The emergence of artificial intelligence (AI) chatbots has opened up new possibilities for doctors and patients — but the technology also comes with the risk of misdiagnosis, data privacy issues and biases in decision-making.
One of the most popular examples is ChatGPT, which can mimic human conversations and create personalized medical advice. In fact, it recently passed the U.S. Medical Licensing Exam.
And because of its ability to generate human-like responses, some experts believe ChatGPT could help doctors with paperwork, examine X-rays (the platform is capable of reading photos) and weigh in on a patient’s surgery.
The software could potentially become as crucial for doctors as the stethoscope was in the last century for the medical field, said Dr. Robert Pearl, a professor at the Stanford University School of Medicine.
“It just won’t be possible to provide the best cutting-edge medicine in the future (without it),” he said, adding the platform is still years away from reaching its full potential.
“The current version of ChatGPT needs to be understood as a toy,” he said. “It’s probably two per cent of what’s going to happen in the future.”
This is because generative AI can increase in power and effectiveness, doubling every six to 10 months, according to researchers.
Developed by OpenAI, and released for testing to the general public in November 2022, ChatGPT had explosive uptake. After its release, over a million people signed up to use it in just five days, according to OpenAI CEO Sam Altman.
The software is currently free as it sits in its research phase, though there are plans to eventually charge.
“We will have to monetize it somehow at some point; the compute costs are eye-watering,” Altman said online on Dec. 5, 2022.
A physician’s digital assistant
Although ChatGPT is a relatively new platform, the idea of AI and health care has been around for years.
In 2007, IBM created an open-domain question–answering system, named Watson, which won first place on the television game show Jeopardy!
Ten years later, a team of scientists used Watson to successfully identify new RNA-binding proteins that were altered in the disease amyotrophic lateral sclerosis (ALS), highlighting the use of AI tools to accelerate scientific discovery in neurological disorders.
During the COVID-19 pandemic, researchers from the University of Waterloo developed AI models that predicted which COVID-19 patients were most likely to have severe kidney injury outcomes while they are in hospital.
What sets ChatGPT apart from the other AI platforms is its ability to communicate, said Huda Idrees, founder and CEO of Dot Health, a health data tracker.
“Within a health-care context, communicating with clients — for example, if someone needs to write a longish letter describing their care plan — it makes sense to use ChatGPT. It would save doctors a lot of time,” she said. “So from an efficiency perspective, I see it as a very strong communication tool.”
Its communication is so effective that a JAMA study published April 28 found ChatGPT may have better bedside manners than some doctors.
The study had 195 randomly drawn patient questions and compared physicians’ and the chatbot’s answers. The chatbot responses were preferred over physician responses and rated significantly higher for both quality and empathy.
On average, ChatGPT scored 21 per cent higher than physicians for the quality of responses and 41 per cent more empathetic, according to the study.
In terms of the software taking over a doctor’s job, Pearl said he does not see that happening, but rather he believes it will act like a digital assistant.
“It becomes a partner for the doctor to use,” he said. “Medical knowledge doubles every 73 days. It’s just not possible for a human being to stay up at that pace. There’s also more and more information about unusual conditions that ChatGPT can find in the literature and provide to the physician.”
By using ChatGPT to sift through the vast amount of medical knowledge, it can help a physician save time and even help lead to a diagnosis, Pearl explained.
It’s still early days, but people are looking at using the platform as a tool to help monitor patients from home, explained Carrie Jenkins, a professor of philosophy at the University of British Columbia.
“We’re already seeing that there is work in monitoring patient’s sugars and automatically filing out the right insulin they should have if they need it for their diabetes,” he told Global News in February.
“Maybe one day it will help with our diagnostic process, but we are not there yet,” he added.
Results can be ‘fairly disturbing’
Previous studies have shown that physicians vastly outperform computer algorithms in diagnostic accuracy.
For example, a 2016 research letter published in JAMA Internal Medicine, showed that physicians were correct more than 84 per cent when diagnosing a patient, compared to a computer algorithm, which was correct 51 per cent of the time.
More recently, an emergency room doctor in the United States put ChatGPT to work in a real-world medical situation.
In an article published in Medium, Dr. Josh Tamayo-Sarver said he fed the AI platform anonymized medical history of past patients and the symptoms that brought them to the emergency department.
“The results were fascinating, but also fairly disturbing,” he wrote.
If he entered precise, detailed information, the chatbot did a “decent job” of bringing up common diagnoses he wouldn’t want to miss, he said.
But the platform only had about a 50 per cent success rate in correctly diagnosing his patients, he added.
“ChatGPT also misdiagnosed several other patients who had life-threatening conditions. It correctly suggested one of them had a brain tumor — but missed two others who also had tumors. It diagnosed another patient with torso pain as having a kidney stone — but missed that the patient actually had an aortic rupture,” he wrote.
Its developers have acknowledged this pitfall.
“ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers,” OpenAI stated on its website.
The potential for misdiagnosis is just one of the fallbacks of using ChatGPT in the health-care setting.
ChatGPT is trained on vast amounts of data made by humans, which means there can be inherent biases.
“There’s a lot of times where it’s factually incorrect, and that’s what gives me pause when it comes to specific health queries,” Idrees said, adding that not only does the software get facts wrong, but it can also pull biased information.
“It could be that there is a lot of anti-vax information available on the internet, so maybe it actually will reference more anti-vax links more than it needs to,” she explained.
Idrees pointed out that another limit the software has is the difficulty in accessing private health information.
From lab results, and screening tests, to surgical notes, there is a “whole wealth” of information that is not easily accessible, even when it’s digitally captured.
“In order for ChatGPT to do anything … really impactful in health care, it would need to be able to consume and have a whole other set of language in order to communicate that health-care data,” she said.
“I don’t see how it’s going to magically access these treasure troves of health data unless the industry moves first.”
— with files from the Associated Press and Global News’ Kathryn Mannie