ChatGPT Shows Human-Level Assessment of Brain Tumor MRI Reports

As artificial intelligence advances, its uses and capabilities in real-world applications continue to reach new heights that may even surpass human expertise. In the field of radiology, where a correct diagnosis is crucial to ensure proper patient care, large language models, such as ChatGPT, could improve accuracy or at least offer a good second opinion.

To test its potential, graduate student Yasuhito Mitsuyama and Associate Professor Daiju Ueda's team at Osaka Metropolitan University’s Graduate School of Medicine led the researchers in comparing the diagnostic performance of GPT-4 based ChatGPT and radiologists on 150 preoperative brain tumor MRI reports. Based on these daily clinical notes written in Japanese, ChatGPT, two board-certified neuroradiologists, and three general radiologists were asked to provide differential diagnoses and a final diagnosis.

Subsequently, their accuracy was calculated based on the actual diagnosis of the tumor after its removal. The results stood at 73% for ChatGPT, a 72% average for neuroradiologists, and 68% average for general radiologists. Additionally, ChatGPT’s final diagnosis accuracy varied depending on whether the clinical report was written by a neuroradiologist or a general radiologist. The accuracy with neuroradiologist reports was 80%, compared to 60% when using general radiologist reports.

"These results suggest that ChatGPT can be useful for preoperative MRI diagnosis of brain tumors," stated graduate student Mitsuyama. "In the future, we intend to study large language models in other diagnostic imaging fields with the aims of reducing the burden on physicians, improving diagnostic accuracy, and using AI to support educational environments."

Mitsuyama Y, Tatekawa H, Takita H, Sasaki F, Tashiro A, Oue S, Walston SL, Nonomiya Y, Shintani A, Miki Y, Ueda D.
Comparative analysis of GPT-4-based ChatGPT's diagnostic performance with radiologists using real-world radiology reports of brain tumors.
Eur Radiol. 2024 Aug 28. doi: 10.1007/s00330-024-11032-8

Most Popular Now

Do Fitness Apps do More Harm than Good?

A study published in the British Journal of Health Psychology reveals the negative behavioral and psychological consequences of commercial fitness apps reported by users on social media. These impacts may...

AI Tool Beats Humans at Detecting Parasi…

Scientists at ARUP Laboratories have developed an artificial intelligence (AI) tool that detects intestinal parasites in stool samples more quickly and accurately than traditional methods, potentially transforming how labs diagnose...

Making Cancer Vaccines More Personal

In a new study, University of Arizona researchers created a model for cutaneous squamous cell carcinoma, a type of skin cancer, and identified two mutated tumor proteins, or neoantigens, that...

AI, Health, and Health Care Today and To…

Artificial intelligence (AI) carries promise and uncertainty for clinicians, patients, and health systems. This JAMA Summit Report presents expert perspectives on the opportunities, risks, and challenges of AI in health...

AI can Better Predict Future Risk for He…

A landmark study led by University' experts has shown that artificial intelligence can better predict how doctors should treat patients following a heart attack. The study, conducted by an international...

A New AI Model Improves the Prediction o…

Breast cancer is the most commonly diagnosed form of cancer in the world among women, with more than 2.3 million cases a year, and continues to be one of the...

AI System Finds Crucial Clues for Diagno…

Doctors often must make critical decisions in minutes, relying on incomplete information. While electronic health records contain vast amounts of patient data, much of it remains difficult to interpret quickly...

New AI Tool Makes Medical Imaging Proces…

When doctors analyze a medical scan of an organ or area in the body, each part of the image has to be assigned an anatomical label. If the brain is...