Brains vs. Bytes: Study Compares Diagnoses Made by AI and Clinicians

A University of Maine study compared how well artificial intelligence (AI) models and human clinicians handled complex or sensitive medical cases.

The study published in the Journal of Health Organization and Management in May evaluated more than 7,000 anonymized medical queries from the United States and Australia. The findings outlined where the technology showed promise and what limitations need to be addressed before AI is unleashed on patients - and may inform the future development of AI tools, clinical procedures and public policy. The study also informs efforts to use AI to support healthcare professionals at a time when workforce shortages are growing and clinician burnout is increasing.

The results showed that the accuracy of most AI-generated responses aligned with expert standards of information, especially with factual and procedural queries, but often struggled with "why" and "how" questions.

The study also found that while responses were consistent within a given session, inconsistencies appeared when users posed the same questions in later tests. These discrepancies raise concerns, particularly when a patient’s health is at stake. The findings add to a growing body of evidence that will define AI’s role in healthcare.

"This isn't about replacing doctors and nurses," said C. Matt Graham, author of the study and associate professor of information systems and security management at the Maine Business School. "It's about augmenting their abilities. AI can be a second set of eyes; it can help clinicians sift through mountains of data, recognize patterns and offer evidence-based recommendations in real time."

The study also compared health metrics, including patient satisfaction, cost and treatment efficacy, across both countries. In Australia, which has a universal healthcare model, patients reported higher satisfaction and one-quarter of cost compared to those in the U.S., where patients also waited twice as long to see providers. Graham notes in the study that health system, regulatory and cultural differences like these will ultimately influence how AI is received and used and that models should be trained to account for these variations.

While the accuracy of a diagnosis matters, so does the way it is delivered. In the study, AI responses frequently lacked the emotional engagement and empathetic nuance often conveyed by human clinicians.

The length of AI responses were strikingly consistent, with most varying between 400 and 475 words. Responses by human clinicians showed far more variation, with more concise answers written in response to simpler questions.

Vocabulary analysis revealed that AI regularly used clinical terms in its responses, which may be hard to understand or feel insensitive to some patients. In situations involving topics such as mental health or terminal illness, AI struggled to convey the compassion that is critical in effective patient-provider relationships.

"Healthcare professionals offer healing that is grounded in human connection, through sight, touch, presence and communication - experiences that AI cannot replicate," said Kelley Strout, associate professor of UMaine's School of Nursing, who was not involved in the study. "The synergy between AI and clinicians’ judgment, compassion and application of evidence-based practice has the potential to transform healthcare systems but only if accompanied by rigorous standards, ethical frameworks and safeguards to monitor for errors and unintended consequences."

The study arrives amid widespread and growing shortages in the U.S. healthcare workforce. Across the country, patients face long wait times, high costs and a shortage of primary care and specialty providers. These barriers are particularly acute in rural regions, where limited access often leads to delayed diagnoses and worsening health outcomes.

A report published by the Health Resources and Services Administration in 2024projected that nonmetro areas will face a 42% shortage of primary care physicians by 2037. While a growing number of nurse practitioners and physician assistants are stepping in to fill the gap, demand for care is growing faster. Between 2022 and 2026, the population of people 65 and older in the U.S. is projected to increase 54%, a trend harboring significant implications for the demand of health services.

Strout said that while AI could help improve patient access and alleviate challenges - such as burnout, which affects more than half of primary care physicians in the U.S. - its use must be carefully approached.

AI-powered tools could support round-the-clock virtual assistance and complement provider-to-patient communication through tools like online patient portals, which have skyrocketed in popularity since 2020. The technology, however, also raises fears of job displacement, and experts warn that rapid implementation without ethical guardrails may exacerbate disparities and compromise care quality.

"Technology is only one part of the solution," said Graham. "We need regulatory standards, human oversight and inclusive datasets. Right now, most AI tools are trained on limited populations. If we're not careful, we risk building systems that reflect and even magnify existing inequalities."

Strout added that as health care systems integrate AI into clinical practice, administrators must ensure that these tools are designed with patients and providers in mind. Lessons from past integration of technology, which at times failed to enhance care delivery, offer valuable guidance for AI developers.

"We must learn from past missteps. The electronic health record (EHR), for example, was largely developed around billing models rather than patient outcomes or provider workflows," Strout said. "As a result, EHR systems have often contributed to frustration among providers and diminished patient satisfaction. We cannot afford to repeat that history with AI."

Other factors, such as accountability for mistakes and patient privacy, are top of mind for medical ethicists, policy makers and AI researchers. Solutions to these ethical questions may vary depending on where they are adopted to account for different cultural and regulatory environments.

As AI continues to develop, many experts believe it will enhance the service efficiency and decision-making that providers offer to patients. The study’s findings support the growing consensus that AI’s limited ethical and emotional adaptability means that human clinicians remain indispensable. Graham says that, in addition to improving the performance of AI tools, future research should focus on managing ethical risks and adapting AI to diverse healthcare contexts to ensure the technology augments rather than undermines human care.

"Technology should enhance the humanity of medicine, not diminish it," Graham said. "That means designing systems that support clinicians in delivering care, not replacing them altogether."

Graham CM.
Artificial intelligence vs human clinicians: a comparative analysis of complex medical query handling across the USA and Australia.
J Health Organ Manag. 2025 May 27. doi: 10.1108/JHOM-02-2025-0100

Most Popular Now

Personalized Breast Cancer Prevention No…

A new telemedicine service for personalised breast cancer prevention has launched at preventcancer.co.uk. It allows women aged 30 to 75 across the UK to understand their risk of developing breast...

New App may Help Caregivers of People Ge…

A new study by investigators from Mass General Brigham showed that a new app they created can help improve the quality of life for caregivers of patients undergoing bone marrow...

An App to Detect Heart Attacks and Strok…

A potentially lifesaving new smartphone app can help people determine if they are suffering heart attacks or strokes and should seek medical attention, a clinical study suggests. The ECHAS app (Emergency...

A Machine Learning Tool for Diagnosing, …

Scientists aiming to advance cancer diagnostics have developed a machine learning tool that is able to identify metabolism-related molecular profile differences between patients with colorectal cancer and healthy people. The analysis...

Fine-Tuned LLMs Boost Error Detection in…

A type of artificial intelligence (AI) called fine-tuned large language models (LLMs) greatly enhances error detection in radiology reports, according to a new study published in Radiology, a journal of...

DeepSeek-R1 Offers Promising Potential t…

A joint research team from The Hong Kong University of Science and Technology and The Hong Kong University of Science and Technology (Guangzhou) has published a perspective article in MedComm...

Deep Learning can Predict Lung Cancer Ri…

A deep learning model was able to predict future lung cancer risk from a single low-dose chest CT scan, according to new research published at the ATS 2025 International Conference...

New Research Finds Specific Learning Str…

If data used to train artificial intelligence models for medical applications, such as hospitals across the Greater Toronto Area, differs from the real-world data, it could lead to patient harm...

'AI Scientist' Suggests Combin…

An 'AI scientist', working in collaboration with human scientists, has found that combinations of cheap and safe drugs - used to treat conditions such as high cholesterol and alcohol dependence...

Brains vs. Bytes: Study Compares Diagnos…

A University of Maine study compared how well artificial intelligence (AI) models and human clinicians handled complex or sensitive medical cases. The study published in the Journal of Health Organization...

Patients say "Yes..ish" to the…

As artificial intelligence (AI) continues to be integrated in healthcare, a new multinational study involving Aarhus University sheds light on how dental patients really feel about its growing role in...

New AI Transforms Radiology with Speed, …

A first-of-its-kind generative AI system, developed in-house at Northwestern Medicine, is revolutionizing radiology - boosting productivity, identifying life-threatening conditions in milliseconds and offering a breakthrough solution to the global radiologist...