Diagnoses and Treatment Recommendations Given by AI were More Accurate than those of Physicians

A new study led by Prof. Dan Zeltzer, a digital health expert from the Berglas School of Economics at Tel Aviv University, compared the quality of diagnostic and treatment recommendations made by artificial intelligence (AI) and physicians at Cedars-Sinai Connect, a virtual urgent care clinic in Los Angeles, operated in collaboration with Israeli startup K Health. The paper was published in Annals of Internal Medicine and presented at the annual conference of the American College of Physicians (ACP). This work was supported with funding by K Health.

Prof. Zeltzer explains: "Cedars-Sinai operates a virtual urgent care clinic offering telemedical consultations with physicians who specialize in family and emergency care. Recently, an AI system was integrated into the clinic - an algorithm based on machine learning that conducts initial intake through a dedicated chat, incorporates data from the patient’s medical record, and provides the attending physician with detailed diagnostic and treatment suggestions at the start of the visit -including prescriptions, tests, and referrals. After interacting with the algorithm, patients proceed to a video visit with a physician who ultimately determines the diagnosis and treatment. To ensure reliable AI recommendations, the algorithm - trained on medical records from millions of cases - only offers suggestions when its confidence level is high, giving no recommendation in about one out of five cases. In this study, we compared the quality of the AI system's recommendations with the physicians' actual decisions in the clinic."

The researchers examined a sample of 461 online clinic visits over one month during the summer of 2024. The study focused on adult patients with relatively common symptoms - respiratory, urinary, eye, vaginal and dental. In all visits reviewed, patients were initially assessed by the algorithm, which provided recommendations, and then treated by a physician in a video consultation. Afterwards, all recommendations - from both the algorithm and the physicians - were evaluated by a panel of four doctors with at least ten years of clinical experience, who rated each recommendation on a four-point scale: optimal, reasonable, inadequate, or potentially harmful. The evaluators assessed the recommendations based on the patients' medical histories, the information collected during the visit, and transcripts of the video consultations.

The compiled ratings led to interesting conclusions: AI recommendations were rated as optimal in 77% of cases, compared to only 67% of the physicians' decisions; at the other end of the scale, AI recommendations were rated as potentially harmful in a smaller portion of cases than physicians' decisions (2.8% of AI recommendations versus 4.6% of physicians' decisions). In 68% of the cases, the AI and the physician received the same score; in 21% of cases, the algorithm scored higher than the physician; and in 11% of cases, the physician's decision was considered better.

The explanations provided by the evaluators for the differences in ratings highlight several advantages of the AI system over human physicians: First, the AI more strictly adheres to medical association guidelines - for example, not prescribing antibiotics for a viral infection; second, AI more comprehensively identifies relevant information in the medical record - such as recurrent cases of a similar infection that may influence the appropriate course of treatment; and third, AI more precisely identifies symptoms that could indicate a more serious condition, such as eye pain reported by a contact lens wearer, which could signal an infection. Physicians, on the other hand, are more flexible than the algorithm and have an advantage in assessing the patient's real condition. For example, if a COVID-19 patient reports shortness of breath, a doctor may recognize it as a relatively mild respiratory congestion, whereas the AI, based solely on the patient's answers, might refer them unnecessarily to the emergency room.

Prof. Zeltzer concludes: "In this study, we found that AI, based on a targeted intake process, can provide diagnostic and treatment recommendations that are, in many cases, more accurate than those made by physicians. One limitation of the study is that we do not know which of the physicians reviewed the AI's recommendations in the available chart, or to what extent they relied on these recommendations. Thus, the study only measured the accuracy of the algorithm's recommendations and not their impact on the physicians. The uniqueness of the study lies in the fact that it tested the algorithm in a real-world setting with actual cases, while most studies focus on examples from certification exams or textbooks. The relatively common conditions included in our study represent about two-thirds of the clinic's case volume, and thus the findings can be meaningful for assessing AI's readiness to serve as a decision-support tool in medical practice. We can envision a near future in which algorithms assist in an increasing portion of medical decisions, bringing certain data to the doctor's attention, and facilitating faster decisions with fewer human errors. Of course, many questions still remain about the best way to implement AI in the diagnostic and treatment process, as well as the optimal integration between human expertise and artificial intelligence in medicine."

Zeltzer D, Kugler Z, Hayat L, Brufman T, Ilan Ber R, Leibovich K, Beer T, Frank I, Shaul R, Goldzweig C, Pevnick J.
Comparison of Initial Artificial Intelligence (AI) and Final Physician Recommendations in AI-Assisted Virtual Urgent Care Visits.
Ann Intern Med. 2025 Apr;178(4):498-506. doi: 10.7326/ANNALS-24-03283

Most Popular Now

AI also Assesses Dutch Mammograms Better…

AI is detecting tumors more often and earlier in the Dutch breast cancer screening program. Those tumors can then be treated at an earlier stage. This has been demonstrated by...

Unlocking the 10 Year Health Plan

The government's plan for the NHS is a huge document. Jane Stephenson, chief executive of SPARK TSL, argues the key to unlocking its digital ambitions is to consider what it...

AI can Find Cancer Pathologists Miss

Men assessed as healthy after a pathologist analyses their tissue sample may still have an early form of prostate cancer. Using AI, researchers at Uppsala University have been able to...

Alcidion Grows Top Talent in the UK, wit…

Alcidion has today announced the addition of three new appointments to their UK-based team, with one internal promotion and two external recruits. Dr Paul Deffley has been announced as the...

How AI could Speed the Development of RN…

Using artificial intelligence (AI), MIT researchers have come up with a new way to design nanoparticles that can more efficiently deliver RNA vaccines and other types of RNA therapies. After training...

AI, Full Automation could Expand Artific…

Automated insulin delivery (AID) systems such as the UVA Health-developed artificial pancreas could help more type 1 diabetes patients if the devices become fully automated, according to a new review...

MIT Researchers Use Generative AI to Des…

With help from artificial intelligence, MIT researchers have designed novel antibiotics that can combat two hard-to-treat infections: drug-resistant Neisseria gonorrhoeae and multi-drug-resistant Staphylococcus aureus (MRSA). Using generative AI algorithms, the research...

Penn Developed AI Tools and Datasets Hel…

Doctors treating kidney disease have long depended on trial-and-error to find the best therapies for individual patients. Now, new artificial intelligence (AI) tools developed by researchers in the Perelman School...

AI Hybrid Strategy Improves Mammogram In…

A hybrid reading strategy for screening mammography, developed by Dutch researchers and deployed retrospectively to more than 40,000 exams, reduced radiologist workload by 38% without changing recall or cancer detection...

New Training Year Starts at Siemens Heal…

In September, 197 school graduates will start their vocational training or dual studies in Germany at Siemens Healthineers. 117 apprentices and 80 dual students will begin their careers at Siemens...

Routine AI Assistance may Lead to Loss o…

The introduction of artificial intelligence (AI) to assist colonoscopies is linked to a reduction in the ability of endoscopists (health professionals who perform colonoscopies) to detect precancerous growths (adenomas) in...

New AI Tool Addresses Accuracy and Fairn…

A team of researchers at the Icahn School of Medicine at Mount Sinai has developed a new method to identify and reduce biases in datasets used to train machine-learning algorithms...