Head-to-Head Against AI, Pharmacy Students Won

Students pursuing a Doctor of Pharmacy degree routinely take - and pass - rigorous exams to prove competency in several areas. Can ChatGPT accurately answer the same questions? A new study by University of Arizona R. Ken Coit College of Pharmacy researchers said no, it can’t.

Researchers found that ChatGPT 3.5, a form of artificial intelligence, fared worse than PharmD students in answering questions on therapeutics examinations that ensure students have the knowledge, skills, and critical thinking abilities to provide safe, effective and patient-centered care.

ChatGPT was less likely to correctly answer application-based questions (44%) compared with questions focused on recall of facts (80%). It also was less likely to answer case-based questions correctly (45%) compared with questions that weren’t focused on patient cases (74%). Overall, ChatGPT answered only 51% of the questions correctly.

The results provide additional insights into the uses and limitations of the technology and may also prove valuable in the development of pharmacy exam questions. The study findings appear in Currents in Pharmacy Teaching and Learning.

"AI has many potential uses in health care and education, and it’s not going away," said Christopher Edwards, PharmD, an associate clinical professor of pharmacy practice and science. "One of the things we were hoping to answer with the study was if students wanted to use AI on an exam, how would they perform? I wanted to have data to show the students and tell them they can do well in the exams by studying hard and they don’t necessarily need these tools."

A secondary goal was to find out what kinds of questions AI would struggle with. Coit College of Pharmacy Interim Dean Brian Erstad, PharmD, wasn’t surprised that ChatGPT did better with straightforward multiple choice and true-false questions and was less successful with application-based questions.

"The kinds of places where evidence is limited and judgment is required, which is often in a clinical setting, was where we found the technology somewhat lacking," he said. "Ironically those are the kinds of questions clinicians are always facing."

Edwards, Erstad, and Bernadette Cornelison, PharmD, an associate professor of pharmacy practice and science, evaluated answers to 210 questions from six exams in two pharmacotherapeutics courses that are part of the university’s Coit College of Pharmacy PharmD program.

The questions came from a first-year PharmD course focused on disorders related to nonprescription medications for heartburn, diarrhea, atopic dermatitis, cold and allergies. The other class was a second-year course that covered cardiology, neurology and critical care topics.

To compare the exam performances of pharmacy students and ChatGPT, they calculated mean composite scores as a measure of the ability to correctly answer questions. For ChatGPT, they added individual scores for each exam and divided by the number of exams. To figure out the mean composite score for the students, they divided the sum of the mean class performance on each exam by the number of exams. The mean composite score for six exams for ChatGPT was 53 compared to 82 for pharmacy students.

Educators, clinicians and others continue to debate the value of AI large language models, such as ChatGPT, in academic medicine. While such models will continue to play a range of roles in health care, pharmacy practice and other areas, many are concerned that relying too much on the technology could hamper the development of needed reasoning and critical thinking skills in students.

Both Erstad and Edwards acknowledged that in time, newer and more advanced technology may change these results.

Edwards CJ, Cornelison B, Erstad BL.
Comparison of a generative large language model to pharmacy student performance on therapeutics examinations.
Curr Pharm Teach Learn. 2025 Sep;17(9):102394. doi: 10.1016/j.cptl.2025.102394

Most Popular Now

AI-Powered CRISPR could Lead to Faster G…

Stanford Medicine researchers have developed an artificial intelligence (AI) tool to help scientists better plan gene-editing experiments. The technology, CRISPR-GPT, acts as a gene-editing “copilot” supported by AI to help...

Groundbreaking AI Aims to Speed Lifesavi…

To solve a problem, we have to see it clearly. Whether it’s an infection by a novel virus or memory-stealing plaques forming in the brains of Alzheimer’s patients, visualizing disease processes...

AI Spots Hidden Signs of Depression in S…

Depression is one of the most common mental health challenges, but its early signs are often overlooked. It is often linked to reduced facial expressivity. However, whether mild depression or...

AI Model Forecasts Disease Risk Decades …

Imagine a future where your medical history could help predict what health conditions you might face in the next two decades. Researchers have developed a generative AI model that uses...

AI Tools Help Predict Severe Asthma Risk…

Mayo Clinic researchers have developed artificial intelligence (AI) tools that help identify which children with asthma face the highest risk of serious asthma exacerbation and acute respiratory infections. The study...

AI Model Indicates Four out of Ten Breas…

A project at Lund University in Sweden has trained an AI model to identify breast cancer patients who could be spared from axillary surgery. The model analyses previously unutilised information...

Smart Device Uses AI and Bioelectronics …

As a wound heals, it goes through several stages: clotting to stop bleeding, immune system response, scabbing, and scarring. A wearable device called "a-Heal," designed by engineers at the University...

AI Distinguishes Glioblastoma from Look-…

A Harvard Medical School–led research team has developed an AI tool that can reliably tell apart two look-alike cancers found in the brain but with different origins, behaviors, and treatments. The...

ChatGPT 4o Therapeutic Chatbot 'Ama…

One of the first randomized controlled trials assessing the effectiveness of a large language model (LLM) chatbot 'Amanda' for relationship support shows that a single session of chatbot therapy...

Overcoming the AI Applicability Crisis a…

Opinion Article by Harry Lykostratis, Chief Executive, Open Medical. The government’s 10 Year Health Plan makes a lot of the potential of AI-software to support clinical decision making, improve productivity, and...

Dartford and Gravesham Implements Clinis…

Dartford and Gravesham NHS Trust has taken a significant step towards a more digital future by rolling out electronic test ordering using Clinisys ICE. The trust deployed the order communications...