Creating Exam Questions with ChatGPT

For the study, the UKB (Universitätsklinikum Bonn) researchers created two sets of 25 multiple-choice questions (MCQs), each with five possible answers, one of which was correct. The first set of questions was written by an experienced medical lecturer, the second set was created by ChatGPT. 161 students answered all questions in random order. For each question, students also indicated whether they thought it was created by a human or by ChatGPT.

Matthias Laupichler, one of the study authors and research associate at the Institute for Medical Didactics at the UKB, explains: "We were surprised that the difficulty of human-generated and ChatGPT-generated questions was virtually identical. Even more surprising for us, however, was that the students were unable to correctly identify the origin of the question in almost half of the cases. Although the results obviously need to be replicated in further studies, the automated generation of exam questions using ChatGPT and co. appears to be a promising tool for medical studies."

His colleague and co-author of the study Johanna Rother adds: "Lecturers can use ChatGPT to generate ideas for exam questions, which are then checked and, if necessary, revised by the lecturers. In our opinion, however, students in particular benefit from the automated generation of medical practice questions, as it has long been known that self-testing one's own knowledge is very beneficial for learning."

Tobias Raupach, Director of the Institute of Medical Didactics, continues: "We knew from previous studies that language models such as ChatGPT can answer the questions in medical state examinations. We have now been able to show for the first time that the software can also be used to write new questions that hardly differ from those of experienced teachers."

Tizian Kaiser, who is studying human medicine in his seventh semester, comments: "When working on the mock exam, I was quite surprised at how difficult it was for me to tell the questions apart. My approach was to differentiate between the questions based on their length, the complexity of their sentence structure and the difficulty of their content. But to be honest, in some situations I simply had to guess and the evaluation showed that I was barely able to differentiate between them. This leads me to the conviction that a meaningful knowledge query, as in this exam, is also possible exclusively through questions posed by the AI."

He is convinced that ChatGPT has great potential for student learning. It allows students to repeat what they have learned in different ways and in different ways again and again. "There is the option of being quizzed by the AI on predefined topics, having mock exams designed or simulating oral exams in writing. The repetition of the material is thus tailored to the exam concept and the training possibilities are endless," says the study participant, while also qualifying: "However, I would only use Chat-GPT for this purpose and not beforehand in the learning process, in which the study topics have to be worked through and summarized. Because while Chat-GPT is excellent for repetition, I fear that errors can occur when preparing learning content. I wouldn't notice these errors without a prior overview of the topic."

It is known from other studies that regular testing - even and especially without grading - helps students to remember learning content more sustainably. Such tests can now be created with little effort. However, the current study should first be transferred to other contexts (i. e. other subjects, semesters and countries) and it should be investigated whether ChatGPT can also write questions other than the multiple choice questions commonly used in medicine.

Laupichler MC, Rother JF, Grunwald Kadow IC, Ahmadi S, Raupach T.
Large Language Models in Medical Education: Comparing ChatGPT- to Human-Generated Exam Questions.
Acad Med. 2023 Dec 28. doi: 10.1097/ACM.0000000000005626

Most Popular Now

Unlocking the 10 Year Health Plan

The government's plan for the NHS is a huge document. Jane Stephenson, chief executive of SPARK TSL, argues the key to unlocking its digital ambitions is to consider what it...

Alcidion Grows Top Talent in the UK, wit…

Alcidion has today announced the addition of three new appointments to their UK-based team, with one internal promotion and two external recruits. Dr Paul Deffley has been announced as the...

AI can Find Cancer Pathologists Miss

Men assessed as healthy after a pathologist analyses their tissue sample may still have an early form of prostate cancer. Using AI, researchers at Uppsala University have been able to...

New Training Year Starts at Siemens Heal…

In September, 197 school graduates will start their vocational training or dual studies in Germany at Siemens Healthineers. 117 apprentices and 80 dual students will begin their careers at Siemens...

AI, Full Automation could Expand Artific…

Automated insulin delivery (AID) systems such as the UVA Health-developed artificial pancreas could help more type 1 diabetes patients if the devices become fully automated, according to a new review...

How AI could Speed the Development of RN…

Using artificial intelligence (AI), MIT researchers have come up with a new way to design nanoparticles that can more efficiently deliver RNA vaccines and other types of RNA therapies. After training...

MIT Researchers Use Generative AI to Des…

With help from artificial intelligence, MIT researchers have designed novel antibiotics that can combat two hard-to-treat infections: drug-resistant Neisseria gonorrhoeae and multi-drug-resistant Staphylococcus aureus (MRSA). Using generative AI algorithms, the research...

AI Hybrid Strategy Improves Mammogram In…

A hybrid reading strategy for screening mammography, developed by Dutch researchers and deployed retrospectively to more than 40,000 exams, reduced radiologist workload by 38% without changing recall or cancer detection...

Penn Developed AI Tools and Datasets Hel…

Doctors treating kidney disease have long depended on trial-and-error to find the best therapies for individual patients. Now, new artificial intelligence (AI) tools developed by researchers in the Perelman School...

Are You Eligible for a Clinical Trial? C…

A new study in the academic journal Machine Learning: Health discovers that ChatGPT can accelerate patient screening for clinical trials, showing promise in reducing delays and improving trial success rates. Researchers...

Global Study Reveals How Patients View M…

How physicians feel about artificial intelligence (AI) in medicine has been studied many times. But what do patients think? A team led by researchers at the Technical University of Munich...

New AI Tool Addresses Accuracy and Fairn…

A team of researchers at the Icahn School of Medicine at Mount Sinai has developed a new method to identify and reduce biases in datasets used to train machine-learning algorithms...