AI Performs Comparably to Human Readers of Mammograms

Using a standardized assessment, researchers in the UK compared the performance of a commercially available artificial intelligence (AI) algorithm with human readers of screening mammograms. Results of their findings were published in Radiology, a journal of the Radiological Society of North America (RSNA).

Mammographic screening does not detect every breast cancer. False-positive interpretations can result in women without cancer undergoing unnecessary imaging and biopsy. To improve the sensitivity and specificity of screening mammography, one solution is to have two readers interpret every mammogram.

According to the researchers, double reading increases cancer detection rates by 6 to 15% and keeps recall rates low. However, this strategy is labor-intensive and difficult to achieve during reader shortages.

"There is a lot of pressure to deploy AI quickly to solve these problems, but we need to get it right to protect women's health," said Yan Chen, Ph.D., professor of digital screening at the University of Nottingham, United Kingdom.

Prof. Chen and her research team used test sets from the Personal Performance in Mammographic Screening, or PERFORMS, quality assurance assessment utilized by the UK’s National Health Service Breast Screening Program (NHSBSP), to compare the performance of human readers with AI. A single PERFORMS test consists of 60 challenging exams from the NHSBSP with abnormal, benign and normal findings. For each test mammogram, the reader's score is compared to the ground truth of the AI results.

"It's really important that human readers working in breast cancer screening demonstrate satisfactory performance," she said. "The same will be true for AI once it enters clinical practice."

The research team used data from two consecutive PERFORMS test sets, or 120 screening mammograms, and the same two sets to evaluate the performance of the AI algorithm. The researchers compared the AI test scores with the scores of the 552 human readers, including 315 (57%) board-certified radiologists and 237 non-radiologist readers consisting of 206 radiographers and 31 breast clinicians.

"The 552 readers in our study represent 68% of readers in the NHSBSP, so this provides a robust performance comparison between human readers and AI," Prof. Chen said.

Treating each breast separately, there were 161/240 (67%) normal breasts, 70/240 (29%) breasts with malignancies, and 9/240 (4%) benign breasts. Masses were the most common malignant mammographic feature (45/70 or 64.3%), followed by calcifications (9/70 or 12.9%), asymmetries (8/70 or 11.4%), and architectural distortions (8/70 or 11.4%). The mean size of malignant lesions was 15.5 mm.

No difference in performance was observed between AI and human readers in the detection of breast cancer in 120 exams. Human reader performance demonstrated mean 90% sensitivity and 76% specificity. AI was comparable in sensitivity (91%) and specificity (77%) compared to human readers.

"The results of this study provide strong supporting evidence that AI for breast cancer screening can perform as well as human readers," Prof. Chen said.

Prof. Chen said more research is needed before AI can be used as a second reader in clinical practice.

"I think it is too early to say precisely how we will ultimately use AI in breast screening," she said. "The large prospective clinical trials that are ongoing will tell us more. But no matter how we use AI, the ability to provide ongoing performance monitoring will be crucial to its success."

Prof. Chen said it's important to recognize that AI performance can drift over time, and algorithms can be affected by changes in the operating environment.

"It's vital that imaging centers have a process in place to provide ongoing monitoring of AI once it becomes part of clinical practice," she said. "There are no other studies to date that have compared such a large number of human reader performance in routine quality assurance test sets to AI, so this study may provide a model for assessing AI performance in a real-world setting."

Chen Y, Taib AG, Darker IT, James JJ.
Performance of a Breast Cancer Detection AI Algorithm Using the Personal Performance in Mammographic Screening Scheme.
Radiology. 2023 Sep;308(3):e223299. doi: 10.1148/radiol.223299

Most Popular Now

AI Predictions for Colorectal Cancer: On…

Colorectal cancer (CRC) ranks second in leading causes of cancer-related deaths globally, according to the WHO. For the first time, researchers from Helmholtz Munich and the University of Technology Dresden...

Combining AI Models Improves Breast Canc…

Combining artificial intelligence (AI) systems for short- and long-term breast cancer risk results in an improved cancer risk assessment, according to a study published in Radiology, a journal of the...

ChatGPT Shows 'Impressive' Acc…

A new study led by investigators from Mass General Brigham has found that ChatGPT was about 72 percent accurate in overall clinical decision making, from coming up with possible diagnoses...

Healthcare Chatbot: Expand Support with …

The Danish eHealth platform, sundhed.dk, has faced a substantial surge in requests from Danish citizens and has swiftly expanded its support and effectively adapt to the ongoing changes in queries due...

WiFi SPARK's Healthcare Business Re…

Leading WiFi provider WiFi SPARK is rebranding its healthcare arm as SPARK Technology Services Limited. The new identity marks the completion of the integration of the former Hospedia bedside unit...

ChatGPT is Debunking Myths on Social Med…

ChatGPT could help to increase vaccine uptake by debunking myths around jab safety, say the authors of a study published in the peer-reviewed journal Human Vaccines and Immunotherapeutics. The researchers asked...

Online AI-Based Test for Parkinson'…

An artificial intelligence (AI) tool developed by researchers at the University of Rochester can help people with Parkinson's disease remotely assess the severity of their symptoms within minutes. A study...

AI Performs Comparably to Human Readers …

Using a standardized assessment, researchers in the UK compared the performance of a commercially available artificial intelligence (AI) algorithm with human readers of screening mammograms. Results of their findings were...

Siemens Healthineers Expands Production …

Siemens Healthineers is expanding its site in Rudolstadt, Germany. By mid 2024, a new manufacturing building will be built on the site. The new manufacturing plant will produce electron accelerators...

More Cases of Breast Cancer Detected wit…

One radiologist supported by AI detected more cases of breast cancer in screening mammography than two radiologists working together, reports the ScreenTrustCAD study from Karolinska Institutet in The Lancet Digital...

MEDICA 2023 + COMPAMED 2023: "Where…

13 - 16 November 2023, Düsseldorf, Germany. The medical technology market is in worldwide motion and the signs ahead of MEDICA 2023 and COMPAMED 2023 in Düsseldorf as the internationally leading...

Smartphone Technology Expected to Advanc…

Since the 1980s, we have known that neurological soft signs (NSS) can distinguish people with schizophrenia from psychiatrically healthy individuals. NSS are subtle neurological impairments that principally manifest as decreased...