A Machine Learning Tool for Diagnosing, Monitoring Colorectal Cancer

Scientists aiming to advance cancer diagnostics have developed a machine learning tool that is able to identify metabolism-related molecular profile differences between patients with colorectal cancer and healthy people.

The analysis of biological samples from more than 1,000 people also revealed metabolic shifts associated with changing disease severity and with genetic mutations known to increase the risk for colorectal cancer.

Though there is more analysis to come, the resulting "biomarker discovery pipeline" shows promise as a noninvasive method of diagnosing colorectal cancer and monitoring disease progression, said Jiangjiang Zhu, co-senior author of the study and an associate professor of human sciences at The Ohio State University.

"We believe this is a good tool for disease diagnostics and monitoring, especially because metabolic-based biomarker analysis could also be utilized to monitor treatment effectiveness," said Zhu, also an investigator in The Ohio State University Comprehensive Cancer Center Molecular Carcinogenesis and Chemoprevention Research Program.

"When a patient is taking drug A versus drug B, especially for cancer, time is essential. If they don’t have a good response, we want to know that as soon as possible so we can change the treatment regimen. If metabolites can help indicate a treatment's effectiveness faster than traditional methods like pathology or protein markers, we hope they could be good indicators for doctors who are caring for patients."

The tool is not intended to replace colonoscopy as the gold standard for cancer screening, Zhu said, and further study with additional samples is planned before the pipeline would be ready for translation to a clinical setting.

The research was published recently in the journal iMetaOmics.

This work also represents an advance in machine learning techniques, combining two established methods to design the new platform: partial least squares-discriminant analysis (PLS-DA) for big-picture differentiation of molecular profiles, and an artificial neural network (ANN) that, in this case, pinpoints molecules that improve the platform’s predictive value. The team called the resulting biomarker pipeline PANDA, short for PLS-ANN-DA.

"We took the best of both worlds and put them together to leverage their strengths and complement each other to offset their potential weaknesses," Zhu said. "We were looking at all kinds of possibilities to tease out the biomarkers that could be predictive or indicative of disease progression and the different stages of the disease. That gave us some strong confidence that this method has great potential for future diagnoses."

Two sets of biological data extracted from blood samples were analyzed: metabolites, products of biochemical reactions that break down food to produce energy and perform other essential functions, and transcripts, RNA readouts of DNA instructions that predict related protein changes.

The biological samples are a significant part of the study’s strength, Zhu said, because they were collected as part of large research projects: The Ohio Colorectal Cancer Prevention Initiative (OCCPI) and an Ohio State Wexner Medical Center clinical laboratory biobank. In all, 626 samples came from people with colorectal cancer - including patients with high-risk genetic mutations. Another 402 samples from age- and gender-matched healthy individuals were obtained by Jieli Li, co-senior study author and associate professor-clinical of pathology in Ohio State’s College of Medicine.

"We, as humans, at different stages of our lives, actually have quite different biochemistry," Zhu said. "This valuable collection of samples enabled us to run high-throughput metabolomics analysis to understand the molecular changes from people who don't have cancer with people who have cancer, and also from early-stage to late-stage disease.

"We also have data from patients with genetic mutations that we can compare to the metabolite data to look at whether metabolic changes are an indication of predictive values for the genetic mutations. To our knowledge, this is the first time this has been done at this scope and scale because we are looking at literally hundreds of patients."

Biomarkers are tricky to rely on for diagnostics across different populations because of the many conditions that affect molecular profiles in living systems - so this study highlights several molecular changes with potential, but not certainty, in assessing colorectal cancer's presence and progression in a nationally representative group of patients.

The metabolism pathways linked to one family of compounds called purines, which are needed for DNA formation and degradation, stood out in the analysis because they were more active overall in cancer patients compared to healthy controls, and were less active with more advanced tumor stages.

"It's certainly an indication that this biomarker may be associated with the underlying mechanisms of cancer biology," Zhu said. "We are cautiously optimistic in saying that we’re not only doing biomarker discovery, but we’re also providing clues for mechanistic investigations."

The team plans to continue analyzing metabolites related to different types of biological signals to refine the PANDA biomarker pipeline.

"Some of the markers we identified are a little bit finicky, and there’s a lot of noise within those signals, but we have pushed the field forward to develop potential next-generation biomarkers and the novel bioinformatics pipeline for colorectal cancer diagnosis and monitoring," Zhu said.

This work was supported by the National Institute of General Medical Sciences, an Ohio State fellowship and Pelotonia, which funded the statewide OCCPI. Zhu is also supported by the Provost’s Scarlet and Gray Associate Professor Program at Ohio State.

Xu R, Jung H, Choueiry F, Zhang S, Pearlman R, Hampel H, Jin N, Li J, Zhu J.
Novel machine-learning bioinformatics reveal distinct metabolic alterations for enhanced colorectal cancer diagnosis and monitoring.
iMetaOmics, 2: e70003, 2025. doi: 10.1002/imo2.70003

Most Popular Now

Philips Foundation 2024 Annual Report: E…

Marking its tenth anniversary, Philips Foundation released its 2024 Annual Report, highlighting a year in which the Philips Foundation helped provide access to quality healthcare for 46.5 million people around...

New AI Transforms Radiology with Speed, …

A first-of-its-kind generative AI system, developed in-house at Northwestern Medicine, is revolutionizing radiology - boosting productivity, identifying life-threatening conditions in milliseconds and offering a breakthrough solution to the global radiologist...

Scientists Argue for More FDA Oversight …

An agile, transparent, and ethics-driven oversight system is needed for the U.S. Food and Drug Administration (FDA) to balance innovation with patient safety when it comes to artificial intelligence-driven medical...

New Research Finds Specific Learning Str…

If data used to train artificial intelligence models for medical applications, such as hospitals across the Greater Toronto Area, differs from the real-world data, it could lead to patient harm...

Giving Doctors an AI-Powered Head Start …

Detection of melanoma and a range of other skin diseases will be faster and more accurate with a new artificial intelligence (AI) powered tool that analyses multiple imaging types simultaneously...

AI Agents for Oncology

Clinical decision-making in oncology is challenging and requires the analysis of various data types - from medical imaging and genetic information to patient records and treatment guidelines. To effectively support...

Patients say "Yes..ish" to the…

As artificial intelligence (AI) continues to be integrated in healthcare, a new multinational study involving Aarhus University sheds light on how dental patients really feel about its growing role in...

Brains vs. Bytes: Study Compares Diagnos…

A University of Maine study compared how well artificial intelligence (AI) models and human clinicians handled complex or sensitive medical cases. The study published in the Journal of Health Organization...

'AI Scientist' Suggests Combin…

An 'AI scientist', working in collaboration with human scientists, has found that combinations of cheap and safe drugs - used to treat conditions such as high cholesterol and alcohol dependence...

Start-ups in the Spotlight at MEDICA 202…

17 - 20 November 2025, Düsseldorf, Germany. MEDICA, the leading international trade fair and platform for healthcare innovations, will once again confirm its position as the world's number one hotspot for...