A Machine Learning Tool for Diagnosing, Monitoring Colorectal Cancer

Scientists aiming to advance cancer diagnostics have developed a machine learning tool that is able to identify metabolism-related molecular profile differences between patients with colorectal cancer and healthy people.

The analysis of biological samples from more than 1,000 people also revealed metabolic shifts associated with changing disease severity and with genetic mutations known to increase the risk for colorectal cancer.

Though there is more analysis to come, the resulting "biomarker discovery pipeline" shows promise as a noninvasive method of diagnosing colorectal cancer and monitoring disease progression, said Jiangjiang Zhu, co-senior author of the study and an associate professor of human sciences at The Ohio State University.

"We believe this is a good tool for disease diagnostics and monitoring, especially because metabolic-based biomarker analysis could also be utilized to monitor treatment effectiveness," said Zhu, also an investigator in The Ohio State University Comprehensive Cancer Center Molecular Carcinogenesis and Chemoprevention Research Program.

"When a patient is taking drug A versus drug B, especially for cancer, time is essential. If they don’t have a good response, we want to know that as soon as possible so we can change the treatment regimen. If metabolites can help indicate a treatment's effectiveness faster than traditional methods like pathology or protein markers, we hope they could be good indicators for doctors who are caring for patients."

The tool is not intended to replace colonoscopy as the gold standard for cancer screening, Zhu said, and further study with additional samples is planned before the pipeline would be ready for translation to a clinical setting.

The research was published recently in the journal iMetaOmics.

This work also represents an advance in machine learning techniques, combining two established methods to design the new platform: partial least squares-discriminant analysis (PLS-DA) for big-picture differentiation of molecular profiles, and an artificial neural network (ANN) that, in this case, pinpoints molecules that improve the platform’s predictive value. The team called the resulting biomarker pipeline PANDA, short for PLS-ANN-DA.

"We took the best of both worlds and put them together to leverage their strengths and complement each other to offset their potential weaknesses," Zhu said. "We were looking at all kinds of possibilities to tease out the biomarkers that could be predictive or indicative of disease progression and the different stages of the disease. That gave us some strong confidence that this method has great potential for future diagnoses."

Two sets of biological data extracted from blood samples were analyzed: metabolites, products of biochemical reactions that break down food to produce energy and perform other essential functions, and transcripts, RNA readouts of DNA instructions that predict related protein changes.

The biological samples are a significant part of the study’s strength, Zhu said, because they were collected as part of large research projects: The Ohio Colorectal Cancer Prevention Initiative (OCCPI) and an Ohio State Wexner Medical Center clinical laboratory biobank. In all, 626 samples came from people with colorectal cancer - including patients with high-risk genetic mutations. Another 402 samples from age- and gender-matched healthy individuals were obtained by Jieli Li, co-senior study author and associate professor-clinical of pathology in Ohio State’s College of Medicine.

"We, as humans, at different stages of our lives, actually have quite different biochemistry," Zhu said. "This valuable collection of samples enabled us to run high-throughput metabolomics analysis to understand the molecular changes from people who don't have cancer with people who have cancer, and also from early-stage to late-stage disease.

"We also have data from patients with genetic mutations that we can compare to the metabolite data to look at whether metabolic changes are an indication of predictive values for the genetic mutations. To our knowledge, this is the first time this has been done at this scope and scale because we are looking at literally hundreds of patients."

Biomarkers are tricky to rely on for diagnostics across different populations because of the many conditions that affect molecular profiles in living systems - so this study highlights several molecular changes with potential, but not certainty, in assessing colorectal cancer's presence and progression in a nationally representative group of patients.

The metabolism pathways linked to one family of compounds called purines, which are needed for DNA formation and degradation, stood out in the analysis because they were more active overall in cancer patients compared to healthy controls, and were less active with more advanced tumor stages.

"It's certainly an indication that this biomarker may be associated with the underlying mechanisms of cancer biology," Zhu said. "We are cautiously optimistic in saying that we’re not only doing biomarker discovery, but we’re also providing clues for mechanistic investigations."

The team plans to continue analyzing metabolites related to different types of biological signals to refine the PANDA biomarker pipeline.

"Some of the markers we identified are a little bit finicky, and there’s a lot of noise within those signals, but we have pushed the field forward to develop potential next-generation biomarkers and the novel bioinformatics pipeline for colorectal cancer diagnosis and monitoring," Zhu said.

This work was supported by the National Institute of General Medical Sciences, an Ohio State fellowship and Pelotonia, which funded the statewide OCCPI. Zhu is also supported by the Provost’s Scarlet and Gray Associate Professor Program at Ohio State.

Xu R, Jung H, Choueiry F, Zhang S, Pearlman R, Hampel H, Jin N, Li J, Zhu J.
Novel machine-learning bioinformatics reveal distinct metabolic alterations for enhanced colorectal cancer diagnosis and monitoring.
iMetaOmics, 2: e70003, 2025. doi: 10.1002/imo2.70003

Most Popular Now

Open Medical Works with Moray's Dig…

Open Medical is working with the Digital Health & Care Innovation Centre’s Rural Centre of Excellence on a referral management plan, as part of a research and development scheme to...

Generative AI on Track to Shape the Futu…

Using advanced artificial intelligence (AI), researchers have developed a novel method to make drug development faster and more efficient. In a new paper, Xia Ning, lead author of the study and...

Reorganisation, Consolidation, and Cuts:…

NHS England has been downsized and abolished. Integrated care boards have been told to change function, consolidate, and deliver savings. Trusts are planning big cuts. The Highland Marketing advisory board...

AI-Human Task-Sharing could Cut Mammogra…

The most effective way to harness the power of artificial intelligence (AI) when screening for breast cancer may be through collaboration with human radiologists - not by wholesale replacing them...

AI Tool Uses Face Photos to Estimate Bio…

Eyes may be the window to the soul, but a person's biological age could be reflected in their facial characteristics. Investigators from Mass General Brigham developed a deep learning algorithm...

Philips Future Health Index 2025 Report …

Royal Philips (NYSE: PHG, AEX: PHIA), a global leader in health technology, today unveiled its 2025 Future Health Index U.S. report, "Building trust in healthcare AI," spotlighting the state of...

AI Model Improves Delirium Prediction, L…

An artificial intelligence (AI) model improved outcomes in hospitalized patients by quadrupling the rate of detection and treatment of delirium. The model identifies patients at high risk for delirium and...

AI-Powered Precision: Unlocking the Futu…

A team of researchers from the Department of Diagnostic and Therapeutic Ultrasonography at the Tianjin Medical University Cancer Institute & Hospital, have published a review in Cancer Biology & Medicine...

Call for Papers: AI Applications in Biom…

JMIR Biomedical Engineering is inviting submissions for a new section titled "AI Applications in Biomedical Engineering." This themed section explores the integration of biomedical engineering and artificial intelligence (AI), focusing...

Deep Learning can Predict Lung Cancer Ri…

A deep learning model was able to predict future lung cancer risk from a single low-dose chest CT scan, according to new research published at the ATS 2025 International Conference...

DeepSeek-R1 Offers Promising Potential t…

A joint research team from The Hong Kong University of Science and Technology and The Hong Kong University of Science and Technology (Guangzhou) has published a perspective article in MedComm...

A Machine Learning Tool for Diagnosing, …

Scientists aiming to advance cancer diagnostics have developed a machine learning tool that is able to identify metabolism-related molecular profile differences between patients with colorectal cancer and healthy people. The analysis...