New Method Tracks the 'Learning Curve' of AI to Decode Complex Genomic Data

Introducing Annotatability - a powerful new framework to address a major challenge in biological research by examining how artificial neural networks learn to label genomic data. Genomic datasets often contain vast amounts of annotated samples, but many of these samples are annotated either incorrectly or ambiguously. Borrowing from recent advances in the fields of natural language processing and computer vision, the team used artificial neural networks (ANNs) in a non-conventional way: instead of merely using the ANNs to make predictions, the group inspected the difficulty with which they learned to label different biological samples. Somewhat similarly to assessing why students find some examples harder than others, the team then leveraged this unique source of information to identify mismatches in cell annotations, improve data interpretation, and uncover key cellular pathways linked to development and disease. Annotatability provides a more accurate method for analyzing genomic data on single cells, offering significant potential for advancing biological research, and in the longer term, improving disease diagnosis and treatment.

A new study led by Jonathan Karin, Reshef Mintz, Dr. Barak Raveh and Dr. Mor Nitzan from Hebrew University, published in Nature Computational Science, introduces a new framework for interpreting single-cell and spatial omics data by monitoring deep neural networks training dynamics. The research aims to address the inherent ambiguities in cell annotations and offers a novel approach for understanding complex biological data.

Single-cell and spatial omics data have transformed our ability to explore cellular diversity and cellular behaviors in health and disease. However, the interpretation of these high-dimensional datasets is challenging, primarily due to the difficulty of assigning discrete and accurate annotations, such as cell types or states, to heterogeneous cell populations. These annotations are often subjective, noisy, and incomplete, making it difficult to extract meaningful insights from the data.

The researchers developed a new framework, Annotatability, which helps identify mismatches in cell annotations and better characterizes biological data structures. By monitoring the dynamics and difficulty of training a deep neural network over annotated data, Annotatability identifies areas where cell annotations are ambiguous or erroneous. The approach also highlights intermediate cell states and the complex, continuous nature of cellular development.

As part of the study, the team introduced a signal-aware graph embedding method that enables more precise downstream analysis of biological signals. This technique captures cellular communities associated with target signals and facilitates the exploration of cellular heterogeneity, developmental pathways, and disease trajectories.

The study demonstrates the applicability of Annotatability across a range of single-cell RNA sequencing and spatial omics datasets. Notable findings include the identification of erroneous annotations, delineation of developmental and disease-related cell states, and better characterization of cellular heterogeneity. The results highlight the potential of this framework for unraveling complex cellular behaviors and advancing our understanding of both health and disease at the single-cell level.

The researchers' work presents a significant step forward in genomic data interpretation, offering a powerful tool for unraveling cellular diversity and enhancing our ability to study the dynamics of health and disease.

Karin J, Mintz R, Raveh B, Nitzan M.
Interpreting single-cell and spatial omics data using deep neural network training dynamics.
Nat Comput Sci. 2024 Dec;4(12):941-954. doi: 10.1038/s43588-024-00721-5

Most Popular Now

AI Catches One-Third of Interval Breast …

An AI algorithm for breast cancer screening has potential to enhance the performance of digital breast tomosynthesis (DBT), reducing interval cancers by up to one-third, according to a study published...

Great plan: Now We need to Get Real abou…

The government's big plan for the 10 Year Health Plan for the NHS laid out a big role for delivery. However, the Highland Marketing advisory board felt the missing implementation...

Researchers Create 'Virtual Scienti…

There may be a new artificial intelligence-driven tool to turbocharge scientific discovery: virtual labs. Modeled after a well-established Stanford School of Medicine research group, the virtual lab is complete with an...

From WebMD to AI Chatbots: How Innovatio…

A new research article published in the Journal of Participatory Medicine unveils how successive waves of digital technology innovation have empowered patients, fostering a more collaborative and responsive health care...

New AI Tool Accelerates mRNA-Based Treat…

A new artificial intelligence (AI) model can improve the process of drug and vaccine discovery by predicting how efficiently specific mRNA sequences will produce proteins, both generally and in various...

AI also Assesses Dutch Mammograms Better…

AI is detecting tumors more often and earlier in the Dutch breast cancer screening program. Those tumors can then be treated at an earlier stage. This has been demonstrated by...

RSNA AI Challenge Models can Independent…

Algorithms submitted for an AI Challenge hosted by the Radiological Society of North America (RSNA) have shown excellent performance for detecting breast cancers on mammography images, increasing screening sensitivity while...

AI could Help Emergency Rooms Predict Ad…

Artificial intelligence (AI) can help emergency department (ED) teams better anticipate which patients will need hospital admission, hours earlier than is currently possible, according to a multi-hospital study by the...

Head-to-Head Against AI, Pharmacy Studen…

Students pursuing a Doctor of Pharmacy degree routinely take - and pass - rigorous exams to prove competency in several areas. Can ChatGPT accurately answer the same questions? A new...

NHS Active 10 Walking Tracker Users are …

Users of the NHS Active 10 app, designed to encourage people to become more active, immediately increased their amount of brisk and non-brisk walking upon using the app, according to...

New AI Tool Illuminates "Dark Side…

Proteins sustain life as we know it, serving many important structural and functional roles throughout the body. But these large molecules have cast a long shadow over a smaller subclass...

Deep Learning-Based Model Enables Fast a…

Stroke is the second leading cause of death globally. Ischemic stroke, strongly linked to atherosclerotic plaques, requires accurate plaque and vessel wall segmentation and quantification for definitive diagnosis. However, conventional...