AI Matches Protein Interaction Partners

Proteins are the building blocks of life, involved in virtually every biological process. Understanding how proteins interact with each other is crucial for deciphering the complexities of cellular functions, and has significant implications for drug development and the treatment of diseases.

However, predicting which proteins bind together has been a challenging aspect of computational biology, primarily due to the vast diversity and complexity of protein structures. But a new study from the group of Ann-Florence Bitbol at EPFL might now change all that.

The team of scientists, including Umberto Lupo, Damiano Sgarbossa and Bitbol, has developed DiffPALM (Differentiable Pairing using Alignment-based Language Models), an AI-based approach that can significantly advance the prediction of interacting protein sequences. The study is published in PNAS.

DiffPALM leverages the power of protein language models, an advanced machine learning concept borrowed from natural language processing, to analyze and predict protein interactions among the members of two protein families with unprecedented accuracy. It uses these machine learning techniques to predict interacting protein pairs. This leads to a significant improvement over other methods that often require large, diverse datasets, and struggle with the complexity of eukaryotic protein complexes.

Another advantage of DiffPALM is its versatility, as it can work even with smaller sequence datasets and thus address rare proteins that have few homologs – proteins of different species that share common evolutionary ancestry. It relies on protein language models trained on multiple sequence alignments (MSAs), such as the MSA Transformer and AlphaFold's EvoFormer module, which allows it to understand and predict the complex interactions between proteins with a high degree of accuracy. Even more, using DiffPALM shows high promise when it comes to predicting the structure of protein complexes, which are intricate structures formed by the binding of multiple proteins, and are essential for many of the cell’s processes.

In the study, the team compared DiffPALM with traditional coevolution-based pairing methods, which study how protein sequences evolve together over time when they interact closely – changes in one protein can lead to changes in its interacting partner. This is an extremely important aspect of molecular and cell biology, which is well-captured by protein language models trained on MSAs. DiffPALM is shown to outperform traditional methods Top of Formon challenging benchmarks, demonstrating its robustness and efficiency.

The application of DiffPALM is obvious in the field of basic protein biology, but extends beyond it, as it has the potential to become a powerful tool in medical research and drug development. For instance, accurately predicting protein interactions can help understand disease mechanisms and develop targeted therapies.

The researchers have made DiffPALM freely available, hoping that the scientific community adopts it widely to further advancements in computational biology and enable researchers to explore the complexities of protein interactions.

By combining advanced machine learning techniques and efficient handling of complex biological data, DiffPALM marks a significant leap forward in computational biology. It not only enhances our understanding of protein interactions but also opens up new avenues in medical research, potentially leading to breakthroughs in disease treatment and drug development.

Umberto Lupo, Damiano Sgarbossa, Anne-Florence Bitbol.
Pairing interacting protein sequences using masked language modeling.
PNAS 24 June 2024. doi: 10.1073/pnas.2311887121

Most Popular Now

AI System Helps Doctors Identify Patient…

A new study from Vanderbilt University Medical Center shows that clinical alerts driven by artificial intelligence (AI) can help doctors identify patients at risk for suicide, potentially improving prevention efforts...

Smartphone App can Help Reduce Opioid Us…

Patients with opioid use disorder can reduce their days of opioid use and stay in treatment longer when using a smartphone app as supportive therapy in combination with medication, a...

AI's New Move: Transforming Skin Ca…

Pioneering research has unveiled a powerful new tool in the fight against skin cancer, combining cutting-edge artificial intelligence (AI) with deep learning to enhance the precision of skin lesion classification...

Leveraging AI to Assist Clinicians with …

Physical examinations are important diagnostic tools that can reveal critical insights into a patient's health, but complex conditions may be overlooked if a clinician lacks specialized training in that area...

AI can Improve Ovarian Cancer Diagnoses

A new international study led by researchers at Karolinska Institutet in Sweden shows that AI-based models can outperform human experts at identifying ovarian cancer in ultrasound images. The study is...

Predicting the Progression of Autoimmune…

Autoimmune diseases, where the immune system mistakenly attacks the body's own healthy cells and tissues, often have a preclinical stage before diagnosis that’s characterized by mild symptoms or certain antibodies...

Major EU Project to Investigate Societal…

A new €3 million EU research project led by University College Dublin (UCD) Centre for Digital Policy will explore the benefits and risks of Artificial Intelligence (AI) from a societal...

New AI Tool Uses Routine Blood Tests to …

Doctors around the world may soon have access to a new tool that could better predict whether individual cancer patients will benefit from immune checkpoint inhibitors - a type of...

Using AI to Uncover Hospital Patients�…

Across the United States, no hospital is the same. Equipment, staffing, technical capabilities, and patient populations can all differ. So, while the profiles developed for people with common conditions may...

New Method Tracks the 'Learning Cur…

Introducing Annotatability - a powerful new framework to address a major challenge in biological research by examining how artificial neural networks learn to label genomic data. Genomic datasets often contain...