Deep Machine Learning Completes Information about the Bioactivity of One Million Molecules

The Structural Bioinformatics and Network Biology laboratory, led by ICREA Researcher Dr. Patrick Aloy, has completed the bioactivity information for a million molecules using deep machine-learning computational models. It has also disclosed a tool to predict the biological activity of any molecule, even when no experimental data are available.

This new methodology is based on the Chemical Checker, the largest database of bioactivity profiles for pseudo pharmaceuticals to date, developed by the same laboratory and published in 2020. The Chemical Checker collects information from 25 spaces of bioactivity for each molecule. These spaces are linked to the chemical structure of the molecule, the targets with which it interacts or the changes it induces at the clinical or cellular level. However, this highly detailed information about the mechanism of action is incomplete for most molecules, implying that for a particular one there may be information for one or two spaces of bioactivity but not for all 25.

With this new development, researchers integrate all the experimental information available with deep machine learning methods, so that all the activity profiles, from chemistry to clinical level, for all molecules can be completed.

"The new tool also allows us to forecast the bioactivity spaces of new molecules, and this is crucial in the drug discovery process as we can select the most suitable candidates and discard those that, for one reason or another, would not work," explains Dr. Aloy.

The software library is freely accessible to the scientific community at bioactivitysignatures.org and it will be regularly updated by the researchers as more biological activity data become available. With each update of experimental data in the Chemical Checker, artificial neural networks will also be revised to refine the estimates.

Predictions and reliability

The bioactivity data predicted by the model have a greater or lesser degree of reliability depending on various factors, including the volume of experimental data available and the characteristics of the molecule.

In addition to predicting aspects of activity at the biological level, the system developed by Dr. Aloy's team provides a measure of the degree of reliability of the prediction for each molecule. "All models are wrong, but some are useful! A measure of confidence allows us to better interpret the results and highlight which spaces of bioactivity of a molecule are accurate and in which ones an error rate can be contemplated," explains Dr. Martino Bertoni, first author of the work.

Testing the system with the IRB Barcelona compound library

To validate the tool, the researchers have searched the library of compounds at IRB Barcelona for those that could be good drug candidates to modulate the activity of a cancer-related transcription factor (SNAIL1), whose activity is almost impossible to modulate due to the direct binding of drugs (it is considered an 'undruggable' target). Of a first set of 17,000 compounds, deep machine learning models predicted characteristics (in their dynamics, interaction with target cells and proteins, etc.) for 131 that fit the target.

The ability of these compounds to degrade SNAIL1 has been confirmed experimentally and it has been observed that, for a high percentage, this degradation capacity is consistent with what the models had predicted, thus validating the system.

This work has been possible thanks to the funding from the Government of Catalonia, the Spanish Ministry of Science and Innovation, the European Research Council, the European Commission, the State Research Agency and the ERDF.

Bertoni M, Duran-Frigola M, Badia-I-Mompel P, Pauls E, Orozco-Ruiz M, Guitart-Pla O, Alcalde V, Diaz VM, Berenguer-Llergo A, Brun-Heath I, Villegas N, de Herreros AG, Aloy P.
Bioactivity descriptors for uncharacterized chemical compounds.
Nat Commun. 2021 Jun 24;12(1):3932. doi: 10.1038/s41467-021-24150-4

Most Popular Now

AI Helps Physicians Better Assess the Ef…

In a small but multi-institutional study, an artificial intelligence (AI)-based system improved providers' assessments of whether patients with bladder cancer had complete response to chemotherapy before a radical cystectomy (bladder...

Smartwatches and Fitness Bands Reveal In…

A new digital health study by researchers at Scripps Research shows how data from wearable sensors, such as smartwatches and fitness bands, can track a person’s physiological response to the...

AI may Detect Earliest Signs of Pancreat…

An artificial intelligence (AI) tool developed by Cedars-Sinai investigators accurately predicted who would develop pancreatic cancer based on what their CT scan images looked like years prior to being diagnosed...

Open Call U4H-2022-PJ2: Call for Proposa…

The Ukraine crisis has an unprecedented impact on the mental health of the displaced people in the EU coming from Ukraine. The conflict and experiences of people in war zones...

AI Reduces Miss Rate of Precancerous Pol…

Artificial intelligence reduced by twofold the rate at which precancerous polyps were missed in colorectal cancer screening, reported a team of international researchers led by Mayo Clinic. The study is...

Medical Valley EMN & Volitan Global …

The two healthcare innovation experts Medical Valley EMN and Volitan Global strengthen their existing inbound- and outbound activities through a strategic partnership. The aim is to offer companies access to...

DMEA - Connecting Digital Health Opens w…

26 - 28 April 2022, Berlin, Germany. What plans does the new federal government have concerning the digital transformation of the healthcare sector? What are the initial experiences of doctors regarding...

AI can Predict Probability of COVID-19 v…

Testing shortages, long waits for results, and an over-taxed health care system have made headlines throughout the COVID-19 pandemic. These issues can be further exacerbated in small or rural communities...

Using AI to Detect Cancer from Patient D…

A new way of using artificial intelligence to predict cancer from patient data without putting personal information at risk has been developed by a team including University of Leeds medical...

Oulu University Hospital Expands Partner…

Siemens Healthineers and Oulu University Hospital in Finland have entered a strategic partnership for the next ten years, adding to an existing radiotherapy collaboration to jointly expand and modernize the...

Positive Conclusion to DMEA - Connecting…

26 - 28 April 2022, Berlin, Germany. After three days DMEA, Europe's leading digital health event, came to a successful conclusion - with around 11,000 visitors, more than 500 exhibitors and...

AI-Enabled ECGs may Identify Patients at…

Atrial fibrillation, the most common cardiac rhythm abnormality, has been linked to one-third of ischemic strokes, the most common type of stroke. But atrial fibrillation is underdiagnosed, partly because many...