Welcome Evo, Generative AI for the Genome

Brian Hie runs the Laboratory of Evolutionary Design at Stanford, where he works at the crossroads of artificial intelligence and biology. Not long ago, Hie pondered a provocative question: If a tool like ChatGPT can write original sentences based on patterns found in massive collections of previously written words, what happens if we replace written words with genetic code?

The answer to that seemingly simple question has become Evo, a generative AI model that writes genetic code. Hie and his colleagues at the Arc Institute and the University of California, Berkeley, introduced Evo in a paper in the journal Science. Hie says that researchers might use Evo to understand how microbial and viral genomes work, to fashion new proteins (i.e., drugs) that never existed before, and to reprogram microbes to accomplish remarkable tasks, from improving photosynthesis for carbon sequestration and higher crop yields to gobbling up microplastics from the oceans.

"Instead of having to use brute force testing or mining promising sequences from nature, all of which are quite unpredictable, we now have an AI model for generating systems of interest, allowing researchers to focus only on the most promising possibilities," said Hie, assistant professor of chemical engineering. "Evo puts the genomes of whole lifeforms within reach and accelerates the bioengineering design process."

Evo could even lead to deeper understanding of evolution itself, new understandings of genetic diseases, and new treatments – all achieved on a computer rather than in a lab.

Natural insight

The inspiration comes from nature itself. The instructions of all life are encoded in DNA. Better understanding of the complex interplay of DNA, RNA, and bioproteins - and how they have evolved over time - will lead to deeper knowledge and the ability to reprogram the microbes into useful technologies.

But all is not so easy as it seems. Even simple microbes have complex genomes with millions of base pairs. Two of Evo’s key advances compared to similar existing tools are expanding the length of sequences models can process at once from roughly 8,000 base pairs to more than 131,000 base pairs - known as the "context window" - and improving the resolution to the scale of individual nucleotides, the building blocks of DNA.

Evo was trained on the genomes of 80,000 microbes and 2.7 million prokaryotic and phage genomes, covering 300 billion nucleotides, as well as on smaller DNA loops known as plasmids. To preempt the use of Evo for the development of bioweapons, however, the team had to exclude the genomes of viruses known to infect humans and certain other organisms.

Evo is able to learn how small changes in nucleotide sequences affect the evolutionary fitness of whole organisms and generate DNA sequences of more than 1 million base pairs - more than seven times the context window of 131,000 base pairs, Hie added. By comparison, the smallest “minimal” bacterial genomes are about 580,000 base pairs in length, the researchers note.

Proof of concept

As a proof of concept of Evo's design capabilities, Hie and colleagues prompted Evo to generate novel synthetic CRISPR-Cas molecular complexes and systems. CRISPR-Cas systems are like tiny molecular machines that use proteins and RNA in tandem to edit DNA. In response to that prompt, Evo created a fully functional, previously unknown CRISPR system that was validated after testing 11 possible designs. Evo's CRISPR exploration is the first example of simultaneous protein-RNA codesign using a language model, Hie noted.

Next up, Hie is already working on expanding Evo's ability to process larger genomic sequences as well as to achieve greater control over its outputs, as well as to broaden his research beyond the microbial world to human and other genomes.

"Evo opens up a lot of very interesting research at the intersection of machine learning and biology," Hie said. "It creates opportunities for discoveries that were previously unimaginable and accelerates our ability to engineer life itself."

Evo is open source and publicly available for interested researchers to download.

The research was supported by the Fannie and John Hertz Foundation; National Science Foundation Graduate Fellowship Program; National Center for Advancing Translational Sciences of the National Institutes of Health; National Institutes of Health; National Science Foundation grants; US DEVCOM Army Research Laboratory grants; Office of Naval Research; Stanford HAI; NXP, Xilinx, LETI-CEA, Intel, IBM, Microsoft, NEC, Toshiba, TSMC, ARM, Hitachi, BASF, Accenture, Ericsson, Qualcomm, Analog Devices, Google Cloud, Salesforce, Total, the HAI-GCP Cloud Credits for Research program, the Stanford Data Science Initiative, and members of the Stanford DAWN project: Meta, Google, and VMWare; the Arc Institute; the Rainwater Foundation; the Curci Foundation; Rose Hill Investigators Program; V. and N. Khosla; S. Altman; anonymous gifts to the Hsu laboratory; V. Gupta; and R. Tonsing.

Nguyen E, Poli M, Durrant MG, Kang B, Katrekar D, Li DB, Bartie LJ, Thomas AW, King SH, Brixi G, Sullivan J, Ng MY, Lewis A, Lou A, Ermon S, Baccus SA, Hernandez-Boussard T, Ré C, Hsu PD, Hie BL.
Sequence modeling and design from molecular to genome scale with Evo.
Science. 2024 Nov 15;386(6723):eado9336. doi: 10.1126/science.ado9336

Most Popular Now

AI Catches One-Third of Interval Breast …

An AI algorithm for breast cancer screening has potential to enhance the performance of digital breast tomosynthesis (DBT), reducing interval cancers by up to one-third, according to a study published...

Great plan: Now We need to Get Real abou…

The government's big plan for the 10 Year Health Plan for the NHS laid out a big role for delivery. However, the Highland Marketing advisory board felt the missing implementation...

Researchers Create 'Virtual Scienti…

There may be a new artificial intelligence-driven tool to turbocharge scientific discovery: virtual labs. Modeled after a well-established Stanford School of Medicine research group, the virtual lab is complete with an...

From WebMD to AI Chatbots: How Innovatio…

A new research article published in the Journal of Participatory Medicine unveils how successive waves of digital technology innovation have empowered patients, fostering a more collaborative and responsive health care...

New AI Tool Accelerates mRNA-Based Treat…

A new artificial intelligence (AI) model can improve the process of drug and vaccine discovery by predicting how efficiently specific mRNA sequences will produce proteins, both generally and in various...

Can Amazon Alexa or Google Home Help Det…

Computer scientists at the University of Rochester have developed an AI-powered, speech-based screening tool that can help people assess whether they are showing signs of Parkinson’s disease, the fastest growing...

AI also Assesses Dutch Mammograms Better…

AI is detecting tumors more often and earlier in the Dutch breast cancer screening program. Those tumors can then be treated at an earlier stage. This has been demonstrated by...

RSNA AI Challenge Models can Independent…

Algorithms submitted for an AI Challenge hosted by the Radiological Society of North America (RSNA) have shown excellent performance for detecting breast cancers on mammography images, increasing screening sensitivity while...

AI could Help Emergency Rooms Predict Ad…

Artificial intelligence (AI) can help emergency department (ED) teams better anticipate which patients will need hospital admission, hours earlier than is currently possible, according to a multi-hospital study by the...

Head-to-Head Against AI, Pharmacy Studen…

Students pursuing a Doctor of Pharmacy degree routinely take - and pass - rigorous exams to prove competency in several areas. Can ChatGPT accurately answer the same questions? A new...

NHS Active 10 Walking Tracker Users are …

Users of the NHS Active 10 app, designed to encourage people to become more active, immediately increased their amount of brisk and non-brisk walking upon using the app, according to...

The Human Touch of Doctors will Still be…

AI-based medicine will revolutionise care including for Alzheimer’s and diabetes, predicts a technology expert, but it must be accessible to all patients. Healing with Artificial Intelligence, written by technology expert Daniele...