Welcome Evo, Generative AI for the Genome

Brian Hie runs the Laboratory of Evolutionary Design at Stanford, where he works at the crossroads of artificial intelligence and biology. Not long ago, Hie pondered a provocative question: If a tool like ChatGPT can write original sentences based on patterns found in massive collections of previously written words, what happens if we replace written words with genetic code?

The answer to that seemingly simple question has become Evo, a generative AI model that writes genetic code. Hie and his colleagues at the Arc Institute and the University of California, Berkeley, introduced Evo in a paper in the journal Science. Hie says that researchers might use Evo to understand how microbial and viral genomes work, to fashion new proteins (i.e., drugs) that never existed before, and to reprogram microbes to accomplish remarkable tasks, from improving photosynthesis for carbon sequestration and higher crop yields to gobbling up microplastics from the oceans.

"Instead of having to use brute force testing or mining promising sequences from nature, all of which are quite unpredictable, we now have an AI model for generating systems of interest, allowing researchers to focus only on the most promising possibilities," said Hie, assistant professor of chemical engineering. "Evo puts the genomes of whole lifeforms within reach and accelerates the bioengineering design process."

Evo could even lead to deeper understanding of evolution itself, new understandings of genetic diseases, and new treatments – all achieved on a computer rather than in a lab.

Natural insight

The inspiration comes from nature itself. The instructions of all life are encoded in DNA. Better understanding of the complex interplay of DNA, RNA, and bioproteins - and how they have evolved over time - will lead to deeper knowledge and the ability to reprogram the microbes into useful technologies.

But all is not so easy as it seems. Even simple microbes have complex genomes with millions of base pairs. Two of Evo’s key advances compared to similar existing tools are expanding the length of sequences models can process at once from roughly 8,000 base pairs to more than 131,000 base pairs - known as the "context window" - and improving the resolution to the scale of individual nucleotides, the building blocks of DNA.

Evo was trained on the genomes of 80,000 microbes and 2.7 million prokaryotic and phage genomes, covering 300 billion nucleotides, as well as on smaller DNA loops known as plasmids. To preempt the use of Evo for the development of bioweapons, however, the team had to exclude the genomes of viruses known to infect humans and certain other organisms.

Evo is able to learn how small changes in nucleotide sequences affect the evolutionary fitness of whole organisms and generate DNA sequences of more than 1 million base pairs - more than seven times the context window of 131,000 base pairs, Hie added. By comparison, the smallest “minimal” bacterial genomes are about 580,000 base pairs in length, the researchers note.

Proof of concept

As a proof of concept of Evo's design capabilities, Hie and colleagues prompted Evo to generate novel synthetic CRISPR-Cas molecular complexes and systems. CRISPR-Cas systems are like tiny molecular machines that use proteins and RNA in tandem to edit DNA. In response to that prompt, Evo created a fully functional, previously unknown CRISPR system that was validated after testing 11 possible designs. Evo's CRISPR exploration is the first example of simultaneous protein-RNA codesign using a language model, Hie noted.

Next up, Hie is already working on expanding Evo's ability to process larger genomic sequences as well as to achieve greater control over its outputs, as well as to broaden his research beyond the microbial world to human and other genomes.

"Evo opens up a lot of very interesting research at the intersection of machine learning and biology," Hie said. "It creates opportunities for discoveries that were previously unimaginable and accelerates our ability to engineer life itself."

Evo is open source and publicly available for interested researchers to download.

The research was supported by the Fannie and John Hertz Foundation; National Science Foundation Graduate Fellowship Program; National Center for Advancing Translational Sciences of the National Institutes of Health; National Institutes of Health; National Science Foundation grants; US DEVCOM Army Research Laboratory grants; Office of Naval Research; Stanford HAI; NXP, Xilinx, LETI-CEA, Intel, IBM, Microsoft, NEC, Toshiba, TSMC, ARM, Hitachi, BASF, Accenture, Ericsson, Qualcomm, Analog Devices, Google Cloud, Salesforce, Total, the HAI-GCP Cloud Credits for Research program, the Stanford Data Science Initiative, and members of the Stanford DAWN project: Meta, Google, and VMWare; the Arc Institute; the Rainwater Foundation; the Curci Foundation; Rose Hill Investigators Program; V. and N. Khosla; S. Altman; anonymous gifts to the Hsu laboratory; V. Gupta; and R. Tonsing.

Nguyen E, Poli M, Durrant MG, Kang B, Katrekar D, Li DB, Bartie LJ, Thomas AW, King SH, Brixi G, Sullivan J, Ng MY, Lewis A, Lou A, Ermon S, Baccus SA, Hernandez-Boussard T, Ré C, Hsu PD, Hie BL.
Sequence modeling and design from molecular to genome scale with Evo.
Science. 2024 Nov 15;386(6723):eado9336. doi: 10.1126/science.ado9336

Most Popular Now

Integrating Care Records is Good. Using …

Opinion Article by Dr Paul Deffley, Chief Medical Officer, Alcidion. A single patient record already exists in the NHS. Or at least, that’s a perception shared by many. A survey of...

Should AI Chatbots Replace Your Therapis…

The new study exposes the dangerous flaws in using artificial intelligence (AI) chatbots for mental health support. For the first time, the researchers evaluated these AI systems against clinical standards...

AI could Help Pathologists Match Cancer …

A new study by researchers at the Icahn School of Medicine at Mount Sinai, Memorial Sloan Kettering Cancer Center, and collaborators, suggests that artificial intelligence (AI) could significantly improve how...

AI Detects Early Signs of Osteoporosis f…

Investigators have developed an artificial intelligence-assisted diagnostic system that can estimate bone mineral density in both the lumbar spine and the femur of the upper leg, based on X-ray images...

AI Model Converts Hospital Records into …

UCLA researchers have developed an AI system that turns fragmented electronic health records (EHR) normally in tables into readable narratives, allowing artificial intelligence to make sense of complex patient histories...

AI Sharpens Pathologists' Interpret…

Pathologists' examinations of tissue samples from skin cancer tumours improved when they were assisted by an AI tool. The assessments became more consistent and patients' prognoses were described more accurately...

AI Tool Detects Surgical Site Infections…

A team of Mayo Clinic researchers has developed an artificial intelligence (AI) system that can detect surgical site infections (SSIs) with high accuracy from patient-submitted postoperative wound photos, potentially transforming...

Forging a Novel Therapeutic Path for Pat…

Rett syndrome is a devastating rare genetic childhood disorder primarily affecting girls. Merely 1 out of 10,000 girls are born with it and much fewer boys. It is caused by...

Mayo Clinic's AI Tool Identifies 9 …

Mayo Clinic researchers have developed a new artificial intelligence (AI) tool that helps clinicians identify brain activity patterns linked to nine types of dementia, including Alzheimer's disease, using a single...

AI Detects Fatty Liver Disease with Ches…

Fatty liver disease, caused by the accumulation of fat in the liver, is estimated to affect one in four people worldwide. If left untreated, it can lead to serious complications...

AI Matches Doctors in Mapping Lung Tumor…

In radiation therapy, precision can save lives. Oncologists must carefully map the size and location of a tumor before delivering high-dose radiation to destroy cancer cells while sparing healthy tissue...

Meet Your Digital Twin

Before an important meeting or when a big decision needs to be made, we often mentally run through various scenarios before settling on the best course of action. But when...