Welcome Evo, Generative AI for the Genome

Brian Hie runs the Laboratory of Evolutionary Design at Stanford, where he works at the crossroads of artificial intelligence and biology. Not long ago, Hie pondered a provocative question: If a tool like ChatGPT can write original sentences based on patterns found in massive collections of previously written words, what happens if we replace written words with genetic code?

The answer to that seemingly simple question has become Evo, a generative AI model that writes genetic code. Hie and his colleagues at the Arc Institute and the University of California, Berkeley, introduced Evo in a paper in the journal Science. Hie says that researchers might use Evo to understand how microbial and viral genomes work, to fashion new proteins (i.e., drugs) that never existed before, and to reprogram microbes to accomplish remarkable tasks, from improving photosynthesis for carbon sequestration and higher crop yields to gobbling up microplastics from the oceans.

"Instead of having to use brute force testing or mining promising sequences from nature, all of which are quite unpredictable, we now have an AI model for generating systems of interest, allowing researchers to focus only on the most promising possibilities," said Hie, assistant professor of chemical engineering. "Evo puts the genomes of whole lifeforms within reach and accelerates the bioengineering design process."

Evo could even lead to deeper understanding of evolution itself, new understandings of genetic diseases, and new treatments – all achieved on a computer rather than in a lab.

Natural insight

The inspiration comes from nature itself. The instructions of all life are encoded in DNA. Better understanding of the complex interplay of DNA, RNA, and bioproteins - and how they have evolved over time - will lead to deeper knowledge and the ability to reprogram the microbes into useful technologies.

But all is not so easy as it seems. Even simple microbes have complex genomes with millions of base pairs. Two of Evo’s key advances compared to similar existing tools are expanding the length of sequences models can process at once from roughly 8,000 base pairs to more than 131,000 base pairs - known as the "context window" - and improving the resolution to the scale of individual nucleotides, the building blocks of DNA.

Evo was trained on the genomes of 80,000 microbes and 2.7 million prokaryotic and phage genomes, covering 300 billion nucleotides, as well as on smaller DNA loops known as plasmids. To preempt the use of Evo for the development of bioweapons, however, the team had to exclude the genomes of viruses known to infect humans and certain other organisms.

Evo is able to learn how small changes in nucleotide sequences affect the evolutionary fitness of whole organisms and generate DNA sequences of more than 1 million base pairs - more than seven times the context window of 131,000 base pairs, Hie added. By comparison, the smallest “minimal” bacterial genomes are about 580,000 base pairs in length, the researchers note.

Proof of concept

As a proof of concept of Evo's design capabilities, Hie and colleagues prompted Evo to generate novel synthetic CRISPR-Cas molecular complexes and systems. CRISPR-Cas systems are like tiny molecular machines that use proteins and RNA in tandem to edit DNA. In response to that prompt, Evo created a fully functional, previously unknown CRISPR system that was validated after testing 11 possible designs. Evo's CRISPR exploration is the first example of simultaneous protein-RNA codesign using a language model, Hie noted.

Next up, Hie is already working on expanding Evo's ability to process larger genomic sequences as well as to achieve greater control over its outputs, as well as to broaden his research beyond the microbial world to human and other genomes.

"Evo opens up a lot of very interesting research at the intersection of machine learning and biology," Hie said. "It creates opportunities for discoveries that were previously unimaginable and accelerates our ability to engineer life itself."

Evo is open source and publicly available for interested researchers to download.

The research was supported by the Fannie and John Hertz Foundation; National Science Foundation Graduate Fellowship Program; National Center for Advancing Translational Sciences of the National Institutes of Health; National Institutes of Health; National Science Foundation grants; US DEVCOM Army Research Laboratory grants; Office of Naval Research; Stanford HAI; NXP, Xilinx, LETI-CEA, Intel, IBM, Microsoft, NEC, Toshiba, TSMC, ARM, Hitachi, BASF, Accenture, Ericsson, Qualcomm, Analog Devices, Google Cloud, Salesforce, Total, the HAI-GCP Cloud Credits for Research program, the Stanford Data Science Initiative, and members of the Stanford DAWN project: Meta, Google, and VMWare; the Arc Institute; the Rainwater Foundation; the Curci Foundation; Rose Hill Investigators Program; V. and N. Khosla; S. Altman; anonymous gifts to the Hsu laboratory; V. Gupta; and R. Tonsing.

Nguyen E, Poli M, Durrant MG, Kang B, Katrekar D, Li DB, Bartie LJ, Thomas AW, King SH, Brixi G, Sullivan J, Ng MY, Lewis A, Lou A, Ermon S, Baccus SA, Hernandez-Boussard T, Ré C, Hsu PD, Hie BL.
Sequence modeling and design from molecular to genome scale with Evo.
Science. 2024 Nov 15;386(6723):eado9336. doi: 10.1126/science.ado9336

Most Popular Now

Unlocking the 10 Year Health Plan

The government's plan for the NHS is a huge document. Jane Stephenson, chief executive of SPARK TSL, argues the key to unlocking its digital ambitions is to consider what it...

Alcidion Grows Top Talent in the UK, wit…

Alcidion has today announced the addition of three new appointments to their UK-based team, with one internal promotion and two external recruits. Dr Paul Deffley has been announced as the...

AI can Find Cancer Pathologists Miss

Men assessed as healthy after a pathologist analyses their tissue sample may still have an early form of prostate cancer. Using AI, researchers at Uppsala University have been able to...

AI, Full Automation could Expand Artific…

Automated insulin delivery (AID) systems such as the UVA Health-developed artificial pancreas could help more type 1 diabetes patients if the devices become fully automated, according to a new review...

How AI could Speed the Development of RN…

Using artificial intelligence (AI), MIT researchers have come up with a new way to design nanoparticles that can more efficiently deliver RNA vaccines and other types of RNA therapies. After training...

MIT Researchers Use Generative AI to Des…

With help from artificial intelligence, MIT researchers have designed novel antibiotics that can combat two hard-to-treat infections: drug-resistant Neisseria gonorrhoeae and multi-drug-resistant Staphylococcus aureus (MRSA). Using generative AI algorithms, the research...

AI Hybrid Strategy Improves Mammogram In…

A hybrid reading strategy for screening mammography, developed by Dutch researchers and deployed retrospectively to more than 40,000 exams, reduced radiologist workload by 38% without changing recall or cancer detection...

New Training Year Starts at Siemens Heal…

In September, 197 school graduates will start their vocational training or dual studies in Germany at Siemens Healthineers. 117 apprentices and 80 dual students will begin their careers at Siemens...

Penn Developed AI Tools and Datasets Hel…

Doctors treating kidney disease have long depended on trial-and-error to find the best therapies for individual patients. Now, new artificial intelligence (AI) tools developed by researchers in the Perelman School...

Are You Eligible for a Clinical Trial? C…

A new study in the academic journal Machine Learning: Health discovers that ChatGPT can accelerate patient screening for clinical trials, showing promise in reducing delays and improving trial success rates. Researchers...

New AI Tool Addresses Accuracy and Fairn…

A team of researchers at the Icahn School of Medicine at Mount Sinai has developed a new method to identify and reduce biases in datasets used to train machine-learning algorithms...

Global Study Reveals How Patients View M…

How physicians feel about artificial intelligence (AI) in medicine has been studied many times. But what do patients think? A team led by researchers at the Technical University of Munich...