Welcome Evo, Generative AI for the Genome

Brian Hie runs the Laboratory of Evolutionary Design at Stanford, where he works at the crossroads of artificial intelligence and biology. Not long ago, Hie pondered a provocative question: If a tool like ChatGPT can write original sentences based on patterns found in massive collections of previously written words, what happens if we replace written words with genetic code?

The answer to that seemingly simple question has become Evo, a generative AI model that writes genetic code. Hie and his colleagues at the Arc Institute and the University of California, Berkeley, introduced Evo in a paper in the journal Science. Hie says that researchers might use Evo to understand how microbial and viral genomes work, to fashion new proteins (i.e., drugs) that never existed before, and to reprogram microbes to accomplish remarkable tasks, from improving photosynthesis for carbon sequestration and higher crop yields to gobbling up microplastics from the oceans.

"Instead of having to use brute force testing or mining promising sequences from nature, all of which are quite unpredictable, we now have an AI model for generating systems of interest, allowing researchers to focus only on the most promising possibilities," said Hie, assistant professor of chemical engineering. "Evo puts the genomes of whole lifeforms within reach and accelerates the bioengineering design process."

Evo could even lead to deeper understanding of evolution itself, new understandings of genetic diseases, and new treatments – all achieved on a computer rather than in a lab.

Natural insight

The inspiration comes from nature itself. The instructions of all life are encoded in DNA. Better understanding of the complex interplay of DNA, RNA, and bioproteins - and how they have evolved over time - will lead to deeper knowledge and the ability to reprogram the microbes into useful technologies.

But all is not so easy as it seems. Even simple microbes have complex genomes with millions of base pairs. Two of Evo’s key advances compared to similar existing tools are expanding the length of sequences models can process at once from roughly 8,000 base pairs to more than 131,000 base pairs - known as the "context window" - and improving the resolution to the scale of individual nucleotides, the building blocks of DNA.

Evo was trained on the genomes of 80,000 microbes and 2.7 million prokaryotic and phage genomes, covering 300 billion nucleotides, as well as on smaller DNA loops known as plasmids. To preempt the use of Evo for the development of bioweapons, however, the team had to exclude the genomes of viruses known to infect humans and certain other organisms.

Evo is able to learn how small changes in nucleotide sequences affect the evolutionary fitness of whole organisms and generate DNA sequences of more than 1 million base pairs - more than seven times the context window of 131,000 base pairs, Hie added. By comparison, the smallest “minimal” bacterial genomes are about 580,000 base pairs in length, the researchers note.

Proof of concept

As a proof of concept of Evo's design capabilities, Hie and colleagues prompted Evo to generate novel synthetic CRISPR-Cas molecular complexes and systems. CRISPR-Cas systems are like tiny molecular machines that use proteins and RNA in tandem to edit DNA. In response to that prompt, Evo created a fully functional, previously unknown CRISPR system that was validated after testing 11 possible designs. Evo's CRISPR exploration is the first example of simultaneous protein-RNA codesign using a language model, Hie noted.

Next up, Hie is already working on expanding Evo's ability to process larger genomic sequences as well as to achieve greater control over its outputs, as well as to broaden his research beyond the microbial world to human and other genomes.

"Evo opens up a lot of very interesting research at the intersection of machine learning and biology," Hie said. "It creates opportunities for discoveries that were previously unimaginable and accelerates our ability to engineer life itself."

Evo is open source and publicly available for interested researchers to download.

The research was supported by the Fannie and John Hertz Foundation; National Science Foundation Graduate Fellowship Program; National Center for Advancing Translational Sciences of the National Institutes of Health; National Institutes of Health; National Science Foundation grants; US DEVCOM Army Research Laboratory grants; Office of Naval Research; Stanford HAI; NXP, Xilinx, LETI-CEA, Intel, IBM, Microsoft, NEC, Toshiba, TSMC, ARM, Hitachi, BASF, Accenture, Ericsson, Qualcomm, Analog Devices, Google Cloud, Salesforce, Total, the HAI-GCP Cloud Credits for Research program, the Stanford Data Science Initiative, and members of the Stanford DAWN project: Meta, Google, and VMWare; the Arc Institute; the Rainwater Foundation; the Curci Foundation; Rose Hill Investigators Program; V. and N. Khosla; S. Altman; anonymous gifts to the Hsu laboratory; V. Gupta; and R. Tonsing.

Nguyen E, Poli M, Durrant MG, Kang B, Katrekar D, Li DB, Bartie LJ, Thomas AW, King SH, Brixi G, Sullivan J, Ng MY, Lewis A, Lou A, Ermon S, Baccus SA, Hernandez-Boussard T, Ré C, Hsu PD, Hie BL.
Sequence modeling and design from molecular to genome scale with Evo.
Science. 2024 Nov 15;386(6723):eado9336. doi: 10.1126/science.ado9336

Most Popular Now

Stepping Hill Hospital Announced as SPAR…

Stepping Hill Hospital, part of Stockport NHS Foundation Trust, has replaced its bedside units with state-of-the art devices running a full range of information, engagement, communications and productivity apps, to...

DMEA 2025: Digital Health Worldwide in B…

8 - 10 April 2025, Berlin, Germany. From the AI Act, to the potential of the European Health Data Space, to the power of patient data in Scandinavia - DMEA 2025...

Is AI in Medicine Playing Fair?

As artificial intelligence (AI) rapidly integrates into health care, a new study by researchers at the Icahn School of Medicine at Mount Sinai reveals that all generative AI models may...

Generative AI's Diagnostic Capabili…

The use of generative AI for diagnostics has attracted attention in the medical field and many research papers have been published on this topic. However, because the evaluation criteria were...

New System for the Early Detection of Au…

A team from the Human-Tech Institute-Universitat Politècnica de València has developed a new system for the early detection of Autism Spectrum Disorder (ASD) using virtual reality and artificial intelligence. The...

Diagnoses and Treatment Recommendations …

A new study led by Prof. Dan Zeltzer, a digital health expert from the Berglas School of Economics at Tel Aviv University, compared the quality of diagnostic and treatment recommendations...

AI Tool can Track Effectiveness of Multi…

A new artificial intelligence (AI) tool that can help interpret and assess how well treatments are working for patients with multiple sclerosis (MS) has been developed by UCL researchers. AI uses...

Surrey and Sussex Healthcare NHS Trust g…

Surrey and Sussex Healthcare NHS Trust has marked an important milestone in connecting busy radiologists across large parts of South East England, following the successful go live of Sectra's enterprise...

DMEA 2025 Ends with Record Attendance an…

8 - 10 April 2025, Berlin, Germany. DMEA 2025 came to a successful close with record attendance and an impressive program. 20,500 participants attended Europe's leading digital health event over the...

Dr Jason Broch Joins the Highland Market…

The Highland Marketing advisory board has welcomed a new member - Dr Jason Broch, a GP and director with a strong track record in the NHS and IT-enabled transformation. Dr Broch...

AI-Driven Smart Devices to Transform Hea…

AI-powered, internet-connected medical devices have the potential to revolutionise healthcare by enabling early disease detection, real-time patient monitoring, and personalised treatments, a new study suggests. They are already saving lives...

Multi-Resistance in Bacteria Predicted b…

An AI model trained on large amounts of genetic data can predict whether bacteria will become antibiotic-resistant. The new study shows that antibiotic resistance is more easily transmitted between genetically...