AI Model Forecasts Disease Risk Decades in Advance

Imagine a future where your medical history could help predict what health conditions you might face in the next two decades. Researchers have developed a generative AI model that uses large-scale health records to estimate how human health may change over time. It can forecast the risk and timing of over 1,000 diseases and predict health outcomes over a decade in advance.

This new generative AI model was custom-built using algorithmic concepts similar to those used in large language models (LLMs). It was trained on anonymised patient data from 400,000 participants from the UK Biobank. Researchers also successfully tested the model using data from 1.9 million patients in the Danish National Patient Registry. This approach is one of the most comprehensive demonstrations to date of how generative AI can model human disease progression at scale and was tested on data from two entirely separate healthcare systems.

"Our AI model is a proof of concept, showing that it’s possible for AI to learn many of our long-term health patterns and use this information to generate meaningful predictions," said Ewan Birney, Interim Executive Director at the European Molecular Biology Laboratory (EMBL). "By modelling how illnesses develop over time, we can start to explore when certain risks emerge and how best to plan early interventions. It’s a big step towards more personalised and preventive approaches to healthcare."

This work, published in the journal Nature, was a collaboration between EMBL, the German Cancer Research Centre (DKFZ), and the University of Copenhagen.

Just as large language models can learn the structure of sentences, this AI model learns the "grammar" of health data to model medical histories as sequences of events unfolding over time. These events include medical diagnoses or lifestyle factors such as smoking. The model learns to forecast disease risk from the order in which such events happen and how much time passes between these events.

"Medical events often follow predictable patterns," said Tom Fitzgerald, Staff Scientist at EMBL’s European Bioinformatics Institute (EMBL-EBI). "Our AI model learns those patterns and can forecast future health outcomes. It gives us a way to explore what might happen based on a person’s medical history and other key factors. Crucially, this is not a certainty, but an estimate of the potential risks."

The model performs especially well for conditions with clear and consistent progression patterns, such as certain types of cancer, heart attacks, and septicaemia, which is a type of blood poisoning. However, the model is less reliable for more variable conditions, such as mental health disorders or pregnancy-related complications that depend on unpredictable life events.

Like weather forecasts, this new AI model provides probabilities, not certainties. It doesn’t predict exactly what will happen to an individual, but it offers well-calibrated estimates of how likely certain conditions are to occur over a given period. For example, it could predict the chance of developing heart disease within the next year. These risks are expressed as rates over time, similar to forecasting a 70% chance of rain tomorrow. Generally, forecasts over a shorter period of time have higher accuracy than long-range ones.

For example the model predicts varying levels of risk for heart attacks. Taking the UK BioBank cohort at the age of 60-65, the risk of heart attack varies from a chance of 4 in 10,000 per year for some men to approximately 1 in 100 in other men, depending on their prior diagnoses and lifestyle. Women have a lower risk on average, but a similar spread of risk. Moreover, the risks increase, on average, as people age. A systematic assessment on data from the UK Biobank not used for training showed that these calculated risks correspond well to the observed number of cases across age and sex groups.

The model is calibrated to produce accurate population-level risk estimates, forecasting how often certain conditions occur within groups of people. However, like any AI model, it has limitations. For example, because the model's training data from the UK Biobank comes primarily from individuals aged 40–60, childhood and adolescent health events are underrepresented. The model also contains demographic biases due to gaps in the training data, including the underrepresentation of certain ethnic groups.

While the model isn’t ready for clinical use, it could already help researchers:

  • understand how diseases develop and progress over time,
  • explore how lifestyle and past illnesses affect long-term disease risk,
  • simulate health outcomes using artificial patient data, in situations where real-world data are difficult to obtain or access.

In the future, similar AI tools trained on more representative datasets could assist clinicians in identifying high-risk patients early. With ageing populations and rising rates of chronic illness, being able to forecast future health needs could help healthcare systems plan better and allocate resources more efficiently. But much more testing, consultation, and robust regulatory frameworks are needed before AI models can be deployed in a clinical setting.

"This is the beginning of a new way to understand human health and disease progression," said Moritz Gerstung, Head of the Division of AI in Oncology at DKFZ and former Group Leader at EMBL-EBI. "Generative models such as ours could one day help personalise care and anticipate healthcare needs at scale. By learning from large populations, these models offer a powerful lens into how diseases unfold, and could eventually support earlier, more tailored interventions."

This AI model was trained using anonymised health data under strict ethical rules. UK Biobank participants gave informed consent, and Danish data were accessed in accordance with national regulations that require the data to remain within Denmark. Researchers used secure, virtual systems to analyse the data without moving them across borders. These safeguards help ensure that AI models are developed and used in ways that respect privacy and uphold ethical standards.

Shmatko A, Jung AW, Gaurav K, Brunak S, Mortensen LH, Birney E, Fitzgerald T, Gerstung M.
Learning the natural history of human disease with generative transformers.
Nature. 2025 Sep 17. doi: 10.1038/s41586-025-09529-3

Most Popular Now

AI Tool Offers Deep Insight into the Imm…

Researchers explore the human immune system by looking at the active components, namely the various genes and cells involved. But there is a broad range of these, and observations necessarily...

Do Fitness Apps do More Harm than Good?

A study published in the British Journal of Health Psychology reveals the negative behavioral and psychological consequences of commercial fitness apps reported by users on social media. These impacts may...

AI Tool Beats Humans at Detecting Parasi…

Scientists at ARUP Laboratories have developed an artificial intelligence (AI) tool that detects intestinal parasites in stool samples more quickly and accurately than traditional methods, potentially transforming how labs diagnose...

Making Cancer Vaccines More Personal

In a new study, University of Arizona researchers created a model for cutaneous squamous cell carcinoma, a type of skin cancer, and identified two mutated tumor proteins, or neoantigens, that...

AI, Health, and Health Care Today and To…

Artificial intelligence (AI) carries promise and uncertainty for clinicians, patients, and health systems. This JAMA Summit Report presents expert perspectives on the opportunities, risks, and challenges of AI in health...

AI can Better Predict Future Risk for He…

A landmark study led by University' experts has shown that artificial intelligence can better predict how doctors should treat patients following a heart attack. The study, conducted by an international...

A New AI Model Improves the Prediction o…

Breast cancer is the most commonly diagnosed form of cancer in the world among women, with more than 2.3 million cases a year, and continues to be one of the...

AI System Finds Crucial Clues for Diagno…

Doctors often must make critical decisions in minutes, relying on incomplete information. While electronic health records contain vast amounts of patient data, much of it remains difficult to interpret quickly...

Improved Cough-Detection Tech can Help w…

Researchers have improved the ability of wearable health devices to accurately detect when a patient is coughing, making it easier to monitor chronic health conditions and predict health risks such...

Multimodal AI Poised to Revolutionize Ca…

Although artificial intelligence (AI) has already shown promise in cardiovascular medicine, most existing tools analyze only one type of data - such as electrocardiograms or cardiac images - limiting their...

New AI Tool Makes Medical Imaging Proces…

When doctors analyze a medical scan of an organ or area in the body, each part of the image has to be assigned an anatomical label. If the brain is...