Scientists have developed an AI that can predict future disease risk using data from just one night of sleep. The system analyzes detailed physiological signals, looking for hidden patterns across the brain, heart, and breathing.
The technology successfully forecasts risks for conditions such as cancer, dementia, and heart disease. The results suggest sleep contains early health warnings that doctors have largely overlooked.
SleepFM
The system, called SleepFM, was trained using almost 600,000 hours of sleep recordings from 65,000 individuals. These recordings came from polysomnography, an in-depth sleep test that uses multiple sensors to track brain activity, heart function, breathing patterns, eye movement, leg motion, and other physical signals during sleep.
Polysomnography is considered the main standard for evaluating sleep and is typically performed overnight in a laboratory setting. While it is widely used to diagnose sleep disorders, researchers realised it also captures a vast amount of physiological information that has rarely been fully analysed.
In routine clinical practice, only a small portion of this information is examined. Recent advances in artificial intelligence now allow researchers to analyse these large and complex datasets more thoroughly. According to the team, this work is the first to apply AI to sleep data on such a massive scale.
AI training
To unlock insights from the data, the researchers built a foundation model, a type of AI designed to learn broad patterns from very large datasets and then apply that knowledge to many tasks. Large Language Models like ChatGPT use a similar approach, though they are trained on text rather than biological signals.
SleepFM was trained on 585,000 hours of polysomnography data collected from patients evaluated at sleep clinics. Each sleep recording was divided into five-second segments, which function much like words used to train language-based AI systems.
The model integrates multiple streams of information, including brain signals, heart rhythms, muscle activity, pulse measurements, and airflow during breathing, and learns how these signals interact. To help the system understand these relationships, the researchers developed a training method called leave-one-out contrastive learning. This approach removes one type of signal at a time and asks the model to reconstruct it using the remaining data.
After training, the researchers adapted the model for specific tasks. They first tested it on standard sleep assessments, such as identifying sleep stages and evaluating sleep apnea severity. In these tests, SleepFM matched or exceeded the performance of leading models currently in use.
The researchers next determined whether sleep data could predict future disease. To do this, they linked polysomnography records with long-term health outcomes from the same individuals. This was possible because the researchers had access to decades of medical records from a single sleep clinic.
Outcome
Using the combined dataset, SleepFM reviewed more than 1,000 disease categories and identified 130 conditions that could be predicted with reasonable accuracy using sleep data alone. The strongest results were seen for cancers, pregnancy complications, circulatory diseases, and mental health disorders, with prediction scores above a C-index of 0.8.
C-index: The concordance index is a rank correlation measures between a variable X and a possibly censored variable Y, with event/censoring indicator. In survival analysis, a pair of patients is called concordant if the risk of the event predicted by a model is lower for the patient who experiences the event at a later timepoint. The concordance probability (C-index) is the frequency of concordant pairs among all pairs of subjects. It can be used to measure and compare the discriminative power of a risk prediction models.
SleepFM performed especially well when predicting Parkinson’s disease (C-index 0.89), dementia (0.85), hypertensive heart disease (0.84), heart attack (0.81), prostate cancer (0.89), breast cancer (0.87), and death (0.84).
The research appears in the journal Nature Medicine, titled “A multimodal sleep foundation model for disease prediction.”
