Predicting Heart Failure Years in Advance with AI and Daily ECGs

Author: Denis Avetisyan


A new deep learning model analyzes full-day electrocardiograms to forecast heart failure risk up to five years before traditional diagnosis.

Holter ECG examinations, when coupled with the DeepHHF model for opportunistic analysis, offer a pathway to identify patients at moderate or high risk of heart failure-facilitating proactive interventions such as BNP testing or echocardiography-and potentially improving preventative care.
Holter ECG examinations, when coupled with the DeepHHF model for opportunistic analysis, offer a pathway to identify patients at moderate or high risk of heart failure-facilitating proactive interventions such as BNP testing or echocardiography-and potentially improving preventative care.

DeepHHF leverages 24-hour Holter monitoring data and time series analysis to achieve improved heart failure risk prediction compared to conventional methods.

Despite increasing prevalence, early heart failure (HF) risk assessment remains a clinical challenge. This study, ‘Modeling Day-Long ECG Signals to Predict Heart Failure Risk with Explainable AI’, addresses this gap by demonstrating the feasibility of a deep learning model, DeepHHF, for predicting five-year HF risk using continuous 24-hour electrocardiogram (ECG) data. Achieving an area under the curve of 0.80, DeepHHF outperformed conventional approaches and highlighted key cardiac patterns-particularly arrhythmias-during daytime hours. Could widespread implementation of AI-driven Holter ECG analysis offer a cost-effective and accessible pathway to proactive cardiovascular care?


Unveiling the Complexity of Heart Failure

Heart failure represents a significant and growing global health challenge, not simply due to its increasing prevalence, but also its inherent complexity. The syndrome isn’t monolithic; it manifests in diverse phenotypes – notably heart failure with reduced ejection fraction (HFrEF), heart failure with mildly reduced ejection fraction (HFmrEF), and heart failure with preserved ejection fraction (HFpEF) – each with differing underlying mechanisms and responses to treatment. This heterogeneity complicates predictive efforts, as a single risk assessment tool often fails to accurately capture the nuanced risk profiles within these subgroups. Consequently, identifying individuals at high risk of developing heart failure, or predicting disease progression, requires a deeper understanding of these distinct phenotypes and the subtle biological variations that define them, presenting a considerable hurdle for clinicians and researchers alike.

Current methods for gauging heart failure risk, such as the PCP-HF Score – achieving an area under the receiver operating characteristic curve (AUROC) of 0.74 – frequently fall short in delivering the precision needed for individualized patient care. While valuable as a general indicator, these established risk scores often treat patients as a homogenous group, overlooking the unique physiological nuances that contribute to varying levels of susceptibility. This lack of granularity hinders timely interventions, as individuals at genuinely high risk may be misclassified, delaying crucial treatment and potentially worsening outcomes. Consequently, there is a pressing need for more sophisticated predictive models capable of discerning subtle risk factors and tailoring preventative strategies to each patient’s specific profile, moving beyond broad categorization towards a more personalized approach to heart failure management.

Accurate heart failure prediction hinges on the ability to discern nuanced indicators hidden within the body’s intricate physiological data streams. These signals – encompassing electrocardiograms, echocardiograms, and even subtle variations in blood pressure – often contain predictive information beyond the scope of traditional clinical assessments. Consequently, researchers are increasingly turning to advanced analytical approaches, including machine learning and artificial intelligence, to extract these subtle patterns. These techniques can process high-dimensional datasets, identify non-linear relationships, and ultimately, offer a more granular and personalized assessment of an individual’s risk trajectory. The successful implementation of these methods promises to move beyond broad risk categorization and towards proactive, targeted interventions that can significantly improve patient outcomes and reduce the escalating burden of heart failure.

The DeepHHF model demonstrates clinically valuable performance, as evidenced by subgroup AUROC analysis based on time since heart failure diagnosis, Kaplan-Meier survival curves revealing improved risk stratification compared to short-window inputs, and prevalence analysis of known heart failure risk factors across true/false positive/negative groups.
The DeepHHF model demonstrates clinically valuable performance, as evidenced by subgroup AUROC analysis based on time since heart failure diagnosis, Kaplan-Meier survival curves revealing improved risk stratification compared to short-window inputs, and prevalence analysis of known heart failure risk factors across true/false positive/negative groups.

DeepHHF: A New Lens for Cardiac Prediction

DeepHHF is a deep learning model developed for the prediction of five-year heart failure risk. The model utilizes 24-hour Holter electrocardiogram (ECG) data as its primary input, a data type frequently collected in clinical settings due to its non-invasive nature and ability to provide continuous cardiac monitoring. This reliance on readily available ECG data aims to facilitate wider adoption and integration into existing clinical workflows. DeepHHF’s architecture is specifically designed to process the temporal characteristics of these ECG signals to identify patterns indicative of future heart failure development, offering a potentially valuable tool for proactive risk assessment and patient management.

DeepHHF utilizes a Transformer Encoder architecture to process electrocardiogram (ECG) signals, a design choice enabling the model to capture long-range dependencies within the data. To efficiently handle the raw ECG input, the model incorporates EnCodec, a neural audio codec originally developed for audio compression. EnCodec serves to reduce the dimensionality of the ECG signals while preserving clinically relevant information, thereby facilitating faster processing and reducing computational demands without significant loss of predictive power. This approach allows DeepHHF to directly analyze the complex waveforms of the ECG and automatically learn relevant features for heart failure risk prediction.

The DeepHHF model demonstrates a statistically significant improvement in 5-year heart failure risk prediction, achieving an Area Under the Receiver Operating Characteristic curve (AUROC) of 0.80. This performance exceeds that of the established PCP-HF scoring system, which yields an AUROC of 0.74 when evaluated on the same dataset. The AUROC metric quantifies the model’s ability to discriminate between patients who will and will not develop heart failure within the five-year timeframe; a higher AUROC indicates improved discriminatory power. The 0.06 difference in AUROC between DeepHHF and PCP-HF suggests a substantial increase in the model’s predictive accuracy.

Traditional heart failure risk prediction models frequently depend on feature engineering, a process where clinicians or data scientists manually identify and extract relevant characteristics from ECG data, such as QRS duration or ST segment depression, for input into predictive algorithms. DeepHHF diverges from this approach by directly processing raw ECG signals. This is achieved through the integration of a Transformer Encoder and the EnCodec neural audio codec, allowing the model to autonomously learn and extract pertinent features directly from the waveform data without prior manual intervention. This end-to-end learning capability eliminates the potential biases and limitations inherent in manually defined features and allows DeepHHF to identify subtle patterns within the raw signal that might be overlooked by conventional methods.

A logistic regression classifier combining the DeepHHF score with patient demographics, comorbidities, and PCP-HF features significantly improved performance, as demonstrated by an area under the receiver operating characteristic curve (AUROC) of <span class="katex-eq" data-katex-display="false">0.87</span> (95% CI: <span class="katex-eq" data-katex-display="false">0.83-0.91</span>) for the combined features compared to <span class="katex-eq" data-katex-display="false">0.76</span> (95% CI: <span class="katex-eq" data-katex-display="false">0.71-0.81</span>) using DeepHHF alone with a 24-hour ECG.
A logistic regression classifier combining the DeepHHF score with patient demographics, comorbidities, and PCP-HF features significantly improved performance, as demonstrated by an area under the receiver operating characteristic curve (AUROC) of 0.87 (95% CI: 0.83-0.91) for the combined features compared to 0.76 (95% CI: 0.71-0.81) using DeepHHF alone with a 24-hour ECG.

Illuminating the ‘Why’: Interpreting DeepHHF’s Predictions

Gradient Attention Rollout is employed as a post-hoc interpretability method to determine the contribution of individual data points within the input ECG signal to the DeepHHF model’s output. This technique calculates the gradient of the prediction with respect to the input, effectively quantifying how much each sample in the ECG influences the final risk assessment. The resulting gradient is then “rolled out” across the ECG waveform, highlighting the specific segments-typically measured in milliseconds-that most strongly contributed to the prediction. This allows for visual inspection of the ECG and identification of critical features driving the model’s decision-making process, providing a quantifiable basis for understanding the model’s focus.

The ability to interpret DeepHHF’s predictions provides clinicians with crucial insight into the model’s reasoning process for each individual risk assessment. This transparency extends beyond a simple risk score, detailing which specific features within the electrocardiogram (ECG) contributed most significantly to the outcome. Consequently, clinicians can evaluate the validity of the prediction in the context of their own clinical knowledge and patient history, increasing their confidence in the model’s output. This enhanced understanding facilitates more informed decision-making regarding patient management, enabling clinicians to corroborate or question the assessment and ultimately integrate it responsibly into their overall diagnostic and therapeutic strategies.

Analysis of DeepHHF’s predictive features demonstrates the model’s capacity to identify nuanced variations in electrocardiogram morphology that correlate with cardiac function. Specifically, the model does not rely on easily discernible, broad ECG features, but instead focuses on subtle waveform characteristics. These identified morphological variations extend beyond standard clinical interpretation and suggest the potential for DeepHHF to capture information related to underlying pathophysiological processes not typically assessed via conventional ECG analysis, indicating a sensitivity to more granular aspects of cardiac electrophysiology.

DeepHHF model explainability analysis using gradient attention rollout reveals that attention focuses on characteristic heartbeats, demonstrates circadian variability, and successfully clusters heartbeats into four distinct types, as visualized through t-SNE dimensionality reduction and averaging of representative beats.
DeepHHF model explainability analysis using gradient attention rollout reveals that attention focuses on characteristic heartbeats, demonstrates circadian variability, and successfully clusters heartbeats into four distinct types, as visualized through t-SNE dimensionality reduction and averaging of representative beats.

Beyond Prediction: The Rhythm of Cardiac Health

Early investigations suggest that the predictive capabilities of DeepHHF are intricately linked to the body’s natural circadian rhythms and their impact on cardiac function. The heart doesn’t operate at a constant state; instead, its activity fluctuates throughout the day, governed by internal biological clocks. DeepHHF appears to be sensitive to these time-dependent variations, potentially identifying subtle changes in heart behavior that correlate with increased risk of heart failure. This indicates the model isn’t simply recognizing static patterns, but is actively processing how cardiac signals evolve over time, mirroring the dynamic nature of physiological processes and opening avenues for a more nuanced understanding of heart health.

The efficacy of DeepHHF extends beyond simple prediction, hinting at its capacity to discern subtle, time-dependent fluctuations within cardiac function that correlate with heart failure risk. Cardiac activity isn’t constant; it’s modulated by internal biological rhythms, including circadian patterns, and these variations can serve as early indicators of declining health. The model doesn’t merely identify if a patient is at risk, but appears to be sensitive to when these risks are most pronounced, potentially reflecting periods of heightened vulnerability linked to these natural oscillations. This suggests DeepHHF is learning to recognize complex relationships between temporal patterns in heart signals and the underlying physiological processes that contribute to heart failure, offering a pathway towards a more nuanced understanding of disease progression.

For patients presenting with frequent heart failure diagnoses, the DeepHHF model exhibits a remarkable capacity to identify relevant echocardiogram examinations, achieving a sensitivity of 85.0% in pinpointing these crucial tests. This high level of accuracy suggests the model effectively prioritizes investigations for individuals already exhibiting multiple indicators of cardiac distress. The ability to accurately flag these examinations holds significant potential for streamlining diagnostic workflows and ensuring timely intervention in a population where early detection is paramount. By focusing attention on those most likely to benefit from detailed cardiac imaging, DeepHHF contributes to a more efficient and targeted approach to heart failure management.

The potential to translate DeepHHF’s rhythmic insights into clinical practice centers on the concept of personalized, time-dependent interventions. Recognizing that cardiac function fluctuates with underlying biological rhythms opens avenues for tailoring treatments-such as medication dosage or the timing of cardiac rehabilitation-to coincide with periods of optimal physiological response. This approach moves beyond a one-size-fits-all model, acknowledging that a patient’s susceptibility to heart failure events isn’t constant. By anticipating moments of heightened vulnerability, clinicians could proactively adjust care plans, potentially diminishing the impact of risk factors and improving overall patient outcomes. Further research will focus on identifying specific biomarkers and individual patient characteristics that refine these time-sensitive strategies, ultimately striving to deliver interventions precisely when they are most effective.

Detailed review of a patient's medical history-including <span class="katex-eq" data-katex-display="false">ICD-9</span> codes, echocardiograms, hospitalizations, and mortality data-confirmed a positive heart failure diagnosis in an outlier case.
Detailed review of a patient’s medical history-including ICD-9 codes, echocardiograms, hospitalizations, and mortality data-confirmed a positive heart failure diagnosis in an outlier case.

Towards a Future of Precision Cardiovascular Care

DeepHHF signifies a considerable advancement in the application of artificial intelligence to individualized heart health management. This novel approach moves beyond traditional, generalized risk assessments by utilizing deep learning to analyze comprehensive cardiac imaging data – specifically echocardiograms – with unprecedented precision. The system doesn’t simply identify existing conditions; it predicts the likelihood of future heart failure, enabling proactive interventions tailored to each patient’s unique physiological profile. By identifying subtle patterns often missed by the human eye, DeepHHF offers the potential to move from reactive treatment of cardiovascular disease to a preventative, personalized model of care, ultimately aiming to optimize outcomes and improve the quality of life for those at risk.

The clinical utility of DeepHHF is underscored by its impressive Number Needed to Screen (NNS) values – just 30 for individuals at moderate risk and a remarkably low 21 for those considered high risk. This metric indicates the number of people who need to be screened to identify a single true positive case of subtle cardiac dysfunction, and these figures suggest DeepHHF offers a substantial improvement over existing methods. A lower NNS directly translates to a more efficient use of healthcare resources and, crucially, a greater potential to intervene before symptoms of heart failure manifest. These findings hint at a future where proactive screening with AI-powered tools like DeepHHF could significantly reduce the burden of this debilitating condition by enabling earlier diagnosis and targeted preventative measures.

The predictive power of DeepHHF, while already promising, stands to be significantly amplified by incorporating a more holistic view of individual patient data. Future iterations of the model could integrate continuous physiological monitoring – such as heart rate variability, sleep patterns, and activity levels collected via wearable sensors – alongside comprehensive genomic profiles. This multi-faceted approach would allow for the identification of subtle biomarkers and personalized risk signatures currently undetectable, moving beyond generalized risk assessments. Consequently, clinicians could tailor preventative interventions – from lifestyle modifications to precisely targeted pharmacological therapies – based on a patient’s unique biological makeup and predicted trajectory, ultimately shifting the paradigm from reactive treatment to proactive prevention of cardiovascular disease.

The evolving landscape of cardiovascular health is shifting toward proactive intervention, driven by the potential of advanced analytics to decipher the subtle language of biological rhythms. Current approaches often address heart failure after symptoms manifest, but emerging research suggests that predictive modeling, attuned to individual circadian and other physiological oscillations, could identify individuals at risk long before clinical presentation. By integrating continuous monitoring data – encompassing heart rate variability, sleep patterns, and activity levels – with sophisticated algorithms, it may become possible to detect early warning signs of cardiac dysfunction. This predictive capability doesn’t merely offer earlier treatment opportunities; it envisions a future where personalized lifestyle adjustments, preventative medications, or targeted therapies are deployed before the onset of heart failure, effectively shifting the paradigm from reactive care to proactive prevention and promising a substantial reduction in morbidity and mortality.

Analysis of patient records with repeated heart failure diagnoses reveals a distinction between diagnoses made before and after 2018-when hospital-registered data was included-highlighting potential shifts in diagnostic patterns.
Analysis of patient records with repeated heart failure diagnoses reveals a distinction between diagnoses made before and after 2018-when hospital-registered data was included-highlighting potential shifts in diagnostic patterns.

The research subtly highlights the importance of understanding underlying structures to predict future outcomes, echoing John Locke’s sentiment: “All mankind… being all equal and independent, no one ought to harm another in his life, health, liberty or possessions.” Just as Locke argued for inherent rights based on natural law, this model seeks to discern patterns within the natural rhythm of the heart-the 24-hour ECG signals-to preemptively identify individuals at risk. DeepHHF doesn’t merely present a prediction; it attempts to reveal the subtle indicators within the complex time series data, offering a glimpse into the body’s internal state and potentially safeguarding future health. The model’s predictive power stems from recognizing these foundational patterns, akin to understanding the fundamental principles governing human existence.

Where Do We Go From Here?

The demonstrated capacity to anticipate heart failure risk five years in advance, using only the readily available signal of a standard Holter monitor, feels less like a destination and more like a sharpened question. Performance gains, while notable, should not lull the field into complacency. The elegance of a prediction is inversely proportional to its opacity; current models, even those incorporating explainable AI elements, remain fundamentally ‘black boxes’. True insight demands a move beyond simply identifying what predicts failure, towards understanding why these particular temporal patterns portend a specific outcome.

A consistent user experience, built upon easily interpretable outputs, is not merely a convenience – it is empathy made manifest. The future likely lies in hybrid approaches; models that seamlessly integrate deep learning’s predictive power with established physiological principles. Focusing solely on predictive accuracy risks building systems that are powerful but ultimately untrustworthy, or worse, clinically useless if their reasoning cannot be meaningfully communicated.

Finally, the inherent limitations of time-series analysis should not be ignored. The five-year horizon, while clinically relevant, represents a static endpoint. The heart, of course, is a dynamic organ. Future research must consider models that continually recalibrate risk assessments, incorporating longitudinal data and adapting to the patient’s evolving physiological state. Only then will prediction truly become prevention.


Original article: https://arxiv.org/pdf/2601.00014.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-01-05 06:58