Forecasting ED Crowds: Can Machine Learning Predict Hospital Admissions?

Author: Denis Avetisyan

New research explores the potential of machine learning models to accurately forecast daily arrivals in emergency departments, offering insights for improved resource allocation.

XGBoost modeling of total hospital admissions reveals a hierarchy of predictive features, suggesting that forecasting accuracy hinges on understanding the relative influence of each contributing factor rather than simply identifying correlations.

A comparative analysis of time series and machine learning methods-including XGBoost, LSTM, and SARIMAX-for short-term Emergency Department demand prediction, decomposed by ward and clinical complexity.

Accurate forecasting remains a persistent challenge in healthcare despite increasing data availability. This research, detailed in ‘Early predicting of hospital admission using machine learning algorithms: Priority queues approach’, comparatively evaluates the performance of SARIMAX, XGBoost, and LSTM models in forecasting daily Emergency Department arrivals, stratified by both ward and patient complexity. Results demonstrate viable short-term prediction across all models-with XGBoost achieving the highest accuracy for total admissions-but highlight a consistent underestimation of infrequent, large-scale demand surges. Can future research integrate exogenous variables or novel algorithms to better anticipate and mitigate these critical, unpredictable spikes in patient volume?

The Inevitable Complexity of Anticipation

Effective hospital administration hinges on the ability to anticipate future resource demands, a necessity directly linked to both patient well-being and streamlined operations. Inaccurate forecasting can lead to critical shortages of beds, staff, and essential supplies, potentially compromising the quality of care and increasing patient wait times. Conversely, overestimation results in wasted resources and inflated costs, diminishing a hospital’s financial efficiency. Therefore, precise prediction isn’t merely a logistical exercise, but a fundamental component of delivering optimal healthcare; it enables proactive staffing, efficient supply chain management, and ultimately, the capacity to meet the fluctuating needs of the patient population with both effectiveness and fiscal responsibility.

Historically, predicting hospital demand has proven remarkably difficult due to the chaotic nature of patient arrivals and the multitude of influencing factors. Conventional statistical forecasting techniques, designed for relatively stable systems, often falter when confronted with the unpredictable spikes and dips inherent in emergency departments and fluctuating seasonal illnesses. These methods frequently assume consistent patterns, a premise easily disrupted by unforeseen events – from localized outbreaks and large-scale accidents to even global pandemics. Consequently, forecasts generated by these traditional approaches can be significantly inaccurate, leading to either resource shortages that compromise patient care, or wasteful over-provisioning that strains budgets and logistical capabilities. The inherent variability in patient conditions, coupled with the complex interplay of external influences, necessitates more sophisticated modeling approaches capable of adapting to real-time fluctuations and accounting for a wider range of potential disruptions.

Effective hospital management increasingly relies on predicting demand at a highly specific level – not just overall patient volume, but the needs of individual wards. This granular approach to forecasting is essential because resource consumption varies dramatically between specialties; a cardiology ward, for example, will require significantly different staffing and equipment compared to an orthopedics unit. Consequently, models that treat the hospital as a monolithic entity often fail to accurately allocate resources, leading to bottlenecks, delays in care, and increased operational costs. By focusing on ward-specific demand, hospitals can optimize staffing levels, ensure appropriate equipment availability, and ultimately improve patient outcomes through more responsive and efficient care delivery. This precision necessitates advanced analytical techniques capable of discerning subtle patterns within each ward’s patient flow and anticipating future needs with greater accuracy.

Hospital resource utilization is profoundly shaped by the complexity of patient conditions, as categorized by Diagnosis-Related Groups (DRGs). Patients within higher DRG complexity levels – those requiring more intensive interventions and prolonged stays – demonstrably consume a disproportionate share of resources, including nursing time, specialized equipment, and pharmaceutical supplies. Consequently, effective forecasting models must move beyond simple patient volume predictions and incorporate granular data reflecting varying levels of patient acuity. Models that fail to account for DRG complexity risk underestimating resource needs, potentially leading to compromised care and operational inefficiencies; conversely, accurately modeling this complexity allows for proactive resource allocation and optimized hospital performance, ensuring appropriate care is available for all patients regardless of their individual needs.

SHAP waterfall plots reveal how feature contributions vary depending on the accuracy of weekly total admission predictions.

From Statistical Benchmarks to Machine Learning Potential

Established time-series forecasting methods – including the Autoregressive Integrated Moving Average (ARIMA) model, Seasonal ARIMA (SARIMAX), and the Seasonal Naive method – serve as fundamental benchmarks in predictive modeling. These statistical techniques leverage historical data to identify patterns and extrapolate future values, relying on assumptions of stationarity and linearity within the time series. The ARIMA model, for instance, uses past values of the time series itself and error terms to make predictions, while SARIMAX extends this to incorporate seasonal components and exogenous variables. The Seasonal Naive method provides a simple, yet often surprisingly effective, baseline by assuming the next value will be identical to the corresponding value in the previous season. Comparing the performance of more complex models – such as those utilizing machine learning – against these established methods is crucial for determining if the increased computational cost and complexity are justified by a demonstrable improvement in forecast accuracy.

Long Short-Term Memory (LSTM) networks and Extreme Gradient Boosting (XGBoost) represent advanced machine learning techniques capable of modeling intricate patterns in time-series data. Unlike traditional statistical methods which often assume linearity and fixed relationships, these algorithms can automatically learn and represent non-linear dependencies and complex temporal interactions. LSTMs, a type of recurrent neural network, excel at capturing long-range dependencies by maintaining an internal state that remembers past information, allowing them to effectively model sequential data. XGBoost, a gradient boosting algorithm, combines multiple decision trees to create a strong predictive model, effectively handling complex relationships and feature interactions. Both methods offer the potential to improve forecast accuracy when dealing with data exhibiting non-linear trends or where past observations significantly influence future values.

The forecasting models employed in this analysis leverage routinely collected hospital data, primarily Total Hospital Admissions, as the primary input feature. This data, typically available through hospital information systems, provides a historical record of patient volume which serves as the basis for predicting future demand. The granularity of this data – often daily or weekly – allows for the identification of temporal patterns and trends. While other variables such as patient demographics or diagnoses can be incorporated, the readily accessible nature of total admission counts makes it a foundational element for both statistical time-series models and machine learning algorithms. The consistent collection of this data across time is critical for model training and reliable forecasting.

Evaluation of forecasting models using total hospital admissions data resulted in a Mean Absolute Percentage Error (MAPE) ranging from 10% to 20%. This level of accuracy is considered viable for short-term forecasting applications in hospital resource planning. MAPE is calculated as the average absolute percentage difference between predicted and actual values, providing a readily interpretable metric for assessing forecast performance. A MAPE below 20% generally indicates a good level of forecasting skill, allowing for reasonably reliable predictions of total admission volumes.

Model performance in forecasting total hospital admissions is heavily dependent on the accurate representation of demand at the ward level. Aggregate forecasts failing to account for varying demand profiles between specialized units – such as cardiology, oncology, or intensive care – will inherently exhibit reduced accuracy. Ward-specific demand is influenced by factors including seasonal illness prevalence, elective procedure schedules, and patient demographics unique to each unit. Consequently, models incorporating granular, ward-level data, or employing techniques to disaggregate total admissions into these specific demands, consistently outperform those relying solely on aggregate historical data. Improvements in representing this ward-specific granularity directly correlate with reductions in forecast error metrics, such as Mean Absolute Percentage Error (MAPE).

XGBoost, LSTM, and SARIMAX models demonstrate varying predictive accuracy for total arrivals during peak and trough weeks for complex admissions.

Counterfactuals: Reconstructing Reality from Distorted Signals

To address data distortion caused by external disruptions, a Prophet model was utilized to generate synthetic counterfactual data. This involved creating an alternative historical dataset reflecting what the observed time series might have looked like in the absence of the disruptive event. The Prophet model, designed for forecasting time series data with strong seasonality, was calibrated using historical data prior to the disruption. It then generated predictions for the period affected by the disruption, effectively replacing the distorted observations with these model-derived values. This process creates a more stable and representative dataset for subsequent model training and evaluation, minimizing the impact of anomalous data points introduced by the external factors.

The introduction of synthetic counterfactual data addresses data distortion by systematically substituting anomalous or incomplete historical observations with statistically plausible values. This process ensures that forecasting models are trained on a dataset more accurately reflecting typical conditions, rather than being skewed by temporary disruptions or reporting inconsistencies. By generating these replacements based on established patterns and relationships within the historical data, the artificial dataset effectively expands the training set with representative instances, thereby improving the robustness and generalizability of predictive algorithms. The resulting dataset allows for more reliable model calibration and reduces the impact of outliers on model performance.

Synthetic Counterfactual Data was incorporated into the training datasets of both statistical and machine learning models to assess its impact on forecasting performance. Specifically, the artificially generated data was utilized within a SARIMAX model, alongside machine learning algorithms including XGBoost and LSTM. This integration allowed for a comparative analysis of model accuracy when trained on datasets augmented with counterfactuals, rather than relying solely on potentially distorted historical observations. The objective was to determine if the synthetic data could improve the robustness and predictive capabilities of each modeling approach.

Performance evaluation of the forecasting models following integration of synthetic counterfactual data revealed a Mean Absolute Percentage Error (MAPE) of 20.07% when utilizing the SARIMAX model to forecast major complexity admissions. In comparison, the XGBoost model demonstrated a lower MAPE of 10.32% when forecasting total admissions. These results indicate varying degrees of accuracy between the models and across different admission types following the implementation of the data augmentation strategy.

The implementation of synthetic counterfactual data is designed to enhance the robustness of forecasting models when confronted with unpredictable events and variations in ward-specific demand. Traditional time series analysis relies on historical data; however, external disruptions can distort these observations, leading to inaccurate predictions. By generating synthetic data to replace these distortions, the forecasting process becomes less sensitive to anomalous periods. This allows models-including statistical methods like SARIMAX and machine learning algorithms such as XGBoost and LSTM-to train on a more representative dataset, ultimately improving their ability to predict future ward demand despite fluctuating and unforeseen circumstances. Initial results demonstrate a reduction in forecasting error, with SARIMAX achieving a Mean Absolute Percentage Error (MAPE) of 20.07% and XGBoost achieving 10.32% when utilizing this approach.

XGBoost, LSTM, and SARIMAX models demonstrate varying predictive accuracy for total arrivals, performing differently on peak and trough weeks.

Beyond Prediction: Dissecting the Logic of Demand

The study leveraged SHapley Additive exPlanations (SHAP Values) to dissect the complex decision-making process within the XGBoost model, moving beyond simple prediction to reveal how each input feature influences the forecasted resource demand. SHAP Values assign each feature an importance score for a particular prediction, quantifying its contribution to the model’s output – essentially, explaining the deviation from the average prediction. This approach provides a granular understanding, pinpointing whether a feature pushes a prediction higher or lower, and to what extent. By decomposing predictions into their component parts, SHAP Values offer a transparent view into the model’s reasoning, facilitating trust and enabling stakeholders to validate the logic behind the forecasts, and ultimately, improve decision-making.

Analysis of the XGBoost model’s feature importance revealed key drivers of hospital resource demand, pinpointing specific factors with the greatest predictive power. The study demonstrated that patient admission rates, seasonal influenza prevalence, and day of the week consistently ranked as the most influential variables. This wasn’t simply a matter of correlation; the model’s structure allowed for quantifying the impact of each feature on predicted demand, revealing, for instance, that a surge in influenza cases consistently resulted in a proportionally larger increase in required resources compared to a typical weekday increase in admissions. Identifying these dominant factors empowers hospital administrators to proactively adjust staffing levels, optimize supply chain logistics, and strategically allocate critical equipment, ultimately improving preparedness and responsiveness to fluctuating needs.

The predictive power of the XGBoost model extends beyond simply forecasting resource demand; crucially, it offers a window into why a particular prediction was made. Hospital administrators can now dissect the model’s reasoning, identifying the specific factors-such as seasonal trends, patient demographics, or even day-of-week effects-that most heavily influenced the anticipated need for resources. This level of interpretability moves beyond reactive planning, allowing for proactive adjustments based on an understanding of the underlying drivers of demand. Consequently, decisions regarding staffing levels, supply procurement, and bed allocation can be grounded in data-driven insights, leading to more efficient operations and a greater capacity to meet patient needs effectively.

The convergence of enhanced forecasting accuracy and model interpretability directly fuels improvements in hospital resource allocation and, crucially, patient care. By moving beyond simple predictions to understand why a particular demand level is forecast – factoring in nuanced, ward-specific needs – administrators can proactively adjust staffing, supplies, and bed availability. This granular understanding minimizes wasteful overstocking while simultaneously mitigating the risks associated with shortages, leading to more efficient operations and a demonstrably positive impact on the quality of care delivered. The ability to anticipate and respond to localized demand fluctuations ensures that each patient receives timely and appropriate attention, ultimately contributing to improved health outcomes and a more resilient healthcare system.

The proposed LSTM model utilizes a recurrent architecture to predict sequential data over multiple time steps.

The pursuit of forecasting, as demonstrated by this research into Emergency Department arrivals, isn’t about conquering uncertainty, but rather cultivating a cautious understanding of it. The models, whether statistical or machine learning, offer glimpses into probable futures, yet consistently stumble when confronted with the unpredictable peaks of demand. This echoes a fundamental truth: systems aren’t built, they evolve, and their predictions are prophecies, always shadowed by the potential for unforeseen circumstances. As Edsger W. Dijkstra observed, “It’s not enough to do the right thing; you have to know what the right thing is.” The ability to accurately decompose arrivals by ward and complexity is a step toward that knowledge, but vigilance remains paramount, for even the most sophisticated algorithms cannot fully anticipate the chaos inherent in complex systems.

The Horizon Holds More Questions

The pursuit of predictable Emergency Department arrivals resembles all attempts to tame a complex system: each successful forecast is merely a reprieve, a temporary flattening of the inevitable curve. This work, by dissecting demand into ward and complexity strata, achieves a granular view, but the true challenge – anticipating the unpredictable surge – remains largely untouched. Statistical models and machine learning algorithms alike offer refined maps of the familiar landscape, yet stumble at the edges, where novelty resides. Every architecture promises freedom until it demands DevOps sacrifices, and accurate short-term prediction, while valuable, is insufficient against the weight of unforeseen circumstances.

Future efforts would be well-served to abandon the quest for perfect foresight. Instead, focus should shift towards building resilient systems that absorb uncertainty. The question isn’t ‘how do we predict the unknown?’ but ‘how do we adapt when the predictable fails?’ Perhaps the focus should be less on modeling arrivals, and more on modeling the capacity to respond – a dynamic resource allocation that anticipates its own limitations.

Order is just a temporary cache between failures. The next generation of Emergency Department management systems will not be defined by their predictive power, but by their ability to gracefully degrade, to reconfigure, and to learn from the chaos they inevitably encounter. The horizon holds not a clear picture of future demand, but a multitude of questions, and that, ultimately, is a more honest foundation upon which to build.

Original article: https://arxiv.org/pdf/2601.15481.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Inevitable Complexity of Anticipation

From Statistical Benchmarks to Machine Learning Potential

Counterfactuals: Reconstructing Reality from Distorted Signals

Beyond Prediction: Dissecting the Logic of Demand

The Horizon Holds More Questions

See also: