Beyond the Forecast: AI and Machine Learning Sharpen Subseasonal Predictions

Author: Denis Avetisyan

A new probabilistic bias correction framework dramatically improves the accuracy of short-term weather forecasts generated by both traditional and artificial intelligence models.

Probabilistic bias correction refines predictive accuracy by integrating lagged observations, climatological quantiles-such as quintiles or terciles-and deterministic forecasts from dynamical or artificial intelligence models into a learning-enhanced probabilistic framework designed to mitigate systematic errors.

This research introduces a machine learning approach to probabilistic bias correction that enhances subseasonal forecasting performance, outperforming existing operational systems and improving extreme weather prediction.

Despite recent advances in weather forecasting, skill degrades rapidly at subseasonal timescales (2-6 weeks ahead) due to compounding errors and persistent biases. This limitation is addressed in ‘Enhancing AI and Dynamical Subseasonal Forecasts with Probabilistic Bias Correction’, which introduces a machine learning framework-probabilistic bias correction (PBC)-to substantially reduce systematic error in both dynamical and AI-based forecasts. Demonstrating state-of-the-art performance, PBC doubled the subseasonal skill of one AI system and improved the accuracy of a leading dynamical model, achieving first place in ECMWF’s 2025 real-time forecasting competition. Could this approach unlock more reliable predictions of extreme events and fundamentally improve climate adaptation strategies for vulnerable communities worldwide?

The Illusion of Predictability

Current weather forecasting systems demonstrate remarkable accuracy when predicting conditions within a few days, but their predictive power diminishes significantly when looking two to six weeks ahead – a period known as the ‘subseasonal’ timescale. This gap in predictability poses a substantial challenge for sectors heavily reliant on anticipating weather patterns beyond the immediate future. Agriculture, for instance, requires knowledge of rainfall and temperature trends weeks in advance to optimize planting and harvesting, while disaster preparedness relies on subseasonal forecasts to proactively mitigate the impacts of droughts, floods, and heatwaves. The difficulty lies in the chaotic nature of the atmosphere; small initial uncertainties grow rapidly over this timeframe, limiting the skill of traditional models. Consequently, improving subseasonal prediction is a crucial area of ongoing research, with the potential to deliver significant economic and societal benefits.

Dynamical models represent the cornerstone of weather and climate prediction, relying on fundamental physics to simulate atmospheric behavior; however, their predictive skill diminishes considerably when forecasting between two and six weeks ahead. These models, while adept at capturing large-scale weather patterns, frequently exhibit systematic biases – consistent over or underestimations of certain variables – and struggle with the chaotic nature of the atmosphere at this ‘subseasonal’ timescale. Consequently, forecasts in this range often lack the accuracy needed for critical decision-making in sectors like agriculture, water resource management, and disaster preparedness, where anticipating conditions weeks in advance can significantly mitigate risks and optimize planning efforts. Addressing these limitations requires advanced techniques to correct for inherent model errors and better represent the complex interactions within the Earth system.

The pursuit of skillful subseasonal forecasting hinges on a nuanced understanding of Earth’s complex systems and a dedicated effort to refine predictive models. Current dynamical models, while adept at short-range forecasts, frequently exhibit biases and struggle to accurately represent the intricate interplay of atmospheric, oceanic, and land surface processes operating over the 2-6 week timescale. Achieving reliable predictions necessitates not only capturing these interactions-such as the Madden-Julian Oscillation’s influence on global weather patterns-but also systematically identifying and correcting for inherent model errors. This involves sophisticated statistical post-processing techniques, ensemble forecasting methods, and the integration of observational data to calibrate model outputs and enhance their predictive capability, ultimately bridging the gap between current limitations and the demand for actionable long-range information.

Probabilistic bias correction (PBC) substantially enhances the skill of ECMWF forecasts for extreme weather events-improving temperature, pressure, and precipitation forecasting by 35-100% globally and consistently delivering positive flood forecasting skill, as demonstrated by analysis from 2016-2024 and the Global Disaster Awareness and Coordination System.

Beyond Determinism: A Shift in Perspective

Probabilistic Bias Correction (PBC) represents a departure from deterministic forecasting methods that typically yield a single predicted value. Traditional forecasts often exhibit systematic errors, or biases, resulting from imperfections in model physics, incomplete data, or computational limitations. PBC explicitly identifies and corrects for these biases by generating a probability distribution of potential outcomes, rather than a singular point estimate. This distribution reflects the uncertainty inherent in the forecast and allows users to assess the likelihood of different scenarios. By quantifying the range of possible results, PBC provides a more comprehensive and informative forecast, enabling risk assessment and improved decision-making in applications where understanding forecast uncertainty is critical.

Probabilistic Bias Correction (PBC) utilizes two primary machine learning algorithms, Persistence++ and Debias++, to address systematic errors present in dynamical forecasts. Persistence++ functions by learning from the historical performance of the dynamical model and blending it with a simple persistence forecast – assuming the next value will be similar to the current one – to reduce bias. Debias++ employs a more complex approach, training on historical forecast errors to directly predict and remove the bias component from future dynamical forecasts. Both algorithms are trained on historical data and applied post-hoc to the raw dynamical model output, resulting in calibrated probabilistic forecasts with improved reliability and reduced systematic under- or over-estimation.

Ensemble forecasting forms the basis of this probabilistic bias correction framework by generating multiple forecasts, each representing a plausible future state based on slightly perturbed initial conditions or model parameters. This approach moves beyond single-valued deterministic forecasts to provide a distribution of possible outcomes, allowing for the quantification of forecast uncertainty. By analyzing the spread and characteristics of this ensemble, a probabilistic prediction – specifying the likelihood of various outcomes – is generated. This probabilistic output is demonstrably more informative than a single point estimate, as it allows decision-makers to assess risks, plan for a range of possibilities, and make more robust and actionable decisions based on quantified uncertainty.

Probabilistic bias correction (PBC) consistently improves the forecast skill of the ECMWF model across all seasons and years (2016-2024) globally.

The Rise of Hybrid Intelligence

PoET (Physics-informed Observation Encoding Transformer) and AIFS-SUBS (AI-based Integrated Forecasting System – Subseasonal to Seasonal) represent a new class of subseasonal forecasting models leveraging deep learning architectures, notably FuXi-S2S. FuXi-S2S is a sequence-to-sequence model specifically designed for medium- to long-range weather prediction. These models move beyond traditional statistical approaches by incorporating learned representations directly from observational data and dynamical model outputs. PoET, for example, encodes observational data into a latent space to inform the forecasting process, while AIFS-SUBS combines outputs from multiple forecasting systems using AI-driven weighting schemes. Evaluations demonstrate that these architectures achieve improved skill in predicting subseasonal phenomena, such as temperature and precipitation anomalies, compared to traditional methods and earlier generation statistical models.

Hybrid forecasting models integrate the strengths of both dynamical and artificial intelligence (AI) approaches to improve predictive skill. Dynamical forecasts, derived from numerical weather prediction systems based on physical laws, excel at capturing large-scale atmospheric behavior but can be limited by computational constraints and chaotic effects. AI models, particularly deep learning architectures, demonstrate proficiency in identifying complex patterns and relationships within data. By combining these methodologies – often through techniques like model blending or AI-driven post-processing of dynamical outputs – hybrid models leverage the physically-grounded realism of dynamical forecasts with the pattern recognition and error correction capabilities of AI, resulting in forecasts that outperform either approach in isolation.

MicroDuet achieves enhanced forecasting performance through a strategic ensemble approach, specifically by combining the outputs of two distinct forecasting systems: PBC-ECMWF and PBC-PoET. PBC-ECMWF leverages the established dynamical forecasting capabilities of the European Centre for Medium-Range Weather Forecasts (ECMWF) model, while PBC-PoET utilizes the pattern recognition strengths of the Physics-informed Ensemble Transformer for Subseasonal to Seasonal prediction (PoET) model. By intelligently combining these outputs, MicroDuet effectively mitigates individual model biases and uncertainties, resulting in improved overall forecast skill compared to either model operating independently. This demonstrates the efficacy of ensemble techniques in subseasonal forecasting, where combining diverse predictive signals can lead to more robust and accurate predictions.

MicroDuet combines probabilistic bias correction of the ECMWF ensemble with a hybrid AI/dynamical PoET ensemble to capitalize on the strengths of both dynamical and data-driven forecasting approaches.

Validation and the Pursuit of Progress

The rapidly evolving field of subseasonal forecasting benefits significantly from competitive platforms like the ‘AI Weather Quest’, which provides a crucial arena for rigorously evaluating and refining predictive models. This competition isn’t merely about ranking performance; it actively drives innovation by challenging developers to push the boundaries of accuracy in predicting weather patterns weeks in advance. Models such as MicroDuet are subjected to intense scrutiny through standardized benchmarks and objective scoring, fostering a cycle of continuous improvement. The resulting advancements, born from this competitive environment, translate directly into more reliable forecasts, impacting sectors ranging from agriculture and energy to disaster preparedness and public safety, ultimately demonstrating the power of collaborative competition in scientific progress.

The evaluation of probabilistic weather forecasts demands a robust and objective metric, and the Ranked Probability Skill Score (RPSS) fulfills this need within competitive environments like the ‘AI Weather Quest’. Unlike simple accuracy measures, RPSS assesses the entire probability distribution predicted by a forecast model, rewarding skillful predictions of both the most likely outcome and the range of possible outcomes. It effectively compares the predictive power of a model against a baseline – often a climatological forecast or a simpler statistical model – providing a standardized score between 0 and 1, where higher values indicate greater skill. This rigorous scoring system allows for a clear and unbiased ranking of different forecasting models, driving innovation and improvement in subseasonal prediction capabilities by pinpointing areas where models excel or require further refinement.

Probabilistic Bounded Cascade (PBC) exhibits a remarkable capacity to enhance the accuracy of subseasonal weather forecasts, as evidenced by significant improvements in the Ranked Probability Skill Score (RPSS) when compared to a debiased Ensemble Member Selection (EMS) from the European Centre for Medium-Range Weather Forecasts (ECMWF). Globally averaged data from 2016 to 2024 reveals that PBC more than doubles the RPSS for precipitation forecasts, achieving gains exceeding 100%. Furthermore, substantial skill increases are observed for temperature, ranging from 26% to 38%, and for mean sea level pressure, improving by 16% to 33%. These results demonstrate that PBC not only refines forecast precision but also represents a considerable advancement in the field of subseasonal prediction, offering a more reliable and informative outlook for weather-sensitive sectors.

The culmination of focused development saw MicroDuet achieve a decisive victory in the AI Weather Quest competition, securing first place across all evaluated weather variables and forecast lead times for the September-October-November season. This competitive setting, designed to rigorously test subseasonal forecasting models, provided definitive validation of MicroDuet’s superior performance capabilities. The achievement isn’t simply a ranking; it represents a substantial leap forward in predictive accuracy, demonstrating the model’s ability to consistently outperform its peers in a demanding, real-world evaluation. This success confirms the effectiveness of the underlying architectural choices and training methodologies employed in MicroDuet’s creation, solidifying its position as a leading model in the field of subseasonal forecasting.

A comprehensive evaluation reveals that the probabilistic forecasting capabilities of the model consistently surpass those of FuXi-S2S across a global grid. Specifically, the model exhibits skill gains in an extraordinary 99-100% of grid cells when predicting temperature, and achieves similarly high performance-97-99%-in forecasting precipitation. Even for the complex variable of mean sea level pressure, skill improvements are demonstrated in 96-100% of grid cells. This near-universal enhancement in predictive accuracy underscores the model’s robustness and its ability to consistently deliver improved forecasts, irrespective of geographic location, marking a significant advancement in subseasonal forecasting technology.

In the 2025 AI Weather Quest competition, MicroDuet-an ensemble of <span class="katex-eq" data-katex-display="false">PBC</span>-ECMWF and <span class="katex-eq" data-katex-display="false">PBC</span>-PoET-achieved first place globally for all forecast variables and lead times, surpassing the skill of governmental weather agencies, ECMWF’s AI models, and 34 other competing teams including FengshunHybrid and LPM. — In the 2025 AI Weather Quest competition, MicroDuet-an ensemble of $PBC$ -ECMWF and $PBC$ -PoET-achieved first place globally for all forecast variables and lead times, surpassing the skill of governmental weather agencies, ECMWF’s AI models, and 34 other competing teams including FengshunHybrid and LPM.

The pursuit of enhanced subseasonal forecasting, as detailed in this work, reveals a familiar pattern. Systems, even those built upon the latest machine learning techniques, are not immune to inherent biases. This research, with its probabilistic bias correction framework, doesn’t so much build a better forecast as it cultivates one, attempting to nudge the inevitable drift toward inaccuracy. As John McCarthy observed, “It is better to solve a problem that nobody understands than to solve a problem that nobody cares about.” This pursuit of refinement, addressing the subtle but crucial errors within complex models, exemplifies a dedication to understanding – and mitigating – the inherent fragility of even the most sophisticated predictive ecosystems. The very act of correction acknowledges the system’s tendency toward dependency on initial conditions and the propagation of error.

What Lies Ahead?

The pursuit of accurate subseasonal forecasts, even augmented by machine learning, remains a study in deferred resolution. This work demonstrates a refinement – a smoothing of edges – but does not address the fundamental problem: the chaotic nature of the atmosphere resists precise prediction beyond a narrow window. The framework presented offers marginal gains, yet each gain introduces a new dependency, a subtle lock-in to a particular architecture. Technologies change, dependencies remain. The focus inevitably shifts from model improvement to data assimilation, to the endless quest for the perfect observation to nudge the inevitable toward a slightly less imperfect outcome.

The true challenge isn’t achieving higher skill scores, but managing the illusion of control. Operational forecasting systems are not solved problems; they are elaborate compromises frozen in time. The field will likely see a proliferation of bias correction techniques, each tailored to specific models and regions, creating a fragmented landscape of localized improvements. This is not progress, precisely, but adaptation – a continual recalibration to the ever-present noise.

Perhaps the most fruitful path lies not in chasing perfect forecasts, but in embracing uncertainty. Probabilistic forecasting, while theoretically sound, often struggles with interpretability. The future may well demand a shift in focus – from predicting what will happen to quantifying what is possible, and preparing for a range of plausible futures, rather than a single, illusory certainty.

Original article: https://arxiv.org/pdf/2604.16238.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Illusion of Predictability

Beyond Determinism: A Shift in Perspective

The Rise of Hybrid Intelligence

Validation and the Pursuit of Progress

What Lies Ahead?

See also: