Contagion’s Hidden Networks: Mapping Volatility Spillovers with Machine Learning

Author: Denis Avetisyan


New research reveals a surprisingly limited scope of volatility transmission between financial markets, defying expectations of widespread contagion.

Realized volatility, estimated via the Yang-Zhang method across six futures markets from May 2002 to January 2025-a period encompassing 5,699 observations-demonstrates the quantifiable, annualized fluctuation inherent in those markets, revealing patterns discernible only through rigorous time-series analysis.
Realized volatility, estimated via the Yang-Zhang method across six futures markets from May 2002 to January 2025-a period encompassing 5,699 observations-demonstrates the quantifiable, annualized fluctuation inherent in those markets, revealing patterns discernible only through rigorous time-series analysis.

A hybrid HAR-ElasticNet model identifies a sparse network of volatility spillovers, largely concentrated within commodity markets, with minimal linkages between equities, treasuries, and other asset classes.

Understanding interconnectedness in financial markets is often hampered by the ‘curse of dimensionality’ as spillover effects become obscured in high-dimensional systems. This paper, ‘Volatility Spillovers in High-Dimensional Financial Systems: A Machine Learning Approach’, addresses this challenge by employing a hybrid HAR-ElasticNet framework to identify volatility transmission across commodities, equities, and treasuries. The analysis reveals a surprisingly sparse network where equity markets primarily drive volatility, while agricultural commodities remain largely isolated, despite strong persistence in own-volatility. Can this network structure be leveraged to improve forecasting accuracy and risk management strategies in complex financial systems?


The Persistence of Volatility: A Statistical Memory

Unlike typical data sets where each observation is independent and identically distributed (i.i.d.), financial time series demonstrate a pronounced tendency for volatility to persist over time. This means that significant price fluctuations, or ‘shocks’, don’t simply dissipate; instead, they exert a continuing influence on subsequent market behavior. A large price swing today doesn’t just affect tomorrow’s returns – it increases the likelihood of further substantial movements in the near future, creating clusters of high and low volatility. This ‘memory’ inherent in financial data fundamentally distinguishes it from simpler statistical models and presents a core challenge for accurately forecasting risk and pricing financial instruments. The persistence isn’t necessarily a reflection of any underlying causal mechanism, but rather an observed statistical property demanding specialized modeling techniques to avoid underestimating future market turbulence.

Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models have long been a mainstay in financial time series analysis, successfully capturing many features of volatility clustering. However, these models frequently fall short when tasked with representing the prolonged influence of past events on current market fluctuations-a phenomenon known as long-memory. This limitation arises because standard GARCH formulations often rely on relatively short-term memory components, inadequately reflecting the persistence observed in real-world financial data. Consequently, predictions generated from these models can underestimate the magnitude and duration of volatility shifts, leading to inaccurate risk assessments in areas like option pricing and portfolio management. The inability to fully account for long-memory effects can therefore result in a systematic underestimation of potential losses and an incomplete understanding of market dynamics.

The accurate representation of volatility persistence extends far beyond theoretical refinement, proving essential for practical applications in modern finance. Derivative pricing, for instance, relies heavily on anticipating future price swings; underestimating volatility can lead to significantly mispriced options and substantial losses for traders. Similarly, portfolio optimization strategies, designed to maximize returns for a given level of risk, require precise volatility forecasts to appropriately allocate assets and mitigate potential downturns. Perhaps most critically, systemic risk management – the effort to prevent the failure of one institution from triggering a cascade of failures throughout the financial system – depends on correctly assessing the interconnectedness of risks and the potential for volatility to spread, making accurate modeling a cornerstone of financial stability.

Constructing effective volatility models presents a fundamental challenge: reconciling the desire for simplicity with the intricate realities of financial data. Researchers continually navigate a trade-off between model parsimony – utilizing fewer parameters for computational efficiency and reduced risk of overfitting – and the need to accurately represent the subtle, long-range dependencies inherent in volatility clusters. Highly complex models, while potentially capturing nuanced dynamics, often become unwieldy and susceptible to estimation errors, diminishing their predictive power. Conversely, overly simplified models may fail to adequately reflect the persistent nature of volatility, leading to underestimation of risk and inaccurate pricing of financial instruments. Therefore, advancements in volatility modeling increasingly focus on innovative approaches that achieve an optimal balance – harnessing the power of complex dependencies while maintaining a degree of interpretability and statistical robustness.

During the final five months of the test period, both the hybrid HAR-ElasticNet and univariate HAR models accurately tracked realized volatility with similar root mean squared errors of 0.0044, despite the hybrid model exhibiting extreme sparsity (8% nonzero cross-market coefficients) which effectively reduced its complexity to that of a univariate HAR model.
During the final five months of the test period, both the hybrid HAR-ElasticNet and univariate HAR models accurately tracked realized volatility with similar root mean squared errors of 0.0044, despite the hybrid model exhibiting extreme sparsity (8% nonzero cross-market coefficients) which effectively reduced its complexity to that of a univariate HAR model.

Measuring Volatility with High-Frequency Data

Realized volatility is calculated using the Yang-Zhang estimator, a method that leverages high-frequency intraday price data to provide a more granular and accurate volatility measurement than traditional methods based on daily returns. The estimator computes the sum of squared intraday returns over a specified period, effectively capturing short-term price fluctuations that are lost when using only daily closing prices. This approach significantly improves the precision of volatility estimates, particularly for periods of high market activity or rapid price changes. The Yang-Zhang estimator also incorporates a bias correction to account for the discrete nature of market data and the impact of bid-ask bounce, resulting in a less noisy and more reliable volatility proxy. Specifically, the estimator uses the formula RV = \sum_{i=1}^{N} r_i^2 , where r_i represents the intraday return for each observation and N is the total number of observations within the period.

The Yang-Zhang estimator mitigates the impact of microstructural noise – the high-frequency, non-informative component of price data arising from order book dynamics – and jump diffusion, which introduces volatility spikes unrelated to continuous price movements. This is achieved through a weighted sum of squared intraday returns, employing a kernel-based weighting scheme that downweights observations suspected of being influenced by noise or jumps. Specifically, the weighting function assigns lower values to returns exhibiting characteristics of microstructural noise, such as those clustered around zero, and to extreme returns indicative of jump diffusion. This process results in a realized volatility measure that more accurately reflects the underlying, continuous volatility process, as opposed to being distorted by these extraneous factors, and improves the signal-to-noise ratio of volatility estimates.

The Heterogeneous Autoregressive (HAR) model decomposes realized volatility into long-memory components to address the limitations of traditional autoregressive models. This is achieved by summing three components: a daily component \sqrt{RV_t} , a weekly component calculated as the sum of daily realizations over the previous five trading days, and a monthly component representing the sum of daily realizations from the previous 21 trading days. This aggregation captures the fact that volatility exhibits decay patterns over different time horizons; shocks to volatility tend to be quickly absorbed at the daily level, with lingering effects observable over the weekly and monthly horizons. The HAR model effectively represents this multi-scale persistence, offering a parsimonious and statistically robust approach to modeling realized volatility dynamics.

Ordinary Least Squares (OLS) regression is employed to model the dynamic behavior of realized volatility, specifically its autocorrelation. This approach yields a coefficient of 0.99, indicating a very high degree of persistence in volatility; that is, volatility observed today is strongly correlated with volatility observed in the past. This value is consistent with established empirical findings in financial markets and ensures the model accurately reflects the known tendency of volatility to exhibit strong mean reversion but also prolonged periods of sustained levels. The OLS estimation method effectively captures this characteristic, providing a realistic representation of volatility dynamics without imposing artificial constraints on the persistence parameter.

The hybrid HAR-ElasticNet model reveals strong own-market persistence and sparse cross-market spillovers, with only 8% of the 90 cross-market coefficients being nonzero, and assets like ES, NQ, ZF, and ZN exhibiting no transmission channels.
The hybrid HAR-ElasticNet model reveals strong own-market persistence and sparse cross-market spillovers, with only 8% of the 90 cross-market coefficients being nonzero, and assets like ES, NQ, ZF, and ZN exhibiting no transmission channels.

Mapping the Network of Cross-Market Volatility

Cross-market spillover effects represent the transmission of volatility between distinct asset classes. This analysis examines how changes in volatility within one market – including crude oil, soybeans, stock indices, and treasury indices – systematically impact volatility in others. The investigation moves beyond assessing simple correlations to determine if volatility in one asset class can be statistically linked to volatility changes in another, thereby identifying interconnectedness and potential systemic risk. This approach allows for the quantification of these interdependencies, revealing which asset classes act as key transmitters or receivers of volatility and informing strategies for portfolio diversification and risk management.

ElasticNet regularization is utilized to construct sparse cross-market spillover networks by combining L1 and L2 regularization techniques. This method simultaneously performs feature selection – identifying the most impactful cross-market connections – and shrinkage of coefficient estimates to mitigate the effects of multicollinearity and noise. Specifically, ElasticNet imposes a penalty on the sum of the absolute values (L1) and the squared values (L2) of the coefficients, driving less significant connections to zero and refining the estimates of remaining, statistically relevant relationships. This results in a parsimonious network representation focused on the key pathways of volatility transmission, improving model interpretability and predictive power by reducing overfitting.

Traditional correlation analysis identifies statistical associations between asset volatility but does not establish the direction or mechanism of influence. Our methodology, employing ElasticNet regularization, moves beyond correlation by identifying sparse networks of volatility transmission. This allows for the estimation of coefficients that indicate the magnitude and direction of influence one asset’s volatility has on another, effectively discerning causal relationships. By isolating statistically significant connections and minimizing the impact of spurious correlations, the analysis reveals which assets demonstrably drive volatility in others, providing a more robust understanding of systemic risk than correlation-based approaches.

Analysis of cross-market spillover networks, utilizing ElasticNet regularization, revealed a high degree of sparsity in volatility transmission. Of the 90 cross-market coefficients examined, only 7 demonstrated statistical significance, accounting for just 8% of the total. This finding indicates that the majority of observed cross-market relationships are likely spurious or driven by confounding factors, and that volatility propagation is concentrated within a limited number of key connections between asset classes. The identification of these statistically significant coefficients allows for a more precise understanding of systemic risk and improved modeling of cross-market volatility dynamics.

Following a one-standard-deviation shock to equities <span class="katex-eq" data-katex-display="false">ES</span> and <span class="katex-eq" data-katex-display="false">NQ</span>, impulse responses are shown with 95% confidence intervals derived from 1,000 bootstrap simulations.
Following a one-standard-deviation shock to equities ES and NQ, impulse responses are shown with 95% confidence intervals derived from 1,000 bootstrap simulations.

Tracing Volatility Shocks and Systemic Risk

Joint Impulse Response Functions provide a powerful method for dissecting the complex web of interconnected financial markets, revealing how a sudden increase in volatility within one sector ripples outwards to others. Unlike traditional analyses that examine shocks in isolation, this approach simultaneously accounts for volatility fluctuations occurring across multiple markets, offering a more realistic portrayal of systemic behavior. By tracing the dynamic response of each market to an initial volatility shock, researchers can pinpoint the speed and extent of these spillover effects, identifying which markets are most vulnerable and how quickly contagion can spread. This technique allows for the detection of potential amplification mechanisms – instances where initial shocks are exacerbated by market interactions – ultimately enhancing the understanding of systemic risk and informing more robust risk management strategies.

The propagation of volatility between financial markets isn’t a gradual diffusion, but rather a series of rapid and often substantial spillover effects. Detailed analysis indicates that a volatility shock originating in one market doesn’t simply fade with distance; instead, it can quickly escalate as it moves through interconnected systems. This research identifies specific pathways and timeframes for these transmissions, revealing how initial fluctuations can be amplified by market dynamics and feedback loops. Notably, the study highlights potential mechanisms – such as common latent factors or correlated trading strategies – that contribute to this acceleration. Understanding both the speed and magnitude of these effects is crucial, as delayed or underestimated responses to volatility shocks can significantly increase systemic risk and destabilize the broader financial landscape.

Understanding the propagation of volatility shocks across financial markets is paramount for grasping systemic risk, as interconnectedness can rapidly transform localized events into widespread crises. This analysis provides crucial insights for risk managers, enabling them to move beyond isolated assessments and incorporate the potential for contagion. By identifying the speed and magnitude of spillover effects, strategies can be developed to mitigate these risks – whether through enhanced monitoring of correlated assets, dynamic hedging techniques, or proactive capital allocation. Ultimately, a comprehensive understanding of these interconnected dynamics is not merely an academic exercise, but a fundamental requirement for preserving financial stability and safeguarding against future systemic events.

Financial stability assessments often treat markets in isolation, yet the analysis reveals a critical need to account for interconnectedness. The study demonstrates that volatility shocks don’t remain contained within a single market; instead, they propagate across systems, influencing seemingly unrelated assets. This interconnectedness significantly impacts risk assessment, as traditional models failing to capture these spillover effects may underestimate systemic vulnerabilities. Supporting this, the model achieves an out-of-sample Root Mean Squared Error (RMSE) of 0.0044 – a performance level comparable to a standard univariate Heterogeneous Autoregressive (HAR) model, validating its ability to accurately reflect market dynamics while explicitly incorporating the influence of cross-market relationships. These findings highlight that a holistic view of financial systems, acknowledging the intricate web of dependencies, is paramount for effective risk management and maintaining overall stability.

A one-standard-deviation shock to Treasury yields (ZF and ZN) elicits the depicted impulse responses, with shaded regions representing 95% confidence intervals estimated from 1,000 bootstrap simulations.
A one-standard-deviation shock to Treasury yields (ZF and ZN) elicits the depicted impulse responses, with shaded regions representing 95% confidence intervals estimated from 1,000 bootstrap simulations.

The study meticulously dissects volatility transmission, revealing a network far less interconnected than conventional wisdom suggests. This sparseness, achieved through the HAR-ElasticNet model, implies that systemic risk, while present, isn’t the all-encompassing web often portrayed. It’s a fitting echo of John Stuart Mill’s observation that “The only way to have a firm opinion on anything is to know nothing about it.” The model doesn’t assume pervasive interconnectedness; instead, it actively seeks to disprove it through rigorous regularization. The findings suggest that focusing solely on broad systemic risk overlooks the concentrated nature of volatility spillovers, primarily within commodity markets. If the model had revealed a dense, fully connected network, one might suspect an overfitted, overly optimistic assumption about market integration.

What’s Next?

The apparent sparsity of volatility transmission revealed by this work is less a conclusion than an invitation to reconsider what constitutes “significant” linkage. The model, while adept at separating signal from noise, remains fundamentally constrained by the data it consumes – realized volatility, however meticulously calculated, is still a proxy for the complex, often irrational, behavior driving financial markets. Future iterations must grapple with the inherent limitations of relying solely on observable price movements, perhaps by incorporating alternative data sources reflecting investor sentiment or macroeconomic fundamentals.

Moreover, the concentration of spillovers within commodity markets begs further scrutiny. Is this a genuine characteristic of those markets, or an artifact of the model’s regularization parameters? A systematic exploration of different penalty functions and cross-validation techniques is essential, alongside a rigorous assessment of the model’s out-of-sample performance. The pursuit of parsimony should not eclipse the need for robustness; a model that elegantly explains the past is of limited value if it fails to anticipate the future.

Ultimately, data isn’t the goal-it’s a mirror of human error. The relative isolation of equities and treasuries, if confirmed by subsequent research, suggests either a degree of market efficiency previously underestimated, or simply that the linkages exist, but operate on timescales or through mechanisms this model cannot capture. Even what can’t be measured still matters – it’s just harder to model.


Original article: https://arxiv.org/pdf/2601.03146.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-01-07 13:01