Beyond SABR: Refining Volatility with Geometric Deep Learning

Author: Denis Avetisyan

A novel hybrid approach combines the strengths of established financial modeling with the power of neural networks to deliver more accurate and stable implied volatility surfaces.

The study demonstrates a comparison between the Hagan formula and Monte Carlo estimates in modeling the implied volatility smile characteristic of SABR, revealing the accuracy of analytical approximations against computationally intensive simulations in options pricing.

This paper introduces a geometry-aware residual correction to Hagan’s SABR model using a neural network trained on geometric features, enhancing financial modeling precision.

Accurate and robust modeling of implied volatility surfaces remains a persistent challenge in quantitative finance. This paper, ‘A Geometry-Aware Residual Correction of Hagan’s SABR Implied Volatility Formula’, introduces a hybrid methodology that combines the analytical structure of the SABR model with a neural network trained on geometrically-informed features. By learning the residual error relative to Hagan’s approximation-rather than directly predicting implied volatility-the framework delivers improved accuracy and stability while preserving model interpretability. Could this approach unlock more efficient real-time pricing and calibration in complex trading environments?

The Illusion of Precision: Modeling Volatility’s Smile

Option pricing, at its core, isn’t simply about predicting a future value; it demands an accurate representation of market expectations regarding price fluctuations. This is where the ‘smile’ of implied volatility emerges as a central challenge. Rather than a uniform expectation of volatility across all strike prices, options markets consistently exhibit a pattern where out-of-the-money and in-the-money options are priced as if they anticipate higher volatility than at-the-money options – a curve resembling a smile when plotted. Capturing this ‘smile’ accurately is notoriously difficult because it reflects complex factors like investor fear, supply and demand imbalances, and the anticipation of extreme events – elements not easily captured by traditional financial models. Failing to model the volatility smile correctly can lead to significant mispricing of options and substantial risk for those involved in trading or hedging strategies, making its accurate representation a critical pursuit in quantitative finance.

The SABR model, a cornerstone of volatility modeling due to its analytical solutions for option prices, faces limitations when confronted with the intricacies of real-world financial markets. While offering a comparatively swift calculation of implied volatility – crucial for pricing derivatives – its foundational assumptions often diverge from observed behaviors. Specifically, the model presumes a constant correlation between the underlying asset and its volatility, a simplification that fails to capture phenomena like volatility clustering and the impact of macroeconomic events. This rigidity becomes particularly problematic during periods of market stress, where correlations shift dynamically and the model’s predictions can deviate significantly from actual price movements. Consequently, despite its elegance, the SABR model frequently necessitates calibration adjustments and often struggles to accurately represent the full spectrum of implied volatility observed across different strike prices and expiration dates – a phenomenon known as the volatility smile or skew.

The pursuit of accurate option pricing frequently encounters limitations within current methodologies. Many established techniques, while theoretically sound, demand substantial computational resources to generate reliable results, particularly when dealing with complex portfolio scenarios or exotic options. This reliance on intensive simulations introduces both time constraints and potential for numerical error. Alternatively, some models prioritize speed through simplification, sacrificing precision in the estimation of risk exposures – a critical flaw when managing large positions or assessing tail risk. Consequently, financial institutions often face a trade-off between computational feasibility and the granularity needed for effective risk management, prompting ongoing research into more efficient and accurate volatility modeling approaches.

GeoNN accurately captures implied volatility curvature across short- and intermediate-term maturities, exhibiting only slight underestimation of wings at longer horizons, as compared to Monte Carlo simulations.

Analytical Tools and the Pursuit of Ground Truth

The Hagan formula is a closed-form approximation designed to efficiently calculate implied volatility under the Stochastic Alpha Beta Rho (SABR) model. This formula, derived through asymptotic expansions, provides a rapid estimate of implied volatility for a given money-ness and tenor, making it computationally suitable for tasks requiring numerous calculations, such as calibration and risk management. While not perfectly accurate, the Hagan formula serves as a crucial baseline for evaluating the accuracy of more complex methods, like Monte Carlo simulation, and is widely used in practice due to its speed and relative accuracy, particularly for short-dated options. The formula approximates the implied volatility $\hat{\sigma}$ based on parameters α, β, and ρ, and forward rate $K$ , and tenor $T$ .

Monte Carlo simulation generates implied volatility surfaces by simulating numerous stochastic paths of the underlying asset and calculating the implied volatility for each path that would result in the observed market price. While computationally intensive, this method offers significant flexibility, accommodating complex payoff structures, path-dependent options, and models lacking closed-form solutions for implied volatility calculation. The resulting implied volatilities serve as a benchmark for evaluating the accuracy of faster, analytical approximations like the Hagan formula, and are crucial for validating model calibration against market data; however, achieving statistically significant results typically requires a substantial number of simulated paths, making it considerably slower than analytical methods.

Variance reduction techniques are critical for optimizing Monte Carlo simulations used in financial modeling due to the inherent computational cost of achieving convergence. Specifically, the control variate method improves efficiency by leveraging a related variable with a known expected value to reduce the variance of the estimator. This is achieved by subtracting a scaled version of the control variate’s expected value from each simulated outcome, introducing a correlation that lowers the overall variance without biasing the result. The effectiveness of a control variate depends on the degree of correlation between the simulated variable and the control; higher correlations lead to greater variance reduction. Other techniques, such as importance sampling and stratified sampling, also contribute to enhanced simulation performance by altering the sampling distribution or partitioning the sample space, respectively, ultimately decreasing the computational time required to achieve a specified level of accuracy.

GeoNN significantly reduces dispersion when correlating neural network predictions with Monte Carlo implied volatilities across in-the-money, at-the-money, and out-of-the-money options, demonstrating improved accuracy over NDN.

Learning the Landscape: Neural Networks and Implied Volatility

Deep Feedforward Networks (DFNNs) provide a data-driven approach to estimating implied volatility, circumventing the need for analytical solutions or parametric models traditionally employed in options pricing. Unlike methods reliant on assumptions about the volatility surface – such as the Hagan formula – DFNNs learn the relationship between option characteristics (strike price, time to maturity, underlying asset price) and implied volatility directly from market data. This is achieved through multiple layers of interconnected nodes, enabling the network to model complex, non-linear relationships. The network is trained to minimize the difference between predicted and observed implied volatilities, effectively learning a function that maps option parameters to their corresponding implied volatility values. This direct approximation capability allows for potentially higher accuracy and adaptability to various market conditions, although performance is contingent on the quality and quantity of training data.

Residual Neural Networks (ResNets) improve the accuracy of implied volatility surface calibration by learning the residual difference between the Hagan formula’s output and the true implied volatility. The Hagan formula, while efficient, exhibits limitations in capturing the full complexity of the implied volatility surface, particularly in the tails and away from the at-the-money strikes. ResNets decompose the problem into two parts: the Hagan formula provides an initial estimate, and the ResNet learns to predict the correction needed to refine this estimate. This approach allows the network to focus on learning the more subtle deviations from the Hagan approximation, leading to a more accurate representation of the implied volatility surface and improved calibration performance. The network effectively learns a function $f(X, Y)$ where $X$ and $Y$ represent the strike and maturity, respectively, and the output is added to the Hagan formula’s result to produce the final calibrated implied volatility.

The performance of neural networks used for implied volatility calibration is heavily dependent on the characteristics of the training dataset. Specifically, data quality, encompassing accuracy, completeness, and representativeness of the entire option surface, directly impacts model convergence and generalization ability. Insufficient or noisy training data can lead to overfitting, where the network memorizes the training examples but fails to accurately predict volatilities for unseen data points. Conversely, a large, clean, and diverse dataset, covering a wide range of strikes, maturities, and underlying asset prices, facilitates robust learning and improved out-of-sample performance. Data preprocessing steps, such as outlier removal and error correction, are crucial for ensuring data integrity and maximizing model accuracy. Furthermore, the distribution of the training data should reflect the expected distribution of market conditions to avoid bias in the calibrated implied volatility surface.

ResNN demonstrates stable convergence and reduced overfitting, achieving a validation loss of approximately 0.113 during training.

Beyond Euclidean Space: Geometry-Aware Neural Networks

Geometric Neural Networks (GNNs) incorporate principles from Riemannian Geometry to better represent and process data derived from the Stochastic Volatility Inspired (SVI) model. Specifically, the network architecture is designed to operate on manifolds, acknowledging the non-Euclidean nature of implied volatility surfaces. This approach differs from traditional neural networks which assume a flat, Euclidean input space. By leveraging the intrinsic geometric structure of the SABR model-including concepts like curvature and geodesic distances-GNNs can more accurately capture the relationships between model parameters and implied volatilities, leading to improved calibration and hedging performance. The implementation involves defining appropriate metrics and connections on the manifold to facilitate gradient-based optimization during network training.

The Geometric Residual Neural Network (GRNN) architecture integrates the benefits of both residual learning and geometry-aware input features to improve model performance. Residual learning, implemented through skip connections, facilitates the training of deeper networks by mitigating the vanishing gradient problem. Simultaneously, incorporating Riemannian geometry into the input layer allows the network to directly process the intrinsic geometric structure of the underlying data, specifically within the SABR model. This combination results in enhanced accuracy and improved stability during training, as demonstrated by quantitative results; the GRNN achieved an R² score of 0.97, exceeding the performance of a Naive Deep Network (R² = 0.73), a Geometric Neural Network (R² = 0.94), and a Residual Neural Network (R² = 0.92) on held-out test datasets.

Quantitative validation of the advanced networks demonstrated a coefficient of determination (R²) of 0.97 when tested on held-out data. This performance exceeds that of several baseline models: a Naive Deep Network achieved an R² of 0.73, a standard Geometric Neural Network yielded an R² of 0.94, and a Residual Neural Network achieved an R² of 0.92. These results indicate a statistically significant improvement in predictive accuracy with the proposed geometry-aware architectures.

GeoNN demonstrates greater robustness to stressed SABR parameters, preserving skew and central curvature more effectively than a naive model.

Toward a Predictive Future: Beyond Calibration

Geometry-aware neural networks represent a significant leap beyond traditional calibration methods in financial modeling. These techniques don’t merely refine existing models to fit observed data; they fundamentally reshape the approach to understanding complex financial landscapes. By explicitly incorporating the underlying geometric structure of option prices – the relationships between strikes, maturities, and implied volatility – these networks can more accurately capture subtle market dynamics. This allows for not only improved derivative pricing but, crucially, the potential for real-time risk management. The ability to swiftly assess portfolio sensitivities and predict market movements under various scenarios provides a substantial advantage in volatile conditions, offering a proactive approach to mitigating financial exposure and optimizing investment strategies. This methodology promises a future where financial institutions can move beyond reactive adjustments to embrace predictive, geometry-informed decision-making.

Continued advancements in this field necessitate exploration beyond current datasets; integrating alternative data sources – such as sentiment analysis from news articles, high-frequency trading data, or macroeconomic indicators – promises to refine predictive capabilities and enhance model robustness. Simultaneously, the development of more complex neural network architectures, including transformers and graph neural networks, offers a pathway to capture intricate relationships within financial data that simpler models may overlook. Researchers are actively investigating hybrid approaches, combining the strengths of these advanced architectures with established financial models, to create systems capable of not only accurate predictions but also interpretable risk assessments and real-time adaptation to changing market dynamics. This ongoing pursuit of both data diversification and architectural innovation is poised to unlock the full potential of geometry-aware neural networks in finance.

The development of geometry-aware neural networks signals a potential paradigm shift in financial modeling, moving beyond traditional methods hampered by computational constraints and calibration challenges. These networks offer the promise of models that not only achieve heightened accuracy in complex calculations – such as those required for derivative pricing and risk assessment – but do so with significantly improved efficiency. This is achieved through the inherent ability of these architectures to learn the underlying geometric structure of financial data, allowing for faster computation and more robust predictions even with high-dimensional datasets. Consequently, financial institutions may soon be able to react to market changes in real-time, implement more sophisticated risk management strategies, and ultimately, gain a competitive edge through more informed decision-making.

GeoResNN demonstrates strong alignment with Monte Carlo simulations across all strike prices, exhibiting minimal dispersion in predicted implied volatilities.

The pursuit of a more accurate implied volatility surface, as detailed in this work, echoes a fundamental tenet of rational inquiry. This study doesn’t claim to find the truth about market behavior, but rather to iteratively refine an approximation. As Simone de Beauvoir observed, “One is not born, but rather becomes a woman,” – a statement readily adaptable to models. Just as identity is a process of becoming, so too is a robust financial model. The hybrid framework presented here, combining the analytical strength of the SABR model with the adaptive capacity of neural networks, doesn’t offer a perfect prediction, but a continually improving one. Predictive power is not causality; it’s a refined estimate, sculpted through repeated testing and correction, a becoming rather than a being.

What Lies Ahead?

The pursuit of accurate implied volatility surfaces will undoubtedly continue, though one suspects the diminishing returns will soon demand more than incremental adjustments to established formulas. This work, blending the analytical elegance of SABR with the pattern-matching capacity of neural networks, offers a fleeting glimpse of potential, but it’s crucial to acknowledge what remains stubbornly unresolved. The model’s reliance on geometric features, while innovative, begs the question of whether truly critical information isn’t being discarded in the abstraction. Every dataset is, after all, just an opinion from reality.

Future investigations should prioritize stress-testing this hybrid approach against extreme market conditions – the outliers are, predictably, where these models falter most dramatically. Further exploration of alternative neural network architectures, perhaps those better suited to capturing temporal dependencies, could also prove fruitful. However, a more fundamental shift might be required. The devil isn’t in the details-he’s in the outliers, and chasing ever-more-complex calibrations risks overfitting to a present that will inevitably prove illusory.

Ultimately, the goal isn’t merely to predict the surface, but to understand the underlying economic forces that create it. A model that perfectly reproduces observed prices without offering insight into the dynamics driving those prices is, at best, a sophisticated form of accounting. The true test will be whether this approach, or its successors, can meaningfully improve risk management and inform more rational investment decisions – a benchmark far more rigorous than mere calibration error.

Original article: https://arxiv.org/pdf/2605.06604.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/