Pricing Volatility with AI: A New Approach to Options Modeling

Author: Denis Avetisyan

Researchers have developed a novel physics-informed deep learning method to accurately price financial options under the widely-used Heston model.

The DeepSVM architecture integrates a DeepONet core with a hard-constrained ansatz to precisely satisfy terminal payoff conditions, while residual-based adaptive refinement stabilizes the training process-a configuration designed not to prevent decay, but to manage it with exacting control.

This work introduces DeepSVM, a deep operator network leveraging physics-informed neural networks and Sobolev regularization for improved stochastic volatility modeling.

Efficiently calibrating stochastic volatility models remains computationally challenging due to the repeated solving of partial differential equations. This limitation motivates the work presented in ‘DeepSVM: Learning Stochastic Volatility Models with Physics-Informed Deep Operator Networks’, which introduces a novel physics-informed Deep Operator Network capable of learning the solution operator for the Heston model across its entire parameter space without requiring labelled training data. DeepSVM achieves high pricing accuracy by enforcing key financial constraints and employing adaptive refinement techniques, yet exhibits noise in derivative calculations-specifically, in the at-the-money regime. How can higher-order regularization strategies further refine physics-informed operator learning to ensure smooth and reliable derivative estimations for complex financial models?

The Inevitable Decay of Conventional Pricing

The foundation of fair and precise option pricing rests upon the ability to model market volatility, yet conventional methods frequently struggle to represent the intricacies of real-world financial behavior. Traditional models, often reliant on assumptions of constant or predictable volatility, fail to capture phenomena like volatility clustering – periods of high volatility followed by periods of calm – and the presence of ‘jumps’ in asset prices. This inadequacy stems from the fact that financial markets are inherently dynamic and influenced by a multitude of factors, rendering simple assumptions unrealistic. Consequently, options may be systematically mispriced, leading to incorrect hedging strategies and potentially significant financial losses for those relying on these models. The pursuit of more sophisticated volatility models isn’t merely an academic exercise; it’s a critical need for maintaining market stability and ensuring accurate risk assessment in a complex financial landscape, especially considering the growing volume and variety of derivative instruments traded globally.

Financial models frequently assume volatility – the degree of price fluctuation – follows predictable patterns, yet real-world markets demonstrate a persistent irregularity. This disconnect leads to systematic mispricing of options contracts, as standard models underestimate the probability of extreme price movements. Consequently, risk assessments relying on these flawed valuations can be severely inaccurate, potentially exposing investors to unexpected losses and hindering effective portfolio management. The challenge isn’t simply that volatility changes, but how it changes – exhibiting features like volatility clustering (periods of high volatility followed by periods of calm) and jumps that traditional models, often built on the assumption of continuous diffusion, fail to adequately capture. The consequence is that derivative pricing can diverge significantly from observed market prices, creating arbitrage opportunities and ultimately undermining the reliability of financial instruments.

The valuation of European options presents a unique modeling challenge due to the preference for closed-form analytical solutions. Unlike their American counterparts, European options cannot be exercised before maturity, demanding precise pricing formulas to avoid arbitrage opportunities. This reliance on analytical tractability restricts the complexity of volatility models that can be effectively employed; models must yield solutions expressible in standard mathematical forms. Consequently, even slight inaccuracies in volatility estimation can propagate into significant pricing errors, particularly for options with longer maturities or complex payoff structures. This creates a constant drive for more sophisticated, yet analytically manageable, models capable of capturing the nuances of market dynamics without sacrificing the speed and reliability of a closed-form solution.

Attempts to refine option pricing beyond traditional models have led to sophisticated approaches like Local Stochastic Volatility (LSV) and Rough Volatility, each striving to capture the intricacies of market fluctuations. LSV models incorporate both local volatility-varying volatility with the underlying asset price and time-and a stochastic process governing volatility itself, offering greater flexibility. Rough Volatility, inspired by the mathematical concept of fractional Brownian motion, aims to model the ‘roughness’ inherent in volatility surfaces, recognizing that volatility isn’t always smooth. However, these advancements come at a cost; implementing LSV and Rough Volatility requires intensive computation. Closed-form solutions, readily available for simpler models, are often unattainable, necessitating complex numerical methods like Monte Carlo simulations or finite difference schemes. This computational burden can significantly increase the time and resources needed for pricing and risk management, presenting a practical hurdle despite the theoretical improvements in accuracy. The trade-off between model fidelity and computational feasibility remains a central challenge in quantitative finance.

DeepSVM accurately replicates the Heston semi-analytic solution for option pricing and corresponding Greeks (Delta and Gamma) across a range of randomly selected parameters, as demonstrated by close agreement in price and minimal absolute pricing error as a function of log-moneyness.

The Heston Model: A Necessary Equilibrium

The Heston model is a prevalent stochastic volatility model used in financial mathematics due to its ability to capture key features of volatility dynamics-specifically, the volatility smile and kurtosis-while remaining analytically tractable. Unlike models assuming constant volatility, Heston’s model allows volatility itself to be a stochastic process, modeled as a Cox-Ingersoll-Ross (CIR) process. This results in a closed-form solution for option prices under certain conditions, simplifying calculations compared to more complex models. The model’s parameters – mean reversion rate, volatility of volatility, correlation between the asset price and its volatility, and the initial volatility – provide flexibility in fitting observed market data. While more sophisticated models exist, the Heston model represents a balance between capturing realistic volatility behavior and maintaining computational efficiency, making it widely adopted in both theoretical research and practical applications like options pricing and risk management.

The Heston model’s pricing framework is fundamentally based on solving a two-dimensional parabolic partial differential equation (PDE). This PDE describes the evolution of the asset price and its variance, and obtaining analytical solutions is generally intractable. Consequently, numerical methods are required, typically involving discretization of both the asset price and variance dimensions. The computational intensity arises from the need for fine discretization to ensure accuracy and stability, particularly when modeling path-dependent options or those sensitive to volatility changes. Robust numerical schemes, such as finite difference methods (explicit, implicit, or Crank-Nicolson) or more advanced techniques like operator splitting, are essential to handle the stiffness and potential ill-conditioning of the PDE and achieve convergence within reasonable computational time. The complexity scales with the desired accuracy and the range of asset prices and variances considered in the solution domain.

The Feynman-Kac Partial Differential Equation (PDE) establishes a direct relationship between the Heston model’s pricing equation and the stochastic process governing the underlying asset’s volatility. Specifically, the FK PDE demonstrates that the option price can be expressed as the expected value of the payoff function conditional on the path-dependent volatility process. This is achieved through a probabilistic representation where the solution to the PDE is equivalent to integrating the payoff function multiplied by a weighting function determined by the volatility process’s transition density. Formally, this connection allows for the derivation of the PDE directly from the stochastic control problem inherent in the Heston model, validating its mathematical consistency and providing a framework for alternative solution methods based on Monte Carlo simulation.

The Log-Moneyness transformation, represented as $log(K/X)$, where K is the strike price and X is the asset price, is implemented to address numerical challenges inherent in solving the Heston PDE. This transformation effectively re-scales the spatial domain, mapping the original asset price range to a more manageable and uniform distribution. This re-scaling improves the conditioning of the PDE, leading to enhanced stability in finite difference or other numerical schemes. Consequently, larger time steps and spatial grids can be used without sacrificing accuracy, directly resulting in significant computational efficiency gains. The transformation also mitigates issues arising from boundary conditions at extreme asset prices, further contributing to a more robust and accurate solution process.

Learning the Operator: A New Form of Preservation

DeepSVM employs a novel neural network architecture combining DeepONet and Physics-Informed Neural Networks (PINNs) to directly learn the option pricing operator for the Heston stochastic volatility model. DeepONet facilitates learning the mapping between source and target spaces, in this case, input parameters characterizing the Heston model – such as volatility of volatility, correlation, and risk aversion – and the corresponding option prices. PINNs are integrated to enforce the underlying partial differential equation (PDE) governing the Heston model, specifically the Kolmogorov forward equation. This allows the network to learn the operator – the function that transforms model parameters into option prices – without requiring explicit, numerical solutions of the PDE at each evaluation point. The Heston model’s parameters, $K$ (strike price), $T$ (time to maturity), $S$ (spot price), $\sigma$ (volatility of volatility), $\rho$ (correlation), and $v_0$ (initial variance), serve as inputs to the network, which outputs the option price.

Traditional methods for pricing financial derivatives under models like Heston typically involve solving the partial differential equation (PDE) governing the asset price, often through computationally intensive techniques such as finite difference or Monte Carlo simulation. DeepSVM departs from this approach by directly learning the option pricing operator – the mapping from underlying asset price and time to option price – using a DeepONet architecture. This operator learning paradigm allows DeepSVM to predict option prices without iteratively solving the PDE at evaluation time. Consequently, once the operator is trained, price predictions can be obtained with significantly reduced computational cost, potentially offering a substantial speedup compared to conventional PDE-solving methods, especially for high-dimensional problems or real-time applications.

The Hard-Constrained Ansatz within DeepSVM directly embeds the known boundary conditions of the Heston partial differential equation (PDE) into the neural network architecture. This is achieved by formulating the boundary conditions as explicit constraints within the network’s loss function. Specifically, the network’s output is penalized if it violates these pre-defined conditions at the domain boundaries. This approach differs from traditional PINN implementations that enforce boundary conditions as soft constraints, offering guaranteed satisfaction of boundary conditions and improving the stability and accuracy of the learned solution, particularly when dealing with complex financial models like the Heston model where precise boundary condition enforcement is critical for option pricing.

Sobolev training, as implemented in DeepSVM, enhances operator learning by directly incorporating derivative information into the loss function. Instead of solely minimizing the error between predicted and true option prices, the training process also minimizes the error in the predicted derivatives of the operator. This is achieved by adding terms to the loss function proportional to the $L^2$ norm of the derivatives of the learned operator, effectively penalizing solutions that are not smooth. By minimizing these derivative errors, the model is encouraged to learn a more well-behaved operator, leading to improved generalization performance, particularly when evaluating the operator at inputs not seen during training. The use of Sobolev norms facilitates learning operators that satisfy certain regularity conditions, resulting in more stable and accurate option price predictions.

Adaptive Refinement: Focusing the Inevitable

Residual-Based Adaptive Refinement (RAR) is a training methodology that prioritizes regions of the solution space exhibiting the largest errors during model training. This is achieved by dynamically adjusting the sampling distribution to concentrate on areas where the model’s performance is suboptimal. By focusing computational resources on these high-error regions, RAR accelerates the convergence rate of the training process and improves the overall accuracy of the resulting model. This targeted approach contrasts with uniform sampling techniques, which allocate equal resources to all areas regardless of error magnitude, and is particularly effective in complex function approximation tasks such as option pricing.

Residual-Based Adaptive Refinement (RAR) prioritizes training samples based on the magnitude of the residual error, effectively focusing computational resources on regions of the solution space where the model exhibits the greatest uncertainty. This adaptive sampling strategy contrasts with uniform sampling, which allocates equal weight to all data points regardless of their contribution to the overall error. By concentrating on high-error regions, RAR ensures the neural network learns to accurately represent the most critical features of the pricing operator – those that most significantly influence the option price. This targeted learning process improves the model’s ability to generalize and achieve accurate pricing across a broader range of inputs, ultimately leading to faster convergence and improved performance.

Compared to uniform sampling methods, which allocate computational resources equally across the entire solution space, a targeted approach focusing on regions of high error achieves greater efficiency and accuracy. Uniform sampling can be highly inefficient, particularly in high-dimensional problems, as it dedicates resources to areas where the model already performs well or where the gradient is negligible. Conversely, concentrating computational effort on areas exhibiting significant error – as implemented through Residual-Based Adaptive Refinement (RAR) – allows for faster convergence and more precise approximations of the pricing operator. This selective allocation of resources minimizes redundant calculations and maximizes the impact of each training iteration, resulting in a demonstrably improved solution with pricing residuals of $𝒪(10^{-5})$.

The integration of Deep Support Vector Machines (DeepSVM) with Residual-Based Adaptive Refinement (RAR) represents a methodological advancement in option pricing. This combination achieves pricing residuals on the order of $𝒪(10^{-5})$, indicating a high degree of accuracy. Critically, this level of precision is comparable to that obtained through semi-analytic solution methods, which are computationally expensive and often limited in their applicability to complex financial instruments. The DeepSVM-RAR approach offers a data-driven alternative capable of achieving similar accuracy with potentially greater efficiency and scalability, particularly when dealing with high-dimensional option pricing problems.

Spatial maps of the PDE residual mean-squared error reveal consistent error distributions across time-to-maturity and parameter vectors, visualized using a logarithmic color scale.

A Shift in Perspective: Embracing the Inevitable Decay

The innovative framework demonstrably reduces computational demands when pricing options, especially in scenarios defined by complex, high-dimensional volatility surfaces. Traditional methods often struggle with the exponential growth in complexity as the number of underlying assets or volatility parameters increases, requiring substantial processing time and resources. This approach, however, bypasses these limitations by learning the option pricing operator directly, rather than relying on computationally intensive simulations or discretizations. The result is a significant speedup, allowing for rapid pricing and calibration even in highly complex markets. This computational advantage is not merely incremental; it unlocks the potential for real-time risk management and more sophisticated trading strategies previously hampered by prohibitive computational costs, potentially revolutionizing the way financial institutions approach derivative pricing.

The developed framework isn’t limited to the specific stochastic volatility model initially employed; its architecture is designed for broad applicability. Researchers anticipate successful extensions to encompass a wider range of models, including those with more complex volatility dynamics. Importantly, the methodology isn’t restricted to continuous processes; it holds the potential to incorporate jump-diffusion processes, which account for sudden, discontinuous price movements. This adaptability stems from the operator learning approach, which focuses on learning the underlying dynamics rather than being tied to a specific mathematical formulation. Consequently, the model promises a versatile tool for pricing derivatives under a variety of market conditions and assumptions, ultimately enhancing its practical relevance and predictive power in financial modeling.

Ongoing investigations are directed toward enhancing the model’s resilience to market fluctuations and broadening its applicability across diverse financial instruments. Current efforts prioritize stress-testing the framework with historical and simulated data, encompassing extreme events and varying market regimes, to ensure consistent performance under adverse conditions. Researchers are also actively exploring the model’s adaptation to more complex stochastic volatility structures and its potential integration with high-frequency trading algorithms. A key component of this future work involves validating the model’s predictive power and profitability through backtesting on real-world options data, ultimately aiming to demonstrate its practical value for traders and risk managers in live market environments.

The proposed framework demonstrates the potential for significantly enhanced accuracy and efficiency in option pricing by strategically combining operator learning with Monte Carlo simulation. This synergy allows the model to leverage the strengths of both approaches – the broad applicability of Monte Carlo and the speed of learned operators – resulting in highly competitive pricing results. Crucially, the model achieves near-instantaneous inference, characterized by an inference cost of $𝒪(1)$. This represents a substantial departure from traditional methods, effectively eliminating the computational bottleneck that often hinders real-time pricing and calibration in complex financial markets, and opening doors for more dynamic and responsive trading strategies.

The pursuit of accurate derivative calculations, as demonstrated by DeepSVM’s application to the Heston model, inevitably encounters the limits of any system striving for perfect representation. The model’s success in pricing options is a fleeting moment against the backdrop of inevitable decay – a phenomenon acutely observed in the need for further regularization techniques to ensure smooth Greek calculations. As John Locke stated, “All mankind… being all equal and independent, no one ought to harm another in his life, health, liberty, or possessions.” This echoes the inherent instability within complex systems; just as individuals require protection, so too do mathematical models require robust regularization to maintain their integrity and prevent the ‘harm’ of inaccurate derivatives over time. Any improvement in accuracy ages faster than expected, necessitating continual refinement.

What Lies Ahead?

The architecture presented here, DeepSVM, offers a predictable outcome: a functional approximation of a known model. Every architecture lives a life, and this one demonstrably navigates the Heston landscape with increasing competence. However, the pursuit of smoothness – of derivatives that don’t betray the underlying approximation – reveals a familiar constraint. The model’s ability to accurately calculate Greeks remains a challenge, highlighting that improvements age faster than one can understand them.

Future iterations will undoubtedly explore more sophisticated regularization techniques, attempting to impose order on a system fundamentally defined by stochasticity. Yet, a broader question persists: is the pursuit of perfect derivative calculation merely a refinement of the tool, or a misguided attempt to eliminate inherent uncertainty? The model replicates; it does not explain. The true limit is not computational speed, but the acknowledgement that the model is, ultimately, a map-not the territory.

The next stage may well involve a move beyond the Heston model itself. The architecture’s strength lies in its capacity to incorporate physics-informed constraints. Applying this framework to more complex volatility surfaces, or even entirely different financial instruments, presents a natural extension. The inevitable decay of any model’s predictive power is certain; the grace with which it ages is the metric worth observing.

Original article: https://arxiv.org/pdf/2512.07162.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/