Predicting Claims Costs with Neural Networks

Author: Denis Avetisyan


New research demonstrates how machine learning can improve the accuracy of loss reserving by leveraging both initial estimates and actual payment data.

The study demonstrates that a forecasting model-assessing outstanding claim amounts-achieves performance proportional to its accuracy, as indicated by values approaching unity when predicting actual outstanding amounts and exhibiting a strong correlation-represented by larger values-between its forecasts and those of a conventional case estimate, with these metrics evaluated across both accident and notification quarters.
The study demonstrates that a forecasting model-assessing outstanding claim amounts-achieves performance proportional to its accuracy, as indicated by values approaching unity when predicting actual outstanding amounts and exhibiting a strong correlation-represented by larger values-between its forecasts and those of a conventional case estimate, with these metrics evaluated across both accident and notification quarters.

This study investigates the application of neural networks to individual loss reserving, showing that incorporating case estimate data enhances predictive performance compared to traditional methods.

Accurately forecasting future claim liabilities remains a central challenge in actuarial science, despite increasing computational power. This paper, ‘On the use of case estimate and transactional payment data in neural networks for individual loss reserving’, investigates the potential of neural networks to improve individual loss reserving by leveraging both granular payment histories and routinely produced case estimates. Results demonstrate that incorporating case estimate data significantly enhances predictive accuracy, although equipping networks with memory capabilities yields limited gains. Given the variability in case estimation practices across insurers, can a standardised methodology unlock their full predictive potential and refine future reserving approaches?


The Evolving Challenge of Loss Reserving

Historically, loss reserving – the process of estimating future claim payments – relied heavily on aggregate data and simplified assumptions. However, the advent of detailed, individual claims information presents a significant challenge to these traditional actuarial methods. The sheer volume and complexity of modern data, encompassing numerous variables and intricate relationships, often overwhelm techniques designed for broader generalizations. Consequently, standard approaches may fail to capture nuanced patterns within the data, leading to inaccurate projections of ultimate liabilities. This inability to effectively process granular information can result in underestimation of future claims, potentially jeopardizing an insurer’s financial stability, or conversely, overestimation leading to unnecessary capital reserves and reduced profitability. A shift towards more sophisticated, data-driven models is therefore crucial for accurately assessing risk and ensuring financial solvency in the contemporary insurance landscape.

The escalating number of individual claims processed by insurance companies is fundamentally reshaping loss reserving practices. Traditional methods, reliant on aggregated data and broad assumptions, are proving increasingly inadequate in the face of this granular detail. A shift towards data-driven approaches-leveraging machine learning, statistical modeling, and advanced analytics-becomes not merely beneficial, but essential for accurate liability estimation. These techniques can identify subtle patterns and correlations within the claims data that would otherwise remain hidden, enabling insurers to move beyond simplistic averages and build more precise projections of future costs. This transition promises a more responsive and financially sound approach to reserving, ultimately improving an insurer’s ability to meet its obligations and maintain stability in a dynamic risk landscape.

The estimation of Incurred But Not Reported (IBNR) claims represents a substantial hurdle in actuarial science, as these represent liabilities for events that have occurred but haven’t yet been formally reported to insurers. Traditional methods often fall short in accurately predicting these future costs due to the inherent uncertainty and delayed reporting patterns. Consequently, insurers are increasingly reliant on sophisticated statistical modeling techniques – including generalized linear models, chain-ladder methods with varying inflation assumptions, and even machine learning algorithms – to analyze historical claims data, identify emerging trends, and project ultimate claim amounts. These models attempt to account for factors like claim severity, frequency, reporting delays, and external economic influences, aiming to minimize the potential for under-reserving – which threatens solvency – or over-reserving, which impacts profitability and capital efficiency. The complexity lies not only in the modeling itself, but also in the constant need for model validation and refinement as new data becomes available and reporting patterns evolve.

Tuned models accurately predict outstanding claim amounts across all quarters since notification, as demonstrated by the validation set predictions.
Tuned models accurately predict outstanding claim amounts across all quarters since notification, as demonstrated by the validation set predictions.

Neural Networks: A Superior Modeling Paradigm

Neural networks, as applied to loss reserving, offer a non-linear modeling capability exceeding that of traditional generalized linear models (GLMs). This flexibility stems from the multiple layers of interconnected nodes, or “neurons,” which allow the network to identify and quantify intricate interactions between predictor variables within claims data. Specifically, neural networks can capture relationships that are not easily modeled through simple additive or multiplicative terms, potentially reducing prediction error and improving the accuracy of ultimate loss estimates. The capacity to model these complex relationships is particularly valuable in scenarios involving high-dimensional data, numerous categorical variables, and non-linear effects on claim severity or frequency, leading to a more nuanced understanding of underlying risk factors and improved predictive performance compared to methods relying on predefined functional forms.

Utilizing individual claims data, as opposed to aggregated data, facilitates granular risk assessment by enabling actuaries to model the characteristics of each claim separately. This approach allows for the consideration of a wider range of predictive variables – including claim specifics, policyholder details, and time-varying factors – which are often lost when data is aggregated. Consequently, models built on individual claims data can more accurately identify patterns and relationships that influence ultimate losses, leading to improved forecasting accuracy and a more refined understanding of risk exposure. This is particularly valuable for long-tail lines of business where claims develop over extended periods and individual claim characteristics significantly impact overall loss ratios.

Neural network architecture selection is contingent on data characteristics and modeling goals. Feed-Forward Neural Networks (FFNNs) are suitable for static datasets where the order of claims is irrelevant, processing each claim independently based on its features; their strength lies in identifying non-linear relationships between input variables and predicted loss values. Conversely, Recurrent Neural Networks (RNNs), particularly Long Short-Term Memory (LSTM) networks, excel with sequential data, acknowledging the temporal dependencies within claims history; this is valuable when predicting losses on claims with evolving information or where patterns emerge over time. The increased complexity of RNNs, however, demands larger datasets and greater computational resources compared to FFNNs, necessitating a trade-off between model performance and practical implementation constraints.

At the valuation date, boxplots demonstrate that the FNN+ model consistently predicts outstanding claim amounts with greater accuracy (values closer to 1) and lower vsCE <span class="katex-eq" data-katex-display="false">	ext{vsCE}</span> values, compared to the FNN model, across all quarters since claim notification.
At the valuation date, boxplots demonstrate that the FNN+ model consistently predicts outstanding claim amounts with greater accuracy (values closer to 1) and lower vsCE ext{vsCE} values, compared to the FNN model, across all quarters since claim notification.

Validating Predictive Power: Rigorous Testing Procedures

Cross-validation is a resampling technique used to evaluate the ability of a neural network model to generalize to unseen data. The process involves partitioning the available historical data into multiple subsets, or “folds.” The model is then trained on a portion of these folds and tested on the remaining, unseen fold. This process is repeated iteratively, using a different fold for testing each time. By averaging the performance metrics across all iterations, a more robust and reliable estimate of the model’s generalization performance is obtained. This is crucial because neural networks, due to their complexity, are prone to overfitting – memorizing the training data rather than learning underlying patterns – which results in poor performance on new, real-world claims data. Cross-validation helps to detect and mitigate overfitting, ensuring the model accurately predicts future outcomes.

Fixed Origin Cross-Validation and Rolling Origin Cross-Validation are specialized techniques used to evaluate time series forecasting models, such as those predicting future insurance claims. Fixed Origin Cross-Validation divides the time series into training and validation sets using a single, fixed split point, iteratively training on earlier data and validating on later data. Rolling Origin Cross-Validation improves upon this by shifting the split point forward in time with each iteration, effectively creating multiple training and validation sets and providing a more robust estimate of out-of-sample performance. This ‘rolling’ approach simulates the model’s predictive capability on truly unseen future data, crucial for assessing the reliability of time-dependent predictions and preventing optimistic bias inherent in standard cross-validation methods when applied to sequential data.

Data simulation tools address limitations in real-world claim datasets by generating synthetic data for model validation and stress testing. SPLICE Simulator and SynthETIC are examples of such tools, utilizing statistical methods to create claim records that mimic the characteristics of actual data, including frequency, severity, and correlations between variables. This artificially expanded dataset allows for the evaluation of model performance under a wider range of scenarios, including rare events or conditions not adequately represented in historical data. Supplementing real data with synthetic data improves the robustness of model validation, identifies potential weaknesses, and enhances confidence in predictive accuracy, particularly for infrequently occurring claim types.

Advanced Architectures Yield Superior Predictive Accuracy

Long Short-Term Memory (LSTM) networks represent a significant advancement in the analysis of sequential claim data. Unlike traditional neural networks which struggle with information retention over extended periods, LSTMs are specifically designed to capture long-term dependencies. This capability is crucial in insurance claim prediction, as the history of a claim – including prior payments, reported injuries, and evolving circumstances – heavily influences its ultimate cost. By effectively ‘remembering’ relevant information from the entire claim lifecycle, LSTMs can identify subtle patterns and correlations that would otherwise be missed, resulting in markedly improved prediction accuracy and more reliable reserve estimations. The architecture’s internal mechanisms allow it to selectively retain or discard information, mitigating the vanishing gradient problem common in standard recurrent networks and enabling the modeling of complex, time-dependent relationships within the data.

Distributional Refinement Networks represent a significant advancement in claim cost prediction by moving beyond single-point estimates to forecast the entire probability distribution of potential claim payments. This holistic approach offers a far more comprehensive understanding of financial liability than traditional methods, allowing insurers to quantify not just the most likely cost, but also the range of possible outcomes and their associated probabilities. By predicting the full distribution, risk managers can better assess capital adequacy, design more effective reinsurance strategies, and ultimately, improve the overall financial stability of the organization. This capability is crucial for navigating the inherent uncertainty in insurance and making informed decisions based on a complete picture of potential exposures, leading to more accurate reserving and enhanced risk management practices.

Rigorous evaluation of the predictive models utilizes both Mean Absolute Logarithmic Error and Mean Squared Logarithmic Error to precisely quantify reserve accuracy. Results demonstrate a substantial improvement following the integration of case estimate data; initial analyses revealed a mean reserve error of 20.5%, which decreased dramatically to just 7.7% with the inclusion of this additional information. This significant reduction highlights the value of incorporating expert judgment alongside algorithmic prediction, leading to more reliable financial forecasting and improved risk mitigation strategies within claims reserving.

LSTM+ and FNN+ models demonstrate strong predictive performance of outstanding claim amounts, as indicated by boxplots showing proportions near 1 and larger <span class="katex-eq" data-katex-display="false">\text{vsCE}_{\text{OCL}}\</span> values across quarters since notification.
LSTM+ and FNN+ models demonstrate strong predictive performance of outstanding claim amounts, as indicated by boxplots showing proportions near 1 and larger \text{vsCE}_{\text{OCL}}\ values across quarters since notification.

The pursuit of accurate loss reserving, as detailed in the study, demands a rigorous approach to data utilization. Every input, every feature, must contribute meaningfully to the predictive model-redundancy is anathema. This aligns perfectly with Mary Wollstonecraft’s assertion: “The mind, when once awakened, will not be satisfied with shadows.” The incorporation of case estimate data, demonstrating significant performance gains, is not merely a refinement, but an illumination-a move away from the ‘shadows’ of traditional methods toward a provably more accurate assessment of individual claim liabilities. The study’s success stems from a dedication to extracting signal from noise, mirroring a commitment to intellectual clarity and mathematical precision.

What Lies Ahead?

The demonstrated efficacy of neural networks in loss reserving, particularly when augmented with case estimate data, does not represent a culmination, but rather a necessary initial step. The current focus on predictive accuracy, while pragmatically valuable, obscures a more fundamental question: can these models truly understand the underlying stochastic processes governing claim development? The observed improvements over traditional methods suggest a capacity for pattern recognition exceeding simple extrapolation, yet a formal, mathematically rigorous demonstration of this remains elusive. Simply achieving lower error metrics does not equate to understanding.

Future work should therefore prioritize the development of models grounded in actuarial theory, not merely statistical optimization. The incorporation of explicit process assumptions – perhaps through the use of recurrent neural networks designed to model time-series dependencies with provable stability – could yield more robust and interpretable results. The challenge lies in reconciling the flexibility of neural networks with the need for actuarial soundness; a pursuit of elegance beyond mere empirical performance.

Furthermore, the inherent limitations of transactional data – its retrospective nature and susceptibility to reporting biases – demand attention. Exploration of causal inference techniques, alongside methods for quantifying and mitigating data imperfections, represents a critical next step. Only through a commitment to mathematical clarity and a healthy skepticism towards purely data-driven approaches can this field progress beyond a sophisticated form of curve fitting.


Original article: https://arxiv.org/pdf/2601.05274.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-01-12 14:17