Beyond Credit Scores: Modeling Mortgage Risk with AI

Author: Denis Avetisyan

A new approach combines the power of deep learning with interpretable statistics to better predict mortgage defaults and understand the factors driving credit risk.

The system defines distinct states of delinquency and illustrates the permissible transitions between them over a one-month period, mapping a dynamic landscape of behavioral shifts.

This research introduces a semi-structured multi-state model leveraging deep neural networks and linear terms to enhance prediction accuracy and maintain model transparency in mortgage default prediction.

Accurately forecasting mortgage delinquency requires balancing model flexibility with transparent effect estimation. This is addressed in ‘Semi-structured multi-state delinquency model for mortgage default’, which proposes a novel discrete-time multi-state framework combining interpretable linear predictors with a deep neural network to capture complex, nonlinear relationships in credit transitions. The resulting semi-structured model demonstrates improved discrimination in early prediction horizons-using Freddie Mac data-while maintaining similar overall accuracy to benchmark generalized additive models. Could this approach offer a practical compromise between predictive power and model interpretability applicable to other areas of credit risk assessment and beyond?

Beyond Static Risk: Modeling the Dynamics of Creditworthiness

Conventional credit risk assessments frequently fall short because they treat a borrower’s financial health as static, failing to account for the constant shifts in circumstance that influence repayment ability. These models often rely on snapshots of credit history, neglecting the dynamic processes of income fluctuations, unexpected expenses, or changing economic conditions that can rapidly move an individual from a position of good standing toward delinquency. This simplification introduces significant inaccuracies, as it overlooks the nuanced transitions borrowers experience – a process far more complex than a simple binary categorization of ‘good’ or ‘bad’ credit. Consequently, lenders may underestimate potential defaults or misprice risk, leading to financial instability and inefficient capital allocation. Addressing this limitation requires models capable of tracking and predicting these dynamic shifts, moving beyond static assessments to a more granular understanding of borrower behavior.

The ability to precisely chart a borrower’s journey through varying financial health is paramount for robust risk management. Traditional credit scoring often provides a static snapshot, failing to capture the nuanced shifts from consistent repayment to early delinquency and, ultimately, potential default. A dynamic model, however, allows institutions to anticipate changes in creditworthiness, enabling proactive interventions like tailored repayment plans or adjusted credit limits. This granular understanding minimizes potential losses by facilitating timely mitigation strategies and optimizing capital allocation. Furthermore, accurate transition modeling enhances the reliability of stress testing and regulatory compliance, fostering a more stable and resilient financial system, as it moves beyond simple binary classifications to a probabilistic assessment of ongoing risk.

Current credit risk modeling frequently falters due to oversimplified representations of borrower behavior. Traditional approaches often assume static risk profiles or rely on Markovian transitions – predicting future states solely on the present one – which fails to account for the complex interplay of economic conditions, individual circumstances, and behavioral patterns that drive delinquency and default. These models frequently presume homogeneity within risk segments, overlooking the significant heterogeneity in how borrowers respond to financial stress. Consequently, predictions can be significantly skewed, leading to underestimation of potential losses and ineffective risk mitigation strategies. The reliance on these limiting assumptions diminishes the predictive power of these models, particularly during periods of economic volatility or when faced with novel financial pressures, and necessitates the development of more nuanced and adaptive techniques.

Analysis of U.S. mortgage loan data using the Aalen-Johansen estimator reveals underlying temporal patterns in transition probabilities, as shown by smoothed <span class="katex-eq" data-katex-display="false"> ext{LOESS}</span> curves overlaid on raw estimates (displayed on a logarithmic scale). — Analysis of U.S. mortgage loan data using the Aalen-Johansen estimator reveals underlying temporal patterns in transition probabilities, as shown by smoothed $ext{LOESS}$ curves overlaid on raw estimates (displayed on a logarithmic scale).

A Multi-State Framework: Capturing the Nuances of Credit Transitions

The multi-state model represents an advancement over traditional survival analysis by allowing for the definition and tracking of multiple, mutually exclusive states. Standard survival analysis typically focuses on a single endpoint, such as default, whereas the multi-state model defines a series of possible states – for example, ‘current’, ‘30 days past due’, ‘60 days past due’, ‘default’ – that a borrower can transition between over time. This allows analysts to model not only the time to default, but also the probability of transitioning from one creditworthiness stage to another, offering a more complete picture of borrower behavior and risk profiles. The model accomplishes this by estimating transition intensities – the instantaneous rates of moving from one state to another – rather than a single hazard rate.

Traditional credit risk models often categorize borrowers into broad states, such as ‘good’ or ‘defaulted’, obscuring the intermediate stages of creditworthiness. The multi-state model addresses this limitation by defining a series of distinct states – for example, ‘current’, ‘30 days past due’, ‘90 days past due’, ‘in collections’ – and explicitly modeling the transitions between them. This allows for a more detailed observation of borrower behavior, capturing the evolving nature of credit risk as a series of state changes rather than a single event. By tracking these transitions, the model provides a more nuanced and realistic representation of how borrowers move through different stages of financial health, improving the accuracy of risk assessment and prediction compared to simpler models that lack this granularity.

The multi-state model leverages the granularity of loan-level data – encompassing attributes such as loan amount, interest rate, borrower credit score, payment history, and demographic information – to create a statistically robust risk assessment. This data is utilized to estimate transition intensities between states, quantifying the likelihood of a borrower moving from current payment to delinquency, or from delinquency to default. By directly incorporating these detailed features as covariates in the model, the framework moves beyond aggregate statistics, enabling the identification of specific risk factors and their impact on borrower behavior. This data-driven approach facilitates both a more accurate prediction of individual borrower risk profiles and the calibration of risk parameters for portfolio-level analysis.

The multi-state model is fundamentally structured to calculate transition probabilities between defined states, reflecting the evolving risk profile of a borrower over time. Unlike traditional survival analysis focused solely on time-to-event, this approach explicitly models the likelihood of moving from one state (e.g., current, 30 days past due, 90 days past due, defaulted) to another within a specified period. These probabilities are not static; they are functions of borrower characteristics and macroeconomic factors, allowing for a dynamic assessment of credit risk. The model’s output provides a matrix of transition probabilities, enabling the calculation of expected future states and associated risk exposures, thereby providing a more nuanced understanding of borrower behavior than static risk scores.

The proportion of loans in the training set varies by state and increases over time since loan origination.

Balancing Power and Interpretability: The Semi-Structured Prediction Approach

A semi-structured additive predictor represents a hybrid modeling approach that leverages the benefits of both linear models and neural networks. Traditional linear models, including those utilizing splines, offer high interpretability but can be limited in their ability to capture complex relationships. Conversely, neural networks excel at modeling non-linear interactions but often lack transparency. This predictor combines these approaches by creating an additive model with a structured component – encompassing linear terms and splines – and an unstructured neural network component. The structured component provides a readily interpretable baseline prediction, while the neural network refines this prediction by modeling residual complexity, resulting in a model that aims to balance predictive power with model understanding.

The structured component of a semi-structured additive predictor leverages established statistical techniques – linear terms and splines – to generate a readily interpretable baseline prediction. Linear terms model relationships assuming a constant rate of change, while splines, typically cubic or thin-plate, allow for non-linear relationships while maintaining smoothness and avoiding overfitting. This structured component effectively captures the dominant trends in the data, providing a transparent and easily understandable initial forecast. The use of these techniques ensures that the primary drivers of the prediction are explicitly modeled and quantifiable, contributing to overall model interpretability and facilitating direct comparison with traditional statistical models.

The neural network component of a semi-structured predictor addresses limitations of linear models by capturing complex, non-linear relationships within the data. While linear terms and splines effectively model additive effects, they struggle with interactions where the effect of one feature depends on the value of another. Neural networks, with their multiple layers and non-linear activation functions, can approximate arbitrarily complex functions, allowing them to model these interactions. This capacity for modeling non-linear interactions typically results in increased predictive accuracy, particularly in datasets where such relationships are prevalent, by providing a more nuanced representation of the underlying data generating process.

Orthogonalization within a semi-structured additive predictor enforces statistical independence between the structured and unstructured components. This is achieved by subtracting the projection of the neural network’s output onto the space spanned by the structured component’s terms. The resulting residual error is then modeled by the neural network, ensuring its contribution is solely focused on the non-linear variance not already explained by the linear and spline terms. Consequently, the effect of each feature within the structured component remains directly interpretable as its coefficient reflects the unique contribution to the prediction, and the neural network’s contribution is isolated to complex interactions beyond the scope of the structured model.

Simulation results, indicated by black curves, accurately recover the true nonlinear functions (red curves) across 100 runs, demonstrating effective estimation as shown in the facet layout.

From Logit to Probability: Precise Estimation of Transition Dynamics

Within multi-state modeling, the prediction of transitions between distinct states relies heavily on binary logistic regression. This statistical method effectively assesses the likelihood of an event – a transition from one state to another – occurring based on a set of predictor variables. Rather than directly estimating the duration spent in a state, logistic regression focuses on the probability of transitioning within a defined time interval. The output of this regression isn’t a simple yes/no prediction, but a value between zero and one, representing the estimated probability of that specific transition. This probability is crucial because it forms the foundation for understanding and predicting individual trajectories through the various states of the model, allowing for a nuanced assessment of risk and progression. $P(transition) = \frac{1}{1 + e^{-(\beta_0 + \beta_1x_1 + ... + \beta_nx_n)}}$

Within multi-state modeling, the transition from one state to another isn’t simply a matter of observing the change, but quantifying the probability of that change. The ‘Exact Discrete-Time Transformation’ addresses this need by providing a rigorous method to convert the outputs of binary logistic regression – which initially predict the log-odds of transition – into directly interpretable, competing transition probabilities. This transformation isn’t merely a scaling exercise; it ensures that probabilities for transitioning to different states sum to one at each discrete time point, a crucial requirement for maintaining statistical accuracy and interpretability. Unlike approximations that may introduce bias, this exact method delivers precise probability estimates, allowing for reliable comparisons between transition risks and ultimately, more informed decision-making based on the underlying dynamic processes being modeled.

Traditional methods for estimating state transitions often rely on continuous-time approximations, which assume events can occur at any point in time. However, in many real-world applications, data is collected at discrete intervals, and applying continuous-time models introduces inaccuracies. Recent research demonstrates a superior approach through the ‘Exact Discrete-Time Transformation,’ yielding markedly lower error rates when compared to these continuous-time methods. Specifically, simulations reveal significantly reduced Mean Squared Error (MSE) and Mean Absolute Error (MAE) – key metrics for evaluating prediction accuracy – confirming that modeling transitions within the observed discrete time frame provides a more precise and reliable assessment of risk. This improvement is crucial for applications where accurate timing of events is paramount, offering a robust alternative for analyzing dynamic processes.

The Aalen-Johansen estimator represents a significant advancement in multi-state modeling by providing a refined method for calculating transition intensities – the instantaneous rates at which individuals move between different states. This estimator doesn’t simply predict transitions; it offers a comprehensive and reliable assessment of the underlying risk factors driving these changes. Rigorous simulations demonstrate the estimator’s precision, consistently clustering estimates closely around the true values for both baseline risks and linear effects. Furthermore, even with more complex nonlinear effects, the estimator’s curves exhibit clear convergence, indicating a stable and dependable method for discerning intricate relationships within multi-state data and bolstering the accuracy of risk prediction.

Mean squared error (MSE) and mean absolute error (MAE) metrics demonstrate the performance of continuous and discrete-time transformation methods.

The pursuit of predictive accuracy, as demonstrated by this multi-state delinquency model, echoes a historical tension. This work’s blending of interpretable linear terms with deep learning’s nonlinear capacity highlights the need for responsible innovation. As René Descartes famously stated, “Good sense is the most evenly distributed thing in the world,” yet applying it to algorithmic design remains a challenge. The model’s emphasis on both performance and transparency suggests a recognition that tools without values are weapons – a crucial consideration when automating decisions with significant financial implications. Understanding transition probabilities, a Main Concept of the study, isn’t merely about prediction, but about discerning the factors driving those transitions and ensuring fairness in the process.

What’s Next?

The pursuit of predictive accuracy in credit risk, as demonstrated by this work, invariably raises the question of what is being predicted for. The combination of interpretable linear terms and deep learning offers a momentary reprieve from the black box critique, yet does not fundamentally address the ethical implications of automated financial decisions. Scalability without a corresponding commitment to fairness merely accelerates existing biases, embedding them deeper into the systems that govern economic opportunity.

Future research must move beyond solely optimizing for Area Under the Curve. The field should prioritize techniques for quantifying and mitigating disparate impact, acknowledging that “transparency” is insufficient if the underlying model reflects a fundamentally inequitable worldview. Semi-structured data, while offering a path toward interpretability, also invites scrutiny of the features deemed relevant – each selection reveals an assumption about what constitutes ‘creditworthiness.’

Ultimately, the challenge lies not in building more sophisticated models, but in defining what a just financial system should look like. Privacy is not a checkbox to be added during model development, but a foundational design principle. The next generation of credit risk models must treat individuals not as data points, but as agents deserving of dignity and equitable treatment, even – or especially – within the logic of algorithmic finance.

Original article: https://arxiv.org/pdf/2603.26309.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

Beyond Static Risk: Modeling the Dynamics of Creditworthiness

A Multi-State Framework: Capturing the Nuances of Credit Transitions

Balancing Power and Interpretability: The Semi-Structured Prediction Approach

From Logit to Probability: Precise Estimation of Transition Dynamics

What’s Next?

See also: