Smarter Portfolios: AI Predicts Asset Performance with Confidence

Author: Denis Avetisyan


A new approach leverages the power of artificial intelligence to refine asset selection and deliver improved risk-adjusted returns in financial markets.

A system distills firm characteristics into high-dimensional latent factors, then forecasts their uncertainty using time-series models to select the most predictable subset for portfolio optimization-a process mirroring the inherent limitations of any model, where even the most refined projections are ultimately subject to the unpredictable forces of the market, and any perceived certainty may vanish beyond the horizon of unforeseen events.
A system distills firm characteristics into high-dimensional latent factors, then forecasts their uncertainty using time-series models to select the most predictable subset for portfolio optimization-a process mirroring the inherent limitations of any model, where even the most refined projections are ultimately subject to the unpredictable forces of the market, and any perceived certainty may vanish beyond the horizon of unforeseen events.

This paper introduces a scalable method for portfolio optimization using Conditional Autoencoders and uncertainty quantification to dynamically select relevant factors.

While latent factor models offer a powerful approach to asset pricing, scaling these models often degrades performance due to increased noise. This paper, ‘Scaling Conditional Autoencoders for Portfolio Optimization via Uncertainty-Aware Factor Selection’, introduces a scalable framework coupling high-dimensional Conditional Autoencoders with an uncertainty-aware factor selection procedure, demonstrably improving portfolio performance. By ranking factors based on forecast uncertainty-using models like Chronos, XGBoost, and bootstrap sampling-we identify predictable subsets that consistently deliver enhanced risk-adjusted returns and outperform individual forecasting models in ensemble. Could this uncertainty quantification approach unlock further gains in complex, high-dimensional financial modeling?


The Illusion of Control: Peering into Latent Forces

Portfolio construction has long centered on the idea that asset returns aren’t random, but are instead driven by a smaller set of underlying, yet often unobservable, forces known as latent factors. These factors – encompassing macroeconomic trends, investor sentiment, or industry-specific dynamics – offer a potential pathway to systematic profit, as portfolios are tilted to capitalize on their influence. However, the very nature of these latent factors introduces inherent uncertainty; they are not directly measurable and must be inferred from complex market data. This inference is subject to noise, estimation error, and the ever-present possibility that the relationships observed in the past will not hold in the future. Consequently, while identifying these factors is a crucial first step, acknowledging their inherent uncertainty is paramount for building resilient and adaptive investment strategies.

The pursuit of predictive accuracy in financial markets hinges on identifying latent factors-underlying drivers of asset returns-but conventional forecasting techniques often falter when confronted with the inherent complexity and noise of real-world time series data. Standard statistical models, while useful, frequently struggle to disentangle genuine predictive signals from random fluctuations, leading to unreliable forecasts and potentially flawed investment strategies. This difficulty arises from the non-stationary nature of financial data, the presence of regime shifts, and the sheer volume of variables interacting in complex ways. Consequently, even sophisticated econometric approaches can produce estimates with substantial uncertainty, highlighting the need for more robust and nuanced forecasting methodologies that acknowledge and quantify this inherent unpredictability.

Successfully navigating financial markets demands more than simply identifying factors believed to influence asset returns; it requires a precise understanding of the uncertainty surrounding those factors. Robust portfolio construction isn’t about pinpoint accuracy in prediction, but rather acknowledging the range of plausible outcomes. Quantifying this predictive uncertainty allows investors to move beyond single-point forecasts and instead consider a distribution of possibilities, enabling the creation of portfolios resilient to unforeseen market events. Techniques that incorporate uncertainty estimates, such as Bayesian methods or ensemble forecasting, facilitate a more nuanced risk assessment and optimization process, ultimately leading to portfolios better equipped to withstand volatility and achieve long-term goals. Ignoring this inherent uncertainty can result in portfolios that appear optimal under ideal conditions, but are dangerously fragile when confronted with the inevitable complexities of real-world financial data.

Portfolios built on the assumption of precise factor forecasts often exhibit a dangerous illusion of stability. When predictive uncertainty is disregarded, optimization algorithms tend to concentrate investments in a limited number of assets perceived to offer the highest returns, creating portfolios that are overly sensitive to even small shifts in market conditions. This overconfidence stems from an underestimation of potential forecast errors; slight deviations from projected factor behavior can trigger disproportionately large losses, as the portfolio lacks diversification to absorb unexpected outcomes. Consequently, portfolios ignoring predictive uncertainty are particularly vulnerable during periods of market stress or when underlying economic relationships change, highlighting the critical need to incorporate forecast risk into the investment process and build truly robust strategies.

Uncertainty-aware latent factor selection consistently outperforms the SPY benchmark, with Ensemble (A), combining SPY with adaptive strategies, maximizing risk-adjusted returns, while Ensemble (B), built solely from adaptive strategies, delivers the highest total return and annualized growth.
Uncertainty-aware latent factor selection consistently outperforms the SPY benchmark, with Ensemble (A), combining SPY with adaptive strategies, maximizing risk-adjusted returns, while Ensemble (B), built solely from adaptive strategies, delivers the highest total return and annualized growth.

Beyond Prediction: Harnessing Time-Series Foundation Models

The Chronos time-series foundation model is utilized to improve latent factor return forecasting through transfer learning. Pre-trained on the M4 dataset, a large and diverse collection of time-series data encompassing over 100,000 series, Chronos acquires robust feature extraction capabilities. This pre-training allows the model to generalize effectively to the task of forecasting latent factor returns, even with limited historical data specific to those factors. By leveraging the knowledge embedded within Chronos’ parameters, the model requires less task-specific training and achieves improved performance compared to models trained from scratch. The M4 dataset’s breadth ensures exposure to a wide range of time-series characteristics, enhancing the model’s adaptability and predictive power.

Probabilistic forecasting, enabled by the Chronos time-series foundation model and Quantile Regression, moves beyond single-value predictions to estimate the entire probability distribution of future outcomes. Quantile Regression specifically models different quantiles – such as the 5th percentile, median (50th percentile), and 95th percentile – of the forecast distribution. This allows for the generation of prediction intervals, quantifying the uncertainty associated with the forecast. Instead of a single expected value, the model outputs a range of plausible values with associated probabilities, providing a more comprehensive assessment of potential future factor behavior and facilitating risk management by identifying likely best- and worst-case scenarios.

Gradient Boosted Trees (GBTs) are utilized as a post-processing step to enhance the accuracy and reliability of quantile predictions generated by the Chronos model and Quantile Regression. This involves training GBTs on the errors of the initial quantile forecasts, effectively learning to correct systematic biases and improve calibration. The GBT models leverage features derived from historical data and the initial quantile predictions themselves. By combining the strengths of the probabilistic forecasting approach with the error-correction capabilities of GBTs, the resulting refined quantile forecasts demonstrate reduced prediction intervals and improved statistical properties, particularly in scenarios with non-linear relationships or complex dependencies within the time-series data. This refinement process leads to more robust and dependable probabilistic forecasts of latent factor returns.

Traditional time-series forecasting often produces point forecasts, which represent a single predicted value for a future time step. In contrast, the implemented probabilistic forecasting approach, utilizing Chronos and quantile regression, generates a distribution of potential outcomes. This distribution allows for the calculation of prediction intervals, quantifying the uncertainty associated with each forecast. Rather than simply predicting a single expected return for a latent factor, the model provides a range of possible returns, along with the probability of each occurring. This nuanced understanding of potential factor behavior is critical for risk management and portfolio optimization, enabling more informed decision-making under uncertainty, and is superior to methods relying solely on central tendency estimates.

Uncertainty-aware pruning consistently improves the risk-adjusted performance of CAE models-demonstrated by the concave risk-return frontier-and exhibits robustness across different numbers of latent factors.
Uncertainty-aware pruning consistently improves the risk-adjusted performance of CAE models-demonstrated by the concave risk-return frontier-and exhibits robustness across different numbers of latent factors.

Constructing Resilience: Robust Risk Measures in Portfolio Design

Canonical Autoencoders (CAE) are employed to derive latent factor portfolios from a combination of firm characteristics – such as size, value, and momentum – and historical asset returns. This process involves training the CAE to compress high-dimensional input data-firm characteristics and returns-into a lower-dimensional latent space, effectively identifying the underlying factors driving asset behavior. The resulting latent factors are then used as inputs for portfolio construction, enabling the creation of portfolios based on these statistically significant, data-driven representations of asset risk and return. The CAE methodology facilitates the discovery of factors beyond those traditionally used, potentially enhancing portfolio diversification and performance.

Portfolio optimization utilizes multiple risk measures to provide a comprehensive evaluation of performance beyond simple return maximization. The $Sharpe\, Ratio$ quantifies risk-adjusted return by dividing excess return by total risk, measured by standard deviation. The $Sortino\, Ratio$ focuses specifically on downside risk, using only negative deviations below a specified return target. The $Omega\, Ratio$ represents the probability of gains exceeding losses, offering a non-parametric assessment of risk and return. Finally, $Maximum\, Drawdown$ identifies the largest peak-to-trough decline during a specific period, providing a measure of potential loss and informing risk tolerance considerations. Combining these measures allows for a nuanced understanding of portfolio risk and facilitates the selection of portfolios aligned with investor preferences.

The Tangency Portfolio is determined as the point on the efficient frontier where the Sharpe Ratio is maximized. This portfolio represents the optimal balance between expected return and total risk, as measured by standard deviation. Calculation involves identifying the weights of each asset within the portfolio that yield the highest $ \frac{E[R_p] – R_f}{\sigma_p} $ , where $E[R_p]$ is the expected portfolio return, $R_f$ is the risk-free rate, and $ \sigma_p $ represents the portfolio’s standard deviation. The resulting portfolio provides the highest possible return for a given level of risk, or conversely, the lowest risk for a given level of return, within the defined investment universe and optimization constraints.

Log-Sum-Exp regularization is implemented during the factor selection process to enhance the stability and manage the complexity of the extracted latent factors. This technique adds a penalty term to the optimization function, calculated as $λ \sum_{i} log(1 + exp(β_i))$, where $λ$ is a regularization parameter and $β_i$ represents the weight assigned to the $i$-th factor. By utilizing the log-sum-exp function, the regularization promotes a smoother, more gradual evolution of factor weights, preventing abrupt shifts in portfolio composition and mitigating the risk of overfitting to historical data. This approach effectively encourages the model to maintain a balance between utilizing informative factors and avoiding overly complex factor combinations, resulting in more robust and interpretable portfolios.

Low correlations in out-of-sample returns (2000-2024) between adaptive strategies-including ZS-Chronos, Q-Boost, IID-BS, CAE, SPY, and ensemble portfolios-demonstrate their complementary predictive capabilities and support the benefits of ensemble construction.
Low correlations in out-of-sample returns (2000-2024) between adaptive strategies-including ZS-Chronos, Q-Boost, IID-BS, CAE, SPY, and ensemble portfolios-demonstrate their complementary predictive capabilities and support the benefits of ensemble construction.

Beyond the Horizon: Implications for a New Era of Financial Modeling

Traditional portfolio construction often relies on point estimates for expected returns, overlooking the inherent uncertainty in latent factors that drive asset performance. This approach can lead to portfolios that are overly optimistic and vulnerable to unexpected market shifts. Instead, the methodology presented explicitly models this uncertainty, recognizing that factor returns are not fixed values but rather distributions of possible outcomes. By incorporating this probabilistic view, the framework generates more realistic portfolio allocations that account for a wider range of potential scenarios. The resulting portfolios are demonstrably more robust, better equipped to withstand market volatility, and capable of delivering consistent, risk-adjusted returns even under adverse conditions. This shift from deterministic to probabilistic modeling represents a significant advancement in financial modeling, enabling asset managers to build portfolios that are not only optimized for expected returns but also resilient to unforeseen risks.

The convergence of time-series foundation models with sophisticated risk assessment techniques demonstrably enhances portfolio performance, yielding improvements in both returns and risk mitigation. Empirical results indicate that portfolios constructed using this methodology achieve a compelling balance between reward and potential loss, evidenced by a Sharpe ratio of 2.22 and a Sortino ratio reaching 4.01. Crucially, downside risk is significantly constrained, with the model registering a maximum drawdown below 10%. This suggests a robust strategy capable of navigating market fluctuations while delivering consistent, risk-adjusted returns, offering a substantial advantage in volatile financial landscapes.

Asset managers currently face unprecedented challenges stemming from rapidly shifting global dynamics and increasingly intricate market behaviors. This research introduces a methodology designed to empower these professionals with a sophisticated approach to portfolio construction, specifically tailored for navigating such turbulent conditions. By explicitly modeling uncertainty and leveraging the predictive capabilities of time-series foundation models, the framework enables a more nuanced understanding of risk and return profiles. The resulting portfolios demonstrate not only enhanced risk-adjusted returns – achieving a Sharpe ratio of 2.22 and a Sortino ratio of 4.01 – but also significantly reduced downside risk, evidenced by a maximum drawdown below 10%. This represents a substantive advancement beyond traditional methods, offering a practical and robust tool for achieving long-term investment success in an environment characterized by heightened volatility and complexity.

Long-term investment success hinges fundamentally on the capacity to quantify and effectively manage inherent market uncertainty. Recent performance data demonstrates this principle, as exemplified by ZS-Chronos, which achieved an annual return of 11.79% following 2018. This result isn’t simply about gains, but also about the quality of those returns; the model consistently delivered a Sharpe ratio of 1.826, indicating strong risk-adjusted performance, and a Sortino ratio of 3.898, highlighting superior downside risk protection. These metrics collectively suggest that a methodology focused on explicitly modeling uncertainty doesn’t merely predict market behavior, but actively constructs portfolios resilient to adverse conditions, ultimately fostering sustainable growth and investor confidence.

Despite benchmark index losses, all adaptive strategies generated positive yearly returns during market drawdowns, except for ZS-Chronos, which experienced a minor 1.25% loss in 2008.
Despite benchmark index losses, all adaptive strategies generated positive yearly returns during market drawdowns, except for ZS-Chronos, which experienced a minor 1.25% loss in 2008.

The pursuit of optimal portfolios, as detailed within this work, resembles an attempt to map the contours of the unknowable. Each iteration of the Conditional Autoencoder, each refinement of factor selection through uncertainty quantification, is a step further into a complex system where complete knowledge remains elusive. It echoes a sentiment articulated by Georg Wilhelm Friedrich Hegel: “We do not know what we do not know.” The study’s emphasis on uncertainty isn’t a concession to limitation, but a recognition that true understanding arises from acknowledging the inherent gaps in any model, mirroring the way a black hole bends perception – a constant reminder that the map is never the territory.

What Lies Beyond the Horizon?

The pursuit of optimized portfolios, rendered through the lens of Conditional Autoencoders and uncertainty quantification, arrives at a familiar impasse. Improved performance, even statistically significant gains, is not a destination, but a momentary reprieve. The very factors selected as predictive-those deemed ‘important’ by the algorithm-are transient signals, subject to the same gravitational pull as any other data point. Any prediction is just a probability, and it can be destroyed by gravity. The model’s success hinges on capturing relationships within a specific historical window; a window which, by definition, does not include the future.

Future work will undoubtedly explore more sophisticated architectures, larger datasets, and perhaps even attempts to incorporate non-stationarity directly into the latent space. Yet, the core limitation remains: the map is not the territory. The algorithm can refine its understanding of past behavior, but it cannot anticipate the truly novel event, the black swan that renders all prior optimization irrelevant. Black holes don’t argue; they consume.

The next iteration will likely demand a reckoning with the inherent unknowability of asset pricing. Perhaps the focus should shift from maximizing returns to minimizing regret – not seeking the optimal path, but building resilience against inevitable disruption. The true challenge lies not in predicting the future, but in preparing for its unpredictability.


Original article: https://arxiv.org/pdf/2511.17462.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2025-11-25 02:51