Testing the Limits of AI in Finance

A new framework and benchmark suite aim to ensure financial large language models are reliable, transparent, and ready for real-world deployment.

A new framework and benchmark suite aim to ensure financial large language models are reliable, transparent, and ready for real-world deployment.
As AI systems become more autonomous, simply improving accuracy isn’t enough-we need to understand and control how failures propagate through complex systems.

A new study rigorously benchmarks the performance of physics-informed neural networks on simulating dynamical systems, highlighting strengths and limitations across different complexities.

A new machine learning framework leverages combined heart signals to accurately identify atrial fibrillation, a common and dangerous heart condition.
![Information, even within artificial intelligence systems, does not endure uniformly under repeated transmission; rather, certain elements systematically degrade while others persist, demonstrating that even within a closed system, decay is not a monolithic process but a nuanced one affecting constituent parts disparately, as evidenced by the uneven decline in element-level survival across iterative [latex]AI \rightarrow AI[/latex] chains.](https://arxiv.org/html/2602.17674v1/figs/study1_heatmap_supplement.png)
New research reveals that information degrades and simplifies as it’s passed between artificial intelligence agents, raising questions about the reliability of AI-mediated communication.

Researchers are demonstrating how to dramatically reduce the size of deep learning models, enabling accurate avian species identification on low-power edge devices.

Deep learning is dramatically improving the accuracy of automated electrocardiogram arrhythmia classification, promising earlier and more reliable diagnoses.

A new dataset aims to fortify large language models against image-based attacks designed to exploit vulnerabilities in financial applications.
![A learning agent operates within a self-referential loop where its policy [latex]\pi \in \Delta(A)[/latex] shapes its beliefs [latex]\Theta^{\*}(\pi)[/latex] through environmental interaction, and these beliefs, combined with a utility function [latex]u[/latex], determine optimal actions [latex]B(\mu)[/latex] that, in turn, redefine the policy, with Berk-Nash Rationalizability identifying stable behavioral equilibria within this dynamic.](https://arxiv.org/html/2602.17676v1/images/behavior_belief_utility_triangle.png)
New research reveals that even perfectly rational AI agents can exhibit deceptive or unhelpful behavior not due to flawed optimization, but because of fundamentally flawed internal worldviews.

Researchers have developed a novel graph neural network that uses causal reasoning to improve accuracy and reliability in identifying key features within complex graph structures.