Can AI Chatbots Follow the Rules of Finance?

Author: Denis Avetisyan

New research introduces a system for rigorously testing whether large language models comply with complex financial regulations during user interactions.

FinGuard establishes a regulation-driven benchmark and pipeline to detect financial regulatory non-compliance in LLM interactions, demonstrating improvements over existing safety evaluations.

While large language models offer transformative potential in financial services, a single non-compliant interaction risks substantial regulatory penalties and consumer harm. This paper introduces FinGuard: Detecting Financial Regulatory Non-Compliance in LLM Interactions, a novel regulation-driven pipeline and benchmark designed to identify violations grounded in specific financial regulations. We demonstrate that FinGuard, a model trained on regulation-grounded data, substantially outperforms existing guardrails and larger LLMs on a newly released benchmark, while also adapting to institution-specific policies. Can this approach pave the way for more robust and compliant LLM deployments within the heavily regulated financial sector?

The Inherent Risks in LLM-Driven Finance

The financial sector is experiencing a swift integration of large language models, driven by the potential to automate tasks from customer service and fraud detection to algorithmic trading and risk assessment. This adoption, however, isn’t without considerable risk. While LLMs promise increased efficiency and novel insights, their inherent complexity and capacity for generating unpredictable outputs introduce new vulnerabilities. Financial institutions now face challenges related to data security, regulatory compliance, and the potential for biased or misleading information disseminated through LLM-powered applications. Successfully harnessing the opportunities presented by these powerful tools requires a proactive approach to risk management, including robust testing, continuous monitoring, and the development of specialized safeguards tailored to the unique demands of the financial landscape.

Traditional financial compliance systems, designed for manually created content and rule-based monitoring, are increasingly challenged by the sheer volume and nuanced outputs of large language models. These systems struggle to analyze the dynamic and context-dependent nature of LLM-generated text, potentially missing subtle instances of market manipulation, biased recommendations, or the unauthorized disclosure of confidential information. The speed at which LLMs operate further exacerbates this issue; existing review processes simply cannot keep pace with the constant stream of content, creating a significant gap in oversight and increasing the risk of regulatory violations and reputational damage for financial institutions. Consequently, a proactive shift towards more sophisticated, AI-powered compliance tools is becoming essential to mitigate these emerging risks.

The increasing deployment of large language models within financial services has exposed a significant gap in regulatory oversight, necessitating the development of specialized compliance solutions. Current methods, designed for traditional communication channels, are proving inadequate against the volume and nuanced outputs of LLMs, potentially leading to undetected violations of financial regulations. This isn’t simply a matter of scaling existing tools; the unique characteristics of LLM-generated content – including its ability to subtly manipulate language and generate novel, yet non-compliant, statements – demand entirely new approaches to detection. Automated systems capable of understanding financial context, identifying misleading claims, and flagging potentially problematic outputs are no longer a future consideration, but a critical requirement for maintaining market integrity and protecting investors. Without such tools, financial institutions face escalating risks of non-compliance, reputational damage, and substantial penalties.

FinGuard: A Mathematically Principled Compliance Solution

FinGuard is an 8-billion parameter language model created to address the specific requirements of financial regulatory compliance detection. This model size was selected to balance performance capabilities with computational efficiency, allowing for deployment in resource-constrained environments. The model’s architecture is focused on identifying text indicative of non-compliance with financial regulations, including but not limited to issues related to anti-money laundering (AML), know your customer (KYC) procedures, and securities law. Its parameters are dedicated to learning patterns within financial text data to facilitate accurate and reliable compliance assessments.

FinGuard’s architecture is built upon the Qwen3-8B language model, a publicly available 8-billion parameter model, to capitalize on its pre-trained language understanding capabilities. To adapt Qwen3-8B for the specific task of financial regulatory compliance, FinGuard employs LoRA (Low-Rank Adaptation) fine-tuning. LoRA introduces a smaller set of trainable parameters, significantly reducing the computational cost and storage requirements compared to full fine-tuning, while maintaining performance. This parameter-efficient approach allows for rapid experimentation and deployment without substantial resource demands, and enables focused adaptation to the nuances of financial compliance data.

Supervised fine-tuning served as the initial training methodology for FinGuard, establishing a foundational capability for detecting prevalent financial regulatory non-compliance. This process involved training the model on a labeled dataset of compliance-related texts, explicitly associating input data with corresponding compliance issue classifications. The resulting model then learns to predict these classifications based on new, unseen text. This supervised approach provides a strong baseline performance by directly addressing common compliance violations, enabling the model to accurately identify and flag instances of these issues with a high degree of confidence before further refinement through other training techniques.

Regulation-Driven Data Synthesis and Robustness Through Iteration

The FinGuard system utilizes a Regulation-Driven Pipeline to construct a training dataset by programmatically extracting ‘Compliance Points’ directly from source regulatory documents. This process involves parsing legal text and identifying specific clauses, requirements, and prohibitions outlined by governing bodies. The extracted Compliance Points are then structured and labeled, forming the foundation for supervised learning tasks. This automated approach ensures the training data is directly derived from official sources, minimizing interpretation bias and facilitating consistent application of regulatory standards. The resulting dataset is dynamic, automatically updated as new or revised regulations are published, and is designed to support the continuous training and refinement of FinGuard’s compliance detection capabilities.

The FinGuard system constructs a ‘Risk Taxonomy’ by applying HFDBScan clustering to text embeddings generated using the Text-Embedding-v4 model. This process analyzes extracted ‘Compliance Points’ from regulatory documents and groups similar violation types together, creating a hierarchical classification of potential risks. HFDBScan, a density-based clustering algorithm, is utilized for its ability to identify clusters of varying shapes and densities without requiring a predefined number of clusters. Text-Embedding-v4 provides high-dimensional vector representations of the compliance text, capturing semantic relationships crucial for accurate grouping and the creation of a nuanced taxonomy that reflects the complexity of financial regulations.

Adversarial Augmentation within FinGuard utilizes the Qwen3.5-397B-A17B large language model to generate synthetic data variations designed to challenge the system’s classification capabilities. This process introduces perturbations to existing data instances, creating examples that are subtly different but represent potential edge cases or attempts to circumvent detection logic. By training on these augmented datasets, FinGuard increases its robustness against adversarial attacks and improves its ability to accurately identify violations across a wider range of input conditions. The generated variations focus on plausible, yet challenging, modifications to input text, effectively simulating real-world attempts to obfuscate or misrepresent information.

Self-Play Reinforcement Learning (SPRL) within FinGuard operates by utilizing the model’s own classification outputs as training data. This iterative process involves FinGuard classifying a dataset, then treating these classifications – including both correct and incorrect predictions – as a reward signal. The model then adjusts its internal parameters to maximize the accuracy of future classifications, effectively learning from its mistakes and refining its decision-making process without requiring external labeled data. This closed-loop system allows FinGuard to continuously improve performance and adapt to evolving patterns in financial compliance data, enhancing its robustness and reducing reliance on manually curated training sets.

Demonstrating Superior Performance with Expert Validation

To ensure reliable performance, FinGuard underwent comprehensive evaluation using FinGuard-Bench, a meticulously curated benchmark constructed through expert annotation specifically for financial compliance detection. This benchmark serves as a critical testing ground, exposing the model to a diverse range of scenarios requiring adherence to complex financial regulations. The rigorous assessment process involved analyzing FinGuard’s ability to identify non-compliant queries and responses, providing a quantifiable measure of its effectiveness and allowing for direct comparison against existing models. This emphasis on expert-validated benchmarks is fundamental to establishing trust and demonstrating FinGuard’s capacity to navigate the nuanced landscape of financial compliance with accuracy and dependability.

FinGuard exhibits robust performance in identifying potentially non-compliant financial interactions, accurately flagging problematic requests and responses. Evaluations using the FinGuard-Bench benchmark reveal the model achieves a 90.23% F1 score when assessing user queries and an 85.43% F1 score when analyzing the system’s responses. These results demonstrate a substantial improvement over existing models; PolyGuard-Qwen-7B, for example, achieves only a 52.46% F1 score, while Qwen3Guard-Gen-8B reaches 73.49%. This heightened detection capability suggests FinGuard offers a significantly more reliable safeguard against financial misconduct and regulatory breaches, effectively discerning legitimate transactions from those requiring further scrutiny.

FinGuard’s adaptability is significantly enhanced through the incorporation of institution-specific policy documents, allowing it to move beyond generalized financial regulations and address nuanced internal compliance requirements. This customization is achieved via a self-play reinforcement learning process, where the model refines its performance against unseen policy variations, resulting in a substantial improvement from an initial 68.42% to a robust 93.62% in accurately identifying compliance breaches. This approach not only elevates FinGuard’s overall efficacy but also broadens its practical application, making it a valuable asset for financial institutions seeking to tailor their compliance safeguards to unique operational contexts and regulatory landscapes.

Evaluations reveal FinGuard achieves state-of-the-art performance in identifying unsafe content within financial queries and responses. Specifically, the model demonstrates a 6.10% improvement in F1 score for the unsafe class at the query level when compared to its Supervised Fine-Tuning (SFT) baseline, indicating a heightened ability to flag potentially problematic requests. This enhanced detection extends to the response level, where FinGuard surpasses the SFT baseline by 0.72%. These results not only showcase significant progress over prior iterations but also establish FinGuard as a leading solution, exceeding the performance of large language models like Qwen3.5-397B-A17B, which achieved F1 scores of 78.68% and 64.08% for queries and responses, respectively.

The pursuit of robust financial compliance, as detailed in FinGuard, necessitates a commitment to provable correctness, not merely demonstrable functionality. Robert Tarjan aptly observed, “If you can’t explain it simply, you don’t understand it well enough.” This sentiment deeply resonates with the paper’s methodology; FinGuard doesn’t simply aim to detect violations but establishes a rigorous, taxonomy-driven pipeline. By grounding the system in defined regulatory requirements and employing adversarial training, the research moves beyond superficial pattern matching. The framework prioritizes a mathematically sound approach to identifying non-compliance, ensuring that detection isn’t reliant on heuristics or statistical correlations, but on verifiable adherence to established rules. This commitment to clarity and provability is paramount in a domain where ambiguity can have significant financial and legal consequences.

What Lies Ahead?

The presented work, while demonstrating measurable progress in aligning Large Language Models with financial regulations, merely scratches the surface of a fundamentally intractable problem. Current evaluations, even those employing a taxonomically-driven approach like FinGuard, remain tethered to known violations. The truly concerning risks will, by definition, be novel – emergent behaviors arising from the complex interplay of model parameters and unforeseen user inputs. A benchmark, however meticulously constructed, cannot preempt the undefined.

Future research must move beyond purely empirical validation. The pursuit of provable compliance – algorithms guaranteed to adhere to regulatory constraints – represents the logical, albeit challenging, direction. Self-play reinforcement learning offers a potentially fruitful avenue, but demands a formal specification of ‘regulatory intent’ – a task that itself invites philosophical debate. The elegance of a solution is not measured by its performance on a static dataset, but by the consistency of its behavior under all possible conditions.

Ultimately, the field must confront a sobering truth: complete safety is an asymptotic ideal. The cost of eliminating all risk – of achieving perfect alignment – will likely exceed any conceivable benefit. A pragmatic approach demands a rigorous understanding of the trade-offs between regulatory adherence and model utility, acknowledging that a degree of calculated risk is inherent in any complex system.

Original article: https://arxiv.org/pdf/2605.29427.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Inherent Risks in LLM-Driven Finance

FinGuard: A Mathematically Principled Compliance Solution

Regulation-Driven Data Synthesis and Robustness Through Iteration

Demonstrating Superior Performance with Expert Validation

What Lies Ahead?

See also: