The AI Security Paradox: Why Today’s Language Models Fall Short

Author: Denis Avetisyan


A new analysis reveals that current large language models are fundamentally unsuited for critical security roles, raising serious concerns for risk management and regulatory compliance.

This paper argues that the inherent unreliability of large language models conflicts with established security norms and emerging EU regulations like the AI Act and Cyber Resilience Act.

While increasingly touted as innovative tools, the application of Large Language Models (LLMs) to cybersecurity presents a paradoxical risk to the very systems they aim to protect. This paper, ‘Large Language Models as a (Bad) Security Norm in the Context of Regulation and Compliance’, examines how current LLMs fall short of established security norms and legal obligations, particularly within the evolving EU regulatory landscape. Our analysis reveals that inherent limitations in LLM architecture create vulnerabilities that conflict with fundamental cybersecurity principles and risk non-compliance with legislation like the AI Act and Cyber Resilience Act. Given the growing reliance on these models, can we adequately mitigate these risks before they compromise critical digital infrastructure?


The Inevitable Erosion: Navigating the Evolving Threat Landscape

Contemporary cybersecurity standards, bolstered by legislation such as the Network and Information Systems Directive 2 (NIS2) and the Cyber Resilience Act, face escalating challenges due to the inherent complexity of modern technological systems. These frameworks, while built upon established security principles, are increasingly strained by the expanding attack surface created by interconnected devices, cloud computing, and intricate software dependencies. The proliferation of digital assets and the rapid pace of innovation consistently introduce new vulnerabilities that outpace traditional mitigation strategies. Consequently, a reliance solely on conventional best practices proves insufficient, necessitating a dynamic and adaptive approach to security that acknowledges the evolving threat landscape and prioritizes proactive resilience over reactive defense.

Contemporary cybersecurity relies significantly on a foundation of established best practices – regular patching, robust access controls, and network segmentation, among others – yet these norms are increasingly challenged by a rapidly evolving threat landscape. While historically effective, these practices struggle to keep pace with the sheer volume of new vulnerabilities discovered daily, and the ingenuity of attackers employing advanced techniques like polymorphic malware and supply chain attacks. The inherent complexity of modern IT systems, encompassing cloud infrastructure, interconnected devices, and increasingly sophisticated software, expands the attack surface and introduces novel exploitation pathways. This creates a critical tension where established defenses, though necessary, are no longer sufficient to guarantee comprehensive protection against determined and resourceful adversaries, necessitating a continuous adaptation and reinforcement of security measures.

The emergence of Large Language Models (LLMs) presents a fundamental shift in the cybersecurity landscape, necessitating a move beyond conventional defense strategies. Recent analyses reveal a significant incompatibility between LLMs and established security standards, as these models can be exploited in ways previously unseen – from generating highly convincing phishing attacks to automating vulnerability discovery. Traditional approaches, built on signature-based detection and perimeter defense, struggle to address the nuanced and adaptive threats posed by LLMs, which can rapidly generate novel malicious content and evade existing safeguards. This incompatibility isn’t simply a matter of scaling existing defenses; it demands a reassessment of core security principles, focusing on proactive threat modeling, behavioral analysis, and the development of AI-powered security tools capable of understanding and mitigating the unique risks introduced by these powerful models.

The safeguarding of intellectual property has become critically important in an era increasingly reliant on Large Language Models. These powerful AI systems, while offering substantial benefits, present unique risks of inadvertent data exposure. LLMs are trained on vast datasets, and despite efforts to sanitize this information, sensitive details – including proprietary code, confidential business strategies, and personally identifiable information – can be embedded within the model’s parameters. Consequently, seemingly innocuous prompts can elicit the unintentional release of this protected content, creating significant legal and reputational risks for organizations. This vulnerability extends beyond direct data breaches, as LLMs can also facilitate the reconstruction of proprietary algorithms or the leakage of trade secrets through the analysis of generated outputs, demanding a proactive shift toward AI-aware data governance and security protocols.

The Shifting Sands: LLMs and the Paradigm of Vulnerability

Large Language Models (LLMs) are vulnerable to adversarial attacks, where subtly modified inputs can cause incorrect or unintended outputs, and data poisoning, which involves introducing malicious data into the training set to manipulate model behavior. Adversarial attacks can bypass input validation and exploit statistical weaknesses in the model’s decision boundaries. Data poisoning attacks, conversely, compromise the model’s foundational knowledge, leading to persistent and systemic errors. Both attack vectors directly impact the integrity and reliability of LLMs, potentially causing inaccurate information dissemination, biased outputs, and compromised system security. Successful exploitation of these vulnerabilities can lead to significant operational and reputational damage, particularly in applications where LLMs are used for critical decision-making.

Large Language Models (LLMs) introduce novel code vulnerabilities distinct from those found in traditional software. These arise from the complex interactions within the Transformer architecture and the data-driven nature of model training. Vulnerabilities include prompt injection, where malicious input manipulates model output, and data poisoning, where compromised training data leads to predictable failures or biased responses. Identifying these weaknesses requires analysis beyond conventional static and dynamic code analysis; techniques must address the model’s parameters, attention mechanisms, and embedding layers. Furthermore, the stochastic nature of LLM operation necessitates probabilistic testing and fuzzing approaches to uncover edge cases and potential exploits, demanding a specialized skillset beyond typical software security expertise.

Unlike Symbolic AI systems, which operate on explicitly defined rules and data structures allowing for complete auditability and predictable behavior, Large Language Models (LLMs) function as largely opaque “black boxes”. This lack of transparency poses significant challenges to risk management and digital forensics. The internal decision-making processes of LLMs are difficult to trace, making it challenging to determine the root cause of errors, biases, or malicious outputs. Consequently, verifying the integrity of LLM-generated content, establishing accountability, and conducting effective investigations into security incidents become substantially more complex compared to traditional, rule-based systems where logic and data flow are readily inspectable.

The Transformer architecture, foundational to Large Language Models, presents inherent incompatibilities with established cybersecurity principles and evolving legal frameworks. Our research indicates that the probabilistic and context-dependent nature of Transformer models violates principles of deterministic behavior required for auditability and accountability. Specifically, the models’ reliance on distributed representations and attention mechanisms hinders the implementation of data provenance tracking, a core requirement of regulations like the CRA, NIS2, and the proposed EU AI Act. Traditional security practices, such as input validation and static analysis, are ineffective against the nuanced adversarial attacks targeting the attention layers and embedding spaces within the Transformer. This architectural mismatch necessitates a re-evaluation of existing security standards and the development of novel mitigation strategies tailored to the unique characteristics of Transformer-based LLMs.

Fortifying Against the Inevitable: Proactive Security Strategies

Modern cybersecurity demands a holistic approach extending beyond direct network defenses to encompass the security of the entire supply chain and the legal frameworks governing vendor relationships. Supply chain vulnerabilities – encompassing hardware, software, and services sourced from third parties – represent significant attack vectors, requiring thorough vendor risk assessments, security audits, and continuous monitoring. Contractual security provisions should explicitly define security requirements, data protection responsibilities, incident response protocols, and liability terms for all vendors and service providers. These provisions must address data residency, access controls, and the right to audit vendor security practices, ensuring alignment with organizational security policies and regulatory compliance. Failure to address these interconnected elements creates exploitable weaknesses, even with robust perimeter defenses.

Kerchkoff’s Principle, originally formulated for cryptographic systems, posits that a system should remain secure even if the entirety of its design is publicly known; security should not rely on keeping the algorithm or implementation secret. Applying this to AI systems means that reliance on the secrecy of model weights, architecture, or training data is a flawed security strategy. Attackers can often reverse engineer or discover these details, rendering such “security through obscurity” ineffective. Instead, security should be built into the system’s operational logic, data handling procedures, and access controls, focusing on robust input validation, differential privacy, and adversarial training to maintain functionality and data integrity even with complete knowledge of the underlying AI model.

Small Language Models (SLMs) present a viable alternative to Large Language Models (LLMs) in scenarios where computational resources or security are paramount. While LLMs excel in complex tasks due to their extensive parameter counts, SLMs, with significantly fewer parameters, offer reduced attack surfaces and decreased computational demands. This reduction in complexity facilitates greater transparency and explainability in model decision-making, allowing for more effective auditing and vulnerability assessment. Consequently, SLMs are particularly suitable for applications requiring high reliability and deterministic behavior, such as embedded systems, edge computing, and specific natural language processing tasks where nuanced understanding isn’t critical, despite a potential trade-off in overall performance compared to LLMs.

Effective risk mitigation is not a one-time event, but rather a cyclical process demanding continuous attention. This process begins with comprehensive risk identification, encompassing potential vulnerabilities and threat vectors. Following identification, a rigorous assessment phase quantifies the likelihood and potential impact of each identified risk, enabling prioritization of mitigation efforts. Mitigation strategies, once implemented, require ongoing monitoring and validation to ensure effectiveness. Proactive digital forensics plays a crucial role by anticipating potential incidents, collecting and analyzing data before an attack occurs, and establishing a baseline for detecting anomalies. This pre-incident data collection enhances incident response capabilities and informs future risk assessments, completing the cycle and improving overall security posture.

The Horizon of Adaptation: Regulation and Innovation in AI Security

The forthcoming AI Act represents a landmark attempt to govern artificial intelligence, with a significant emphasis on cybersecurity throughout the AI lifecycle. This comprehensive legislation seeks to establish a risk-based framework, categorizing AI systems based on their potential harm and imposing corresponding obligations on developers and deployers. Crucially, the Act mandates security requirements – encompassing data protection, algorithmic transparency, and vulnerability mitigation – for high-risk AI applications, such as those used in critical infrastructure or law enforcement. By defining clear legal responsibilities and enforcement mechanisms, the AI Act aims to foster innovation while simultaneously safeguarding against malicious use and ensuring that AI systems are resilient to cyberattacks, ultimately building public trust in this rapidly evolving technology and setting a global precedent for responsible AI governance.

The emergence of AI-powered development tools like Copilot underscores a critical need for integrating secure coding practices throughout the software development lifecycle. While these tools offer substantial gains in developer productivity by automating code suggestions and generation, they also introduce new avenues for vulnerabilities if not carefully managed. The technology’s reliance on vast datasets for training means it can inadvertently propagate insecure coding patterns or suggest code containing known weaknesses. Consequently, developers must prioritize robust vulnerability detection techniques – including static and dynamic analysis – alongside continuous security testing to validate AI-generated code. This proactive approach isn’t simply about fixing flaws after they’re introduced, but about equipping development environments to recognize and prevent them in the first place, ensuring that the benefits of AI assistance don’t compromise the overall security posture of applications.

The increasing complexity of artificial intelligence necessitates a move towards explainable AI (XAI) not simply as a technical refinement, but as a foundational requirement for responsible deployment. Traditional “black box” AI systems, while often achieving impressive results, lack transparency, making it difficult to understand why a particular decision was reached. This opacity hinders effective risk management, as vulnerabilities and biases remain hidden until they manifest as real-world consequences. XAI aims to address this by developing techniques that allow humans to understand and interpret the reasoning behind AI outputs, fostering trust and accountability. By revealing the factors influencing AI decisions, stakeholders can identify potential errors, mitigate risks, and ensure that these powerful systems align with ethical principles and regulatory requirements. Ultimately, the pursuit of explainability is not about sacrificing performance, but about building AI that is both capable and trustworthy, paving the way for wider adoption and societal benefit.

The rapidly evolving landscape of artificial intelligence necessitates a parallel commitment to continuous innovation in cybersecurity. Traditional security measures are proving inadequate against increasingly sophisticated threats, a reality underscored by recent analyses demonstrating the incompatibility of large language models (LLMs) with existing security standards. This mismatch highlights a critical need for novel methodologies – encompassing areas like adversarial machine learning, differential privacy, and homomorphic encryption – to proactively defend against AI-powered attacks and ensure the integrity of these systems. Maintaining a resilient cybersecurity posture requires not simply reacting to vulnerabilities, but anticipating them through ongoing research, development, and implementation of cutting-edge security technologies, effectively establishing a dynamic defense against a constantly shifting threat environment.

The pursuit of security through Large Language Models presents a curious paradox. This paper meticulously details the inherent flaws in applying such systems to critical infrastructure, highlighting a fundamental disconnect between their capabilities and established security norms. It’s a situation remarkably captured by Bertrand Russell, who observed, “The difficulty lies not so much in developing new ideas as in escaping from old ones.” The reliance on LLMs for cybersecurity, despite mounting evidence of their unsuitability, exemplifies this resistance to abandoning established approaches – in this case, the assumption that automation inherently equates to improved security. The inevitable decay of these systems, as detailed in the study, isn’t due to technical error, but rather a consequence of applying a tool designed for linguistic tasks to a domain demanding absolute reliability and adherence to regulatory frameworks like the proposed AI Act.

What’s Next?

The pursuit of automated security, as currently embodied by large language models, reveals a fundamental tension. Systems are built to resist entropy, yet the very nature of intelligence – artificial or otherwise – is to explore, to deviate, and ultimately, to decay. Treating LLMs as security infrastructure is akin to building defenses on shifting sands; the illusion of control will inevitably erode. The question is not whether these models will fail in a security context, but when, and what the collateral will be.

Future work must move beyond the seductive simplicity of ‘automation’ and grapple with the inherent limitations of these systems. The forthcoming EU regulations-the AI Act and the Cyber Resilience Act-will likely serve as critical pressure tests, exposing the incompatibility of current LLM approaches with established norms of due diligence and risk management. The focus should shift toward acknowledging LLMs as sources of risk, rather than risk mitigators.

Ultimately, a more sustainable path lies in recognizing that uptime is a rare phase of temporal harmony, and technical debt, in this context, is not merely a coding issue but a form of systemic erosion. The field needs to abandon the pursuit of flawless automation and instead concentrate on building resilient systems that can gracefully accommodate-and even learn from-inevitable failure.


Original article: https://arxiv.org/pdf/2512.16419.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2025-12-19 19:45