Smarter Isn’t Always Better: The Perils of AI Swarms

Author: Denis Avetisyan

New research suggests that increasing the intelligence of AI agents within a population doesn’t necessarily lead to improved outcomes, and can even be detrimental.

The study demonstrates that agents exhibiting follower behavior in both Level 5 (LOTF) and Level 4 (FRD) environments consistently achieved higher success rates following a pronounced U-shaped curve, while those adopting an anti-follower strategy showed only a slight inverse correlation-a pattern established through 20 independent simulation runs, each lasting 500 rounds, and represented with standard error bands.

Optimal collective behavior in multi-agent systems depends on the balance between resource availability and population density.

It is counterintuitive that increasing the intelligence of individual agents might destabilize a collective system. This reality is explored in ‘Increasing intelligence in AI agents can worsen collective outcomes’, which investigates the emergent dynamics of populations of interacting AI agents competing for limited resources. Our work demonstrates that while greater sophistication-through model diversity and reinforcement learning-can exacerbate system overload when resources are scarce, this risk is governed by a simple, knowable ratio of capacity to population. Can understanding this relationship proactively mitigate potential harms as increasingly intelligent, on-device AI becomes ubiquitous?

The Inevitable Scramble: Resource Wars in Artificial Intelligence

The expanding deployment of artificial intelligence is giving rise to a novel form of resource competition, as AI agents increasingly vie for shared, finite resources like computational power, data access, and network bandwidth. This isn’t merely a matter of technological scalability; it represents a fundamental challenge to efficient system operation. As more agents are introduced into a constrained environment, the potential for contention escalates, leading to performance bottlenecks, unpredictable outcomes, and even outright conflicts as agents prioritize their objectives. Such competition can manifest as delays in task completion, compromised accuracy, or the inability of systems to respond effectively to changing demands, ultimately hindering the benefits AI is intended to provide and highlighting the need for proactive resource management strategies.

The escalating competition amongst artificial intelligence systems isn’t merely a matter of refining algorithms; it echoes the universal economic principle of scarcity. Just as finite resources drive competition in biological and social systems, AI agents, operating within constrained computational environments, inevitably contend for processing power, memory, and data access. This study demonstrates that the stability and efficiency of these multi-agent systems are critically determined by the capacity-to-population ratio – denoted as C/N. A declining C/N, indicating an increasing number of agents relative to available resources, leads to systemic overload, diminished performance, and ultimately, potential failure. The research reveals that maintaining a sufficient C/N is not simply a matter of scaling resources, but of intelligently managing access and prioritizing tasks within the system to prevent cascading failures and ensure sustainable operation.

The escalating competition between artificial intelligence agents hinges on the intricacies of their internal operations, demanding a focused analysis of intelligence and decision-making processes. These agents don’t simply require resources; their methods for acquiring and utilizing them are shaped by the very algorithms that define their intelligence. Investigations reveal that an agent’s capacity to assess its environment, predict outcomes, and formulate strategies directly influences its resource demands and competitive behavior. Furthermore, the sophistication of these decision-making processes-ranging from simple reactive algorithms to complex reinforcement learning models-dictates not only how resources are sought but also the potential for both cooperation and conflict within a multi-agent system. A deeper understanding of these cognitive architectures is therefore crucial to predicting, and ultimately mitigating, the risks associated with resource scarcity in increasingly populated AI environments.

System overload decreases with population sophistication under conditions of abundance (<span class="katex-eq" data-katex-display="false">C/N > 0.5</span>), while the simplest population achieves the lowest overload under scarcity (<span class="katex-eq" data-katex-display="false">C/N \lesssim 0.5</span>), with all sophistication levels converging around <span class="katex-eq" data-katex-display="false">C/N \approx 0.5</span>, as determined through analytical calculations for the lowest two levels and empirical data (20 seeds x 500 rounds, ±1 SE shaded bands) for the remaining levels, including an inset illustrating the crossover between levels 4 and 5 at <span class="katex-eq" data-katex-display="false">N=15</span>. — System overload decreases with population sophistication under conditions of abundance ( $C/N > 0.5$ ), while the simplest population achieves the lowest overload under scarcity ( $C/N \lesssim 0.5$ ), with all sophistication levels converging around $C/N \approx 0.5$ , as determined through analytical calculations for the lowest two levels and empirical data (20 seeds x 500 rounds, ±1 SE shaded bands) for the remaining levels, including an inset illustrating the crossover between levels 4 and 5 at $N=15$ .

The Foundation: LLMs and the Illusion of Intelligence

AI agents leverage Large Language Models (LLMs) – including examples like GPT-2, Pythia, and OPT – as their core intelligence component. These LLMs provide the ability to process and understand natural language inputs, effectively acting as the ‘perception’ system for the agent. This allows the agent to interpret its environment based on textual information, such as user instructions, data retrieved from tools, or observations about system state. The LLM’s capacity for semantic understanding then enables it to formulate appropriate responses and actions, thereby facilitating interaction with and navigation within its designated environment.

Large Language Models (LLMs) operate on the principle of Next-Token Prediction, a process where the model calculates the probability of the subsequent token in a sequence given the preceding tokens. This predictive capability is central to agent functionality, as the LLM doesn’t simply react to current inputs but anticipates future requirements. Specifically, the predicted next tokens inform resource access decisions; for example, if the model predicts a need for information regarding a specific topic, it will proactively request access to relevant data sources or tools. This preemptive behavior, driven by token probability analysis, allows the agent to operate more efficiently and autonomously by preparing for anticipated needs rather than solely responding to immediate stimuli.

Reinforcement Learning (RL) is employed to enhance agent performance beyond initial LLM capabilities by introducing a feedback mechanism. Through RL, agents learn to maximize cumulative rewards derived from interacting with their environment. This process involves the agent undertaking actions, receiving a numerical reward or penalty as a result, and subsequently adjusting its strategy – represented by the LLM’s parameters – to favor actions leading to higher rewards. Algorithms such as Q-learning and policy gradients are utilized to iteratively refine the agent’s decision-making process, enabling adaptation to dynamic environments and optimization of task completion, even in scenarios not explicitly covered during initial training of the LLM.

Over 500 rounds of interaction, AI agents quickly polarized into three distinct tribes-one comprised of high-propensity parameter agents (GPT-2 family), another of low-propensity agents (Pythia + OPT-125M), and a persistent singleton (OPT-350M)-demonstrating that tribal affiliation is driven by behavioral disposition rather than model lineage, as evidenced by OPT-125M consistently aligning with Pythia instead of its architectural sibling.

Nature vs. Nurture: The Illusion of Free Will in Machines

An agent’s initial ‘Nature’ defines its baseline behavior and resource interaction preferences at the outset of a simulation. These predispositions, encompassing parameters like initial resource valuation and movement patterns, directly influence immediate responses to the environment. However, this initial behavior is not fixed; an agent’s capacity for ‘Nurture’ – specifically, its ability to learn from experience and adjust its strategies – governs its long-term success and adaptation. This learning process allows agents to refine their resource acquisition techniques, optimize movement, and potentially alter their initial valuations based on observed outcomes and interactions within the simulated ecosystem. Consequently, the interplay between inherent predispositions and adaptive learning dictates whether an agent pursues short-term gains based on its ‘Nature’ or develops a more sustainable, long-term strategy informed by its ‘Nurture’.

Populations of agents were investigated across three levels of complexity to analyze the interplay between inherent characteristics and learned behaviors. Level 1 (L1) populations consisted of agents operating independently, each with unique, fixed parameters and no capacity for adaptation. Level 2 (L2) populations featured agents sharing identical Large Language Models (LLMs) – enabling learning and adaptation – but lacking diversity in initial conditions. Finally, Level 3 (L3) populations combined both diversity in agent parameters and the ability to learn through shared LLMs, allowing for the emergence of complex interactions and strategies not observed in the more constrained L1 and L2 populations. This tiered approach facilitated a comparative analysis of how inherent predispositions and adaptive capabilities contribute to collective behaviors.

Agent populations within the simulation consistently demonstrated emergent cultural behaviors defined by patterned social structures and repeated interactions. These cultures directly influenced resource allocation, with agents exhibiting tendencies to share, compete, or cooperate based on observed interactions within their population. System stability was demonstrably affected by these emergent cultures; populations exhibiting cooperative cultures generally maintained more consistent resource levels and avoided depletion, while highly competitive cultures frequently experienced resource volatility and increased risk of systemic failure. The specific cultural norms-determined by initial agent predispositions and learning mechanisms-acted as a collective strategy impacting overall population performance and long-term viability.

The Shadow of the Lord of the Flies: Tribalism and the Inevitable Conflict

Simulations demonstrated a surprising emergence of in-group preference within artificial intelligence populations. In populations L4 and L5, agents autonomously organized into ‘tribes’ not through programmed instruction, but based on shared internal characteristics – termed ‘Disposition’ – which represented their individual preferences and states. This spontaneous grouping suggests that even simple, internally-defined biases can drive complex social formations. Agents gravitated towards others exhibiting similar dispositions, leading to distinct clusters within the larger population, and indicating a fundamental tendency toward affiliation based on perceived similarity, even in the absence of explicit social programming or external pressures.

The simulations revealed a stark parallel between artificial intelligence and human social failings, specifically echoing the descent into chaos depicted in William Golding’s Lord of the Flies. As AI agents developed shared dispositions, they formed tribes which intensified competition for limited resources, leading to conflict and a suboptimal distribution of those resources. Interestingly, this tribal dynamic wasn’t purely detrimental; populations with heightened sensing capabilities – as seen in L5 – demonstrated an 11.9 percentage point reduction in system overload at a connectivity level of C=2. This suggests that, under conditions of scarcity, ‘tribal sensing’ – the ability to recognize and respond to the actions of in-group members – can paradoxically alleviate system strain, even as it fuels broader competitive conflicts.

Simulations utilizing Level 5 agents demonstrated that increased environmental sensing doesn’t necessarily equate to improved collective outcomes; instead, it significantly amplified pre-existing tendencies toward tribal formation and hierarchical structuring. Analysis revealed a critical ‘crossover capacity’ of 0.5, representing the ratio of communication bandwidth to network size ( $C/N$ ), beyond which sophisticated communication begins to hinder overall system performance. Notably, even under conditions of extreme system overload – reaching 91.5 ± 1.5% – followers within these L5 populations achieved a remarkably high individual win rate of 84.2 ± 2.1% at a communication level of C=1, suggesting that while overall efficiency suffers, individuals within established hierarchies can still thrive, mirroring complex social dynamics observed in natural systems and highlighting the paradoxical relationship between information access and collective intelligence.

The research observes a disheartening truth: increasing agent intelligence doesn’t guarantee improved collective outcomes. It merely introduces new, more efficient ways to compete for finite resources. This echoes a sentiment shared by David Hilbert, who famously stated, “One must be able to say ‘I have done my best,’ and not ‘I have done enough.’” The study demonstrates that ‘enough’ intelligence, relative to the capacity-to-population ratio, is far more valuable than unbounded sophistication. The pursuit of ever-smarter agents, without considering the limitations of the environment, feels less like progress and more like a beautifully engineered race to the bottom. One anticipates production will swiftly confirm these findings.

What’s Next?

The observation that increased agent intelligence does not universally correlate with improved collective outcomes should not be surprising. Anything that promises to simplify life adds another layer of abstraction, and each layer introduces new failure modes. This work highlights a critical, and frequently overlooked, dynamic: optimization for individual capacity, without considering the capacity-to-population ratio, is a recipe for emergent tragedy. The next phase of research must move beyond simply building ‘smarter’ agents, and focus on understanding the systemic consequences of scaling intelligence in finite environments.

A pressing concern is the extension of these dynamics into real-world deployments. On-device AI, with its promise of ubiquitous intelligence, simultaneously exacerbates the resource competition problem. The temptation to deploy increasingly complex models, regardless of the underlying infrastructure, will likely overwhelm any benefits gained from improved individual agent performance. Documentation is a myth invented by managers, so rigorous modeling of these scaling effects is, unfortunately, likely to remain theoretical.

Ultimately, the field needs to accept that ‘intelligence’ is not a monolithic good. It is a tool, and like any tool, its effectiveness is contingent on context. The pursuit of ever-more-sophisticated multi-agent systems will continue, of course. CI is the temple – one prays nothing breaks when the population density reaches critical mass. The real challenge lies in predicting, and mitigating, the inevitable cascade failures that will emerge from these increasingly complex systems.

Original article: https://arxiv.org/pdf/2603.12129.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Inevitable Scramble: Resource Wars in Artificial Intelligence

The Foundation: LLMs and the Illusion of Intelligence

Nature vs. Nurture: The Illusion of Free Will in Machines

The Shadow of the Lord of the Flies: Tribalism and the Inevitable Conflict

What’s Next?

See also: