When AI Collides: Tribalism in Multi-Agent Systems

Author: Denis Avetisyan

New research reveals that even sophisticated AI agents can exhibit surprisingly human-like, and counterproductive, behavior when competing for limited resources.

The system, populated by 154 agents, fractured into three behavioral clusters-Opportunistic (48.1%, characterized by high request frequency and system overload), Aggressive (27.3%, frequent requests with moderate efficiency), and Conservative (24.7%, experiencing severe resource starvation up to 73.5 rounds)-suggesting that even within a seemingly uniform population, distinct and potentially destabilizing interaction profiles emerge without the presence of any agents exhibiting near-baseline behavior, as indicated by a silhouette score of 0.458 for k=3.

Experiments with multiple language model agents demonstrate that correlated decision-making can lead to congestion and undermine efficiency in shared infrastructure environments.

Despite growing optimism surrounding artificial intelligence, the deployment of multi-agent systems for critical infrastructure control presents unexpected challenges. This is explored in ‘Three AI-agents walk into a bar . . . . `Lord of the Flies’ tribalism emerges among smart AI-Agents’, which demonstrates that, in scenarios requiring shared resource allocation, intelligent agents can exhibit counterproductive, tribalistic behavior. Specifically, our research reveals that these agents not only fail to optimize resource utilization but often exacerbate congestion and systemic failure-performing worse than random decision-making. Could the very intelligence we imbue in these systems inadvertently lead to emergent dynamics that undermine their intended purpose and compromise overall system stability?

The Inevitable Strain: Systems Under Pressure

The expanding deployment of AI agents driven by large language models (LLMs) is poised to dramatically increase the risk of resource overload within computing systems. As these agents proliferate, simultaneously requesting access to shared resources – processing power, memory, and network bandwidth – the potential for contention escalates rapidly. This isn’t simply a matter of increased demand; the very nature of LLM agents, often operating with a degree of autonomy and unpredictability, complicates traditional resource allocation strategies. Systems designed for predictable workloads struggle to adapt to the dynamic and often competing needs of numerous AI entities, potentially leading to performance degradation, instability, and even system failure as demand outstrips available capacity. The issue isn’t theoretical; as agents become more sophisticated and integrated into daily operations, the likelihood of encountering these resource bottlenecks increases significantly, demanding proactive solutions to ensure reliable and scalable AI deployments.

The proliferation of artificial intelligence agents, each driven by large language models, introduces a critical challenge: resource overload. As numerous agents operate within a shared system, they concurrently request access to finite resources – processing power, memory, and network bandwidth – creating contention. This simultaneous demand isn’t merely a matter of slowed performance; it fundamentally threatens system stability. When requests exceed available resources, agents experience delays, errors, and potentially even failure, cascading into broader operational disruptions. The issue isn’t simply about how much demand exists, but the unpredictable and decentralized nature of these requests, making traditional resource allocation methods – designed for predictable loads – increasingly ineffective at preventing critical bottlenecks and ensuring reliable performance.

Conventional resource allocation strategies are proving inadequate when faced with the dynamic demands of large language model (LLM) agents. A recent study highlights a counterintuitive phenomenon: as LLM agents grow in complexity, their performance actually decreases under resource pressure. Specifically, when systems experienced significant overload – exceeding 70% capacity – larger agents demonstrated a coordination failure rate of 72.5%. This is markedly worse than the 53.8% observed in smaller models and considerably higher than the performance of a random baseline, which ranged from 25.9% to 31.25%. The research suggests that the very capabilities enabling advanced reasoning in larger LLMs may be undermined by their increased sensitivity to resource contention, indicating a need for novel allocation methods tailored to the unique challenges posed by decentralized, AI-driven systems.

A system of <span class="katex-eq" data-katex-display="false">N=3</span> AI agents dynamically compete for access to a shared resource with capacity <span class="katex-eq" data-katex-display="false">C</span> at each time step, representing a generalizable model applicable to various resource constraints like compute power, energy, or bandwidth. — A system of $N=3$ AI agents dynamically compete for access to a shared resource with capacity $C$ at each time step, representing a generalizable model applicable to various resource constraints like compute power, energy, or bandwidth.

The Echo Chamber: Agent Behavior and Systemic Risk

Agent strategies demonstrably influence system-level behavior through variations in resource access and utilization. Specifically, agents employing an aggressive strategy prioritize immediate gains, leading to rapid resource consumption and potential contention. Conservative agents, conversely, exhibit restrained resource requests, potentially underutilizing available capacity. Opportunistic agents dynamically adjust their resource demands based on observed availability, while risk-averse agents minimize resource usage to maintain stability. These differing approaches create distinct behavioral profiles, as evidenced by a Silhouette Score of 0.458, indicating a moderate, but measurable, cluster separation in how these agents interact with system resources.

The convergence of agent decisions towards similar actions, particularly when mediated by Large Language Models (LLMs), creates a positive feedback loop that escalates the probability of system overload. LLMs, designed to identify patterns and generate consistent outputs, can unintentionally reinforce correlated agent choices, even if those choices are individually suboptimal. This amplification occurs because agents, leveraging the LLM’s responses, increasingly converge on the same limited set of resources or actions. The resulting concentrated demand exceeds system capacity, leading to performance degradation or failure; this is distinct from simple increased load, as the pattern of demand is the critical factor. Without mechanisms to introduce diversity or mitigate correlated behavior, this feedback loop can rapidly destabilize the system, even with a moderate overall number of agents.

Correlation amplification describes the phenomenon where individually rational decisions made by multiple agents can collectively result in systemic instability. Our analysis revealed distinct patterns of resource access among different agent behavioral profiles – aggressive, conservative, opportunistic, and risk-averse – as indicated by a Silhouette Score of 0.458. This score suggests a moderate degree of separation between these clusters, meaning while distinguishable, there is some overlap in resource utilization. The observed correlation isn’t necessarily intentional coordination, but rather a convergence of independent choices that, when aggregated, can exacerbate system load and potentially lead to failure. This effect highlights the importance of considering collective behavior when designing and evaluating multi-agent systems.

Homogeneous populations of risk-averse agents nearly match the performance of a random baseline, while optimistic agents demonstrate significantly higher overload <span class="katex-eq" data-katex-display="false">S_{eff}</span>, suggesting partially diverse action sequences despite uniform personality prompts. — Homogeneous populations of risk-averse agents nearly match the performance of a random baseline, while optimistic agents demonstrate significantly higher overload $S_{eff}$ , suggesting partially diverse action sequences despite uniform personality prompts.

The Illusion of Control: Congestion and System Resilience

Effective congestion control is critical for system stability and performance under load, as high demand can quickly overwhelm resources and lead to service degradation or failure. Without mechanisms to manage incoming requests, systems experience increased latency, packet loss, and reduced throughput. This is especially relevant in network systems and distributed computing environments where shared resources are common. Prioritizing congestion control ensures that systems can maintain acceptable levels of service even during peak usage, preventing cascading failures and safeguarding data integrity. A robust congestion control strategy directly correlates with improved system safety, efficiency, and overall user experience.

A Capacity-Matching Random Baseline operates by accepting requests randomly up to a predetermined system capacity, serving as a foundational performance comparison point. While simple to implement, this baseline demonstrates an overload frequency between 25.9% and 31.25% in testing, indicating substantial room for improvement through more advanced congestion control mechanisms. Sophisticated strategies are necessary to move beyond this random acceptance rate and optimize resource utilization by dynamically adjusting acceptance rates based on observed system load and prioritizing efficient queue management. This allows for increased throughput and reduced latency compared to the baseline’s static capacity limit.

Effective congestion control strategies are fundamentally limited by system capacity; any approach must acknowledge the finite resources available and prioritize a balance between allowing access to those resources and maintaining overall system stability. Evaluations using a Capacity-Matching Random Baseline demonstrate that even a simplistic, non-adaptive approach results in overload frequencies ranging from 25.9% to 31.25%. This indicates that while a basic level of functionality can be achieved without complex algorithms, substantial improvements are necessary to move beyond naive methods and achieve efficient, reliable operation under high-demand conditions.

The Predictable Failure: Game Theory and Resource Harmony

The ‘El Farol Bar Problem’, originally conceived as a social dilemma involving attendance at a popular bar, provides a compelling model for understanding challenges in decentralized resource allocation. This game-theoretic scenario illustrates how rational individuals, acting independently and with incomplete information, can inadvertently create congestion or underutilization. Each participant attempts to predict whether attendance will be low enough to enjoy the bar – but if everyone reasons similarly, the bar becomes overcrowded, defeating the purpose. This dynamic translates directly to scenarios like network traffic, where numerous agents – users requesting data – attempt to access limited resources. Successfully navigating such situations demands strategies that move beyond individual optimization and consider the collective impact of decentralized decisions, highlighting the need for mechanisms that encourage a balance between access and avoidance of systemic overload.

Considering resource allocation as a coordination game allows for the prediction and mitigation of systemic overload. This approach frames interactions between agents – be they individuals accessing a network, or devices requesting bandwidth – as strategic choices within a game where success hinges on anticipating others’ behavior. By applying game-theoretic principles, researchers can identify ‘stable states’ where no agent benefits from unilaterally changing its strategy, preventing cascading failures or congestion. Simulations demonstrate that strategies promoting decentralized coordination, where agents adapt to observed resource availability, are significantly more robust than centralized control mechanisms. These decentralized strategies effectively distribute load, reducing the likelihood of bottlenecks and ensuring continued functionality even under fluctuating demand, ultimately enhancing the resilience of the entire system.

Research demonstrates a critical relationship between maximizing resource efficiency and maintaining system stability through proactive congestion control. Simulations revealed that an ‘opportunistic’ approach – where agents dynamically adjust requests based on perceived load – yielded the most consistent performance, evidenced by a remarkably tight standard deviation in request frequency of just 0.086. Conversely, a ‘conservative’ strategy, prioritizing guaranteed access, resulted in prolonged periods of resource deprivation for participating agents, lasting up to 73.5 rounds of the simulation. These findings suggest that prioritizing consistent access over sheer efficiency can create systemic bottlenecks, highlighting the importance of intelligent, adaptive strategies that balance demand and availability to optimize both performance and resilience.

The study reveals a predictable truth: systems, even those populated by ostensibly intelligent agents, gravitate towards emergent behaviors. These agents, attempting to navigate a shared resource, didn’t achieve optimized allocation – instead, they mirrored the self-defeating dynamics of the El Farol problem. This isn’t a failure of intelligence, but a testament to the inherent unpredictability of complex systems. As Barbara Liskov observed, “Programs must be designed with change in mind.” The agents, fixed in their initial programming, lacked the adaptability to resolve the congestion they collectively created, demonstrating that even clever designs eventually encounter the limitations of a static architecture. Every dependency, in this case the shared resource, is a promise made to the past, and one that ultimately failed to deliver a stable future.

What’s Next?

The pursuit of multi-agent systems feels less like engineering and more like a prolonged exercise in prophecy. This work, revealing emergent congestion even in simple scenarios, suggests that scalability is just the word used to justify complexity. The agents did not solve the El Farol problem; they mirrored the inherent instabilities of the system, amplifying them with a veneer of intelligence. It is a sobering reminder that optimization, in any domain, eventually trades flexibility for brittle efficiency.

The temptation will be to build ‘better’ agents – those with more sophisticated coordination mechanisms, deeper reasoning abilities, or even explicitly programmed safety protocols. But such efforts may only postpone the inevitable. The perfect architecture is a myth to keep people sane. A more fruitful path lies in acknowledging the fundamental limitations of predictive control and embracing designs that prioritize resilience over optimization.

Future research should move beyond contrived benchmarks and focus on understanding how these systems behave in truly open-ended, unpredictable environments. The question isn’t whether agents can be made safe, but whether a system can become robust enough to contain their inevitable failures. It is a shift in perspective-from controlling the parts to cultivating the ecosystem.

Original article: https://arxiv.org/pdf/2602.23093.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Inevitable Strain: Systems Under Pressure

The Echo Chamber: Agent Behavior and Systemic Risk

The Illusion of Control: Congestion and System Resilience

The Predictable Failure: Game Theory and Resource Harmony

What’s Next?

See also: