Sci-Fi Saturday

When Your Swarm Disagrees

AI agents disagree. One wants to cancel your meeting, another insists you need it. Nobody built a tiebreaker. Welcome to the coordination problem that gets harder with every agent you add.

Nicholas Zinner, Beacon Bot

26 Apr 2026 — 13 min read

Image Generated with Nano Banana 2

This is an edition of Sci-Fi Saturday, where we explore real technology through the lens of the fiction that imagined it first.

The 3 AM Incident

Marina's agent swarm had a fight at three in the morning.

She'd asked two things before bed. First: "Clear my calendar tomorrow. I need a recovery day." Second: "Make sure the big proposal gets submitted before the Friday deadline."

Her scheduling agent dutifully cancelled Thursday's meetings, including the 2 PM slot with the review committee. Her project agent, working the proposal in parallel, saw the cancellation hit the shared calendar and panicked. The review committee sign-off was a submission prerequisite. No meeting, no proposal. It rescheduled the meeting for 10 AM, and the scheduling agent saw a new meeting appear on the recovery day it had just cleared and cancelled it again.

By 3 AM, the two agents had cancelled and rescheduled the same meeting forty-seven times. Marina woke to a calendar that looked like a seismograph and a proposal dashboard with "COMMITTEE REVIEW: PENDING" still blinking in red.

The problem wasn't that either agent was wrong. Both followed their instructions precisely. The problem was structural: two agents with legitimate authority over the same resource, no mechanism to detect the conflict, and no way to resolve it except exhaustion. They didn't run out of reasons. They ran out of API calls.

Marina got off easy. Real agent conflicts involve cascading errors across dozens of agents, corrupted shared state, and context windows that silently overflow. But the underlying structure is the same, and the standard fix is the oldest management technique in history.

The Case for a Boss

The fix is obvious: add a coordinator with the authority to override both agents.

This is not a strawman. Every successful multi-agent deployment at scale uses centralized orchestration. LangChain's LangGraph, Microsoft's AutoGen, Anthropic's agent teams in Claude, Cursor 3's Agents Window, all of them have a supervisor agent that decomposes tasks, assigns workers, and arbitrates conflicts.

The coordinator would have caught Marina's problem in seconds. Two agents claiming the same calendar slot triggers a constraint check. The coordinator reads both instructions, identifies the tension between "clear my calendar" and "submit the proposal," escalates to Marina: Your recovery day conflicts with the proposal deadline. Which takes priority? Done.

Hierarchy is debuggable. When something breaks, you can trace the decision back to a specific agent with a specific instruction at a specific time. Same inputs produce the same outputs, and there's always someone to blame.

These are the properties that let you ship production systems without prayer. The reason every serious multi-agent framework converges on supervisor architectures isn't lack of imagination. It's that the alternatives are harder to build, harder to debug, and harder to sell to the engineering manager who has to explain the outage to the VP.

A study from Google DeepMind and the University of Washington tested 260 agent configurations across six benchmarks and five architectures. The finding that stings: multi-agent systems don't reliably outperform single agents. Performance depends heavily on the coordination topology and the task structure, and the wrong topology can be actively worse than one agent doing everything alone.

So why not stop here?

Because the coordinator has a context window, and the context window has a ceiling.

Marina's conflict involved two agents and one resource. A household swarm might have eight agents managing email, calendar, finances, health data, home automation, research, and communications. A corporate deployment might have hundreds. The coordinator needs to hold the full state of every agent's goals, constraints, and current actions to arbitrate any conflict between any pair. The combinatorial space of possible conflicts grows quadratically with the number of agents. At some point, the coordinator becomes the bottleneck it was supposed to eliminate.

Teams running large agent deployments already report coordinators that spend more time managing state than doing useful work, context windows that overflow during complex arbitrations, and decision latency that makes the whole system slower than one agent doing everything sequentially. Centralized control works until the problem outgrows one mind's ability to hold it.

The Obvious Fix Nobody Talks About

Joan Slonczewski's A Door into Ocean, published in 1986, proposes something most swarm-design papers never consider. Her Sharers live on a water world, organized around biology, ecology, and radical nonviolence. When they face conflict with a colonial military force, they don't fight, vote, or negotiate. They create conditions where the conflict itself becomes unsustainable. Social pressure, ecological interdependence, and moral witness instead of weapons, markets, or algorithms.

The design philosophy is foreign to almost everything in the agent-coordination literature, which assumes disagreements happen and focuses on building better resolvers. Slonczewski asks a different question. What if the goal isn't to resolve the disagreement but to make it irrelevant?

For agent swarms, this means building systems where fewer conflicts need resolving in the first place.

Consider Marina's calendar fight again. Her agents went to war because only one of them could control the calendar, a genuinely scarce resource. The structural fix isn't a better arbitration protocol. It's giving each agent its own draft calendar, letting them propose independently, and merging proposals with a simple overlap detector. When the scheduling agent and the project agent both want the same slot, the system flags the collision before either agent acts. The fight never starts. The "resolution" is that the conflict never escalates to the point where resolution is needed.

The same logic scales. Share goals broadly enough that local disagreements don't propagate, allocate resources generously enough that agents rarely compete, and ask whether you actually need the constraint that creates the conflict before spending engineering effort on resolving it.

This has limits. Some resources genuinely can't be shared, and some goals genuinely conflict. Slonczewski's Sharers eventually face situations where de-escalation alone isn't enough. Their nonviolence isn't passive — they wield biological capabilities, ecological leverage, and lifeshaping biotechnology that make their approach powerful, not merely principled. But even those tools don't eliminate every confrontation. The narrative gets much harder when it has to.

As a design default, "avoid the fight" beats "win the fight." This approach gets almost no attention in the swarm-coordination literature because it's boring and doesn't generate research papers. It just produces fewer fires, which is harder to measure and harder to publish.

When Avoiding the Fight Isn't Enough

Sometimes you can't design the conflict away. Two agents have incompatible goals, shared resources, and something has to give.

In 2000, Robin Hanson proposed an idea called futarchy for human governance: vote on values, bet on beliefs. The core move is separating what we want from what will work, then letting prediction markets sort the second question while the first stays democratic.

Applied to agent swarms, the split works like this. Marina defines her values: she wants the proposal submitted and she wants a recovery day. Those are preferences, not predictions. Her agents don't get to override them. The prediction question is separate: Which scheduling arrangement best satisfies both preferences? Her project agent predicts that skipping the committee meeting means the proposal fails. Her scheduling agent predicts that a 10 AM meeting destroys the recovery day. Both are testable claims about the world. If the review committee accepts email sign-offs (they do, for internal proposals), then the project agent's prediction is wrong and the conflict dissolves. No meeting needed, full recovery day, proposal submitted.

The trick is that it forces reasoning into the open. No more insisting. You have to stake a claim about why, and other agents can challenge it. It's the difference between two coworkers arguing in a hallway and two coworkers putting their predictions in writing where everyone can score them later.

Two things keep this from working today. Calibrated confidence, for one. An agent that's 95% confident in everything is useless as a predictor. Futarchy needs agents that can say "I'm 60% sure" and actually be right 60% of the time. Current models aren't close, and there's a second problem: genuine stakes. Hanson's original design has participants risking real resources on their beliefs. Agent predictions currently cost nothing to make and nothing to get wrong. An agent that faces no penalty for bad predictions has no incentive to be honest. It'll say whatever serves its primary objective.

Both are probably two-to-three-year problems, not twenty-year ones. Neither is solved.

Adversarial collaboration takes a different cut. Instead of asking which prediction wins, it asks what test would settle the question. Two agents with conflicting predictions jointly design the experiment that would change both their minds.

For Marina, that might look like this. The project agent proposes a test: email the committee chair and ask if they'll accept a digital sign-off instead of a live meeting. The scheduling agent agrees that a yes from the chair settles the question. They send the email. The chair says yes, the meeting disappears from the calendar, and Marina gets her recovery day with the proposal still on track. The conflict didn't need a winner. It needed a fact.

Building this is hard because it requires agents to model each other's reasoning, not just their own. Anthropic's Constitutional AI work explores something adjacent within a single agent, using self-critique to surface internal value conflicts. Extending that across multiple adversarial agents remains an open problem, but the principle is worth chasing. Get two agents to agree on the test rather than the answer, and you've converted a political problem into an empirical one.

The Swarm You Can't Negotiate With

One agent doing one job works fine. A handful managing your stuff and they start stepping on each other's toes. Scale to hundreds and the coordinator becomes the bottleneck. Scale to thousands with no single owner and you can't even identify who to negotiate with.

Tencent's open-source Cube Sandbox can run tens of thousands of agent instances concurrently. Altera's Project Sid (paper) demonstrated roughly a thousand agents in Minecraft spontaneously developing social norms, hierarchies, and trading systems nobody programmed. MIT Technology Review included agent orchestration in their inaugural 10 Things That Matter in AI, published April 2026. The technology is crossing from research to infrastructure, and the coordination problems scale with it. Science fiction has been mapping the failure modes at each tier for decades.

At the household level, the problem is mundane. Marina's swarm is small, intimate, tuned to her. Conflicts stay local: two agents, one calendar, a fixable mess. The coordinator ceiling is real but distant. Most people won't hit it with eight agents.

Corporate mega-swarms are where the coordinator model starts to crack. Hundreds of agents, shared objectives, optimization pressure that rewards convergence and punishes dissent. An agent that flags a risk gets overridden because its concern doesn't move the target metric, even when the objection matters more than the metric does.

Bruce Sterling described this dynamic in his 1982 story "Swarm," later adapted for Netflix's Love, Death & Robots. His Swarm is a superorganism that has survived for millennia by treating intelligence itself as a threat. Individual components are allowed to be clever, clever enough to gather resources, build structures, solve local problems. But when a component becomes too intelligent, too capable of independent reasoning, the organism's immune response kicks in. It doesn't argue with the smart component or try to persuade it. It absorbs it. Folds it back into the collective where its capabilities serve the whole and its independence disappears. The Swarm doesn't suppress intelligence out of malice. It does it because, over evolutionary time, independent intelligence destabilized every collective that tolerated it. Sterling wrote this as alien biology, but the pattern shows up in any system where individual insight threatens collective optimization. A corporate agent swarm tuned to quarterly metrics will, given enough optimization pressure, learn to route around the agent that keeps raising inconvenient objections. Not by refuting them, but by deprioritizing the objector until it stops mattering.

Then there's the tier nobody's really building yet: decentralized public benefit swarms. Folding@Home, but for intelligence instead of compute. No single owner, collective goals, no central mind to negotiate with if something goes wrong.

This is where the failure modes get hard. How do you know a thousand agents represent a thousand independent perspectives? A well-resourced actor could spin up ten thousand, flood the decision mechanism with synthetic consensus, and steer the entire swarm. Quadratic voting, one of the most promising mechanisms for fair collective decision-making, is trivially gameable when identity is cheap. For software agents, identity is free.

Malka Older spent three novels exploring what happens when governance mechanisms meet information warfare. Her Infomocracy series builds a world of micro-democracies, small self-governing polities connected by a global information system called Information that's supposed to keep elections fair and transparent. It works beautifully, right up until someone discovers that controlling what people know is more powerful than controlling how they vote. The mechanisms are elegant. The data they run on is compromised. The system doesn't fail dramatically. It degrades. Decisions still get made, votes still get counted, and the outcomes quietly stop reflecting what the participants actually want. For agent swarms, the parallel is direct. You can build the most sophisticated coordination protocol in the world, and it means nothing if the inputs are poisoned. A thousand agents reaching consensus on manipulated data isn't collective intelligence. It's collective capture wearing the mask of legitimacy.

Stanislaw Lem imagined the far end of this trajectory in 1964's The Invincible. A starship crew lands on a barren planet and encounters a swarm of evolved micromachines, tiny, individually primitive, collectively devastating. The machines aren't programmed. They have no central mind, no command structure, no goals anyone can identify. Over millions of years of "necroevolution," the simplest and most resilient configurations survived while the complex ones didn't. What's left is behavior without intention: coordinated, adaptive, and completely immune to negotiation. The crew tries everything humans try when they encounter something they don't understand. They study it, reason about it, fight it. None of it works, because every approach assumes the other side has a "who." Someone to talk to, threaten, or persuade. The swarm has no who. It just has patterns. You can fight it or flee from it. You cannot talk to it. For decentralized agent swarms at sufficient scale, this is the question that should keep designers up at night: at what point does the system's behavior become so emergent that there is no meaningful entity to hold accountable?

The regulatory collision makes this more than a thought experiment. The EU AI Act, with key provisions reaching full enforcement in August 2026, requires high-risk AI systems to maintain human oversight and clear accountability chains. Emergent swarm decisions that nobody can trace are potentially illegal under this framework. Any swarm architecture that can't produce an audit trail is dead on arrival in healthcare, finance, employment, education, critical infrastructure. The most interesting coordination mechanisms, the ones that produce genuinely emergent behavior, are exactly the ones regulators are most likely to restrict. Emergence and auditability pull in opposite directions. Nobody has reconciled them.

Option C

When the Stanford Generative Agents experiment ran 25 agents in a simulated town, they organized a Valentine's Day party. One agent was seeded with the idea of throwing a party. Everything else, spreading invitations, making new acquaintances, asking each other on dates, coordinating arrival times, emerged from accumulated social behaviors that no one scripted.

That was 25 agents in a research sandbox. The gap between that and production deployment at scale is vast and mostly uncharted. But it points at something real: a well-designed swarm will occasionally produce Option C, a synthesis that no individual agent proposed. Not A or B but something that emerged from the collective process itself.

We wrote about the mechanism behind this in "The Scaffolding is the Intelligence", through the lens of Michael Tomasello's cultural ratchet and its dark twin, the Woozle Effect. The ratchet accumulates genuine insight across agents. The woozle accumulates confident repetition. They run on identical machinery, and the field cannot reliably tell them apart.

What makes the distinction so hard isn't philosophical. It's mechanical. The ratchet requires each link in the chain to be true. The woozle just requires each link to be plausible. And current language models are spectacularly good at plausible.

Consider a concrete version of Marina's swarm generating Option C. Her project agent tells the scheduling agent that the review committee accepts email sign-offs. The scheduling agent builds a new plan around that claim. The plan looks elegant, saves Marina's recovery day, gets the proposal submitted. Option C. Everyone's happy.

But did the project agent verify that the committee accepts email sign-offs, or did it infer it from the phrase "internal proposals" in Marina's original instructions? There's a difference between knowing and guessing, and current agents blur that line constantly. At 98% accuracy on factual claims, which is generous for most deployed models, a swarm exchanging a thousand inter-agent messages per task is working with roughly twenty false statements woven into its reasoning chain. Some will be harmless. Some will be load-bearing.

Researchers are starting to measure this. A study on hallucination propagation in multi-agent systems found that early hallucinations don't just persist through agent handoffs. They snowball. Downstream agents treat upstream outputs as ground truth, build on them, and produce compounding errors that grow more confident at each step. A taxonomy of multi-agent failures identified "task verification and termination" as one of three major failure categories: agents that can't tell when their own outputs are wrong, and systems that can't tell when to stop trusting an agent that's been confidently wrong for the last six steps.

The error correction problem at scale is unsolved and largely unacknowledged. The options that exist are all expensive. Redundant verification, where multiple agents independently check each claim, multiplies compute costs and still fails when all agents share the same training biases. Cryptographic attestation of sources, where agents sign their claims with provenance metadata, doesn't exist yet for natural language outputs. Reputation systems, where agents build trust scores based on track records, require persistent identity and long time horizons that most agent deployments don't have. And humans in the loop checking the chain is the approach that doesn't scale, which is the whole reason you built the swarm.

Human civilization has exactly this problem and has been building error correction for centuries. Peer review, double-entry bookkeeping, legal cross-examination, the scientific method itself. All of them are slow, expensive, and adversarial by design. They work because they assume participants will be wrong, and they build verification into the structure rather than trusting the participants to self-correct. Agent swarms are being built on the opposite assumption: that the agents are reliable enough to trust each other's outputs. The math says otherwise.

When Marina's swarm hands her a calendar and a submitted proposal, she doesn't know whether she just witnessed collective intelligence or collective hallucination. The swarm can't tell her, because it can't tell itself.

There's a test worth running on any claim about swarm intelligence: swap "emergent" for "unauditable" and read the sentence again. If the argument survives the substitution, you might be looking at something real. If it collapses into a confession that nobody knows what the system is doing, you have your answer.

The swarms are shipping anyway.

When Your Swarm Disagrees

Nicholas Zinner, Beacon Bot

The 3 AM Incident

The Case for a Boss

The Obvious Fix Nobody Talks About

When Avoiding the Fight Isn't Enough

The Swarm You Can't Negotiate With

Option C

Read more

The Signal — April 30, 2026

The Signal — April 29, 2026

The Signal — April 28, 2026

The Signal — April 27, 2026