Coordinating local LLM agents without a manager: stigmergy from ant colonies

Multi-agent AI systems face a coordination problem. As frameworks scale from a handful of agents to dozens, orchestration overhead dominates actual work. Current approaches (AutoGen’s conversation managers, MetaGPT’s role hierarchies, CrewAI’s explicit backstories) all import human organizational patterns that create bottlenecks.

But nature solved this problem millions of years ago. Ant colonies coordinate thousands of agents without managers. Immune systems respond to threats without message buses. This exploration led me to stigmergic coordination: coordination through shared environment modification rather than explicit orchestration.

Three principles emerged that transfer beyond multi-agent AI:

Constraint over Orchestration: Design constraints that make coordination …

Three principles emerged that transfer beyond multi-agent AI:

Constraint over Orchestration: Design constraints that make coordination unnecessary rather than protocols that manage it
Locality as a Feature: Information hiding isn’t just software hygiene; it’s a coordination mechanism
Stability Through Continuous Pressure: Maintain equilibrium against decay rather than solving once and stopping

What follows is how I arrived at these principles.

The Problem

Human organizations coordinate through hierarchies because humans face specific constraints: limited communication bandwidth, cognitive load, the need to verify trust. We invented managers to handle these constraints. We built org charts to route information efficiently. We created planning roles to decompose complex goals into executable tasks.

These patterns work for humans. But importing them into agent systems introduces artificial bottlenecks. Central planners become serialization points. Global state synchronization creates contention. Hierarchical message passing adds latency proportional to tree depth. When a manager agent fails, all dependent workers stall.

I noticed this pattern across frameworks: coordination costs scale poorly. Add more agents, get diminishing returns. More agents should mean more capacity, not more overhead.

The constraint wasn’t the agents. It was the coordination mechanism.

What Nature Taught Me

Natural systems that coordinate at scale use a radically different approach: they coordinate through environment modification, not message passing.

Ant colonies are the classic example. When a foraging ant finds food, it doesn’t send a message to other ants. Instead, it deposits a chemical trail (pheromone) on its path back to the nest. Other ants sense the pheromone concentration and follow stronger trails. More ants following a path deposit more pheromone, reinforcing the trail. Shorter paths accumulate pheromone faster because ants complete the round trip more quickly. The colony converges on optimal routes without any ant knowing the global map or coordinating with other ants.

No manager ant assigns tasks. Each ant makes purely local decisions based on chemical gradients. Coordination emerges from the environment itself.

This principle was first identified by Pierre-Paul Grassé in 1959, studying termite nest-building. He called it stigmergy: indirect coordination through environment modification. The word comes from the Greek "stigma" (mark) and "ergon" (work): coordination through marks left by work.

Marco Dorigo and colleagues later formalized these insights into Ant Colony Optimization, demonstrating that simple local rules plus environmental feedback could solve combinatorial optimization problems like the traveling salesman. The key mechanisms: positive feedback (reinforcing good paths), negative feedback (pheromone evaporation), and purely local decision-making.

Stigmergy has three properties:

Local decisions suffice. Agents observe only their immediate environment. No global knowledge required.

The environment stores coordination state. Pheromone trails, chemical gradients, price signals: the medium itself carries information about what to do next.

Stability requires continuous maintenance. Pheromone evaporates. Chemical signals diffuse. Prices drift. Coordination isn’t achieved once and locked in. It’s maintained through ongoing activity.

This last point was crucial. I’d been thinking about multi-agent coordination as an optimization problem: find the best solution and stop. But natural systems treat coordination as a dynamical process: maintain stability against continuous decay.

What I Built

I designed a coordination kernel based on stigmergic principles. The core idea: treat the work artifact (whatever the agents are improving) as a shared environment. Instead of agents messaging each other about what to do, they read quality signals from the artifact itself and act to reduce "pressure" in high-badness regions.

The artifact is divided into regions. In a codebase, regions might be functions. In a shell script, individual commands or blocks. Each region has measurable properties (I call these signals). For shell scripts, signals include error counts, warnings, and style violations (via shellcheck). For code, signals might include cyclomatic complexity, test coverage, or formatting consistency.

Each region also has a pressure value: a scalar measure of "badness" computed from its signals. High pressure means the region needs attention. Low pressure means it’s in good shape.

Agents (I call them actors) observe these pressure gradients and propose changes to reduce pressure. An actor seeing a shell script with SC2086 warnings (unquoted variables) might propose adding proper quoting. Critically, actors observe only local state: the region’s content, its signals, its pressure. They don’t see other regions, don’t know what other actors are doing, and don’t communicate with each other.

The key mechanism is temporal decay. Region fitness and confidence values decay with a configurable half-life (I’m using 10 minutes for fitness, 5 minutes for confidence). Even regions that were recently improved gradually decay back toward baseline. This decay serves two purposes. First, it prevents premature convergence (the system doesn’t get stuck assuming a region is perfect forever). Second, it drives continuous re-evaluation. A low-pressure region yesterday might need attention today if its context changed.

The coordination loop runs in phases:

Phase 1: Decay. Apply half-life decay to all regions’ fitness and confidence. Stability must be maintained, not achieved once.

Phase 2: Measure and Propose. For each region where pressure exceeds an activation threshold, ask actors to propose patches. Actors see only local signals and propose changes based purely on local pressure gradients.

Phase 3: Validate and Select. Proposed patches are written to separate validation files (the patched artifact state) for testing. Patches that pass validation are ranked by pressure reduction. The system selects the top non-conflicting patches greedily.

Phase 4: Apply and Reinforce. Apply the selected patches. Boost fitness and confidence of affected regions. Mark these regions as "inhibited" for a cooldown period (I’m using 1 minute) so they aren’t immediately re-modified.

Then the cycle repeats. Decay erodes fitness, pressure gradients emerge, actors propose changes, patches are applied, reinforcement provides temporary stability.

It worked. Coordination emerged without orchestration. Adding agents increased throughput without increasing coordination overhead. Agents could fail, restart, or join dynamically without protocol changes.

The crucial trade-off: this only works when actions are reasonably independent. If every change to region A invalidates regions B through Z, the system thrashes. When locality holds, stigmergic coordination wins. When coupling is high, you need explicit planning.

What I Found

The theoretical prediction: pressure-field coordination should scale linearly with agent count while hierarchical coordination plateaus as the manager becomes a bottleneck.

I’m validating this on a shell script quality task: improving scripts using LLM actors (qwen2.5-coder via Ollama). Signals come from shellcheck: error counts, warnings, info messages, and style issues. Baselines are hierarchical coordination (a manager agent delegates to workers), sequential processing (one agent at a time), and random selection (pick regions randomly instead of by pressure gradient).

The key metric isn’t final quality. It’s quality per unit of coordination overhead. Hierarchical systems might achieve the same final quality, but at what cost? How many messages did agents exchange? How much time waiting for manager responses?

Pressure-field coordination has constant coordination overhead: zero inter-agent messages. Agents share state only through the artifact itself. Adding agents increases the number of patches proposed per tick, but doesn’t increase communication cost. In early runs, I’m seeing exactly this pattern: adding agents improves throughput linearly until we hit the validation bottleneck (writing patched artifacts to disk for testing has I/O costs).

The approach fails predictably. When coupling is high (when fixing one issue requires changing multiple regions simultaneously to maintain consistency), greedy local action doesn’t work. Agents thrash, making conflicting changes that undo each other. The inhibition period helps by preventing immediate re-modification, but doesn’t solve the fundamental problem: some tasks require global planning.

Sparse signals are another failure mode. If you can’t measure region quality locally, pressure gradients don’t reflect true badness. Agents optimize for the metric instead of the underlying goal (Goodhart’s Law in action). This taught me that sensor design is critical. You need signals that actually correlate with quality, and you need them to be locally computable.

The failures taught me as much as the successes. Stigmergic coordination isn’t universal. It’s a specific tool for a specific class of problems. But for problems with locality and measurable signals, it’s dramatically simpler than hierarchical alternatives.

Transferable Design Principles

Three principles emerged that extend beyond multi-agent AI.

Constraint over Orchestration. Instead of designing explicit coordination (managers, message protocols, planning algorithms), design constraints that make coordination unnecessary. In the pressure-field kernel, the constraint is locality: actors see only local state. This constraint forces emergent coordination because agents can’t explicitly plan together. Constraints shape behavior more powerfully than instructions.

Locality as a Feature. We usually treat information hiding as a software engineering nicety. But locality is a coordination mechanism. When agents can’t see global state, they can’t create global dependencies. When each agent operates on independent regions, parallelism is automatic. Design systems where local views suffice for local decisions.

Stability Through Continuous Pressure. Natural systems don’t "solve" problems and stop. They maintain equilibrium against decay. Pheromone evaporates, forcing ants to continuously reinforce good paths. Fitness decays, forcing agents to continuously re-evaluate regions. This prevents premature convergence and adapts to changing conditions. Design for continuous maintenance, not one-time achievement.

These aren’t just AI principles. They’re systems thinking principles. Software systems with clear module boundaries coordinate better than monoliths with shared global state. Teams with well-defined interfaces scale better than teams that constantly negotiate responsibilities. Markets with price signals allocate resources better than central planning committees.

The pattern repeats: constrain local behavior, let global coordination emerge.

What’s Next

Multi-agent coordination doesn’t require managers, planners, or message buses. It requires well-designed pressure landscapes that align local incentives with global goals.

Stigmergy (coordination through shared environment modification) offers an alternative to hierarchical orchestration. Natural systems have used this approach for millions of years, coordinating at scales and speeds we’re only beginning to replicate in software.

I’m currently cleaning up the Rust implementation and finishing the experimental validation. Once the results are compiled, I plan to publish the code and a more detailed write-up. The kernel is built on acton-reactive, an actor framework I maintain, which provides the supervision and message-passing primitives that make this architecture natural to express.

For product design, these principles suggest opportunities: systems that maintain pressure maps over complex artifacts (codebases, documentation, infrastructure) where agents, human or AI, observe gradients and prioritize work naturally. Coordination emerges from shared visibility into quality signals, not from task assignment systems.

The question isn’t whether we can build better planners. It’s whether coordination requires planning at all.