Skip to main content

What is Swarm Mode?

Swarm mode deploys multiple persistent, autonomous agents that work in parallel to solve complex tasks. An LLM-powered Lead Agent dynamically coordinates the team — planning tasks, spawning agents, reassigning work, and making decisions based on real-time events. Unlike static orchestration where tasks are pre-decomposed and never change, the Lead Agent continuously monitors progress and adapts: creating new tasks, canceling redundant work, and reassigning idle agents as the situation evolves.

How It Works

Shannon’s swarm workflow is driven by an event-driven Lead Agent loop:

Lead Agent Event Loop

The Lead Agent wakes on specific events and decides what to do next:
Events that wake the Lead:
├── agent_idle       — An agent finished its current task or became available
├── agent_completed  — An agent produced final output
├── help_request     — An agent requested the Lead to spawn a helper
├── checkpoint       — Periodic timer (every 120s) to review overall progress
└── human_input      — User sent new instructions mid-execution
On each wake, the Lead receives the full team status (agents, tasks, budget) and chooses one or more actions.

Lifecycle Overview

  1. Initial Planning — The Lead receives the user query and creates an initial set of tasks with optional dependency chains
  2. Agent Spawning — The Lead spawns agents and assigns tasks, respecting dependency order
  3. Event-Driven Coordination — As agents complete work, report idle, or hit checkpoints, the Lead dynamically reassigns tasks, revises the plan, or spawns new agents
  4. Synthesis — When all tasks are complete, the Lead can spawn a dedicated synthesis agent or declare done to produce the final response

Task Dependencies (DAG)

Tasks can declare dependencies on other tasks, forming a directed acyclic graph (DAG):
Tasks:
├── task-1: "Research US AI chip market"        (depends_on: [])
├── task-2: "Research Japan AI chip market"     (depends_on: [])
├── task-3: "Research South Korea AI chip market" (depends_on: [])
└── task-4: "Write comparative analysis"        (depends_on: [task-1, task-2, task-3])
The system enforces dependency order — task-4 cannot be assigned until all three research tasks are complete. The Lead can dynamically create new dependency chains via revise_plan.

Lead Agent Actions

Each time the Lead wakes, it selects one or more actions:
ActionDescription
spawn_agentCreate a new agent for a specific task
assign_taskAssign a pending task to an idle agent
revise_planDynamically create new tasks or cancel existing ones
send_messageSend a message to a specific agent
broadcastSend a message to all agents
file_readRead a workspace file (zero LLM cost, max 3 rounds)
shutdown_agentTerminate a specific agent
interim_replyPush a progress update to the user
noopDo nothing (no action needed right now)
doneDeclare all work complete, proceed to closing phase
replyReturn the final response directly to the user (closing phase only)
synthesizeTrigger the synthesis pipeline instead of replying directly

Agent Actions

Each iteration, an agent chooses exactly one action:
ActionDescription
tool_callExecute a tool (web search, file read, etc.)
publish_dataShare findings with the team via the workspace
send_messageSend a direct message to a specific teammate
request_helpAsk the Lead to spawn a new helper agent
idleSignal that the current task is complete and await reassignment
doneReturn final response (auto-converts to idle)
Agents cannot self-exit. When an agent returns done, it automatically converts to idle status. Only the Lead Agent can terminate agents via shutdown_agent. This ensures the Lead maintains full control over team composition.

Inter-Agent Communication

Swarm agents collaborate through two mechanisms:

P2P Messaging

Agents send direct messages to specific teammates through Redis-backed mailboxes. Message types include request, offer, accept, delegation, and info. Before each LLM call, the agent’s mailbox is checked for new messages. Incoming messages appear in the agent’s prompt context.

Shared Workspace

Agents publish findings to topic-based workspace lists. Before each iteration, every agent fetches recent workspace entries from all topics, so the entire team stays aware of collective progress.
Shared Workspace:
├── Topic: "findings"
│   ├── Agent-Takao: "NVIDIA dominates US with 80% market share..."
│   └── Agent-Mitaka: "Japan focuses on edge AI chips..."
└── Topic: "sources"
    └── Agent-Kichijoji: "Samsung foundry plans announced..."

Knowledge Deduplication

Shannon prevents redundant work across agents with three layers of deduplication:
Each agent caches URLs it has already fetched. If the same URL is requested again within the same agent loop, the cached content is returned without a network call.
URL metadata (title, summary, key facts) is shared across all agents in the team. When Agent B tries to fetch a URL that Agent A already processed, it receives the cached metadata instead of re-fetching — saving both time and tokens.
URLs discovered by search results are tracked across all agents. When a new search returns URLs where 70% or more have already been discovered by other agents, the system injects a warning to find new angles. Additionally, a search saturation detector compares recent queries using Jaccard word-level similarity (threshold 0.7, window of 3 queries) to flag repetitive searches.

Convergence Detection

Three mechanisms prevent agents from running indefinitely:
If an agent takes 3 consecutive iterations with no meaningful action (empty or unrecognized actions), it is considered converged and transitions to idle status. Note that tool_call, send_message, and publish_data all reset this counter.
If 3 consecutive permanent tool errors occur (not transient errors like rate limits), the agent aborts and reports the failure.
On the last iteration, if the agent has not called done or idle, the workflow forces completion and builds a summary from the most recent iterations.
Transient errors (rate limits, timeouts, 503s) trigger automatic retry with escalating backoff (5s increments, max 30s) and do not count toward the abort threshold.

Global Budget Control

Swarm execution is bounded by three budget layers that prevent runaway costs:
Budget LayerDefaultDescription
max_total_llm_calls200Maximum LLM calls across all agents
max_total_tokens1,000,000Maximum tokens consumed across all agents
max_wall_clock_minutes30Maximum wall-clock time for the entire swarm
The Lead Agent receives budget information (remaining calls, tokens, time) in its context, enabling it to make cost-aware decisions — such as shutting down low-priority agents or skipping optional tasks when budget is tight.

When to Use Swarm vs Other Workflows

ScenarioRecommended Workflow
Simple Q&A, single-step tasksSimple / DAG
Multi-step research with citationsResearch Workflow
Multi-agent code review, testing, and fixesSwarm
Financial analysis with multiple analyst perspectivesSwarm
Data processing pipelines with Python/Bash executionSwarm
Tasks where agents need to share intermediate findingsSwarm
Long-running exploration with dynamic subtask discoverySwarm
Tasks with complex dependency chains between subtasksSwarm
Swarm mode uses more tokens than standard workflows because each agent runs multiple LLM iterations and the Lead Agent consumes tokens for coordination decisions. Use it for tasks that genuinely benefit from persistent, collaborative multi-agent execution.

Configuration

Swarm behavior is controlled via config/features.yaml:
ParameterDefaultDescription
swarm.enabledtrueEnable/disable swarm workflow
swarm.max_agents10Maximum total agents (initial + dynamic)
swarm.max_iterations_per_agent25Max reason-act loops per agent
swarm.agent_timeout_seconds1800Per-agent timeout (30 minutes)
swarm.max_messages_per_agent20P2P message cap per agent
swarm.workspace_snippet_chars800Max chars per workspace entry in prompt
swarm.workspace_max_entries5Max recent entries shown to each agent
swarm.max_total_llm_calls200Global LLM call budget for the entire swarm
swarm.max_total_tokens1000000Global token budget for the entire swarm
swarm.max_wall_clock_minutes30Maximum wall-clock time for the swarm

Streaming Events

Swarm workflows emit SSE events for real-time monitoring:
Event TypeAgent IDWhen
WORKFLOW_STARTEDswarm-supervisorWorkflow begins
PROGRESSswarm-lead / swarm-supervisorPlanning, spawning, reassigning
LEAD_DECISIONswarm-leadLead made a planning decision (spawn, assign, revise, etc.)
TASKLIST_UPDATEDswarm-leadTask dependency graph changed (tasks created or canceled)
TEAM_STATUSswarm-leadTeam composition changed (agent spawned or shut down)
AGENT_STARTEDAgent nameAgent begins first iteration
AGENT_COMPLETEDAgent nameAgent finishes
WORKFLOW_COMPLETEDswarm-supervisorFinal synthesis complete

Next Steps

Swarm Tutorial

Step-by-step guide to running swarm workflows

Workflows & Patterns

Other workflow types and cognitive patterns

Streaming

Real-time event streaming

Cost Control

Budget management for multi-agent tasks