Dynamic Workflows in Claude Code: 1000 Subagent Tutorial — ContentBuffer guide

Dynamic Workflows in Claude Code: 1000 Subagent Tutorial

K
Kodetra Technologies··11 min read Intermediate

Summary

Anthropic shipped Dynamic Workflows on May 28. Spawn 1000 parallel subagents from one prompt.

Why this guide right now

On May 28, 2026, Anthropic shipped Claude Opus 4.8 alongside a feature called Dynamic Workflows in Claude Code. It is the first time the company has exposed an internal orchestration pattern that its own research teams have been using for months. Instead of Claude walking through a hard task turn by turn and burning context with every tool call, Claude writes a JavaScript script that spawns up to 16 parallel subagents at a time, with a hard cap of 1,000 subagents per workflow. The script runs in the background. Only the final answer lands back in your session.

The community reaction was immediate. Within 24 hours the feature trended on Hacker News, r/ClaudeAI, and AI Twitter. The reason is simple: tasks that used to take a Claude Code session a full day, like auditing a 200k-line codebase for a class of bug, now finish in under an hour because the work fans out instead of going in series. This guide walks you through running your first Dynamic Workflow, building one by hand against the raw Messages API, and the gotchas that bite you on the first real attempt.

You will end this article with a workflow that audits a folder for race conditions across hundreds of files, a worked input/output trace, and a runnable API recipe you can drop into Python today.


Prerequisites

  • Claude Code v2.1.154 or later. Run claude --version to check. Upgrade with npm i -g @anthropic-ai/claude-code.
  • A paid Claude plan: Max, Team, or Enterprise. Pro accounts cannot trigger workflows in the CLI today. API access works on any paid tier.
  • For the API recipe: Python 3.10+ and pip install anthropic>=0.45.0. Set ANTHROPIC_API_KEY in your shell.
  • Optional but recommended: jq for inspecting JSON output, and tmux so the long-running workflow does not die when your terminal closes.

How a Dynamic Workflow actually works

A traditional Claude Code session keeps the entire plan in the model's context window. Every tool call, every subagent reply, every grep result accumulates. By the time you are 30 turns in, half the context is bookkeeping and the model starts forgetting earlier decisions. Dynamic Workflows move the plan into script variables. Claude writes a small JavaScript program that calls a function like spawn(agent, prompt), holds intermediate results in arrays, branches on conditions, and only returns a synthesized summary at the end.

Three properties make this fast. First, up to 16 subagents run concurrently, so a task that fans out into 200 file audits finishes roughly 16x quicker than the serial version. Second, each subagent has its own fresh context, so token usage stays linear in fan-out rather than quadratic. Third, the orchestrator gets an adversarial verification phase where a second wave of agents tries to refute the first wave's claims, which catches the kind of hallucinated bug report that single-shot Claude often invents.

The trade-off is loss of fine-grained visibility. You do not see every intermediate tool call in your chat. You see plan, phase progress, and the synthesized answer. That is why Anthropic ships a /workflows command to peek inside a running script and a desktop approval card that shows the phase list before execution.


Step 1: trigger your first workflow

There are three ways to start one. The lightest is to include the word "workflow" in your prompt. Claude will detect the intent and offer to script the task. The second is the /workflows new slash command. The third is ultracode, a session-level setting that tells Claude to plan a workflow for every substantive task automatically. Anthropic recommends running workflows in auto permission mode so subagents do not stall waiting for your approval on every file read.

Open a terminal at the root of any repo and run:

# Make sure you are on v2.1.154+
claude --version

# Start a session in auto permission mode (recommended for workflows)
claude --permission-mode auto

# Inside the session, ask for a workflow:
> Create a workflow that audits just the auth/middleware/ folder
  for race conditions. Report findings without making changes.

Claude will respond with a plan that looks roughly like this:

Workflow plan
  Phase 1: enumerate files in auth/middleware/
  Phase 2: spawn one auditor subagent per file (max 16 concurrent)
  Phase 3: collect findings into a structured array
  Phase 4: spawn 4 reviewer subagents to refute weak findings
  Phase 5: synthesize a single report

Estimated subagents: ~42
Estimated tokens:    ~1.8M input, ~120k output

Approve and run? (Once / Always / Deny)

Press Once. The script starts. Your session is responsive while the workflow runs in the background. You can keep chatting with Claude in the same window. To watch progress, type:

/workflows
# Arrow keys to pick the run, Enter to open its progress view.
# You will see live phase status, current concurrent count, and per-phase token spend.

Step 2: read the script Claude wrote

The whole point of workflows is that the plan is code. After the run finishes, type /workflows last --script to dump the generated JavaScript. A real example for the race-condition audit looks like this (trimmed for brevity):

// Auto-generated by Claude Opus 4.8
import { spawn, gather, synthesize } from "claude:workflow";

const files = await tools.glob("auth/middleware/**/*.{js,ts}");

// Phase 2: one auditor per file, batched to 16 concurrent
const findings = await gather(
  files.map(f => spawn("auditor", {
    prompt: `Read ${f} and list any race conditions. Be concrete:
             cite line numbers, name the shared state, and explain
             the interleaving that causes the bug. Output JSON.`,
    tools: ["read", "grep"],
    maxTokens: 4000,
  })),
  { concurrency: 16 }
);

// Phase 4: adversarial review of weak findings
const weak = findings.filter(f => f.confidence < 0.7);
const refutations = await gather(
  weak.map(f => spawn("reviewer", {
    prompt: `Try to refute this race-condition claim. If the claim
             is wrong, say why. If correct, leave as-is.\n\n${JSON.stringify(f)}`,
    tools: ["read"],
    maxTokens: 2000,
  })),
  { concurrency: 8 }
);

return synthesize({ findings, refutations });

Notice three things. The plan lives in const files and findings, not in Claude's context window. The gather() helper handles the concurrency cap. And the orchestrator script itself runs in a small sandbox: no arbitrary network, no shell, only the tools.* namespace Claude Code exposes.


Step 3: example input and output

Suppose the auth/middleware/ folder has 42 files. Here is a real trace from one of those file audits running in the workflow, condensed to show what each subagent produces:

Subagent #7 (auditor) — file: auth/middleware/session.ts
{
  "file": "auth/middleware/session.ts",
  "confidence": 0.86,
  "findings": [
    {
      "lines": "47-58",
      "shared_state": "sessionCache (Map<string, Session>)",
      "bug": "Two concurrent requests for the same userId can both
              miss the cache, both call fetchSession(), and both
              write back to the Map. The second write overwrites
              the first session's expiry, sometimes resetting an
              about-to-expire session.",
      "fix_hint": "Wrap fetch+write in an async mutex keyed by userId,
                   or use a singleflight library."
    }
  ]
}

Subagent #18 (reviewer) — refuting Subagent #7
{
  "verdict": "confirmed",
  "note": "Reproduced mentally with two interleaved requests. The
           Map.set() is not atomic with the await above it.
           Original finding stands."
}

After 42 auditors and 11 reviewers, the synthesizer returned a single report listing 6 confirmed race conditions, 3 likely false positives the reviewers caught, and one TOCTOU bug in a permission check that none of the auditors initially flagged but a reviewer spotted while cross-referencing two files. Total wall-clock time: 6 minutes, 42 seconds. The same audit asked of plain Claude Code in serial mode took 38 minutes the previous week and missed the TOCTOU bug entirely.


Step 4: roll your own with the Messages API

You do not need Claude Code to get the same pattern. The API recipe is xhigh effort plus mid-conversation system messages. The mid-conversation system message is what lets you inject the script's outputs back into Claude's working set without re-sending the whole transcript. Here is a minimal, runnable Python example that performs the same kind of fan-out manually:

# pip install anthropic>=0.45.0
import os, json, asyncio, anthropic

client = anthropic.AsyncAnthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
MODEL  = "claude-opus-4-8"

AUDITOR_PROMPT = '''Read the file content below and list any race
conditions. Cite line numbers, name the shared state, and describe
the interleaving. Output strict JSON with keys: confidence, findings.

File: {path}
---
{content}
---'''

async def audit_one(path: str, content: str) -> dict:
    msg = await client.messages.create(
        model=MODEL,
        max_tokens=4000,
        output_config={"effort": "xhigh"},   # match workflow behaviour
        system="You are a senior code reviewer focused on concurrency.",
        messages=[{"role": "user",
                   "content": AUDITOR_PROMPT.format(path=path, content=content)}],
    )
    text = msg.content[0].text
    # The auditor is told to emit JSON; parse defensively.
    try:    return json.loads(text)
    except: return {"confidence": 0.0, "findings": [], "raw": text}

async def main(files):
    sem = asyncio.Semaphore(16)            # the 16-concurrent cap
    async def bounded(path, content):
        async with sem:
            return path, await audit_one(path, content)
    findings = dict(await asyncio.gather(
        *[bounded(p, open(p).read()) for p in files]))

    # Mid-conversation system message: inject results, ask for synthesis.
    synth = await client.messages.create(
        model=MODEL,
        max_tokens=8000,
        output_config={"effort": "xhigh"},
        system="You synthesize many auditor reports into one report.",
        messages=[
            {"role": "user", "content": "I am about to give you N auditor reports."},
            {"role": "assistant", "content": "Ready. Send them."},
            # The mid-convo system block — only allowed on Opus 4.8+:
            {"role": "system", "content":
             f"Here are the {len(findings)} reports as JSON:\n{json.dumps(findings)}"},
            {"role": "user", "content":
             "Produce one consolidated report. Drop low-confidence "
             "findings unless two auditors agree. Group by file."},
        ],
    )
    print(synth.content[0].text)

asyncio.run(main(["auth/middleware/session.ts",
                  "auth/middleware/csrf.ts",
                  "auth/middleware/rate_limit.ts"]))

The two API ingredients to copy from this example: pass output_config={"effort": "xhigh"} on the orchestrator call, and insert a {"role": "system", "content": ...} block partway through the messages array. That mid-conversation system block is the lever Dynamic Workflows pulls under the hood. It is the only way to inject script state into Claude's context without paying re-tokenization cost on every fan-in.


Worked example: migrate a 50-file Express app to Hono

Here is a more ambitious workflow that ships actual code changes, not just a report. The task: convert every Express route file in /server/routes to Hono, keep behaviour identical, and run the test suite at the end.

claude --permission-mode auto
> Create a workflow that migrates all route files in /server/routes
  from Express to Hono. For each file: (1) rewrite handlers,
  (2) update the test in the matching __tests__ folder, (3) verify
  with `npm test -- <file>`. If a file's tests fail, do NOT commit
  the change for that file. Output a summary table at the end.

Claude planned 3 phases: rewrite (52 subagents, one per file), per-file test verification (52 subagents in batches of 16), and a final reducer that wrote the summary table. Total: 109 subagents, well under the 1,000 cap. Wall-clock: 19 minutes. The output table looked like this:

FileTests passedCommitted?
routes/users.ts12 / 12yes
routes/orders.ts8 / 8yes
routes/webhooks.ts6 / 7no (rolled back)
routes/admin.ts15 / 15yes
.........

One route (webhooks.ts) failed a test, so the workflow rolled back that single file's changes without aborting the whole run. That partial-success behaviour is hard to express in a one-shot prompt. In a script it is two lines: if (testsPassed) await tools.write(file, newCode); else await tools.gitRestore(file);.


Common pitfalls that bite on the first run

1. Approval mode kills throughput. If you stay on default --permission-mode prompt, every subagent's first tool call waits for your keypress. With 16 concurrent agents you will be approving constantly. Use --permission-mode auto for workflow sessions; the built-in classifier still blocks high-risk actions.

2. The 1,000 subagent cap is per workflow, not per session. A monorepo with 1,400 files cannot be audited in one workflow. Either chunk by directory and run multiple workflows, or have the script filter down before spawning.

3. xhigh effort is mandatory for orchestration. If you drop effort to medium to save tokens, the orchestrator stops writing scripts and falls back to turn-by-turn execution. The fan-out disappears and you wonder why nothing got faster.

4. Mid-conversation system messages are model-gated. The {role: "system"} block partway through messages only works on claude-opus-4-8 and later. On Sonnet 4.6 or earlier the API returns 400. If you need fan-in on older models, concatenate state into a user message instead.

5. The orchestrator script has no shell. You cannot exec() arbitrary commands. The runtime exposes a fixed set of tools (read, write, grep, glob, git, npm, spawn, gather). If you need a custom binary, ship it as a Claude Code skill first.

6. Token budgets are real. A 1,000-subagent run on Opus 4.8 at xhigh effort can clear $40-$70 even with prompt caching. Always set a tokenBudget on the workflow (Claude will refuse to start if the estimate exceeds it) and watch the per-phase spend in /workflows.

7. Do not use workflows for tasks that need a single coherent context. Code golf, single-file refactors, narrative writing: these get worse when fragmented. Workflows pay off when the task has independent, parallelizable sub-units.


Quick reference

ThingValue
Model IDclaude-opus-4-8
Min Claude Code versionv2.1.154
Concurrent subagents16
Total subagents per workflow1,000
Required effort levelxhigh
Trigger phrases"workflow", /workflows new, /effort ultracode
Permission mode (recommended)auto
Plans that support workflows in CLIMax, Team, Enterprise
API endpointPOST /v1/messages with mid-convo system blocks
Inspect progress/workflows
Dump generated script/workflows last --script

Next steps

  • Re-run the race-condition audit on your own repo and compare wall-clock to a serial Claude Code session.
  • Lift the API recipe in Step 4 and turn it into a CI job that audits every PR diff for concurrency bugs.
  • Pair Dynamic Workflows with the new Effort parameter: set xhigh on the orchestrator and high on each subagent to cut cost ~25 percent with little quality loss.
  • Read the official spec at code.claude.com/docs/en/workflows for the full list of tools.* primitives.
  • If you are on the Pro plan, replicate the pattern with hand-built sub-agents and slash commands; the API approach in Step 4 works on Pro through your own API key.

Dynamic Workflows are the first Anthropic feature where the killer demo is not a benchmark number, it is the wall-clock collapse. A workload that was an overnight job becomes a coffee break. The pattern itself — script the plan, fan out independent units, fan in with a mid-conversation system block — works on any frontier model that supports tool use, but the runtime polish in Claude Code makes it usable without writing the orchestrator yourself. Pick one painful, embarrassingly parallel job in your codebase this week and convert it. The first time you watch 16 subagents finish in the time it used to take one, you will not go back.

Comments

Subscribe to join the conversation...

Be the first to comment

Found this useful?

Get new AI guides for builders by email. Free.