MacleodLabs.ai
← All posts

Post

Dynamic Workflows: When Your Agent Writes Its Own Orchestrator

Agentic AI just learned to delegate at scale—and the architecture is simpler than you'd think.

Dynamic Workflows: When Your Agent Writes Its Own Orchestrator

Enjoying the field notes? Subscribe for each new deep dive.Subscribe →

Dynamic Workflows: When Your Agent Writes Its Own Orchestrator

Agentic AI just learned to delegate at scale—and the architecture is simpler than you'd think.

Anthropic shipped dynamic workflows in Claude Code, and the core idea inverts how we've been thinking about multi-agent systems. Instead of an LLM orchestrating subagents turn-by-turn through prompt engineering, Claude now writes a JavaScript workflow script that becomes the orchestrator. The runtime executes the script while your session stays responsive. The agent's context window shrinks to hold only the final answer; all the intermediate state lives in script variables.

This isn't just a performance hack. It's a fundamental architectural shift from context-driven orchestration to code-driven orchestration—and it unlocks agentic work patterns that were impractical before.

The Context Window Ceiling

Per arXiv:2512.04123 ("Measuring Agents in Production"), production systems typically execute ≤10 steps before human intervention—68% of agents in the study operated within this bound. Teams use bounded autonomy as a practical control measure; the study identifies reliability as the top challenge and documents that human-in-the-loop checkpoints are the dominant production pattern. Dynamic workflows break this architectural ceiling by decoupling orchestration (the script) from agent autonomy (each subagent's bounded scope): the script can coordinate 50+ phases without any single agent accumulating unbounded context.

The standard workaround—hierarchical agents with handoffs—introduces new problems: how do you route partial results? How do you recover from mid-chain failures? How do you audit what happened when Agent 3's output contradicted Agent 1's assumptions?

Dynamic workflows solve this by moving the plan into code.

How It Works

When you trigger ultracode mode (via /effort ultracode or the keyword ultracode: in your prompt), Claude analyzes the task and decides whether it needs a workflow. If yes:

  1. Claude writes a JavaScript orchestration script tailored to your task. The script defines phases, specifies which subagents to spawn, and encodes the coordination logic (fan-out, cross-check, aggregate).
  2. The workflow runtime executes the script, spawning subagents as parallel processes. Each subagent is a fresh Claude instance with its own context, its own tool access, and a narrowly scoped prompt.
  3. The script manages state. Variables hold intermediate results. Control flow handles branching (if Agent A found issue X, spawn Agents B and C to verify). Coordination logic is explicit: "wait for all three to finish, then compare findings."
  4. Only the synthesis lands in your context. The lead Claude session doesn't see the 47 file diffs, the 12 API calls, or the 200K tokens of intermediate reasoning. It sees the script's return value: a summary, a report, or a structured result.

The workflow is resumable. If you pause mid-run (or Claude Code crashes), you can restart from the last completed phase. The script is inspectable: from /workflows, select a run and press s to save it as a custom command you can re-run or modify.

Per the official docs at https://code.claude.com/docs/en/workflows, key use cases include codebase-wide bug sweeps, 500-file migrations, research requiring cross-checked sources, and drafting plans from multiple independent angles.

Parallel Subtask Decomposition: The Research Foundation

Dynamic workflows adapt the parallel orchestration pattern from arXiv:2604.17009 ("Small Model as Master Orchestrator"), but replace the trained ParaManager model with Claude-written JavaScript orchestration code. This eliminates post-training but uses frontier models as subagents instead of specialized small models.

The paper proposes a unified parallel orchestration paradigm via a lightweight orchestrator called ParaManager. It abstracts agents and tools into a standardized action space with protocol normalization and explicit state feedback, enabling parallel tool calls at each orchestration round. The orchestrator emits a set of parallel calls per round—each with a tool identifier and parameters—and the system provides explicit state feedback enabling "recoverable closed-loop reasoning."

arXiv:2604.17009 — first page

Read the paper on arXiv →

Below is a schematic of the ParaManager architecture as described in the paper:

         state_t (current task state + history)
               │
               ▼
        ┌─────────────┐
        │  ParaManager │  (lightweight orchestrator)
        │  (SFT + RL)  │
        └──────┬───────┘
               │ emits parallel action set at round t
               │ (up to n_max parallel branches)
       ┌───────┴───────┐
       │               │
       ▼               ▼
  [Agent/Tool 1]  [Agent/Tool 2]  ...  [Agent/Tool n]
       │               │
       └───────┬───────┘
               │ structured state feedback per branch
               ▼
        ┌──────────────┐
        │  Aggregation  │  cross-validation / synthesis
        └──────┬───────┘
               │
               ▼
          state_{t+1}  (updated; loop continues or terminates)

Dynamic workflows implement this general pattern with two key differences:

  1. The orchestrator is code, not a trained model. Claude writes the loop; the runtime executes it. This eliminates the need for post-training an orchestration policy.
  2. Subagents are frontier models, not specialized small models. Each subagent is a full Claude instance with reasoning, tool use, and planning capability.

The Cost-Quality Trade-Off

Parallel execution burns more tokens but unlocks parallelism that sequential agents cannot achieve. Teams should architect their tasks to fan out work where possible to justify the token spend.

When does this make sense?

  • When parallelism beats cost. Workflows shift the trade-off: you pay more tokens but gain wall-clock speed and capability at scale that single-agent approaches cannot achieve, especially when developer time is the binding constraint.
  • When adversarial verification matters. Dynamic workflows enable quality patterns that single agents can't achieve: have Agent A draft a plan, Agents B and C review it from opposing angles, then Agent D synthesizes. This "independent agents adversarially reviewing each other's findings" pattern (per Anthropic docs) is how you catch hallucinations and logic errors before they ship.
Workflow showing task decomposition into parallel subagents with cross-validation phase

Production Findings: Bounded Autonomy Meets Unbounded Workflows

Per arXiv:2512.04123's study of 86 deployed systems practitioners across 26 domains (20 case studies), production agents are deliberately constrained:

  • Teams use bounded autonomy as a practical control measure for production stability
  • 68% of agents execute ≤10 steps before human intervention
arXiv:2512.04123 — first page

Read the paper on arXiv →

Dynamic workflows don't violate this principle—they scale it. Each subagent still operates with bounded autonomy (narrow prompt, specific task, limited tool access). But the workflow script coordinates dozens or hundreds of these bounded agents. The entire run is subject to approval gates and resumability for inspection—control is maintained through the script's explicit coordination logic and the ability to pause, inspect, and restart at any phase, not by shrinking agent count.

The comparison matrix from the docs clarifies when to use workflows vs. other patterns:

Feature Subagents Skills Agent Teams Workflows
Who decides next Claude, turn by turn Claude, following prompt Lead agent, turn by turn The script (the plan is upfront, but execution is inspectable and resumable)
Scale Few delegated tasks Same as subagents Handful of peers Dozens to hundreds
Interruption Restarts turn Restarts turn Teammates keep running Resumable in same session

Activating Workflows: Two Paths

Explicit: Include ultracode: in your prompt, or use natural language ("use a workflow", "run a workflow").

Session-wide: Run /effort ultracode to combine xhigh reasoning effort with automatic workflow orchestration. Claude decides when each task warrants a workflow. Drop back with /effort high for routine work. Resets each session.

Pseudo-Code: What a Workflow Script Looks Like

The following is conceptual pseudo-code illustrating the orchestration pattern—not literal Claude Code API syntax. The actual workflow runtime handles spawning and coordination transparently; this shows orchestration intent, not a callable spawnAgent() function.

// Phase 1: Discovery — fan out across all TypeScript files
const files = await findAllMatchingFiles("src/**/*.ts");

// Conceptually: one analysis task per file, executed in parallel by the runtime
const agents = files.map(f => spawnAgent({
  prompt: `Analyze ${f} for auth bypass vulnerabilities. Return JSON:
    { file, severity: "high"|"medium"|"low"|"none", finding, lineRange }`,
  tools: ["read_file", "search_files"]
}));
const findings = await Promise.all(agents); // runtime executes in parallel

// Phase 2: Cross-check — escalate high-severity findings with independent review
const suspicious = findings.filter(f => f.severity === "high" || f.severity === "medium");

const crossCheckAgents = suspicious.map(finding => spawnAgent({
  prompt: `A prior agent flagged this potential vulnerability:
    File: ${finding.file}, Lines: ${finding.lineRange}
    Finding: ${finding.finding}
    Independently verify: is this a real auth bypass? Check call sites and access control.
    Return JSON: { confirmed: boolean, confidence: "high"|"medium"|"low", reasoning }`,
  tools: ["read_file", "search_files"]
}));
const crossChecked = await Promise.all(crossCheckAgents);

// Phase 3: Synthesis — produce final report from confirmed findings only
const confirmed = suspicious.filter((_, i) => crossChecked[i].confirmed);

const report = await spawnAgent({
  prompt: `You are a security analyst. Synthesize these confirmed vulnerabilities into
    an executive summary with remediation priorities.
    Input: ${JSON.stringify(confirmed, null, 2)}
    Output: markdown report with severity table and recommended fixes.`,
  tools: []
});

return {
  totalFilesScanned: files.length,
  findingsReviewed: suspicious.length,
  confirmedVulnerabilities: confirmed.length,
  report: report.output
};

Note: spawnAgent() above is illustrative. The actual Claude Code workflow runtime infers parallelism from independent tasks in the script; consult the official docs at https://code.claude.com/docs/en/workflows for the real API contract.

Key structural points: Phase 1 fans out (one agent per file), Phase 2 narrows (cross-checks only escalated findings), Phase 3 synthesizes. State lives in script variables—Claude's session context never accumulates the intermediate findings.

Sources & further reading

  • arXiv:2512.04123 — "Measuring Agents in Production" (Pan et al., UC Berkeley, ICML 2026)
  • arXiv:2604.17009 — "Small Model as Master Orchestrator: Learning Unified Agent-Tool Orchestration with Parallel Subtask Decomposition"
  • https://code.claude.com/docs/en/workflows — Official Claude Code dynamic workflows documentation

Get the next deep dive in your inbox

Field notes on shipping agentic AI — no spam, unsubscribe anytime.

Subscribe →