How Subagents are Built

Subagents are one of the most important primitives when it comes to harness engineering so lets peruse what exists and compare them.

Anson

21 Jun 2026 • 6 min read

Photo by Stephane Gagnon / Unsplash

Watching the development of AI harnesses is really interesting for a ton of reasons. It is ground zero for agentic development, a large amount of the work is open source, and every major player thinks that they are in a winner takes all race so the enginering teams are large and incredibly talented. Claude Code alone will probably be processing $100bn a month in tokens in the next year so every single choice has large consequences, if they aren't executing better than everyone else they will quickly get left behind. Being open source this is also the best view we have into how engineering is happening at these labs which are likely a few months ahead of everyone else when it comes to the latest software development cycles.

For a long time I really thought that all of the AI labs would start to converge on implementations across their stack, but its actually been the complete opposite. Models all behave so differently that I can pretty consistently call out lab of origin and size for any model when I run openrouter random through my OpenClaw. Agent harnesses are not different, pretty much every detail is different especially under the hood when you look at the agent side - tools, text formats, prompts are all completely heterogeneous.

The reason we are here today is mostly because pi.dev/ doesn't ship with subagents. Pi is a really incredible agent harness, and I've been forcing myself to use it without extensions for about a month now to kind of cleanse my pallete of all the incredible bells and whistles Claude Code ships with in an attempt to figure out what I really need. By far the thing I miss the most is subagents. Subagents are critical for cost, context management, and speed. A single Agent does do surprisingly well at a lot of tasks I thought it would struggle at alone, and it put me back on the hotseat as the orchestrator. For example if I need to compare the subagent implementations of OpenCode, Hermes, Codex, and Claude Code, then right now I just open 4 terminal tabs and paste the same question across them, then I copy them all into one conversations and work from there. It is kind of fun getting a layer closer to the engineering again :P

Subagent Primitives

Now if subagents are just as important as your Agent being able to read bash, then why didn't pi bundle some minimal subagent tooling? Turns out its because everyone is building them differently and there is no "correct" implementation.

Firstly as I see it there are 3 different tools here:

tool handoff
- Especially bash, but really any tool should be capable of being handed off to a subagent, especially if you build system spits out 100k tokens. Tool form should roughly be agent_handoff(tool, args, prompt)
  - agent_handoff(bash, 'make build', 'on failure tell me what the error is')
- No turns, tools, or context. Subagent is just fed the tool call, its output, and what we want to know about the output.
subagent
- This agent usually has the same tools and environment as the agent you are interfacing with, but it has to one-shot it problem. This is pretty much just calling pi -p "{prompt}"
- No communication with other agents or its host agent, but it also isn't too much of a stretch to keep subagents alive so that they can take multiple turns, receive additional instructions, or elaborate on their response.
orchestrator
- Claude calls these agent teams and is the only harness that fully ships this out of the gate. Your agent becomes a manager and spins up persistent subagents that can not only talk to your orchestrator but even eachother.
- There is a lot of infrastructure that needs to work well in order for this to work. I think the biggest hurdle is prompting since all agents currently just want to optimize for user satisfactions which really falls apart when you are collaborating on a large task. You have to have some sort of tension and push/pull between eachother that current agents just aren't good at.

SubAgent Implementations

Now back to every harness doing this differently. Nobody is really converging on one idea of a subagent, which is exactly why pi isn't shipping them. There is no obvious “correct” primitive yet.

Claude Code

Claude Code is the most ambitious of the bunch, and definitely the most developed subagent implementation. What makes it different is that it supports multiple delegation modes instead of one: fresh specialist agents that start from zero context, forked workers that inherit parent context, background execution, resumable sidechain transcripts, and coordinator-style team workflows.Claude Code is treating subagents as a real runtime for collaborative agent work. The important caveat is that this power comes with complexity. Once you move into persistent sidechains, background notifications, resume semantics, and team coordination, you are no longer just spawning a worker. You are managing a whole system which means its easy to end up in a gas town scenario where you are 100x'ing your token spend to 1.1x your productivity.

Dynamic workflows came out right as I started writing this and threw my original conclusion for a loop. I was going to say pi will end up being a leader in this space because it is moldable and dynamic enough to become whatever workflow you need on the fly. I think dynamic workflows are going to become that inside Claude Code. Hopefully Anthropic keeps writing high quality blog posts since their releases are slowly pushing deeper into the veil and being abstracted from the user as time goes on.

Codex

I think Codex is focusing on building out infrastructure and they haven't landed on a golden path yet. On paper they aren't that far behind Claude Code, and they should be able to handle things like Agent Teams, but in practice I never see Codex utilizing subagents on its own. I have to very explicitly tell it to break a task down on a specific axis and to assign subagents to each unit.

Hermes

Hermes just does a pure subagent like I describe in the primitives section above. They are new sessions, that inherit their parents permissions and tools, and have to one shot their task. Hermes is still pretty new so I'm sure this will expand. But its also probably totally fine for a personal assistant type tool.

Something I think they could build that would be really cool is not just searching the transcript for previous conversations, but to be able to talk to those conversations as well and keep the threads going.

OpenCode

OpenCode makes an addressable child process with limited permissions and tools, but otherwise they are pretty durable and can handle multiple turns. This is probably the obvious path to build subagents with pi. You just let your agent call pi in json/rpc mode and let it cook.

Pi

So I actually tried the approach I mentioned in OpenCode and Pi did it really well.

can you call multiple pi sessions to have them all review my resume at once with different prespectives and give me the results?

 $ set -euo pipefail
 mkdir -p /tmp/pi-resume-review
 PROMPT_BASE='Review the attached resume/index.html / rendered resume content for Anson Biggs. Do not edit files. Return concise, evidence-backed feedback with: strengths, top concerns, concrete changes, and any
 suggested rewritten bullets. Focus only on your assigned perspective.'

 pi --print --no-tools --name resume-review-ats @resume/index.html "$PROMPT_BASE Perspective: ATS/recruiter screening for Lead/Staff Generative AI, autonomy, aerospace/defense roles. Consider keyword fit,
 scanability, titles, dates, metrics, and red flags." > /tmp/pi-resume-review/ats.txt 2>/tmp/pi-resume-review/ats.err &
 P1=$!
 pi --print --no-tools --name resume-review-hm @resume/index.html "$PROMPT_BASE Perspective: technical hiring manager for AI platform/agentic engineering. Evaluate credibility, technical depth, impact, scope,
 architecture signals, and interview hooks." > /tmp/pi-resume-review/hm.txt 2>/tmp/pi-resume-review/hm.err &
 P2=$!
 pi --print --no-tools --name resume-review-defense @resume/index.html "$PROMPT_BASE Perspective: aerospace/defense cleared programs recruiter/manager. Evaluate clearance wording, restricted data/CUI/ITAR
 positioning, mission relevance, and compliance/security concerns." > /tmp/pi-resume-review/defense.txt 2>/tmp/pi-resume-review/defense.err &
 P3=$!
 pi --print --no-tools --name resume-review-exec @resume/index.html "$PROMPT_BASE Perspective: executive/CTO reviewing for senior leadership potential. Evaluate strategic narrative, business outcomes, leadership
 scope, influence, and whether the resume sells seniority." > /tmp/pi-resume-review/exec.txt 2>/tmp/pi-resume-review/exec.err &
 P4=$!
 FAIL=0
 for p in $P1 $P2 $P3 $P4; do if ! wait "$p"; then FAIL=1; fi; done
 if [ "$FAIL" -ne 0 ]; then
   echo "One or more pi sessions failed" >&2
   for f in /tmp/pi-resume-review/*.err; do echo "--- $f ---" >&2; tail -50 "$f" >&2 || true; done
   exit 1
 fi
 printf 'Outputs saved in /tmp/pi-resume-review\n'

You don't even need to use a special mode, just -p does most of what you would need. Now I've never seen Pi do this itself so I don't think this is truly a solution, but it does mean that the gap might even be small enough that just a skill is needed for "implementing" subagents.

Conclusion

All of this for me to just glaze Pi and Claude Code as usual. I do think all software will start to take the Pi approach going forward now that LLM improvements are pushing the cost of code generation to zero. Why do I have to settle with the way Anthropic wants my world to look if I can have my software generated on the fly to not just meet my specific wants, but the exact bridge to get me through my immediate task.