Agent frameworks

Framework	Interaction	License	Multi-model	Agent teams	Free tier
Claude Code · SDK	Interactive CLI + API	Proprietary	Claude only	Subagents + hooks	No
Pi	Interactive terminal	MIT	10+ providers	Via extensions	Own API key
Codex	Async cloud	Proprietary	GPT-5 only	Parallel sandboxes	Limited
Gemini CLI	Interactive terminal	Apache 2.0	Gemini only	Limited	1,000 req/day
Google ADK	Build-your-own	Apache 2.0	Any model	Graph orchestration	Framework free

Bold values indicate a differentiating advantage. Sources: Anthropic · Pi (GitHub) · OpenAI · Google ADK — May 2026.

SWE-bench performance

Agentic task resolution
SWE-bench Verified

87.688.776.2

Terminal task completion
Terminal-Bench 2.0

65.464.756.2

System prompt footprint
tokens (lower = leaner)

~10K<1K~8K

SWE-bench and Terminal-Bench scores reflect the underlying model (Claude Opus 4.7, GPT-5.5, Gemini 2.5 Pro). Pi’s score is model-dependent — it runs any provider. System prompt sizes: Pi <1,000 per mariozechner.at (Nov 2025); Claude Code ~10,000 per same source. Sources: Anthropic system card · Pi blog post.

Pick the right framework

Agent teams that need compliance, audit trails, and subagent control

Use Claude Code + Agent SDK — first-class subagent support with permissioned tool sets, PreToolUse/PostToolUse hooks for change-control logging, plan mode (read-only), and session branching. The only framework with a programmatic API that mirrors the interactive CLI exactly. Best for regulated, multi-agent workflows where you need full observability.

Background task queues — fire tasks, collect PRs

Use Codex — the only framework built for async, parallel cloud execution. Submit 20 tasks simultaneously across different repos; each runs in an isolated sandbox and returns a PR. RL-trained on real software engineering tasks to produce clean, human-style diffs. Best when you want agents working in the background while the team does other things.

Cost-sensitive, local, or air-gapped deployments

Use Pi — MIT licensed, model-agnostic (Anthropic, OpenAI, Gemini, Mistral, Ollama, and more), and a <1,000 token system prompt means ~10× less context overhead than alternatives. Fork it, embed it, run it fully offline with Ollama. Best when data sovereignty matters, budget is tight, or you need full control over the framework itself.

Custom multi-agent systems on Google Cloud

Use Google ADK — a full graph-based agent orchestration framework (Python, TypeScript, Go, Java) with 100+ enterprise connectors (SAP, Salesforce, Workday, BigQuery), an A2A (agent-to-agent) communication protocol, and native Vertex AI deployment. Best when building bespoke agent workflows on GCP with existing enterprise system integrations.

Going deeper

Claude Code & SDK Anthropic's CLI + SDK — subagents, hooks, MCP, compliance-ready.

Pi MIT-licensed, multi-model, lean system prompt. We use this.

Codex OpenAI's async cloud agent — parallel tasks, GitHub-native.

Google ADK · Jules · CLI Four Google products — ADK for custom agents, Jules for async tasks.

Agent frameworks