Stop prompting agents. Build the loop that prompts them.
You shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.
I don’t prompt Claude anymore. I have loops running. They’re the ones prompting Claude and figuring out what to do. My job is to write loops.
The newest floor in a four-story stack
Each layer makes the one below it more useful. Prompt, context, and harness make a single run better. The loop makes recurring work reliable — when the human is no longer in the inner loop.
Seven steps, then a decision
A loop is not a cron job. A cron job repeats blindly. A loop discovers, verifies, persists, and decides what is next, then stops for an honest reason.
A loop that cannot stop is a bug. Build the stop conditions first. They are what let you walk away.
Five building blocks, plus memory
The same parts show up whether you build in Claude Code or Codex. Only the names of the levers change.
| Block | Its job | Claude Code | Codex |
|---|---|---|---|
| Automations | the heartbeat that fires the loop | /loop, /goal, hooks | Automations tab |
| Worktrees | isolate parallel agents | git worktree | worktree per thread |
| Skills | codify project knowledge | SKILL.md | SKILL.md |
| Connectors | touch real tools and data | MCP + plugins | Connectors (MCP) |
| Sub-agents | keep maker separate from checker | .claude/agents/ | .codex/agents/ |
| + Memory | done and next, on disk | markdown, issue tracker | markdown, issue tracker |
The command that makes a loop a loop
Run the agent non-interactively, with pre-approved tools and machine-readable output. A loop is just this command driven by something other than you — a while, a cron, a CI run — with a gate that skips when work is already green, and a cap.
/loop · repeat on a cadence
Run again on a schedule, roughly like cron. Useful, but it does not know whether it succeeded.
/goal · run until verified
Repeat until a verifiable condition is true, checked by a separate model each turn. The workhorse of serious loops.
# Claude Code: one-shot, machine-readable, pre-approved tools claude -p "fix the cause of these failures; don't weaken tests" \ --allowedTools "Read,Edit,Bash" --output-format json # returns the result and total_cost_usd # Codex: read-only by default; opt up explicitly; JSONL + resume codex exec --json "summarize the failing tests and fix the cause" codex exec resume --last "now fix the race condition you found"
The verifier is the asset; the maker is a commodity. Makers improve for free with every model release. The encoded, executable definition of correct for us is the thing you actually own.
Read the hands-on Claude Code & Codex guide → · let an agent design one for you, with the skill →
What it asks of you, and what it costs
The point of a loop is to automate the inner loop — the per-turn prompting — while you stay the engineer at the outer loop: verification, comprehension, and intent.
- goal metthe verifier confirms the done-condition
- budget spentan iteration, token, time, or dollar ceiling tripped
- stalledthe same failure twice, with no new evidence
- needs a humanhigh-risk or ambiguous, so it escalates — a success state, not a failure
- Fix only the cause. Do not widen scope.
- Never weaken or delete a test to make it pass.
- The maker does not declare itself done. A separate checker does.
- Write progress to a file each turn; read it first the next turn.
- Keep an explicit stop, so the loop can halt without you.
Money. An N-step loop trends toward O(N²), because each turn re-bills the history. Anthropic reports agents use roughly 4× the tokens of chat, and multi-agent systems roughly 15×. Bound it with budgets, gating, caching the repeated context, compaction, and right-sized models.
You. The orchestration tax. You can only merge as many diffs as you can read. Scale the fleet to your review rate — usually a low single digit — not to the tool’s lane count.
| Debt | Lives in | Agent can pay it down? |
|---|---|---|
| Technical | the code | Yes |
| Comprehension | people | Recoverable: ask it to explain |
| Intent (the why) | artifacts | No. It must come from a human |
Watch for cognitive surrender: when the model’s output quietly becomes your answer and you stop forming an independent view.
Make the loop cheap to trust.
The separate checker, the disk-backed memory, the budgets, the gate, the honest stop — every part is in service of that one sentence. If a step does not make the loop’s output cheaper to trust, it is decoration.