loop engineering
A field guide

Stop prompting agents. Build the loop that prompts them.

In their words
You shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.
Peter Steinberger creator of OpenClaw
I don’t prompt Claude anymore. I have loops running. They’re the ones prompting Claude and figuring out what to do. My job is to write loops.
Boris Cherny creator of Claude Code
Where it sits

The newest floor in a four-story stack

Each layer makes the one below it more useful. Prompt, context, and harness make a single run better. The loop makes recurring work reliable — when the human is no longer in the inner loop.

PROMPT Prompt engineering: the words you give the model. What should I say to the model?
CONTEXT Context engineering (Karpathy): the state and knowledge it sees. What should the model see?
HARNESS The tools, permissions, tests, and sandbox around one agent run. What should surround one run?
LOOP The recurring system that fires, verifies, remembers, and decides. What system runs the work for me?

Read the full chapter on the lineage →

The shape of a loop

Seven steps, then a decision

A loop is not a cron job. A cron job repeats blindly. A loop discovers, verifies, persists, and decides what is next, then stops for an honest reason.

1
Trigger
A schedule or an event fires it. The heartbeat, not you.
2
Discover
Read CI failures, issues, alerts, or a task list to find the work.
3
Delegate
The maker proposes the smallest change that could be right.
4
Act
In an isolated worktree or sandbox, so parallel agents do not collide.
5
Verify
A separate checker grades the result. The maker does not get to mark its own work.
✖ fail → feed the evidence back and retry ✓ pass → persist state, then decide
6
Persist
Write done and next to disk. Read it first on the following turn.
7
Decide
Choose the next action, or stop: goal met, budget spent, stalled, needs a human.

A loop that cannot stop is a bug. Build the stop conditions first. They are what let you walk away.

Read the anatomy of a loop, with the 11-part contract →

What it is made of

Five building blocks, plus memory

The same parts show up whether you build in Claude Code or Codex. Only the names of the levers change.

BlockIts jobClaude CodeCodex
Automationsthe heartbeat that fires the loop/loop, /goal, hooksAutomations tab
Worktreesisolate parallel agentsgit worktreeworktree per thread
Skillscodify project knowledgeSKILL.mdSKILL.md
Connectorstouch real tools and dataMCP + pluginsConnectors (MCP)
Sub-agentskeep maker separate from checker.claude/agents/.codex/agents/
+ Memorydone and next, on diskmarkdown, issue trackermarkdown, issue tracker

Read the chapter on all five blocks →

Build it headless

The command that makes a loop a loop

Run the agent non-interactively, with pre-approved tools and machine-readable output. A loop is just this command driven by something other than you — a while, a cron, a CI run — with a gate that skips when work is already green, and a cap.

/loop · repeat on a cadence

Run again on a schedule, roughly like cron. Useful, but it does not know whether it succeeded.

/goal · run until verified

Repeat until a verifiable condition is true, checked by a separate model each turn. The workhorse of serious loops.

# Claude Code: one-shot, machine-readable, pre-approved tools
claude -p "fix the cause of these failures; don't weaken tests" \
  --allowedTools "Read,Edit,Bash" --output-format json
  # returns the result and total_cost_usd

# Codex: read-only by default; opt up explicitly; JSONL + resume
codex exec --json "summarize the failing tests and fix the cause"
codex exec resume --last "now fix the race condition you found"

The verifier is the asset; the maker is a commodity. Makers improve for free with every model release. The encoded, executable definition of correct for us is the thing you actually own.

Read the hands-on Claude Code & Codex guide →  ·  let an agent design one for you, with the skill →

Stay honest

What it asks of you, and what it costs

The point of a loop is to automate the inner loop — the per-turn prompting — while you stay the engineer at the outer loop: verification, comprehension, and intent.

The four honest stops
  • goal metthe verifier confirms the done-condition
  • budget spentan iteration, token, time, or dollar ceiling tripped
  • stalledthe same failure twice, with no new evidence
  • needs a humanhigh-risk or ambiguous, so it escalates — a success state, not a failure
Bake these into every maker
  • Fix only the cause. Do not widen scope.
  • Never weaken or delete a test to make it pass.
  • The maker does not declare itself done. A separate checker does.
  • Write progress to a file each turn; read it first the next turn.
  • Keep an explicit stop, so the loop can halt without you.
The two ceilings

Money. An N-step loop trends toward O(N²), because each turn re-bills the history. Anthropic reports agents use roughly the tokens of chat, and multi-agent systems roughly 15×. Bound it with budgets, gating, caching the repeated context, compaction, and right-sized models.

You. The orchestration tax. You can only merge as many diffs as you can read. Scale the fleet to your review rate — usually a low single digit — not to the tool’s lane count.

The debt trilogy
DebtLives inAgent can pay it down?
Technicalthe codeYes
ComprehensionpeopleRecoverable: ask it to explain
Intent (the why)artifactsNo. It must come from a human

Watch for cognitive surrender: when the model’s output quietly becomes your answer and you stop forming an independent view.

The shortest version

Make the loop cheap to trust.

The separate checker, the disk-backed memory, the budgets, the gate, the honest stop — every part is in service of that one sentence. If a step does not make the loop’s output cheaper to trust, it is decoration.