Skip to content

Recovery and resume

Orcho recovery is state-driven. The correct next action depends on why the run stopped.

Resume is not a generic “try again” button. It is one recovery mode among several.

SituationUsual next action
Worker process interruptedResume the run.
Provider access failedFix runtime access, then resume with the right runtime.
Phase awaits handoffRecord a handoff decision, then resume.
Final acceptance rejectedStart a correction follow-up.
Delivery decision parkedDecide the delivery gate.
Scope or provenance blockerInspect evidence before overriding.

Orcho separates checkpoint restore from real follow-up work.

ModeTriggerKeepsChanges
CHECKPOINTorcho run --resume <run-id> with no new taskThe same run directory, checkpoint store, completed phase records, and original task.Starts a new process to continue the same run.
FOLLOWUPorcho run --resume <run-id> --task ... or a typed follow-up actionParent run id, parent status, base task, available session seeds, evidence, and retained context.Creates a new child run with a new task and new run directory.
from-run-planorcho run --from-run-plan <run-id>The parent parsed_plan.json and inherited project/task metadata when omitted.Starts a new run after the planning block.

All three preserve continuity. Only checkpoint resume keeps the same run as the active subject.

Use checkpoint resume when a run stopped before its lifecycle was finished.

Terminal window
orcho status --workspace ~/www/my-workspace/.orcho
orcho run \
--workspace ~/www/my-workspace/.orcho \
--resume 20260628_125026 \
--output live

The resumed process loads the existing checkpoint and skips phases that already completed. If the run pauses again on a handoff, decide the handoff first and then resume.

This sanitized excerpt is from a real interrupted feature run. The important signal is not just resumed; it is the checkpoint line and the pipeline DSL: completed phases are marked with , the active phase is marked with , and Orcho continues from the next unfinished phase.

recovery modecheckpoint restore
orcho run \
--resume 20260629_231549 \
--workspace /repo/workspace-orchestrator

Run 20260629_231549 did not finish (status: interrupted).
What do you want to do?
1) Resume from checkpoint  [default]
   Continue the same run from saved checkpoints.
2) Start a follow-up using this run as context
   Start a new run with parent context.
3) Exit
Choice [1/2/3]: 1

Orcho Run  20260629_231549  feature  resumed

State
session     auto  rounds=1  plan=yes
checkpoint  5 phases completed: plan, validate_plan, plan, validate_plan, implement
output      /repo/workspace/runspace/runs/20260629_231549/output.log
events      /repo/workspace/runspace/runs/20260629_231549/events.jsonl

Pipeline
⟳² (✓ plan [Claude] → ✓ validate_plan [Codex]) → ✓ implement [Claude]
  → ⟳² (▶ review_changes [Codex] → · repair_changes [Claude])
  → · final_acceptance [Codex]

worktree: retained retry subject /repo/workspace/runspace/worktrees/wt_20260629_231549/checkout
✓ Resuming from checkpoint: 5 phases completed

[PLAN] PLAN -- architect creates MD artifacts
↳ skipped: completed earlier in this run (resumed)

[VALIDATE_PLAN] VALIDATE PLAN -- reviewer audits the plan
↳ skipped: completed earlier in this run (resumed)

[IMPLEMENT] IMPLEMENT -- developer applies the change
↳ skipped: completed earlier in this run (resumed)

[REVIEW_CHANGES] review_changes -- Round 1
→ runtime=codex · model=gpt-5.5 · mode=read · session=fresh
What survived
The same run directory, retained worktree, original task, checkpoint store, completed phase records, output log, and event stream.
What restarted
A new process continued the lifecycle; unfinished phases can use fresh provider sessions while Orcho preserves the run context.

This is why checkpoint restore is different from a follow-up. The subject is still the same run. The already completed phases are evidence, not work to repeat.

Use a follow-up when the previous run reached a real decision and the next step is a new correction task.

Terminal window
orcho evidence \
--workspace ~/www/my-workspace/.orcho \
--format md
orcho run \
--workspace ~/www/my-workspace/.orcho \
--resume 20260628_125026 \
--task "Address final acceptance blocker R1: add the missing contract test and rerun the focused suite." \
--output live

This is intentionally not the same as checkpoint restore. The parent stays as evidence; the child run carries the correction.

Use a plan follow-up when planning already succeeded and the implementation run should inherit that plan.

Terminal window
orcho run \
--workspace ~/www/my-workspace/.orcho \
--from-run-plan 20260628_125026 \
--profile feature \
--output live

This creates a new run that skips the parent planning block and starts from the first downstream phase.

Use the highest-level surface available:

  • CLI: orcho status, orcho evidence;
  • MCP: orcho_run_status, orcho_run_diagnose, orcho_delivery_gate;
  • artifacts: meta.json, events.jsonl, receipts, and diff.patch.

The rule of thumb: do not infer from terminal text when a typed run-control surface already exists.

Ask one question first: did the same run stop mid-lifecycle, or did it finish with a decision that requires new work?

  • Mid-lifecycle interruption: checkpoint resume.
  • Recorded handoff decision: resume after the decision is written.
  • Rejected final acceptance: correction follow-up.
  • Persisted plan that should become implementation: from-run-plan.
  • Unknown or unsafe state: diagnose before launching anything.

For rejected delivery states, read Correction follow-ups. For paused decision points, read Handoffs and advisors.