Plan contract and DAG implementation
Large Orcho runs should not collapse into one giant agent transcript.
The plan is a delivery contract. It names what must be true, which files are in scope, which commands prove the work, which risks matter, what reviewers must audit, and how implementation is split into dependent subtasks.
Read a large run from left to right. The task becomes a contract, the contract is reviewed before implementation, implementation runs as a dependency graph, and final acceptance checks whether the whole delivery is actually shippable.
Why this matters
Section titled “Why this matters”One real feature run had a large blast radius: MCP schemas, status projection, diagnosis, evidence, summary, docs, tests, generated schema snapshots, and verification receipts.
The useful part is not only that an agent edited files. The useful part is that Orcho made the shape of the work visible before letting implementation proceed.
Orcho Run 20260629_083410 feature
Task MCP provider-pressure projection
Plugin orcho-mcp
Verification
mode pro
envs mcp-local-core
policy declared in contract (require, warn)
effect require receipts; missing/failed resolved at gate time
Contract
acceptance 9
owned files 16
commands 5
risks 8
review focus 7
tasks 4This is the signal to slow down. A run with many owned files and required receipts needs a plan contract, not a free-form coding session.
Plan sections
Section titled “Plan sections”A good Orcho plan is not a narrative promise. It is a structured agreement.
| Section | What it controls |
|---|---|
| Goal | The one outcome the run must produce. |
| Acceptance criteria | The observable facts that must be true at the end. |
| Owned files | The allowed blast radius and review target. |
| Commands | The verification receipts the run must produce. |
| Risks | The assumptions and forbidden shortcuts reviewers should watch. |
| Review focus | The checks reviewers must prioritize over generic style review. |
| Tasks | The executable decomposition, including dependencies. |
For a large run, the task list is the bridge from planning to implementation. It should say which subtask owns which surface and which earlier subtask must finish before it can start.
Implementation execution policy
Section titled “Implementation execution policy”The approved plan does not automatically mean “run every subtask as a separate agent turn.” Orcho needs an implementation execution policy.
This policy answers one specific question:
How should the implement phase consume the approved task plan?It is separate from runtime selection and verification policy:
- runtime policy chooses which worker runtime/model handles a phase or subtask;
- verification policy chooses which gates and receipts are required;
- implementation execution policy chooses the shape of implementation itself.
The reader-facing mental model is:
| Policy shape | What happens |
|---|---|
linear | The implement phase runs as one controlled worker turn. The plan may still contain tasks, but they are guidance inside one implementation context. |
compact_dag | The implement phase turns planned subtasks into controlled worker turns with dependencies, receipts, and attestations. |
Current compact_dag execution is still sequential. The graph is not a promise
of parallel work yet. Dependencies are useful today because they define order,
compact context, upstream facts, and delivery blocking rules.
Implementation execution policy
linear
implement once
one worker turn consumes the approved plan
one implementation result goes to review
compact_dag
parse task list into a dependency graph
invoke each ready subtask as a focused worker turn
require subtask receipts and done-criteria attestation
block declared dependents when an upstream subtask is incomplete
run sequentially today (concurrency=1)This is the missing control plane between “the plan has subtasks” and “the implement phase actually runs those subtasks as separate controlled units.”
Isolation policy
Section titled “Isolation policy”Implementation execution policy answers how the plan is consumed. Isolation policy answers where the worker edits code.
This matters because Orcho has two different path concepts:
| Path | Meaning |
|---|---|
{project} | The canonical target repository the run is about. |
{checkout} | The Orcho-provided checkout that commands and workers should treat as the current subject. |
When per-run worktree isolation is active, {checkout} is a run-owned worktree
and {project} remains the source repository. When isolation is off, the two
collapse into the same checkout and the run edits the repository directly.
Isolation policy
per_run worktree
create a run-owned checkout under the workspace runspace
run agents and verification against {checkout}
preserve the diff as the review and repair subject
apply or deliver back to {project} only through the delivery boundary
off / direct checkout
{checkout} == {project}
run agents directly in the target repository
simpler for current-diff or small intentional edits
lower protection against accidental source-checkout mutationThis policy is part of the risk model. A large feature run wants a retained worktree subject so review, repair, receipts, and delivery all point at the same diff. A current-diff audit or small direct edit may intentionally use the project checkout.
If the source checkout is dirty and an isolated run would start from HEAD,
Orcho can ask how that dirty state should feed the run:
Pre-run intake - uncommitted changes in checkout 1) include seed the isolated run worktree with the current diff 2) exclude start the run from HEAD and leave local changes untouched 3) commit commit the checkout first, then start from that commit 4) halt stop before the run startsFor expert readers, this is the policy layer that prevents confusion between “the repo this task is about” and “the checkout this run is allowed to mutate.”
Budgeted plan validation
Section titled “Budgeted plan validation”validate_plan exists to reject a weak plan before code is changed.
In the same run, the first plan versions were rejected. The important part is how the failure stayed controlled: automatic plan rounds were bounded, then a phase handoff asked the operator whether to continue, retry, halt, waive, or ask for advice.
Plan validation
verdict REJECTED
finding F1 P1 RunStatus builder is not owned by the plan
Handoff (fired): validate_plan automatic round 2/2
policy human_feedback_on_reject
action retry_feedback
round validate_plan human retry 1 after REJECTED verdict
Plan validation
verdict REJECTED
finding F1 P2 evidence is missing from the AC7 consistency check
Handoff (fired): validate_plan human retry 1 rejected
action retry_feedback
round validate_plan human retry 2 after REJECTED verdict
Plan validation
verdict APPROVED
summary plan defines one source, four public surfaces, and a cross-surface testThe operator did not have to accept a bad plan. The run paused at the decision point, recorded the choice, fed the reviewer findings back into planning, and continued only after approval.
Subtask DAG
Section titled “Subtask DAG”Once the plan is approved, implementation can run as a dependency graph instead of one mixed pile of edits, but only when the implementation execution policy selects the compact DAG path.
T1-projection-schema
owns: shared projection source, next-actions helper, wire model
T2-diagnose
depends_on: T1-projection-schema
owns: diagnose condition and safe next actions
T3-status-evidence-summary
depends_on: T1-projection-schema, T2-diagnose
owns: status, evidence, summary, live card, schema snapshot
T4-future-shape-blocker-doc
depends_on: T2-diagnose, T3-status-evidence-summary
owns: future-state fixtures and architecture documentationThe dependencies matter. They keep the worker from implementing summary before the projection source exists, or documenting a future condition before the observable public surfaces agree.
Controlled worker runs
Section titled “Controlled worker runs”In compact_dag implementation, every subtask is its own controlled runtime
turn. It carries a goal, dependencies, done criteria, prompt size, session
policy, and upstream count.
ORCHO subtask 3/4 START: T3-status-evidence-summary
goal: align status, evidence, diagnose, and summary from one source
runtime: claude
model: claude-opus-*
depends_on: T1-projection-schema, T2-diagnose
done_criteria: 7
prompt_chars: 17082
current_only: true
execution_context: compact_dag
prompt_turn: true
upstream_deps: 2This is the main difference from asking one worker to “fix everything.” The subtask knows its local contract and its upstream dependencies.
The implementation output also reports how each subtask was rendered:
Subtask renders T1-projection-schema: full (continue_session=false) T2-diagnose: delta (continue_session=true) T3-status-evidence-summary: delta (continue_session=true) T4-future-shape-blocker-doc: delta (continue_session=true)That makes session strategy visible. The first subtask receives the full local contract; later subtasks receive compact deltas plus their dependency context.
Self-attestation
Section titled “Self-attestation”After a subtask completes, the worker attests against its own done criteria. This is not final approval. It is the worker’s structured claim of completion.
ORCHO subtask 3/4 DONE: T3-status-evidence-summary
attestation: met
[EXIT code=0 duration=4288.90s]
ORCHO subtask 3/4 ATTESTATION (met): T3-status-evidence-summary
7/7 done-criteria met
1. RunStatus.provider_pressure is filled by the real status builder.
2. Evidence carries the same typed provider-pressure object.
3. Summary carries provider_pressure without breaking legacy next_actions.
4. AC7 compares all four surfaces: status, evidence, diagnose, summary.
5. Generic failures remain provider_pressure == None on all four surfaces.
6. The MCP schema snapshot was regenerated and tested.
7. Unit, architecture, ruff, and diff checks were run.The attestation gives reviewers a checklist. It does not replace review; it makes the review target sharper.
Review, repair, and final acceptance
Section titled “Review, repair, and final acceptance”Orcho separates implementation success from delivery readiness.
In the same run, review approved the implementation, but final acceptance still rejected the delivery because required receipts were missing or stale.
Review
verdict APPROVED
summary no substantial defects found; surfaces share one projection/helper
Final acceptance
verdict REJECTED
ship_ready no
summary required receipts are missing or stale
Correction gate
missing required receipts: mcp-mock-smoke
stale required receipts: env-provenance, lint
Contract status
task_contract incomplete
interfaces compatible
tests weakThis is exactly why final acceptance is its own gate. A code reviewer can be satisfied while the delivery protocol still blocks release.
What to copy into your own tasks
Section titled “What to copy into your own tasks”For complex work, ask for a plan that names:
- acceptance criteria as observable facts;
- owned files and explicit out-of-scope boundaries;
- required verification commands and receipt policy;
- risks and falsifiers;
- review focus;
- subtasks with
depends_on; - done criteria per subtask;
- what counts as release-ready evidence.
For small work, this much structure is unnecessary. Use a lighter profile and keep the live run readable. Orcho’s point is not to make every task heavy; it is to make the amount of control match the blast radius.
Related
Section titled “Related”- Run anatomy explains the whole live output shape.
- Handoffs and advisors explains operator decision points.
- Verification receipts explains why receipts can block delivery.
- Cost accounting explains why large DAG subtasks should be measured.