False-ready delivery
Trust boundary
False-ready delivery is the state where a worker appears finished, but the delivery system cannot prove the result is ready. Orcho exists to keep that state visible instead of letting it become a confident final message.
Worker claim
Section titled “Worker claim”A worker runtime can finish its implementation phase and still leave the delivery unready.
[IMPLEMENT] edited api/auth.py and testsThat line is useful, but it is not a release decision. It says work happened. It does not prove that review passed, required checks ran, or final acceptance approved the result.
Review gate
Section titled “Review gate”The review gate is where false-ready delivery becomes visible.
[REVIEW_CHANGES] verdict=REJECTED blocker missing negative-path test fix add regression coverage for invalid inputA normal agent transcript can drift from implementation into a confident final message. Orcho keeps the gate separate. The run can say:
- the worker made a change;
- the change is not delivery-ready yet;
- this exact blocker must be fixed before the run can proceed.
Repair
Section titled “Repair”Repair is not a new vague prompt. It is a continuation of the delivery contract.
[REPAIR_CHANGES] added missing negative-path testThe point is not that rejection is good. The point is that rejection becomes a controlled state with a next action instead of hidden operator anxiety.
When the correction needs a new run, use Correction follow-ups to carry the parent context, blocker, diff, and evidence forward.
Final gate
Section titled “Final gate”Final acceptance is the last readiness decision.
[FINAL_ACCEPTANCE] verdict=APPROVED ship_ready yesIf final acceptance rejects, the run should not look ready. It should preserve the blocker and tell the operator whether to repair, follow up, resume, or halt.
Read Run anatomy for where review and final acceptance appear in the live stream.
When code review passes but delivery is still blocked
Section titled “When code review passes but delivery is still blocked”The strongest version of false-ready is not “the agent looked done.” It is “review approved the code, and delivery was still blocked.” Review and final acceptance are separate gates for exactly this reason: a reviewer can be satisfied with the diff while required verification is missing or stale.
Review
verdict APPROVED
summary no substantial defects found
Final acceptance
verdict REJECTED
ship_ready no
summary required receipts are missing or stale
Correction gate
missing required receipts: mcp-mock-smoke
stale required receipts: env-provenance, lint
Contract status
task_contract incomplete
tests weakA confident final message would have shipped this. Orcho holds it: the code was fine, but the proof that it works was not there.
Read Plan contract and DAG for the same gate inside a full run.
Evidence
Section titled “Evidence”For a rejected or blocked run, inspect:
- final acceptance verdict;
- release blockers;
- verification receipts;
- diff summary;
- delivery gate state;
- recommended correction or recovery action.
The durable proof surface lives in artifacts such as:
output.logevents.jsonlplan.mdreview.jsondiff.patchreceipts/A receipt is what turns “tests passed” from a sentence into something inspectable. Each one records where and how a check actually ran, so final acceptance can tell apart passed, failed, ran in the wrong place, and never ran:
[ { "name": "lint", "status": "stale", "command": "ruff check api/", "working_dir": "/repo/app", "interpreter": "python3.12", "source": "api/auth.py", "result": "passed against an earlier diff; re-run required", "provenance": "ran before the last edit" }, { "name": "mcp-mock-smoke", "status": "missing", "result": "required by the profile, but no receipt was produced" }, { "name": "verification-unit", "status": "passed", "command": "pytest tests/test_auth.py", "working_dir": "/repo/app", "interpreter": "python3.12", "source": "tests/test_auth.py", "result": "ran in the expected checkout and passed" }]The values are sanitized, but the shape is the point: missing and stale
receipts are why the run above blocked delivery even though the code review
approved the diff.
Use Evidence bundle for the artifact model and Verification receipts when the key question is whether checks ran in the right environment.
What false-ready usually looks like
Section titled “What false-ready usually looks like”False-ready delivery can happen when:
- required tests were not run;
- review found a blocker;
- final acceptance rejected;
- verification ran against the wrong tree;
- the diff touched files outside the declared scope;
- a delivery decision is still parked.
The point is not to celebrate rejection. The point is to keep the operator from shipping a result that only looked complete in the worker’s last message.
Related
Section titled “Related”- Run anatomy shows the deeper run stream.
- Evidence bundle explains durable artifacts.
- Verification receipts explains proof that checks ran.
- Correction follow-ups explains recovery after rejection.