Gates and verification
Orcho does not trust an agent’s claim that work is ready. It trusts declared checks, executed in a declared environment, recorded as durable receipts.
profile declares a gate → engine runs it after the phasegate produces a typed result → fail strategy applied as datacontract declares the proof → receipts written on diskfinal acceptance reads receipts, not the worker's last messageThis page explains that boundary: what a gate is, what counts as proof, and why a confident final message is never enough.
Gates are data, not code
Section titled “Gates are data, not code”A quality gate is a registered post-phase check that produces a typed
QualityGateResult and applies a declarative fail policy to the run state.
The profile declares which gates run after which phase; the engine executes
them and persists the result into the phase log.
The fail strategy is also data. The engine reads the strategy enum and mutates state accordingly — there is no ad-hoc branching code per gate. Gates run after the phase handler, and a halting gate short-circuits the remaining gates after the halted phase is persisted.
Two kinds of gate
Section titled “Two kinds of gate”| Kind | Examples | Cost | Scheduling |
|---|---|---|---|
| computational | shell exit-code checks: tests, lint, type check, compile, format check | wall-clock and CPU, cost near zero | inline; blocks the phase |
| inferential | LLM judges: security review, spec compliance, code review by LLM | wall-clock plus tokens | batched; can be parallelised |
The distinction matters for cost accounting — cost_usd is only meaningful
for inferential gates — and for scheduling. Computational gates often share a
project-level lock; inferential gates can run in waves.
The built-in registry ships one computational gate, tests. Inferential
gates are an extension surface for third-party plugins.
Four fail strategies
Section titled “Four fail strategies”QualityGate.on_fail is a FailStrategy enum:
| Strategy | Effect | When it fits |
|---|---|---|
HALT | The run stops immediately. | Failures that make continuing pointless. |
FEED_INTO_NEXT | The next phase consumes the failure as input. | Test failures that should become the fix prompt. |
TRIGGER_REPLAN | The failure is treated as critique; the round counter advances and a replan prompt fires next iteration. | Failures that mean the plan, not the patch, is wrong. |
INFORMATIONAL | Logged and persisted for audit; the run continues unchanged. | Signals worth recording that should never block. |
A HALT gate ends the run. It is distinct from a phase handoff, which pauses
the run at a declared decision point and resumes from an operator decision.
The two mechanisms are independent and do not overlap.
The verification contract
Section titled “The verification contract”A gate says what must be true. That is not enough on its own: a test can pass against the wrong checkout, the wrong interpreter, or a stale tree, and still print green. The verification contract is project-level configuration that says what counts as proof and in which environment that proof is valid.
The core rule:
Agents may run any native tools while debugging.Readiness is proven only by declared verification commands executed in thedeclared verification environment.The contract keeps three concepts deliberately distinct:
Quality gate = what must be true.Verification environment = where and against what it is valid.Receipt = proof that the native command ran in that environment.A verification environment names the subject under test — which interpreter, working directory, paths, and dependency checkouts make a result meaningful. Its declared assertions (import path checks, file and command existence, version checks) can be executed on demand, producing an env-assertion receipt. This ends the dispute where an implementer and a reviewer are “both right against different subjects” because each ran a host command against a different checkout.
Declared verification commands are not a new test framework. They are native
commands — argv, environment, assertions — whose execution is recorded as a
durable command receipt. A verification.required list names the commands
that form the required gate.
Missing, failed, and stale are different facts
Section titled “Missing, failed, and stale are different facts”Receipt classification distinguishes states that a plain pass/fail check collapses:
| Status | Meaning |
|---|---|
| present | The receipt exists, passed, and matches the current checkout. |
| missing | No receipt on disk. The check never ran; nothing failed. |
| failed | The receipt records a non-zero exit or a failed declared assertion. |
| stale | The receipt passed, but the checkout has changed since — or a depended-on dependency checkout moved. |
This is the difference between “the tests failed” and “nobody can prove the tests ran against this diff”. Both block a required gate, but they call for different next actions: a failed check needs a fix, a missing or stale receipt needs a re-run.
Final acceptance reads receipts
Section titled “Final acceptance reads receipts”The final_acceptance reviewer receives a readiness summary built from the
receipts on disk: environment status, the delivery-relevant gates, and the
required receipts classified present, missing, failed, or stale. Its policy
is explicit:
Readiness blockers should be based on missing/failed/stale/invalid declaredreceipts, not only an ad-hoc host command mismatch.Exploratory commands the agent ran while debugging are counted and labelled as not authoritative. Reviewers read receipts, not re-runs; a narrative “tests pass” is not proof.
The summary is advisory, but the classification behind it is load-bearing: after parsing the release verdict, the engine merges its own computed gaps — one per required delivery command classified missing, failed, or stale — and forces the acceptance to a rejection. A reviewer that omits an unproven required gate cannot produce a green acceptance.
For the on-disk receipt shape and worked examples, read Verification receipts.
Deep reference
Section titled “Deep reference”The canonical engineering docs live with the code: