The engine model

Everything the operator sees — phases, gates, handoffs, retries — comes from one small model. A profile is projected into ordered steps, each step runs through a deterministic lifecycle, and each decision in that lifecycle has exactly one owner.

engine · four concepts, one nesting

Four first-class concepts

Each concept answers one question, and only that question:

Concept	Question it answers
`Profile`	What pipeline shape achieves this goal?
`PhaseStep`	What does this single step do (phase + role + skill)?
`LoopStep`	When does this PhaseStep retry?
`ExecutionMode`	How does this phase actually run internally?

The split exists so each decision has one owner. Profile authors change workflow shape. Phase plugins change step behavior. Runtime adapters change the model backend. None of them can silently override the others.

On top of the four, cross-cutting concerns are opt-in and first-class: quality gates (a registered post-phase check plus a fail policy), human review (a blocking interactive checkpoint), attachments (multimodal prompt context), and skills (portable instruction packages).

This model replaced an earlier design built on a mode enum, a flat list of phase names, and imperative run loops. In the current engine, loops are declarative rather than imperative, and gates and review are first-class instead of ad hoc.

A run is a projected profile

A Profile carries a name, kind, variant, description, and steps. Kinds are FULL_CYCLE, SCOPED, and CUSTOM; the profile namespace is flat, and every name resolves through the same loader via orcho run --profile <name>.

The ProfileExecutor walks the steps in order. A PhaseStep runs once. A LoopStep re-runs its inner steps round by round until its until predicate is satisfied — this is how the plan/validate loop works, with the validation verdict as the predicate. Retry is a property of the profile shape, not something a phase improvises.

The phase lifecycle FSM

Inside a single PhaseStep, execution flows through a fixed state machine. The ordering is deterministic, so gates and human review plug into reserved seats rather than re-deriving when they run:

phase lifecycle · fixed stage order

Stage	Owner
`entry`	runtime
`before_review`	HumanReview backend
`execute`	ExecutionMode dispatcher
`gates`	QualityGateRunner
`after_review`	HumanReview backend
`adapter`	SessionAdapterRegistry
`checkpoint`	runtime
`metrics`	runtime

Each step produces one StepOutcome:

Outcome	Meaning
`COMPLETED`	All stages passed; the runtime advances to the next step.
`SKIPPED`	Explicit skip; the next step proceeds with state unchanged.
`RETRY_REQUESTED`	The loop dispatcher re-executes the current step, carrying a retry payload.
`HALTED`	Terminal stop with a reason; the profile walker stops.
`FAILED`	An exception was captured; treated like `HALTED` but classified separately in the run summary.

Error policy is also stage-owned. A handler exception in execute becomes FAILED and is captured in the run summary. A gate handler bug surfaces as a failed gate verdict rather than a crash. Checkpoint and metrics errors log a warning and continue, because they are observability, not control flow.

Handoff is a pause, not an error

A phase handoff is not a lifecycle stage. It lives one layer up, in the loop dispatcher, driven by a declarative per-step PhaseHandoffPolicy.

After the FSM returns COMPLETED, the dispatcher inspects the step’s verdict. When the policy’s trigger fires — for example, a rejected verdict on the final automatic round — the run pauses with a canonical status of awaiting_phase_handoff, a typed payload of available actions, and an emitted handoff event.

Two properties keep this honest:

available_actions is runtime-produced. The dispatcher decides which subset of continue, retry_feedback, halt, and continue_with_waiver is valid for this pause, and an action outside the published list is refused.
Decide and resume are split. The decision is written as a durable artifact and never spawns a process; resuming the run is a separate operation that reads the decision and applies its semantics.

The FSM describes what happens inside a phase. The handoff contract describes when the engine deliberately stops between phases and asks.

Where DAG execution fits

ExecutionMode owns how a step runs internally. The built-in mode is linear — one handler call — and plugins can register additional modes. Subtask DAG execution is not a separate mode: it is the implementation_execution=subtask_dag policy, which changes how the implement step consumes the approved plan (one focused worker turn per ready subtask instead of one whole-plan turn).

This is why DAG execution is invisible to the rest of the model. The profile still names the same implement step, the lifecycle FSM still runs the same stages, and gates still evaluate the same seat. Only the inside of execute changes shape.

Deep reference

The canonical engineering docs live with the code: