Skip to content

The engine model

Everything the operator sees — phases, gates, handoffs, retries — comes from one small model. A profile is projected into ordered steps, each step runs through a deterministic lifecycle, and each decision in that lifecycle has exactly one owner.

engine · four concepts, one nesting
Profile · the pipeline shape ProfileExecutor walks the steps in order PhaseStep one semantic step LoopStep · bounded retry PhaseStep re-run round by round until its predicate is satisfied each step executes through ExecutionMode dispatcher linear · plugin-registered modes IAgentRuntime claude · codex · gemini · …

Each concept answers one question, and only that question:

ConceptQuestion it answers
ProfileWhat pipeline shape achieves this goal?
PhaseStepWhat does this single step do (phase + role + skill)?
LoopStepWhen does this PhaseStep retry?
ExecutionModeHow does this phase actually run internally?

The split exists so each decision has one owner. Profile authors change workflow shape. Phase plugins change step behavior. Runtime adapters change the model backend. None of them can silently override the others.

On top of the four, cross-cutting concerns are opt-in and first-class: quality gates (a registered post-phase check plus a fail policy), human review (a blocking interactive checkpoint), attachments (multimodal prompt context), and skills (portable instruction packages).

This model replaced an earlier design built on a mode enum, a flat list of phase names, and imperative run loops. In the current engine, loops are declarative rather than imperative, and gates and review are first-class instead of ad hoc.

A Profile carries a name, kind, variant, description, and steps. Kinds are FULL_CYCLE, SCOPED, and CUSTOM; the profile namespace is flat, and every name resolves through the same loader via orcho run --profile <name>.

The ProfileExecutor walks the steps in order. A PhaseStep runs once. A LoopStep re-runs its inner steps round by round until its until predicate is satisfied — this is how the plan/validate loop works, with the validation verdict as the predicate. Retry is a property of the profile shape, not something a phase improvises.

Inside a single PhaseStep, execution flows through a fixed state machine. The ordering is deterministic, so gates and human review plug into reserved seats rather than re-deriving when they run:

phase lifecycle · fixed stage order
entry before_review execute gates after_review adapter checkpoint metrics human review execution mode quality gates human review observability — errors log, never crash
StageOwner
entryruntime
before_reviewHumanReview backend
executeExecutionMode dispatcher
gatesQualityGateRunner
after_reviewHumanReview backend
adapterSessionAdapterRegistry
checkpointruntime
metricsruntime

Each step produces one StepOutcome:

OutcomeMeaning
COMPLETEDAll stages passed; the runtime advances to the next step.
SKIPPEDExplicit skip; the next step proceeds with state unchanged.
RETRY_REQUESTEDThe loop dispatcher re-executes the current step, carrying a retry payload.
HALTEDTerminal stop with a reason; the profile walker stops.
FAILEDAn exception was captured; treated like HALTED but classified separately in the run summary.

Error policy is also stage-owned. A handler exception in execute becomes FAILED and is captured in the run summary. A gate handler bug surfaces as a failed gate verdict rather than a crash. Checkpoint and metrics errors log a warning and continue, because they are observability, not control flow.

A phase handoff is not a lifecycle stage. It lives one layer up, in the loop dispatcher, driven by a declarative per-step PhaseHandoffPolicy.

After the FSM returns COMPLETED, the dispatcher inspects the step’s verdict. When the policy’s trigger fires — for example, a rejected verdict on the final automatic round — the run pauses with a canonical status of awaiting_phase_handoff, a typed payload of available actions, and an emitted handoff event.

Two properties keep this honest:

  • available_actions is runtime-produced. The dispatcher decides which subset of continue, retry_feedback, halt, and continue_with_waiver is valid for this pause, and an action outside the published list is refused.
  • Decide and resume are split. The decision is written as a durable artifact and never spawns a process; resuming the run is a separate operation that reads the decision and applies its semantics.

The FSM describes what happens inside a phase. The handoff contract describes when the engine deliberately stops between phases and asks.

ExecutionMode owns how a step runs internally. The built-in mode is linear — one handler call — and plugins can register additional modes. Subtask DAG execution is not a separate mode: it is the implementation_execution=subtask_dag policy, which changes how the implement step consumes the approved plan (one focused worker turn per ready subtask instead of one whole-plan turn).

This is why DAG execution is invisible to the rest of the model. The profile still names the same implement step, the lifecycle FSM still runs the same stages, and gates still evaluate the same seat. Only the inside of execute changes shape.

The canonical engineering docs live with the code: