Skip to content

MCP control plane anatomy

The MCP server is a control plane: a thin protocol adapter over focused implementation domains, exposing a run’s durable state as typed capabilities. This page is the design view; the workflow pages show how to drive it.

CapabilityToolsEffect
actrun_start · run_resume · run_cancelDrives a subprocess.
observerun_status · run_events_tail · run_metrics · run_historyReads evolving state; never mutates.
decidephase_handoff_decide · delivery_decideWrites a decision artifact; never spawns.
inspectrun_evidence · run_diagnose · delivery_gateProjects durable artifacts into typed slices.

A run flows through all four: spawn, observe, decide the pause, inspect.

Run lifecycle and state transitions are deliberately separate concerns. orcho_run_start, orcho_run_resume, and orcho_run_cancel drive subprocesses — start spawns a detached process group and returns immediately. Observation (orcho_run_status, orcho_run_events_tail, orcho_run_metrics, orcho_run_history) reads the run’s evolving state and mutates nothing, so it is safe to poll on a tight loop.

Decisions are state transitions, not process actions: orcho_phase_handoff_decide resolves a run paused on awaiting_phase_handoff, orcho_delivery_decide resolves a parked delivery or correction gate, and neither ever spawns a process.

Inspection reads existing artifacts and projects them into typed slices: orcho_run_evidence for plan, findings, commands, artifacts, and errors; orcho_run_diagnose for a read-only resume-situation verdict; orcho_delivery_gate for the delivery gate’s available and blocked actions. Control-loop clients use these instead of raw logs.

mcp · the control loop
safe to poll awaiting_phase_handoff terminal orcho_run_start act · returns at once pipeline runs detached own process group observe status · events · metrics phase_handoff_decide writes a durable decision run_resume fresh subprocess run_diagnose · run_evidence inspect · read-only

The package keeps the MCP wire surface thin and the behavior focused. Adapter layers register with FastMCP and delegate; they contain no business logic. Implementation domains each own a coherent slice.

Adapter layerRole
tools.py@mcp.tool registration, one-line delegation to domain modules
resources/MCP resource URI handlers; no SDK access, no direct file IO
schemas/Pydantic models that define the MCP wire shape
instance.pyThe shared FastMCP instance every adapter imports
DomainOwns
services/SDK-backed reads, run-projection read-models, error mapping
observe/Event summaries, watch, handoff hints
run_control/Start / resume / cancel / handoff-decision implementation
inspection/Evidence and diff reads
authoring/Plan validation and prompt resolution
supervisor/Subprocess lifecycle: spawn, state IO, recovery, cancel, reap

Import rules keep the layers apart, and they are enforced by executable architecture guards, not convention: tools.py must not import the SDK — handlers delegate to the matching domain; resources/ reads go through services/, which is the canonical home for SDK access; the supervisor owns its own state file (mcp_supervisor.json) while meta.json stays the pipeline’s contract; and domain modules never import from the adapter layers above them.

Every request travels the same spine:

mcp · the request spine
or client / stdin JSON-RPC frames only FastMCP stdio loop protocol adapter tools.py handler thin shim · no SDK domain module owns the behavior SDK read no subprocess, no lock supervisor spawn own process group schemas/ wire model typed Pydantic shape client / stdout logging goes to stderr

A read like orcho_run_status goes through services/ to the SDK, merges pipeline state with supervisor truth, and returns a Pydantic wire model — no subprocess, no lock. A spawn like orcho_run_start goes through run_control/ to the supervisor, which builds argv via the SDK and launches the pipeline in its own process group before responding.

stdin and stdout are reserved for JSON-RPC frames. No print() is allowed in handler code paths — a single rogue write to stdout corrupts the next protocol frame. Logging goes to stderr.

Whether the server can mutate a run is a durable, on-disk fact, not a session guess. A run this server spawned carries mcp_supervisor.json and is mcp_controllable: resume and handoff decisions proceed. A foreign or CLI-started run has only meta.json and is inspect_only.

Inspection always works on both. On an inspect_only run the mutation tools refuse safely — before any spawn or decision write — by raising a typed InspectOnlyControlError whose payload carries read-only next steps. orcho_run_diagnose exposes the same classification as typed control and control_reason fields, so a client can branch before ever attempting a mutation.

Phase-handoff decisions are recorded artifacts: <run_dir>/phase_handoff_decisions/{safe_handoff_id}.json persists who decided what and why. Replaying the exact same decision payload returns the persisted record unchanged — retries and double-submits cannot drift the audit text — while a diverging payload for the same handoff is a conflict, not a toggle.

Observation is the disposable half: the reliable path is cursor-based replay from durable run state, so a dropped or timed-out orcho_run_watch is observer loss, never a run failure. A client reconnects with its last-seen event cursor and misses nothing.

  • Arbitrary pause/unpause. Runs finish, fail, get cancelled, or pause only when a phase’s declared handoff policy fires. The one declarative pause point is orcho_phase_handoff_decide over awaiting_phase_handoff.
  • Smart resume. orcho_run_resume is rerun-with-checkpoint-context: a fresh subprocess against the existing run directory, respecting the effective profile. It is not “automatically skip already-done phases”.

The canonical engineering docs live with the code: