Skip to content

The prompt plane

In Orcho, a runtime prompt is not a string plus metadata. It is a typed PromptTurn, and every runtime, debug, and cache surface is projected from that same turn.

prompt plane · one object, many views
PromptSpec + phase inputs what this phase must say PromptPart list role · task · format · contracts · payload PromptTurn the canonical prompt agent.invoke(turn.text) wire debug + evidence same turn, same bytes

One object is composed; everything else is a view of it.

A prompt carries two different kinds of content:

  • policy — JSON contracts, review targets, handoff rules, language posture;
  • payload — task text, plans, diffs, critique, test output, repo maps.

If those bytes are flattened into a raw string while metadata lives somewhere else, the two surfaces drift: what the parser expects and what the model was told stop matching. Orcho avoids that by keeping every meaningful byte attached to a typed PromptPart until the final runtime boundary.

The composed prompt itself is a stack of two layers with different owners:

  1. the professional prompt layer — user-editable composable parts (role, task, format files, plus workspace and project overrides) carrying persona, professional method, and presentation;
  2. the system tail — code-owned parser contracts, handoff and review-target policy, language posture, and cross-project grammar. Always attached, never user-overridable.

The engine keeps this split honest with an internal ablation mode: it can render a phase with the professional layer swapped for a short code-owned minimal intent while keeping the identical system tail. That is measurement plumbing for evaluating what the professional layer is worth — it has no user-facing switch — but it is why the two layers stay genuinely separable instead of drifting into one blob.

The same content passes through four stages, and the docs name each stage deliberately:

NameWhat it means
Review subjectA document to be reviewed — bundle markdown, a release summary. May be a plain str.
PromptPartA typed semantic chunk: role, task, format, contract, artifact, codemap, critique.
PromptTurnThe canonical runtime prompt: ordered segments, parts, envelope, trace view, wire text.
Wire promptturn.text — the final string passed to the agent runtime’s invoke.

The distinction is a boundary rule, not vocabulary:

  • str is allowed before something becomes a runtime prompt;
  • PromptTurn is required once something is a runtime prompt;
  • turn.text is extracted only at the runtime invoke.

Strings never travel through the middle of the engine, and turns never get appended to as strings after a builder returns. Post-builder additions go through PromptTurnEditor, which preserves part metadata.

prompt plane · the normal flow
Composer resolves role · task · format parts Builder adds code-owned contracts + dynamic parts Source PromptTurn the complete prompt · cache + session selection PromptTurnEditor optional · prefix, codemap, hypothesis Session selector continue or fresh → full or delta Effective PromptTurn what is actually sent · published to the trace runtime invoke effective.text only

Whether a phase call continues the prior provider session at all is decided before any of this rendering — it is profile policy, not a cache heuristic. Each phase declares a session-continuity policy (always fresh, continue on loop rounds, or continue within the same write zone), and the engine projects it onto a continue-or-fresh decision. Planning and validation rounds deliberately start fresh so accumulated context baggage cannot leak into a judgment phase; only a continued session is even a candidate for a delta render.

The source/effective split matters. Session-aware rendering keeps two turns:

  • the source turn is the complete prompt, used for cache and session selection — the selector must see the whole prompt;
  • the effective turn is the actual wire prompt, full or delta — debug output must show this one, because it is what the model received.

The effective turn is published to the trace slot immediately before invoke, so evidence and transcripts always describe the exact bytes on the wire.

Session continuity is valuable in review/repair loops — a resumed reviewer remembers the earlier finding and the contract — but stale memory is a risk. After a repair, the reviewer must not have to guess what changed. So a resumed review turn always carries a freshness packet of per-turn parts that a delta render can never drop:

  • the repair receipt — what the repair phase claims it fixed, waived, or left open (a claim to verify, not proof);
  • the current review subject — a fresh projection of the thing under review now;
  • the verification digests — what the developer side actually verified, and, at final acceptance, which required receipts are present, missing, failed, or stale.

The reviewer keeps its session, but answers against fresh evidence — never against memory alone.

Provider prefix caching depends on a contiguous, byte-identical prefix. The engine therefore sorts parts into a cache-first physical layout: prefix-eligible parts first, then broader cache scopes before narrower ones —

GLOBAL → WORKSPACE → PROJECT → SESSION → NONE

Stable, widely reusable bytes land early in the wire prompt; volatile, turn-local bytes land late. Each PromptPart carries cache_scope (how broadly its bytes may be reused) and stability (how often the body changes), which also drive whether a part may be omitted on a resumed session’s delta render.

Not every part is editable. The engine separates project-tunable parts from protected contracts owned by engine code:

PartsOwner
Role, task, and format partsProject, workspace, or core prompt files.
Parser contracts, system tail, git policy, language posture, review targetEngine code.

A project can tune how a reviewer behaves; it cannot accidentally delete the JSON contract the review parser depends on, because that contract never lived in an editable file. The tuning surface — which files override which parts, and in what order — is covered in Prompt engine.

The canonical engineering docs live with the code: