ADR 0026 — the runtime plane: per-environment desired-state lives in atlas, in runtime.json, in place
- Status: accepted
- Date: 2026-06-14
- Refines: 0011 (elevates
constellation/runtime.jsonfrom a bootstrap-only probe config into the per-(system, env)runtime desired-state registry — the authored sibling of the generated observed mirrorstate.json; the mirror's collect-never-invent discipline is unchanged) - Builds on: 0011 §4 (the opt-in runtime probe, #523 —
/readyz+ the node's self-assertedendpointfact as ground truth) - Relates: 0010 (atlas coordinates, it does not make the final admission decision — preserved here), 0016 (the substrate-native end-state this on-ramps to: env facts asserted by an arche-ops realizer), 0006/0008 (the channels/BOM version desired-state the runtime axis derives from, never copies)
There is a second plane — the runtime plane (environments: local, staging, prod) — and its desired-state had no home. Endpoints, per-env root public keys, which channel an env tracks, host inventory, and pointers to where secrets live were tribal knowledge ("what's the stele URL?") scattered across operators' heads and per-repo
.env.examples. This ADR gives that desired-state a home without inventing a new plane-shaped artifact: it grows the file atlas already has —constellation/runtime.json— from "one bootstrap endpoint per system" into a per-(system, env)desired-state registry, kept atlas-authored, atlas-gated, generated-and-rendered. Public facts inline, secrets as typed pointers, deployed versions derived from the channel an env tracks. The honest scope: day one this cures scattered-in-heads; it does not yet catch drift-from-the-live-node — that gate is a named follow-on, and the substrate-native end-state (0016) is a later promotion, not this step.
Context
A runtime plane is emerging beside the development plane. The dev plane is the meta-repo (atlas) + the member repos; its structure is topology/constellation.json and its version desired-state is the BOM + channels/. The runtime plane is the running system, segregated into environments (local, staging, later prod). Its observed state has a home — constellation/state.json is generated, probes live nodes (#523), and renders to the dashboard. Its desired/expected state did not: "stele staging is https://stele-staging.bitspark.xyz", "the staging substrate root pubkey is 39b090…", "staging tracks edge" lived in operators' memory and in deploy scaffolds (corpus/deploy/.env.corpus.example bakes the URL + root pubkey), so every newcomer asks the operator.
atlas is not greenfield here. constellation/runtime.json is already the one hand-authored file in constellation/ — today a single bootstrap node per system ({ node, env }) — and 0011 §4 already established the desired/observed split: runtime.json says where to look, the node's own self-asserted endpoint fact is ground truth, and state.json records the probe as source: probe | absent. The gap is only that the desired side is single-node and endpoint-only, and is treated as throwaway bootstrap rather than a governed desired-state registry.
Alternatives considered and rejected (a four-way design study with adversarial review):
- A separate runtime meta-repo (the dev/runtime plane split made literal) — rejected for now. It manufactures the staleness it then polices: env docs would pin atlas by a ref that silently lags (
atlasRef), adding a cross-repo freshness obligation the in-place path simply does not have; it stands up a second secret store and four hard couplings back to atlas (topology membership, BOM reads, the CLI, the fan-out), and forces the working site-deploy + probe (today in atlas'sdeploy.yml) to either move or split. Heavy answer to a question atlas has already half-answered in place. Promote to a repo when ownership demands it, not before. - Decentralized per-repo
deploy/env.<env>.json+ a thin atlas index — rejected. The facts have no home because prod has no single-repo owner; scattering prod desired-state into each service repo, "owned by the team that deploys it," invents an owner that does not exist and demotes atlas to a read-only index of the highest-stakes facts — a governance regression versus the version axis, which atlas does own (channels). It also conflated two real mechanisms (substrate importreads local committed manifests; onlytopology syncharvests remotely) and cannot be both remote-harvest and deterministic-offline. - Dogfood the substrate (env desired-state as stele facts in per-env spaces) — the right end-state, deferred. It makes desired-vs-observed a same-store diff and puts public keys in their natural signed-public home, reusing every adstrate principle. But it is back-loaded onto pieces that do not exist yet: env-scoped spaces, an
env.*profile family, and crucially the asserter (0016 §6's arche-ops, "the one genuinely-greenfield layer") that would write desired facts as a byproduct of deploying. Until that lands, desired facts are hand-asserted — i.e. hand-rotting, only signed — and missing intent fails silent. It is the north star, not the first step.
Decision
1. Runtime desired-state extends constellation/runtime.json in place, keyed by (system, env). No new file, no new repo, no per-repo scatter. Each systems.<name> gains environments.<env> entries carrying:
- inline public facts —
endpoint(thenodeURL, now per env),rootPubkey, thechannelthe env tracks, optional host/provider/location; - secret pointers — every private counterpart (root seed, deploy key, tokens) as a typed
{ "secretRef": "<store>:<KEY>" }object, never a value; - nothing for versions — the deployed release/component versions are derived, not stored:
env.channel → substrate/channels/<channel>.json → bom.json. The runtime registry records the channel subscription; the version is a pure function of it (the same pointers-not-copies discipline the BOM already enforces for builds).
2. The boundary: desired ≠ authoritative; atlas authors intent, not admission. This preserves 0010 (atlas coordinates and produces PRs; it does not make the final admission decision) and extends 0011: runtime.json is the authored desired sibling of the generated observed mirror state.json. The live node — its self-asserted endpoint/identity facts — remains ground truth. atlas declares what an environment should be; it does not become the system of record for what it is, and it does not adjudicate runtime admission. (The same posture the BOM catalog and channels already hold for the version plane: atlas records and audits intent; the producers and the record own reality.)
3. The day-one gates are stated honestly. What gates at PR time, on bare atlas doctor, offline and zero-dep:
runtime-schema— validates the new(system, env)shape;runtime-grounding— everysystems.<name>resolves to a constellation member and everychannelto a realwiring.jsonchannel;runtime-fresh— the generated dashboard view byte-matches a re-render;runtime-secret-inline— a secret-typed field containing any inline value is a hard error (a structuraloneOfrule — public-string XORsecretRef-object — not an entropy heuristic).
These catch "a fact that references nothing real" and "a leaked secret." They do not catch "a fact that resolves fine but is simply wrong" (a stale-but-real endpoint, a wrong-but-real pubkey). That limit is acknowledged, not papered over.
4. The desired-vs-observed gate is a named follow-on, not part of this step. Today the runtime probe runs only into deploy.yml's uncommitted build; the committed state.json baseline always carries probe: { source: "absent" }, and no drift rules exist. So there is presently no gate comparing desired to live. Closing that — having the probe write a committed observed-snapshot a freshness check diffs against, so a stale desired pubkey/endpoint reds a real gate — is the next increment, recorded here so the dashboard panel is never mistaken for a gate it is not.
5. Scope is staging-first, and no prod-governance authority is taken now. This ADR adds no "who may change prod" field. atlas has never held runtime authority (0010/0011); adding a governance/owners field that adjudicates prod changes would invert ownership for any prod operated by someone who is not an atlas maintainer (they would round-trip an atlas PR to change their own box). Staging — operated by us — is in scope; prod's facts may be recorded when it exists, but its governance is deferred.
6. The promotion path is explicit. When a distinct prod operator (who must not have atlas write access) or a deploy cadence that fights atlas's creates genuine ownership pressure, promote the desired-state to either env facts in per-env stele spaces (the 0016 dogfood end-state, with arche-ops as the asserter) or a separate runtime repo. The in-place runtime.json is the on-ramp: the same (system, env) data model, relocated into fact-space (or its own repo) when ownership demands it. Promote when ownership demands it, not before.
Consequences
- The change is small and pattern-replicates existing machinery. Re-key
runtime.json+ its schema by(system, env); add thesecretRefoneOf; add the four doctor findings (mirroringconstellationLint); render a per-env table on the dashboard/site beside the existing probe column. No new dependency (node: built-ins only). Notopology/bom/channels/wiringchange — they are read-only inputs. No member fan-out, no fleet wave — a read-mostly atlas-local change, the low-risk class 0011 describes. runtime.jsonkeeps its bootstrap role and gains a desired-state role. The probe still treats the live node as truth (bootstrap-only semantics preserved); the file additionally becomes the authored answer to "what should this env be." The migration consolidates the facts currently scattered inprovision.sh/deploy.yml/.env.exampleinto it (a one-time move), and the single-endpoint duplication with member.env.examples can later be single-sourced by emitting from this file.- The version axis cannot drift from the channels/BOM, because it is derived by following
channel, never typed. The runtime axis (endpoints, identity, host) is validated against what the registry declares, which is honest about being "matches the author," not yet "matches the node" (see Decision 4). - Honest residue. Genuinely un-derivable, pre-live facts — a not-yet-deployed env's expected root pubkey, host type/location, and the secret pointers — are authored and can be wrong with nothing to contradict them until the node exists. That surface is small, named, owned (PR-reviewed), and superseded by the probe the moment the node is live (once the Decision-4 gate lands). The
runtime-secret-existscheck (does the named GH secret still exist) is online/opt-in and so cannot run in atlas's tokenless bare CI — a stated limitation, since secret rename/rotation is a likely real drift. - Lineage. Refines 0011 (adds the desired-runtime axis beside the observed mirror); preserves 0010 (no admission authority); names 0016/dogfood as the end-state it on-ramps to. Mints nothing, moves no layer, takes no prod authority.