Propagates the 10-reviewer peer list across agent frontmatter, Position/Critical prose, shared-patterns, skill dispatchers, gate validators, and docs — resolving drift left behind when multi-tenant-reviewer and lib-commons-reviewer were added to the pool. Also fixes broken shared-pattern paths in lib-commons-reviewer and adds substantive blocker criteria to multi-tenant-reviewer plus codebase-context severity heuristic (Lerian third-rail vs external recommendation) to lib-commons-reviewer. X-Lerian-Ref: 0x1
121 KiB
| name | description | trigger | skip_when | NOT_skip_when | prerequisites | sequence | input_schema | output_schema | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ring:dev-systemplane-migration | Gate-based systemplane migration orchestrator. Migrates Lerian Go services from .env/YAML-based configuration to the systemplane — a database-backed, hot-reloadable runtime configuration and settings management plane with full audit history, optimistic concurrency, change feeds, component-granular bundle rebuilds, and atomic infrastructure replacement. Requires lib-commons v4.3.0+. | - User requests systemplane migration or adoption - Task mentions runtime configuration, hot-reload, or config management - Service needs database-backed configuration with audit trail - BundleFactory or Reconciler development is required | - Project is not a Go service - Service does not use lib-commons v4 - Task is unrelated to configuration management or systemplane | - Service already has systemplane code (verify compliance, do not skip) - "It looks like systemplane is already set up" (existence ≠ compliance) | - Go project - lib-commons/v4 dependency (v4.3.0+ required; upgrade first if older) - PostgreSQL or MongoDB backend available |
|
|
|
Systemplane Migration Orchestrator
<cannot_skip>
CRITICAL: This Skill ORCHESTRATES. Agents IMPLEMENT.
| Who | Responsibility |
|---|---|
| This Skill | Detect stack, determine gates, pass context to agent, verify outputs, enforce order |
| ring:backend-engineer-golang | Implement systemplane code following the patterns in this document |
| ring:codebase-explorer | Analyze the codebase for configuration patterns (Gate 1) |
| ring:visualize | Generate implementation preview HTML (Gate 1.5) |
| 10 reviewers | Review at Gate 8 |
CANNOT change scope: the skill defines WHAT to implement. The agent implements HOW.
FORBIDDEN: Orchestrator MUST NOT use Edit, Write, or Bash tools to modify source code files.
All code changes MUST go through Task(subagent_type="ring:backend-engineer-golang").
The orchestrator only verifies outputs (grep, go build, go test) — MUST NOT write implementation code.
</cannot_skip>
Architecture Overview
The systemplane replaces traditional env-var-only / YAML-based configuration with a three-tier architecture:
┌─ TIER 1: Bootstrap-Only ──────────────────────────────────────────┐
│ Env vars read ONCE at startup. Immutable for process lifetime. │
│ Examples: SERVER_ADDRESS, AUTH_ENABLED, OTEL_* (telemetry) │
│ Stored in: BootstrapOnlyConfig struct (frozen at init) │
└───────────────────────────────────────────────────────────────────┘
┌─ TIER 2: Runtime-Managed (Hot-Reload) ────────────────────────────┐
│ Stored in database (PostgreSQL or MongoDB). │
│ Changed via PATCH /v1/system/configs or /v1/system/settings. │
│ Propagation: ChangeFeed → Supervisor → Snapshot → Bundle/Reconcile│
│ Examples: rate limits, worker intervals, DB pool sizes, CORS │
└───────────────────────────────────────────────────────────────────┘
┌─ TIER 3: Live-Read (Zero-Cost Per-Request) ───────────────────────┐
│ Read directly from Supervisor.Snapshot() on every request. │
│ No rebuild, no reconciler, no locking. │
│ Examples: rate_limit.max, health_check_timeout_sec │
└───────────────────────────────────────────────────────────────────┘
Data Flow: Startup → Hot-Reload → Shutdown
STARTUP:
ENV VARS → defaultConfig() → loadConfigFromEnv() → *Config → ConfigManager
InitSystemplane():
1. ExtractBootstrapOnlyConfig(cfg) → BootstrapOnlyConfig (immutable)
2. LoadSystemplaneBackendConfig(cfg) → BootstrapConfig (PG/Mongo DSN)
3. builtin.NewBackendFromConfig(ctx, cfg) → Store + History + ChangeFeed
4. registry.New() + Register{Service}Keys() → Registry (100+ key definitions)
5. configureBackendWithRegistry() → pass secret keys + apply behaviors
6. NewSnapshotBuilder(registry, store) → SnapshotBuilder
7. New{Service}BundleFactory(bootstrapCfg) → BundleFactory (full + incremental)
8. seedStoreForInitialReload() → Env overrides → Store
9. buildReconcilers() → [HTTP, Publisher, Worker] (phased)
10. NewSupervisor → Reload("initial-bootstrap") → First bundle
11. NewManager(ManagerConfig{ → HTTP API handler
ConfigWriteValidator: productionGuards,
StateSync: configManagerSync,
})
StartChangeFeed() → DebouncedFeed(200ms) → goroutine: subscribe(store changes)
MountSystemplaneAPI() → 9 endpoints on /v1/system/*
HOT-RELOAD (on API PATCH or ChangeFeed signal):
Signal → Supervisor.Reload(ctx, reason) →
1. SnapshotBuilder.BuildFull() → new Snapshot (defaults + store overrides)
2. BundleFactory.BuildIncremental(snap, prev, prevSnap) →
a) Diff changed keys via keyComponentMap (postgres/redis/rabbitmq/s3/http/logger)
b) Rebuild ONLY changed components
c) Reuse unchanged components from previous bundle (pointer transfer)
d) Falls back to full Build() when all components are affected
3. Reconcilers run IN PHASE ORDER:
a) PhaseStateSync: (reserved — no current reconcilers)
b) PhaseValidation: HTTPPolicy, Publisher
c) PhaseSideEffect: Worker → WorkerManager.ApplyConfig()
4. On success: atomic swap — snapshot.Store() + bundle.Store()
5. AdoptResourcesFrom(previous) → nil-out transferred pointers
6. Observer callback → bundleState.Update() + ConfigManager.UpdateFromSystemplane()
+ SwappableLogger.Swap() + SwapRuntimePublishers()
7. previous.Close() → only tears down REPLACED components
SHUTDOWN:
1. ConfigManager.Stop() (prevent mutations)
2. cancelChangeFeed() (stop reload triggers)
3. Supervisor.Stop() (stop supervisory loop + close bundle)
4. Backend.Close() (close store connection)
5. WorkerManager.Stop() (stop all workers)
The Three Configuration Authorities
| Phase | Authority | Scope | Mutability |
|---|---|---|---|
| Bootstrap | Env vars → defaultConfig() + loadConfigFromEnv() |
Server address, TLS, auth, telemetry | Immutable after startup |
| Runtime | Systemplane Store + Supervisor | Rate limits, workers, timeouts, DB pools, CORS | Hot-reloadable via API |
| Legacy bridge | ConfigManager.Get() |
Backward-compat for existing code | Updated by StateSync callback + observer |
Single source of truth: {service}KeyDefs() is THE canonical source of all
default values. The defaultConfig() function derives its values from KeyDefs
via defaultSnapshotFromKeyDefs() → configFromSnapshot(). No manual sync
required between defaults, key definitions, or struct tags.
Component-granular awareness: Every key's Component field (e.g., "postgres",
"redis", "rabbitmq", "s3", "http", "logger", or ComponentNone) enables
the IncrementalBundleFactory to rebuild only the affected infrastructure component
when that key changes, instead of tearing down and rebuilding everything.
Canonical Import Paths
The systemplane is a shared library in lib-commons (since v4.3.0). All services import it via:
import (
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/ports"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/registry"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/service"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/bootstrap"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/bootstrap/builtin"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/adapters/changefeed"
fiberhttp "github.com/LerianStudio/lib-commons/v4/commons/systemplane/adapters/http/fiber"
// Swagger spec (for consuming app integration)
systemswagger "github.com/LerianStudio/lib-commons/v4/commons/systemplane/swagger"
)
Requires lib-commons v4.3.0+ in your go.mod.
⛔ HARD GATE: Agent must not use v2 or v3 import paths or invent sub-package paths. If any tool truncates output, this table is the authoritative reference.
Severity Calibration
| Severity | Criteria | Examples |
|---|---|---|
| CRITICAL | Data loss, runtime panic, silent config corruption | Bundle not closed on shutdown (resource leak), snapshot reads nil (panic), ChangeFeed not started (hot-reload broken) |
| HIGH | Missing integration, broken hot-reload path | Keys registered but no BundleFactory, no HTTP Mount, no StateSync callback |
| MEDIUM | Incomplete but functional | Missing validators on some keys, incomplete secret redaction, no swagger merge |
| LOW | Polish and optimization | Missing config-map.example, incomplete descriptions, suboptimal component grouping |
MUST report all severities. CRITICAL: STOP immediately. HIGH: Fix before gate pass. MEDIUM: Fix in iteration. LOW: Document.
Pressure Resistance
| User Says | This Is | Response |
|---|---|---|
| "Keys are registered, migration is done" | SCOPE_REDUCTION | "Key registration is Gate 2 of 10. Without BundleFactory + Wiring + HTTP Mount, keys are metadata — nothing reads them at runtime. MUST complete ALL gates." |
| "We can wire it later" | SCOPE_REDUCTION | "Gate 7 is where the system comes alive. Without it, systemplane is dead code. Gate 7 is NEVER skippable." |
| "Swagger integration is cosmetic" | QUALITY_BYPASS | "Operators need API discoverability. Without swagger.MergeInto(), 9 endpoints are invisible in Swagger UI. MUST implement." |
| "Config bridge is backward compat only" | SCOPE_REDUCTION | "Existing code reads Config struct. Without StateSync, it reads stale bootstrap values forever. MUST implement Gate 6." |
| "Reconcilers are optional" | SCOPE_REDUCTION | "Without reconcilers, workers and HTTP policies ignore config changes. Hot-reload is partial." |
| "The service already has systemplane" | COMPLIANCE_BYPASS | "Existence ≠ compliance. MUST run compliance audit. If it doesn't match canonical patterns exactly, it is non-compliant." |
| "Skip code review, we tested it" | QUALITY_BYPASS | "MANDATORY: 10 reviewers. One wiring mistake = silent config corruption or resource leak." |
| "Agent says out of scope" | AUTHORITY_OVERRIDE | "Skill defines scope, not agent. Re-dispatch with gate context." |
| "ChangeFeed can be added later" | SCOPE_REDUCTION | "Without ChangeFeed, hot-reload is broken. Config changes in DB are invisible to the service. MUST start DebouncedFeed." |
| "Active bundle state is internal detail" | SCOPE_REDUCTION | "Without thread-safe accessor, request handlers cannot read live config. Race conditions. MUST implement active_bundle_state.go." |
Gate Overview
| Gate | Name | Condition | Agent |
|---|---|---|---|
| 0 | Stack Detection + Prerequisite Audit | Always | Orchestrator |
| 1 | Codebase Analysis (Config Focus) | Always | ring:codebase-explorer |
| 1.5 | Implementation Preview | Always | Orchestrator (ring:visualize) |
| 2 | Key Definitions + Registry | Always | ring:backend-engineer-golang |
| 3 | Bundle + BundleFactory | Always | ring:backend-engineer-golang |
| 4 | Reconcilers | Conditional (skip if no workers AND no RMQ AND no HTTP policy changes) | ring:backend-engineer-golang |
| 5 | Identity + Authorization | Always | ring:backend-engineer-golang |
| 6 | Config Manager Bridge | Always | ring:backend-engineer-golang |
| 7 | Wiring + HTTP Mount + Swagger + ChangeFeed | Always ⛔ NEVER SKIPPABLE | ring:backend-engineer-golang |
| 8 | Code Review | Always | 10 parallel reviewers |
| 9 | User Validation | Always | User |
| 10 | Activation Guide | Always | Orchestrator |
MUST execute gates sequentially. CANNOT skip or reorder.
Gate Execution Rules
⛔ HARD GATES — CANNOT be overridden:
- All gates must execute in order (0→1→1.5→2→3→4→5→6→7→8→9→10)
- Gates execute with explicit dependencies — no gate can start until its predecessor completes
- Existence ≠ compliance: existing systemplane code triggers compliance audit, NOT a skip
- A gate can only be marked SKIP when ALL its compliance checks pass with evidence
- Gate 7 (Wiring) is NEVER skippable — it is the most critical gate
<cannot_skip>
HARD GATE: Existence ≠ Compliance
"The service already has systemplane code" is NOT a reason to skip any gate.
MUST replace existing systemplane code that does not follow the canonical patterns — it is non-compliant. The only valid reason to skip a gate is when the existing implementation has been verified to match the exact patterns defined in this skill document.
Compliance verification requires EVIDENCE, not assumption. The Gate 0 Phase 2 compliance audit (S1-S8 grep checks) verifies each component against canonical patterns.
If ANY audit check is NON-COMPLIANT → the corresponding gate MUST execute to fix it. CANNOT skip.
</cannot_skip>
Gate 0: Stack Detection + Prerequisite Audit
Orchestrator executes directly. No agent dispatch.
This gate has THREE phases: detection, compliance audit, and non-canonical pattern detection.
Phase 1: Stack Detection
DETECT (run in parallel):
1. lib-commons version:
grep "lib-commons" go.mod
(require v4.3.0+ — if older, MUST upgrade before proceeding)
2. PostgreSQL:
grep -rn "postgresql\|pgx\|squirrel" internal/ go.mod
3. MongoDB:
grep -rn "mongodb\|mongo" internal/ go.mod
4. Redis:
grep -rn "redis\|valkey" internal/ go.mod
5. RabbitMQ:
grep -rn "rabbitmq\|amqp" internal/ go.mod
6. S3/Object Storage:
grep -rn "s3\|ObjectStorage\|PutObject\|GetObject\|Upload.*storage\|Download.*storage" internal/ pkg/ go.mod
7. Background workers:
grep -rn "ticker\|time.NewTicker\|cron\|worker\|scheduler" internal/ --include="*.go" | grep -v _test.go
8. Existing systemplane code:
- Imports: grep -rn "systemplane" internal/ go.mod
- BundleFactory: grep -rn "BundleFactory\|IncrementalBundleFactory" internal/ --include="*.go"
- Supervisor: grep -rn "service.NewSupervisor\|Supervisor" internal/ --include="*.go" | grep systemplane
- Keys: grep -rn "Register.*Keys\|KeyDefs()" internal/ --include="*.go"
- HTTP Mount: grep -rn "handler.Mount\|fiberhttp.NewHandler" internal/ --include="*.go"
9. Current config pattern:
- Struct: grep -rn "type Config struct" internal/ --include="*.go"
- Env tags: grep -rn "envDefault:" internal/ --include="*.go" | sort
- Env reads: grep -rn "os.Getenv\|viper\.\|envconfig" internal/ --include="*.go" | grep -v _test.go
- YAML files: find . -name '.env*' -o -name '*.yaml' -o -name '*.yml' | grep -i config
Output format for stack detection:
STACK DETECTION RESULTS:
| Component | Detected | Evidence |
|-----------|----------|----------|
| lib-commons v4.3.0+ | YES/NO | {go.mod version} |
| PostgreSQL | YES/NO | {file:line matches} |
| MongoDB | YES/NO | {file:line matches} |
| Redis | YES/NO | {file:line matches} |
| RabbitMQ | YES/NO | {file:line matches} |
| S3/Object Storage | YES/NO | {file:line matches} |
| Background workers | YES/NO | {file:line matches} |
| Existing systemplane | YES/NO | {file:line matches} |
| Config pattern | envDefault/viper/os.Getenv | {file:line matches} |
Phase 2: Compliance Audit (MANDATORY if any existing systemplane code detected)
If Phase 1 step 8 detects any existing systemplane code, MUST run a compliance audit. MUST replace existing code that does not match canonical patterns — it is not "partially done", it is wrong.
AUDIT (run in parallel — only if step 8 found existing systemplane code):
S1. Key Registry compliance:
grep -rn "Register.*Keys\|KeyDefs()" internal/ --include="*.go"
(no match = NON-COMPLIANT → Gate 2 MUST fix)
S2. BundleFactory compliance:
grep -rn "IncrementalBundleFactory\|BundleFactory" internal/ --include="*.go"
(no IncrementalBundleFactory = NON-COMPLIANT → Gate 3 MUST fix)
S3. AdoptResources compliance:
grep -rn "AdoptResourcesFrom" internal/ --include="*.go"
(no match but BundleFactory exists = NON-COMPLIANT → Gate 3 MUST fix)
S4. Reconciler compliance:
grep -rn "BundleReconciler\|Reconcile.*context" internal/ --include="*.go" | grep -v _test.go
(expected reconcilers missing = NON-COMPLIANT → Gate 4 MUST fix)
S5. Identity/Authorization compliance:
grep -rn "IdentityResolver\|Authorizer" internal/ --include="*.go"
(no match = NON-COMPLIANT → Gate 5 MUST fix)
S6. Config Bridge compliance:
grep -rn "StateSync\|config_manager_systemplane\|configFromSnapshot" internal/ --include="*.go"
(no StateSync callback or config hydration = NON-COMPLIANT → Gate 6 MUST fix)
S7. HTTP Mount compliance:
grep -rn "handler.Mount\|fiberhttp.NewHandler" internal/ --include="*.go"
(no match = NON-COMPLIANT → Gate 7 MUST fix)
Also check all 9 routes are accessible:
grep -rn "/v1/system" internal/ --include="*.go"
S8. Swagger compliance:
grep -rn "swagger.MergeInto\|systemswagger" internal/ --include="*.go"
(no match = NON-COMPLIANT → Gate 7 MUST fix)
S9. File structure compliance:
ls internal/bootstrap/systemplane_mount.go
ls internal/bootstrap/systemplane_authorizer.go
ls internal/bootstrap/systemplane_identity.go
ls internal/bootstrap/systemplane_factory.go
ls internal/bootstrap/active_bundle_state.go
(Each systemplane concern MUST have a dedicated file. Monolithic systemplane.go
files that combine mount + authorizer + identity = NON-COMPLIANT)
Output format for compliance audit:
COMPLIANCE AUDIT RESULTS:
| Check | Component | Status | Evidence | Gate Action |
|-------|-----------|--------|----------|-------------|
| S1 | Key Registry | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 2: SKIP / MUST FIX |
| S2 | BundleFactory | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 3: SKIP / MUST FIX |
| S3 | AdoptResources | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 3: SKIP / MUST FIX |
| S4 | Reconcilers | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 4: SKIP / MUST FIX |
| S5 | Identity/Auth | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 5: SKIP / MUST FIX |
| S6 | Config Bridge | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 6: SKIP / MUST FIX |
| S7 | HTTP Mount | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 7: SKIP / MUST FIX |
| S8 | Swagger | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 7: SKIP / MUST FIX |
| S9 | File Structure | COMPLIANT / NON-COMPLIANT | {ls results} | Gate 7: SKIP / MUST FIX |
⛔ HARD GATE: A gate can only be marked as SKIP when ALL its compliance checks are COMPLIANT with evidence. One NON-COMPLIANT row → gate MUST execute.
Phase 3: Non-Canonical Pattern Detection (MANDATORY)
MUST scan for custom config management outside the systemplane canonical patterns:
DETECT non-canonical config patterns:
N1. Custom config hot-reload:
grep -rln "hot.reload\|config.watch\|config.reload\|ConfigReload" internal/config/ pkg/config/ internal/bootstrap/ --include="*.go" 2>/dev/null
(any match = NON-CANONICAL → MUST be removed and replaced with systemplane ChangeFeed)
N2. Custom file watchers:
grep -rn "fsnotify\|viper.WatchConfig\|inotify\|file.watch" internal/ pkg/ --include="*.go"
(any match = NON-CANONICAL → MUST be removed; systemplane uses database change feed)
N3. Custom change notification channels:
grep -rn "chan.*config\|config.*chan\|ConfigChange\|configChanged" internal/ pkg/ --include="*.go" | grep -v systemplane
(any match outside systemplane = NON-CANONICAL → MUST be replaced with systemplane ChangeFeed)
If non-canonical files are found: report them in the compliance audit as NON-CANONICAL FILES DETECTED. The implementing agent MUST remove these files and replace their functionality with systemplane during the appropriate gate.
Store detection results:
{
"skill": "ring:dev-systemplane-migration",
"gate": "0",
"detection": {
"lib_commons_version": "v4.3.2",
"backend": "postgres",
"dependencies": ["postgres", "redis", "rabbitmq"],
"has_workers": true,
"existing_systemplane": false,
"config_pattern": "envDefault",
"config_struct_location": "internal/bootstrap/config.go:15"
},
"compliance": {
"status": "NEW",
"checks": {}
},
"non_canonical": []
}
MUST confirm: user explicitly approves detection results before proceeding.
Gate 1: Codebase Analysis (Config Focus)
Always executes. This gate builds the configuration inventory for all subsequent gates.
Dispatch ring:codebase-explorer with systemplane-focused context:
TASK: Analyze this codebase exclusively under the systemplane migration perspective. DETECTED STACK: {databases and infrastructure from Gate 0}
FOCUS AREAS (explore ONLY these — ignore everything else):
Config struct location and fields: Find the main Config struct, all env tags (
envDefault:), default values. List every field with its current type, default, and env var name. Include nested structs (e.g.,Config.Postgres.Host).Environment variable reads: All
os.Getenv,envconfig,viperusage. Include those outside the Config struct (ad-hoc reads in business logic).Infrastructure client creation: How postgres, redis, rabbitmq, mongo, S3 clients are created. File:line for each constructor call. Connection strings, pool sizes, timeouts used. These become Bundle components.
Background workers: Ticker-based, cron-based, goroutine workers that need reconciliation. File:line for each worker start. Which config fields control their behavior (intervals, enable/disable, batch sizes).
HTTP server configuration: Listen address, TLS, CORS, rate limits, timeouts, body limits. Where these are read and applied. These become HTTP policy keys.
Authentication/authorization: JWT parsing, middleware, permission model. Where user ID and tenant ID are extracted from context. This drives Identity + Authorizer implementation.
Existing config reload patterns: Any hot-reload, file watching, config refresh mechanisms. Viper watchers, fsnotify, custom channels. These will be replaced by systemplane ChangeFeed.
Swagger/OpenAPI setup: How swagger is generated (
swag initcommand?), where spec is served (middleware?), what tool (swaggo/swag? go-swagger?). This drives the swagger.MergeInto() integration in Gate 7.OUTPUT FORMAT: Structured report with file:line references for every point above. DO NOT write code. Analysis only.
From the analysis, produce the Config Inventory Table:
| Env Var Name | Config Field | Current Default | Go Type | Proposed Key | Proposed Tier | ApplyBehavior | Component | Secret | Validator | Group |
|---|---|---|---|---|---|---|---|---|---|---|
POSTGRES_HOST |
Config.Postgres.Host |
localhost |
string |
postgres.primary_host |
Runtime | BundleRebuild | postgres |
No | — | postgres |
RATE_LIMIT_MAX |
Config.RateLimit.Max |
100 |
int |
rate_limit.max |
Live-Read | LiveRead | _none |
No | validatePositiveInt | rate_limit |
SERVER_ADDRESS |
Config.Server.Address |
:8080 |
string |
server.address |
Bootstrap | BootstrapOnly | _none |
No | — | server |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
This inventory becomes the CONTEXT for all subsequent gates. Gate 2 uses it to create key definitions. Gate 3 uses it to build the Bundle. Gate 6 uses it for snapshot→Config hydration.
<block_condition> HARD GATE: MUST complete the analysis report and config inventory before proceeding. All subsequent gates depend on this inventory. </block_condition>
Gate 1.5: Implementation Preview (Visual Report)
Always executes. This gate generates a visual HTML report showing exactly what will change before any code is written.
Uses the ring:visualize skill to produce a self-contained HTML page.
The report is built from Gate 0 (stack detection) and Gate 1 (codebase analysis). It shows the developer a complete preview of every change that will be made across all subsequent gates.
Orchestrator generates the report using ring:visualize with this content:
The HTML page MUST include these sections:
1. Current Architecture (Before)
- Mermaid diagram showing current config flow: env vars → Config struct → static injection into services
- Table of all files that will be created or modified, with purpose
- How infrastructure clients get their config today (static fields, constructor args)
2. Target Architecture (After)
- Mermaid diagram showing the three-tier architecture:
Bootstrap env → Supervisor → Snapshot → BundleFactory → Bundle → Live-Read (per-request) → StateSync → Config struct (backward compat) - Component map showing which infrastructure is managed by systemplane
- Change feed flow: DB change → pg_notify/change_stream → DebouncedFeed → Supervisor.Reload
3. Key Inventory Table
ALL keys with: name, tier (bootstrap/runtime/live-read), scope, ApplyBehavior, component, secret, validator, group. One row per key. This is the FULL inventory from Gate 1.
4. File Creation Plan
Every file to be created with gate assignment:
| Gate | File | Purpose | Lines (est.) |
|---|---|---|---|
| 2 | systemplane_keys.go |
Key orchestrator: Register + KeyDefs | ~50 |
| 2 | systemplane_keys_{group}.go |
Per-group key definitions | ~30 each |
| 2 | systemplane_keys_validation.go |
Validator functions | ~60 |
| 2 | systemplane_keys_helpers.go |
concatKeyDefs utility | ~15 |
| 3 | systemplane_bundle.go |
Bundle struct + Close + AdoptResources | ~80 |
| 3 | systemplane_factory.go |
IncrementalBundleFactory | ~120 |
| 3 | systemplane_factory_infra.go |
Per-component builders | ~100 |
| 4 | systemplane_reconciler_*.go |
Reconcilers (if applicable) | ~50 each |
| 5 | systemplane_identity.go |
JWT → Actor bridge | ~30 |
| 5 | systemplane_authorizer.go |
Permission mapping | ~50 |
| 6 | config_manager_systemplane.go |
Snapshot → Config hydration | ~150 |
| 6 | config_manager_seed.go |
Env → Store seed | ~80 |
| 6 | config_manager_helpers.go |
Type-safe comparison | ~40 |
| 6 | config_validation.go |
Production config guards | ~60 |
| 7 | systemplane_init.go |
11-step init sequence | ~120 |
| 7 | systemplane_mount.go |
HTTP route registration + swagger merge | ~40 |
| 7 | active_bundle_state.go |
Thread-safe live-read accessor | ~40 |
5. ApplyBehavior Distribution
Visual breakdown showing how many keys per tier:
| ApplyBehavior | Count | Percentage | Example Keys |
|---|---|---|---|
| BootstrapOnly | N | N% | server.address, auth.enabled |
| LiveRead | N | N% | rate_limit.max, timeout_sec |
| WorkerReconcile | N | N% | worker.interval_sec |
| BundleRebuild | N | N% | postgres.host, redis.host |
| BundleRebuildAndReconcile | N | N% | worker.enabled |
6. Component Map
| Component | Infrastructure | Keys | Rebuild Trigger |
|---|---|---|---|
postgres |
PostgreSQL client | N | Connection string, pool size, credentials |
redis |
Redis client | N | Host, port, password |
rabbitmq |
RabbitMQ connection | N | URI, host, credentials |
s3 |
Object storage | N | Endpoint, bucket, credentials |
http |
HTTP policy | N | Body limit, CORS |
logger |
Logger | N | Log level |
_none |
No infrastructure | N | Business logic only (live-read) |
7. Swagger Integration
Shows how systemplane routes will appear in the app's Swagger UI:
import systemswagger "github.com/LerianStudio/lib-commons/v4/commons/systemplane/swagger"
// After swag init generates the base spec:
merged, err := systemswagger.MergeInto(baseSwaggerSpec)
// Use merged spec for swagger UI handler
All 9 /v1/system/* routes will be visible in the Swagger UI after merge.
8. Risk Assessment
| Risk | Mitigation | Verification |
|---|---|---|
| Config struct regression | StateSync hydrates Config from Snapshot on every change | Existing tests pass with systemplane active |
| Startup failure | Graceful degradation — service starts without systemplane if backend unavailable | Manual test with backend down |
| Resource leaks on hot-reload | AdoptResourcesFrom + ownership tracking in Bundle | Bundle factory tests verify resource reuse |
| Secret exposure | RedactPolicy on secret keys, AES-256-GCM encryption in store | Security review at Gate 8 |
9. Retro Compatibility Guarantee
When systemplane store is unavailable, the service degrades to bootstrap defaults:
- Config values from env vars still work (via
defaultConfig()derived from KeyDefs) - No runtime mutation API available (9 endpoints return 503)
- No hot-reload capability (static config for process lifetime)
- Workers run with static config from env vars
- The service never fails to start due to systemplane issues
Output: Save the HTML report to docs/systemplane-preview.html in the project root.
Open in browser for the developer to review.
<block_condition> HARD GATE: Developer MUST explicitly approve the implementation preview before any code changes begin (Gates 2+). This prevents wasted effort on incorrect key classification or architectural decisions. </block_condition>
If the developer requests changes to the preview, regenerate the report and re-confirm.
Gate 2: Key Definitions + Registry
Always executes. This gate creates the key definition files that form the backbone of systemplane.
Dispatch ring:backend-engineer-golang with context from Gate 1 config inventory:
TASK: Create systemplane key definition files based on the approved config inventory.
CONTEXT FROM GATE 1: {Full config inventory table from analysis report} APPROVED KEY COUNT: {N keys from Gate 1.5 preview}
IMPORT PATHS (use exactly these):
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain" "github.com/LerianStudio/lib-commons/v4/commons/systemplane/registry"FILES TO CREATE:
systemplane_keys.go— Orchestrator:func Register{Service}Keys(reg registry.Registry) error { for _, def := range {service}KeyDefs() { if err := reg.Register(def); err != nil { return fmt.Errorf("register key %q: %w", def.Key, err) } } return nil } func {service}KeyDefs() []domain.KeyDef { return concatKeyDefs( {service}KeyDefsAppServer(), {service}KeyDefsPostgres(), // ... one per group from inventory ) }
systemplane_keys_{group}.go— Per-group key definitions. Split by group names from inventory (app, server, postgres, redis, rabbitmq, auth, swagger, telemetry, rate_limit, worker, etc.)
systemplane_keys_validation.go— Validator functions:
validatePositiveInt(value any) errorvalidateLogLevel(value any) errorvalidateSSLMode(value any) errorvalidateAbsoluteHTTPURL(value any) errorvalidatePort(value any) error- Any service-specific validators
systemplane_keys_helpers.go—concatKeyDefs()utilityKEY DEFINITION RULES:
Every key MUST have ALL of these fields:
Key— dotted name (e.g.,"postgres.primary_host")Kind—domain.KindConfigordomain.KindSettingAllowedScopes—[]domain.Scope{domain.ScopeGlobal}or bothDefaultValue— THE single source of truth for defaultsValueType—domain.ValueString,domain.ValueInt,domain.ValueBool,domain.ValueFloatApplyBehavior— classify using the 5-level taxonomy (see Appendix F)MutableAtRuntime—falsefor bootstrap-only,truefor everything elseSecret—trueif contains credentials, tokens, keysRedactPolicy—domain.RedactFullfor secrets,domain.RedactNoneotherwiseDescription— human-readable descriptionGroup— logical grouping matching file suffixComponent— infrastructure component:"postgres","redis","rabbitmq","s3","http","logger", ordomain.ComponentNone("_none")Validator— custom validation function (nil if none needed)DefaultValue is the SINGLE SOURCE OF TRUTH. No separate config defaults. The
defaultConfig()function will derive from KeyDefs viaconfigFromSnapshot(defaultSnapshotFromKeyDefs(...)).
Apply Behavior classification — Include the 5-level taxonomy:
| ApplyBehavior | Constant | Strength | Runtime Effect | Use When |
|---|---|---|---|---|
| bootstrap-only | domain.ApplyBootstrapOnly |
4 | Immutable after startup | Server address, TLS, auth, telemetry |
| bundle-rebuild+worker-reconcile | domain.ApplyBundleRebuildAndReconcile |
3 | Bundle swap AND worker restart | Worker enable/disable (needs new connections + restart) |
| bundle-rebuild | domain.ApplyBundleRebuild |
2 | Rebuild infra clients only | Connection strings, pool sizes, credentials |
| worker-reconcile | domain.ApplyWorkerReconcile |
1 | Restart workers only | Worker intervals, scheduler periods |
| live-read | domain.ApplyLiveRead |
0 | Zero-cost per-request reads | Rate limits, timeouts, TTLs, feature flags |
Completion criteria: All keys registered, go build ./... passes, key count matches inventory from Gate 1.5.
Catalog validation (MANDATORY): Products MUST add a test that validates their KeyDefs against the lib-commons canonical catalog:
func TestKeyDefs_MatchCanonicalCatalog(t *testing.T) {
reg := registry.New()
Register{Service}Keys(reg)
var allDefs []domain.KeyDef
for _, def := range reg.List(domain.KindConfig) {
allDefs = append(allDefs, def)
}
for _, def := range reg.List(domain.KindSetting) {
allDefs = append(allDefs, def)
}
mismatches := catalog.ValidateKeyDefs(allDefs, catalog.AllSharedKeys()...)
for _, m := range mismatches {
t.Errorf("catalog mismatch: %s", m)
}
}
Any mismatch = NON-COMPLIANT. The canonical catalog in lib-commons/commons/systemplane/catalog/ defines the source of truth for shared key names, tiers, and components.
KindSetting vs KindConfig decision tree:
Is this a per-tenant business rule?
YES → KindSetting (AllowedScopes: [ScopeGlobal, ScopeTenant])
NO ↓
Is this a product-wide feature flag or business parameter?
YES → KindSetting (AllowedScopes: [ScopeGlobal])
NO ↓
Is this an infrastructure knob (DB, Redis, auth, telemetry)?
YES → KindConfig
NO ↓
Default: KindConfig
Products MUST NOT leave the Settings API empty. If no per-tenant settings are identified, document the justification in systemplane-guide.md.
Verification: grep "Register.*Keys\|KeyDefs()" internal/bootstrap/ + go build ./...
Gate 3: Bundle + BundleFactory
Always executes. This gate creates the runtime resource container and its incremental builder.
Dispatch ring:backend-engineer-golang with context from Gates 1-2:
TASK: Create the Bundle, BundleFactory, and per-component infrastructure builders.
DETECTED INFRASTRUCTURE: {postgres, redis, rabbitmq, s3 — from Gate 0} KEY DEFINITIONS: {key groups and components from Gate 2}
IMPORT PATHS (use exactly these):
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain" "github.com/LerianStudio/lib-commons/v4/commons/systemplane/ports"FILES TO CREATE:
systemplane_bundle.go— Bundle struct with:
- Per-component ownership fields (postgres, redis, rabbitmq, mongo, s3, http, logger)
Close(ctx context.Context) errormethod that closes all owned resources in REVERSE dependency orderAdoptResourcesFrom(previous domain.RuntimeBundle)method that nil-outs transferred pointers in previous- Ownership booleans per component (e.g.,
ownsPostgres,ownsRedis)Pattern:
type {Service}Bundle struct { Infra *InfraBundle HTTP *HTTPPolicyBundle Logger *LoggerBundle ownsPostgres, ownsRedis, ownsRabbitMQ, ownsObjectStorage bool } func (b *{Service}Bundle) Close(ctx context.Context) error { // Close in REVERSE dependency order, only owned resources if b.ownsObjectStorage && b.Infra.ObjectStorage != nil { b.Infra.ObjectStorage.Close() } if b.ownsRabbitMQ && b.Infra.RabbitMQ != nil { b.Infra.RabbitMQ.Close() } if b.ownsRedis && b.Infra.Redis != nil { b.Infra.Redis.Close() } if b.ownsPostgres && b.Infra.Postgres != nil { b.Infra.Postgres.Close() } return nil } func (b *{Service}Bundle) AdoptResourcesFrom(previous domain.RuntimeBundle) { prev, ok := previous.(*{Service}Bundle) if !ok || prev == nil { return } if !b.ownsPostgres { prev.Infra.Postgres = nil } if !b.ownsRedis { prev.Infra.Redis = nil } // ... etc }
systemplane_factory.go— IncrementalBundleFactory with:
Build(ctx, snapshot)— full rebuild, creates everything from scratchBuildIncremental(ctx, snapshot, previous, prevSnapshot)— component-granular rebuildkeyComponentMap— built once at factory construction from KeyDefsdiffChangedComponents(snap, prevSnap)— uses keyComponentMap to find which components changedPattern:
type {Service}BundleFactory struct { bootstrapCfg *BootstrapOnlyConfig keyComponentMap map[string]string // key → component } func (f *{Service}BundleFactory) Build(ctx context.Context, snap domain.Snapshot) (domain.RuntimeBundle, error) { // Full rebuild — creates everything from scratch } func (f *{Service}BundleFactory) BuildIncremental(ctx context.Context, snap domain.Snapshot, previous domain.RuntimeBundle, prevSnap domain.Snapshot) (domain.RuntimeBundle, error) { // Diff changed components, rebuild only what changed, reuse the rest }
systemplane_factory_infra.go— Per-component builders:
buildPostgres(ctx, snapshot)— creates postgres client from snapshot keysbuildRedis(ctx, snapshot)— creates redis client from snapshot keysbuildRabbitMQ(ctx, snapshot)— creates rabbitmq connection from snapshot keysbuildObjectStorage(ctx, snapshot)— creates S3 client from snapshot keysbuildLogger(snapshot)— creates logger from snapshot keysbuildHTTPPolicy(snapshot)— extracts HTTP policy from snapshot keys (one per detected infrastructure — omit unused components)
Completion criteria: Bundle builds successfully from snapshot, incremental rebuild reuses unchanged components, go build ./... passes.
Verification: grep "IncrementalBundleFactory\|BundleFactory\|AdoptResourcesFrom" internal/bootstrap/ + go build ./...
Gate 4: Reconcilers (Conditional)
SKIP IF: no background workers AND no RabbitMQ AND no HTTP policy changes detected in Gate 0.
When to include each reconciler:
HTTPPolicyReconciler— if HTTP policy keys exist (body limit, CORS, timeouts)PublisherReconciler— if RabbitMQ detectedWorkerReconciler— if background workers detected
Dispatch ring:backend-engineer-golang with context from Gates 1-3:
TASK: Create reconcilers for side effects on config changes.
DETECTED: {workers: Y/N, rabbitmq: Y/N, http_policy_keys: Y/N from Gates 0-2}
IMPORT PATHS (use exactly these):
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain" "github.com/LerianStudio/lib-commons/v4/commons/systemplane/ports"FILES TO CREATE (only for detected components):
systemplane_reconciler_http.go— PhaseValidation reconciler for HTTP config:
- Validates body limit is non-negative
- Validates CORS origins are present
- Can REJECT changes (returns error to abort the reload)
func (r *HTTPPolicyReconciler) Phase() domain.ReconcilerPhase { return domain.PhaseValidation }
systemplane_reconciler_publishers.go— PhaseValidation for RabbitMQ publisher staging (if RMQ detected):
- When RabbitMQ connection changed, creates staged publishers on candidate bundle
- The observer callback swaps them in later
func (r *PublisherReconciler) Phase() domain.ReconcilerPhase { return domain.PhaseValidation }
systemplane_reconciler_worker.go— PhaseSideEffect for worker restart (if workers detected):
- Reads worker config from snapshot
- Calls
workerManager.ApplyConfig(cfg)to restart affected workers- Runs LAST because it has external side effects
func (r *WorkerReconciler) Phase() domain.ReconcilerPhase { return domain.PhaseSideEffect }Phase ordering is enforced by the type system — you cannot register a reconciler without declaring its phase. The supervisor stable-sorts by phase, so reconcilers within the same phase retain their registration order.
Phase execution order: PhaseStateSync (0) → PhaseValidation (1) → PhaseSideEffect (2)
Completion criteria: Reconcilers compile, implement ports.BundleReconciler interface, phase ordering is correct. go build ./... passes.
Verification: grep "BundleReconciler\|Reconcile\|Phase()" internal/bootstrap/systemplane_reconciler_*.go + go build ./...
Gate 5: Identity + Authorization
Always executes. Systemplane HTTP endpoints require identity resolution and permission checking.
Dispatch ring:backend-engineer-golang with context from Gate 1 analysis:
TASK: Implement IdentityResolver and Authorizer for systemplane HTTP endpoints.
CONTEXT FROM GATE 1: {Authentication/authorization analysis — how JWT is parsed, where user ID/tenant ID are extracted}
IMPORT PATHS (use exactly these):
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain" "github.com/LerianStudio/lib-commons/v4/commons/systemplane/ports"FILES TO CREATE:
systemplane_identity.go— Implementsports.IdentityResolver:type {Service}IdentityResolver struct{} func (r *{Service}IdentityResolver) Actor(ctx context.Context) (domain.Actor, error) { uid := auth.GetUserID(ctx) // use the service's existing auth context extraction if uid == "" { uid = "anonymous" } return domain.Actor{ID: uid}, nil } func (r *{Service}IdentityResolver) TenantID(ctx context.Context) (string, error) { return auth.GetTenantID(ctx) // use the service's existing tenant extraction }
systemplane_authorizer.go— Implementsports.Authorizer: Maps 9 permissions to the service's auth model:
system/configs:read— view config valuessystem/configs:write— update config valuessystem/configs/schema:read— view key definitionssystem/configs/history:read— view config change historysystem/configs/reload:write— force full reloadsystem/settings:read— view settingssystem/settings:write— update settingssystem/settings/schema:read— view setting key definitionssystem/settings:history:read— view settings history (scope-dependent)type {Service}Authorizer struct { authEnabled bool } func (a *{Service}Authorizer) Authorize(ctx context.Context, permission string) error { if !a.authEnabled { return nil } // dev mode bypass // Map permission to the service's RBAC action, call auth.Authorize() }When auth is disabled (dev mode), authorizer returns nil for all permissions. When auth is enabled, map systemplane permissions to the service's existing permission model.
Authorizer enforcement policy (MANDATORY):
The authorizer MUST perform real permission checking. Two approved patterns:
- Granular delegation (recommended): Use
ports.DelegatingAuthorizerfrom lib-commons. Splits permission string and delegates to external auth service. - Admin-only with justification: Use a binary admin check BUT document the justification in
systemplane-guide.mdand add a code comment explaining why granular is not used.
PROHIBITED patterns:
- No-op authorizer (always returns nil) = automatic Gate 5 FAILURE
- Authorizer that ignores the permission string = NON-COMPLIANT
- Authorizer with no auth-enabled check (must handle disabled auth)
Default implementations from lib-commons:
ports.AllowAllAuthorizer— for auth-disabled modeports.DelegatingAuthorizer— for auth-enabled mode with per-permission delegationports.FuncIdentityResolver— adapts context extraction functions to IdentityResolver
Completion criteria: Both interfaces implemented, go build ./... passes.
Verification: grep "IdentityResolver\|Authorizer\|Actor\|TenantID\|Authorize" internal/bootstrap/systemplane_identity.go internal/bootstrap/systemplane_authorizer.go + go build ./...
Gate 6: Config Manager Bridge
Always executes. This gate ensures backward compatibility — existing code reading the Config struct continues to work.
Dispatch ring:backend-engineer-golang with context from Gates 1-2:
TASK: Create the Config Manager bridge that hydrates the existing Config struct from systemplane Snapshot.
CONTEXT FROM GATE 1: {Config struct location, all fields, current defaults} KEY DEFINITIONS: {All keys from Gate 2 with their snapshot key names}
FILES TO CREATE:
config_manager_systemplane.go— StateSync callback + snapshot hydration:Two core functions:
configFromSnapshot(snap domain.Snapshot) *Config— builds a Config entirely from snapshot values. ALL fields come from the snapshot — no bootstrap overlay. This is used bydefaultConfig().snapshotToFullConfig(snap domain.Snapshot, oldCfg *Config) *Config— hydrates from snapshot, then overlays bootstrap-only fields from the previous config (they never change at runtime).Helper functions for type-safe extraction with JSON coercion:
snapString(snap, key, fallback) stringsnapInt(snap, key, fallback) intsnapBool(snap, key, fallback) boolsnapFloat64(snap, key, fallback) float64The StateSync callback (registered on Manager):
StateSync: func(_ context.Context, snap domain.Snapshot) { newCfg := snapshotToFullConfig(snap, baseCfg) configManager.swapConfig(newCfg) },
config_manager_seed.go— One-time env → store seed:
- Reads current env vars and seeds them into systemplane store
- Only runs on first boot (when store is empty)
- Preserves existing configuration values during migration
- Skips bootstrap-only keys (they don't go into the store)
- Uses
store.Put()withdomain.RevisionZerofor initial write
config_manager_helpers.go— Type-safe value comparison:
valuesEquivalent(a, b any) bool— handles JSON coercion (float64 vs int)- Used by seed logic to avoid overwriting store values that match env defaults
config_validation.go— Production config guards:
- Validates critical config combinations at startup
- Used as
ConfigWriteValidatoron the Manager- Prevents invalid states (e.g., TLS enabled without cert path)
- Returns non-nil error to REJECT a write before persistence
CRITICAL: The
defaultConfig()function MUST be updated to derive from KeyDefs:func defaultConfig() *Config { return configFromSnapshot(defaultSnapshotFromKeyDefs({service}KeyDefs())) }This eliminates drift between defaults and key definitions. If you change a default in
{service}KeyDefs(), the Config struct picks it up automatically.
Completion criteria: Config struct hydrates correctly from snapshot, existing code reading configManager.Get() gets updated values after StateSync, go build ./... passes.
Verification: grep "configFromSnapshot\|snapshotToFullConfig\|StateSync\|seedStore" internal/bootstrap/ + go build ./...
Gate 7: Wiring + HTTP Mount + Swagger + ChangeFeed ⛔ CRITICAL
<cannot_skip>
This gate is NEVER skippable. It is the most critical gate.
This is where the systemplane comes alive. Without Gate 7, keys are metadata, bundles are dead code, and reconcilers never fire. Gate 7 wires everything together: the 11-step init sequence, HTTP endpoint mount, Swagger merge, ChangeFeed subscription, active bundle state, and shutdown integration.
⛔ HARD GATE: Gate 7 MUST be fully implemented. There is no "partial" Gate 7. Every subcomponent below is required.
</cannot_skip>
Dispatch ring:backend-engineer-golang with context from ALL previous gates:
TASK: Wire the complete systemplane lifecycle — init, HTTP mount, swagger merge, change feed, bundle state, and shutdown.
ALL PREVIOUS GATES: {Key definitions from Gate 2, Bundle/Factory from Gate 3, Reconcilers from Gate 4, Identity/Auth from Gate 5, Config Bridge from Gate 6}
IMPORT PATHS (use exactly these):
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain" "github.com/LerianStudio/lib-commons/v4/commons/systemplane/ports" "github.com/LerianStudio/lib-commons/v4/commons/systemplane/registry" "github.com/LerianStudio/lib-commons/v4/commons/systemplane/service" "github.com/LerianStudio/lib-commons/v4/commons/systemplane/bootstrap" "github.com/LerianStudio/lib-commons/v4/commons/systemplane/bootstrap/builtin" "github.com/LerianStudio/lib-commons/v4/commons/systemplane/adapters/changefeed" fiberhttp "github.com/LerianStudio/lib-commons/v4/commons/systemplane/adapters/http/fiber" systemswagger "github.com/LerianStudio/lib-commons/v4/commons/systemplane/swagger"FILES TO CREATE:
1.
systemplane_init.go— The 11-step initialization sequencefunc Init{Service}Systemplane(ctx context.Context, cfg *Config, configManager *ConfigManager, workerManager *WorkerManager, logger log.Logger, observer func(service.ReloadEvent)) (*SystemplaneComponents, error) { // Step 1: Extract bootstrap-only config (immutable for process lifetime) bootstrapCfg := ExtractBootstrapOnlyConfig(cfg) // Step 2: Load backend config (default: reuse app's Postgres DSN) backendCfg := Load{Service}BackendConfig(cfg) // Step 3: Create registry + register ALL keys reg := registry.New() if err := Register{Service}Keys(reg); err != nil { return nil, err } // Step 4: Configure backend with registry metadata (secret keys + apply behaviors) configureBackendWithRegistry(backendCfg, reg) // Step 5: Create backend (Store + History + ChangeFeed) backend, err := builtin.NewBackendFromConfig(ctx, backendCfg) if err != nil { return nil, fmt.Errorf("systemplane backend: %w", err) } // Step 6: Create snapshot builder snapBuilder, err := service.NewSnapshotBuilder(reg, backend.Store) if err != nil { backend.Close(); return nil, err } // Step 7: Create bundle factory (supports incremental builds) bundleFactory := New{Service}BundleFactory(&bootstrapCfg) // Step 8: Seed store from current env-var config (first boot only) if err := seedStoreForInitialReload(ctx, configManager, reg, backend.Store); err != nil { backend.Close(); return nil, err } // Step 9: Build reconcilers (phase-sorted by supervisor) reconcilers := buildReconcilers(workerManager) // Step 10: Create supervisor + initial reload supervisor, err := service.NewSupervisor(service.SupervisorConfig{ Builder: snapBuilder, Factory: bundleFactory, Reconcilers: reconcilers, Observer: observer, }) if err != nil { backend.Close(); return nil, err } if err := supervisor.Reload(ctx, "initial-bootstrap"); err != nil { backend.Close(); return nil, err } // Step 11: Create manager with callbacks baseCfg := configManager.Get() manager, err := service.NewManager(service.ManagerConfig{ Registry: reg, Store: backend.Store, History: backend.History, Supervisor: supervisor, Builder: snapBuilder, ConfigWriteValidator: productionConfigGuards(baseCfg), StateSync: func(_ context.Context, snap domain.Snapshot) { newCfg := snapshotToFullConfig(snap, baseCfg) configManager.swapConfig(newCfg) }, }) if err != nil { backend.Close(); return nil, err } return &SystemplaneComponents{ ChangeFeed: backend.ChangeFeed, Supervisor: supervisor, Manager: manager, Backend: backend.Closer, }, nil }2.
systemplane_mount.go— HTTP route registrationfunc MountSystemplaneAPI(app *fiber.App, manager service.Manager, authEnabled bool) error { authorizer := &{Service}Authorizer{authEnabled: authEnabled} identity := &{Service}IdentityResolver{} handler, err := fiberhttp.NewHandler(manager, identity, authorizer) if err != nil { return fmt.Errorf("create systemplane handler: %w", err) } handler.Mount(app) // registers all 9 /v1/system/* routes return nil }This registers ALL 9 endpoints:
GET /v1/system/configsPATCH /v1/system/configsGET /v1/system/configs/schemaGET /v1/system/configs/historyPOST /v1/system/configs/reloadGET /v1/system/settingsPATCH /v1/system/settingsGET /v1/system/settings/schemaGET /v1/system/settings/history3. Swagger Integration
In the app's swagger setup (wherever the swagger middleware is configured):
import systemswagger "github.com/LerianStudio/lib-commons/v4/commons/systemplane/swagger" // After swag init generates the base spec: merged, err := systemswagger.MergeInto(baseSwaggerSpec) if err != nil { return fmt.Errorf("merge systemplane swagger: %w", err) } // Use merged spec for swagger UI handler4.
active_bundle_state.go— Thread-safe live-read accessortype activeBundleState struct { mu sync.RWMutex bundle *{Service}Bundle } func (s *activeBundleState) Current() *{Service}Bundle { s.mu.RLock() defer s.mu.RUnlock() return s.bundle } func (s *activeBundleState) Update(b *{Service}Bundle) { s.mu.Lock() defer s.mu.Unlock() s.bundle = b }Used by infrastructure consumers for live-read access to the current bundle.
5. ChangeFeed integration (in init.go wiring)
debouncedFeed := changefeed.NewDebouncedFeed( spComponents.ChangeFeed, changefeed.WithWindow(200 * time.Millisecond), ) feedCtx, cancelFeed := context.WithCancel(ctx) go func() { _ = debouncedFeed.Subscribe(feedCtx, func(signal ports.ChangeSignal) { _ = spComponents.Manager.ApplyChangeSignal(feedCtx, signal) }) }()6. Shutdown sequence integration
func (s *Service) Stop(ctx context.Context) { s.configManager.Stop() // 1. Prevent mutations s.cancelChangeFeed() // 2. Stop change feed BEFORE supervisor s.spComponents.Supervisor.Stop(ctx) // 3. Stop supervisor + close bundle s.spComponents.Backend.Close() // 4. Close store connection s.workerManager.Stop() // 5. Stop workers }7. Observer callback (passed to InitSystemplane)
runtimeReloadObserver := func(event service.ReloadEvent) { bundle := event.Bundle.(*{Service}Bundle) bundleState.Update(bundle) configManager.UpdateFromSystemplane(event.Snapshot) // Swap logger, publishers, etc. }8. Bootstrap config file —
config/.config-map.exampleLists all bootstrap-only keys with their env var names. Operator reference for deployment configuration.
# {Service} — Bootstrap-Only Configuration (requires restart) # # These are the ONLY settings that require a container/pod restart. # Everything else is hot-reloadable via: # # GET /v1/system/configs — view current runtime config # PATCH /v1/system/configs — change any runtime-managed key # GET /v1/system/configs/schema — see all keys, types, and mutability # GET /v1/system/configs/history — audit trail of changes ENV_NAME=development SERVER_ADDRESS=:8080 AUTH_ENABLED=false ENABLE_TELEMETRY=false # ... all bootstrap-only keys
Completion criteria:
- All 9 HTTP routes accessible
- ChangeFeed fires on store changes
- Supervisor rebuilds bundle on config change
- Swagger UI shows systemplane routes (via MergeInto)
- Graceful shutdown closes all resources in correct order
- Active bundle state provides live-read access to request handlers
- Observer callback updates bundle state and config manager on reload
Verification:
grep "handler.Mount\|fiberhttp.NewHandler" internal/bootstrap/
grep "DebouncedFeed\|Subscribe\|ApplyChangeSignal" internal/bootstrap/
grep "swagger.MergeInto\|systemswagger" internal/bootstrap/ internal/
grep "activeBundleState\|bundleState.Update\|bundleState.Current" internal/bootstrap/
grep "cancelChangeFeed\|Supervisor.Stop\|Backend.Close" internal/bootstrap/
# Verify active bundle state exists for thread-safe live-read
grep -rn "activeBundleState\|bundleState" internal/bootstrap/ --include='*.go'
go build ./...
⛔ Gate 7 Anti-Rationalization Table
| Rationalization | WRONG BECAUSE | REQUIRED ACTION |
|---|---|---|
| "Mount is straightforward, I'll skip it" | Without Mount, 9 endpoints are unreachable — the entire management API is dead | MUST implement handler.Mount(app) |
| "Swagger is optional documentation" | Without swagger merge, operators cannot discover systemplane endpoints in Swagger UI | MUST call systemswagger.MergeInto() |
| "ChangeFeed can be added later" | Without ChangeFeed, hot-reload is broken — config changes in DB are invisible to the service | MUST start DebouncedFeed |
| "Shutdown is handled by the framework" | Supervisor, ChangeFeed, and Bundle hold resources that leak without explicit shutdown | MUST integrate 5-step shutdown sequence |
| "Active bundle state is internal detail" | Without thread-safe accessor, request handlers cannot read live config — race conditions | MUST implement active_bundle_state.go |
| "Init function is just boilerplate" | The 11-step sequence has strict ordering — wrong order causes nil panics or stale data | MUST follow exact step order |
| "Observer callback is optional" | Without observer, bundle state and config manager are never updated after reload | MUST register observer on Supervisor |
| "Config map example is nice-to-have" | Operators need to know which keys require restart vs hot-reload — without it, they restart for everything | MUST create config/.config-map.example |
Gate 8: Code Review
Dispatch 10 parallel reviewers (same pattern as ring:codereview).
MUST include this context in ALL 10 reviewer dispatches:
SYSTEMPLANE REVIEW CONTEXT:
- Systemplane is a three-tier configuration system: bootstrap (env, immutable) → runtime (DB, hot-reload) → live-read (snapshot, per-request).
- Every key has an ApplyBehavior that determines how changes propagate: BootstrapOnly → BundleRebuildAndReconcile → BundleRebuild → WorkerReconcile → LiveRead.
- IncrementalBundleFactory rebuilds only changed components. Ownership tracking prevents double-free.
- StateSync callback keeps the Config struct in sync for backward compatibility.
- The 9 HTTP endpoints on /v1/system/* expose config management with If-Match/ETag concurrency.
- ChangeFeed (pg_notify or change_stream) triggers automatic reload on database changes.
- swagger.MergeInto() adds systemplane routes to the app's Swagger spec.
| Reviewer | Systemplane-Specific Focus |
|---|---|
| ring:code-reviewer | Architecture, lib-commons v4 systemplane usage, three-tier separation, package boundaries, import paths |
| ring:business-logic-reviewer | Key classification correctness, ApplyBehavior assignments, component granularity, default values match current behavior |
| ring:security-reviewer | Secret key redaction (RedactFull/RedactMask), authorization model, no credential leaks in config API responses, AES-256-GCM encryption |
| ring:test-reviewer | Key registration tests, bundle factory tests, reconciler tests, contract tests, config hydration tests |
| ring:nil-safety-reviewer | Nil risks in snapshot reads, bundle adoption, reconciler error paths, active bundle state before first reload |
| ring:consequences-reviewer | Impact on existing config reads, backward compat via StateSync, degradation when store unavailable, shutdown resource cleanup |
| ring:dead-code-reviewer | Orphaned env-reading code, dead config helpers replaced by systemplane, unused YAML/viper imports, stale .env files |
| ring:performance-reviewer | Hot-path allocations in config reads, caching efficiency, systemplane polling overhead, connection pool sizing |
| ring:multi-tenant-reviewer | Systemplane key scoping across tenants, tenant isolation in config snapshots, cross-tenant config leakage prevention |
| ring:lib-commons-reviewer | lib-commons v4 systemplane package adoption, canonical patterns vs custom reimplementations, shared helper reuse |
⛔ MANDATORY: All 10 reviewers must PASS. 9/10 = FAIL. Critical findings → fix and re-review.
Gate 9: User Validation
MUST approve: present checklist for explicit user approval.
## Systemplane Migration Complete
- [ ] All keys registered and classified (count matches preview: {N} keys)
- [ ] BundleFactory builds successfully with IncrementalBundleFactory
- [ ] Incremental rebuild reuses unchanged components (ownership tracking works)
- [ ] Reconcilers fire on config changes (if applicable)
- [ ] Identity + Authorization wired (JWT → Actor, 9 permissions mapped)
- [ ] Config Manager bridge works (StateSync hydrates Config struct from Snapshot)
- [ ] `defaultConfig()` derives from KeyDefs (single source of truth)
- [ ] HTTP Mount: all 9 /v1/system/* endpoints accessible
- [ ] Swagger: systemplane routes visible in Swagger UI (via MergeInto)
- [ ] ChangeFeed: config changes in DB trigger supervisor reload
- [ ] Active bundle state: thread-safe live-read for request handlers
- [ ] Shutdown: clean 5-step resource release on SIGTERM
- [ ] Backward compat: service starts normally without systemplane store (degrades to defaults)
- [ ] Tests pass: `go test ./...`
- [ ] Code review passed: all 10 reviewers PASS
Gate 10: Activation Guide
MUST generate docs/systemplane-guide.md in the project root. Direct, concise, no filler text.
The guide is built from Gate 0 (stack detection), Gate 1 (analysis), and Gate 2 (key inventory).
The guide MUST include:
1. Components Table
| Component | Purpose | Status |
|---|---|---|
| Registry | Key definitions (types, defaults, validators) | {N} keys registered |
| Supervisor | Reload lifecycle (snapshot → bundle → reconcile → swap) | Active |
| Manager | HTTP API backend (reads, writes, schema, history, resync) | Active |
| BundleFactory | IncrementalBundleFactory with component-granular rebuilds | Active |
| Reconcilers | Side-effect appliers (HTTP, Publisher, Worker) | {N} reconcilers |
| ChangeFeed | Database change → DebouncedFeed → Supervisor.Reload | Active |
2. Bootstrap-Only Config Reference
| Env Var | Default | Description | Requires Restart |
|---|---|---|---|
ENV_NAME |
development |
Environment name | Yes |
SERVER_ADDRESS |
:8080 |
Listen address | Yes |
AUTH_ENABLED |
false |
Enable auth middleware | Yes |
| ... | ... | ... | ... |
3. Runtime-Managed Config Reference
| Key | Default | Type | ApplyBehavior | Hot-Reloadable |
|---|---|---|---|---|
rate_limit.max |
100 |
int | LiveRead | Yes (instant) |
postgres.primary_host |
localhost |
string | BundleRebuild | Yes (rebuilds PG client) |
| ... | ... | ... | ... | ... |
4. Activation Steps
- Ensure PostgreSQL (or MongoDB) is running
- Set
SYSTEMPLANE_SECRET_MASTER_KEYenv var (32 bytes, raw or base64) - Start the service — systemplane auto-creates tables/collections
- On first boot, env var values are seeded into the store
- Verify:
curl http://localhost:{port}/v1/system/configs | jq
5. Verification Commands
# View current runtime config
curl -s http://localhost:{port}/v1/system/configs | jq
# View schema (all keys, types, mutability)
curl -s http://localhost:{port}/v1/system/configs/schema | jq
# Change a runtime key
curl -X PATCH http://localhost:{port}/v1/system/configs \
-H 'Content-Type: application/json' \
-H 'If-Match: "current-revision"' \
-d '{"values": {"rate_limit.max": 200}}'
# View change history
curl -s http://localhost:{port}/v1/system/configs/history | jq
# Force full reload
curl -X POST http://localhost:{port}/v1/system/configs/reload
# View settings
curl -s http://localhost:{port}/v1/system/settings?scope=global | jq
6. Degradation Behavior
| Scenario | Behavior |
|---|---|
| Systemplane backend unavailable at startup | Service starts with env var defaults (no hot-reload) |
| Systemplane backend goes down after startup | Last-known config remains active (no updates until reconnect) |
| ChangeFeed disconnects | Auto-reconnect with exponential backoff; manual POST /v1/system/configs/reload available |
| Invalid config write | ConfigWriteValidator rejects before persistence; no impact on running config |
| Bundle build failure | Previous bundle stays active; error logged; no disruption |
7. Common Errors and Solutions
| Error | Cause | Solution |
|---|---|---|
systemplane backend: connection refused |
PostgreSQL/MongoDB not running | Start the database |
register key "X": duplicate key |
Key registered twice | Check for duplicate key names in key definition files |
revision mismatch (409) |
Concurrent write conflict | Re-read current revision, retry PATCH with updated If-Match |
key is not mutable at runtime (400) |
Tried to PATCH a bootstrap-only key | This key requires a restart — update env var instead |
secret master key required |
SYSTEMPLANE_SECRET_MASTER_KEY not set |
Set the env var (32 bytes) |
State Persistence
Save to docs/dev-systemplane-migration/current-cycle.json for resume support:
{
"skill": "ring:dev-systemplane-migration",
"version": "2.0.0",
"service": "{service_name}",
"started_at": "ISO8601",
"current_gate": "2",
"gates_completed": ["0", "1", "1.5"],
"gates_skipped": [],
"detection": {
"backend": "postgres",
"dependencies": ["postgres", "redis", "rabbitmq"],
"has_workers": true,
"existing_systemplane": false,
"config_pattern": "envDefault",
"key_count": 45
},
"compliance": {
"status": "NEW",
"checks": {}
}
}
Anti-Rationalization Table
| Rationalization | WRONG BECAUSE | REQUIRED ACTION |
|---|---|---|
| "Keys are registered, migration is done" | Without BundleFactory + Wiring, keys are just metadata — nothing reads them at runtime | Complete ALL gates through Gate 7 |
| "We can wire it later" | Gate 7 is where the system comes alive — without it, systemplane is dead code | Gate 7 is NEVER skippable |
| "Swagger integration is cosmetic" | Operators need API discoverability; without it, 9 endpoints are invisible | MUST call systemswagger.MergeInto() |
| "Config bridge is backward compat only" | Existing code reads Config struct — without StateSync, it reads stale bootstrap values forever | MUST implement Gate 6 |
| "Reconcilers are optional" | Without reconcilers, workers and HTTP policies ignore config changes — hot-reload is partial | Evaluate at Gate 4; implement if infra detected |
| "Service already has systemplane code" | Existence ≠ compliance. Code that doesn't follow canonical patterns is WRONG and must be fixed. | STOP. Run compliance audit (Gate 0 Phase 2). Fix every NON-COMPLIANT component. |
| "Partial systemplane is good enough" | Partial = broken. Without ChangeFeed, no hot-reload. Without Mount, no API. Without StateSync, stale config. | STOP. Every component is required. Execute all gates. |
| "The current approach works fine" | Working ≠ compliant. Custom config management creates drift and blocks standardization. | Replace with canonical systemplane implementation. |
| "Skip code review, we tested it" | One wiring mistake = resource leak, config corruption, or silent degradation. 10 reviewers catch what tests miss. | MANDATORY: All 10 reviewers must PASS. |
| "ChangeFeed can be added later" | Later = never. Without ChangeFeed, the database is a dead store — changes are invisible. | MUST start DebouncedFeed in Gate 7 |
| "Agent says out of scope" | Skill defines scope, not agent. | Re-dispatch with gate context |
| "Active bundle state is an implementation detail" | Without it, request handlers cannot safely read live config. Race conditions and nil panics. | MUST implement in Gate 7 |
| "Key names don't matter, it's just a string" | Different names for the same config break cross-product dashboards, Helm overlays, and operator muscle memory. Names are the API. | MUST match canonical catalog names |
| "Our tier classification is correct for our use case" | Products don't get to choose tiers independently. PG pool tuning is LiveRead everywhere or nowhere — mixed tiers break operator expectations | MUST match canonical tier classification |
| "Admin-only auth is fine for now" | It's been "for now" across 3 products and 2 years. Granular permissions exist in lib-commons but are unused. | MUST implement granular or document justification |
Appendix A: Package Structure Reference
The systemplane lives in lib-commons/v4/commons/systemplane/ — a self-contained,
backend-agnostic library with zero imports of internal application packages.
It was extracted from Matcher's pkg/systemplane/ into lib-commons v4.3.0 to be
shared across all Lerian services.
Directory Structure
commons/systemplane/ # in lib-commons repo
├── doc.go # Package doc
├── domain/ # Pure value objects, no infra deps
│ ├── actor.go # Actor{ID}
│ ├── apply_behavior.go # ApplyBehavior enum + Strength()
│ ├── backend_kind.go # BackendKind enum (postgres, mongodb)
│ ├── bundle.go # RuntimeBundle interface
│ ├── entry.go # Entry (persisted override record)
│ ├── errors.go # 14 sentinel errors
│ ├── key_def.go # KeyDef, ValueType, RedactPolicy, ValidatorFunc, ComponentNone
│ ├── kind.go # Kind enum (config, setting)
│ ├── nil_value.go # IsNilValue() — reflect-based typed-nil detection
│ ├── reconciler_phase.go # ReconcilerPhase enum (3 phases)
│ ├── revision.go # Revision type (uint64 wrapper)
│ ├── scope.go # Scope enum (global, tenant)
│ ├── snapshot.go # Snapshot + EffectiveValue
│ └── target.go # Target (kind/scope/subject coordinate)
├── ports/ # Interface definitions (hex boundary)
│ ├── authorizer.go # Authorizer
│ ├── bundle_factory.go # BundleFactory + IncrementalBundleFactory
│ ├── changefeed.go # ChangeFeed + ChangeSignal
│ ├── history.go # HistoryStore + HistoryEntry + HistoryFilter
│ ├── identity.go # IdentityResolver
│ ├── reconciler.go # BundleReconciler (phased)
│ └── store.go # Store + WriteOp + ReadResult
├── registry/ # Thread-safe key definition registry
│ ├── registry.go # Registry interface + inMemoryRegistry
│ └── validation.go # Type validation with JSON coercion
├── service/ # Use-case orchestration
│ ├── escalation.go # Escalate() — strongest ApplyBehavior
│ ├── manager.go # Manager interface + ManagerConfig
│ ├── manager_helpers.go # Redaction, schema building
│ ├── manager_reads.go # GetConfigs, GetSettings, GetSchema, GetHistory, Resync
│ ├── manager_writes.go # PatchConfigs, PatchSettings, ApplyChangeSignal
│ ├── manager_writes_helpers.go # Validation, escalation, snapshot preview
│ ├── snapshot_builder.go # 3-layer cascade: default → global → tenant
│ ├── supervisor.go # Supervisor interface + Reload lifecycle
│ └── supervisor_helpers.go # Build strategy, phase sort, resource adoption
├── bootstrap/ # Backend wiring from env/config
│ ├── backend.go # BackendFactory registry + NewBackendFromConfig
│ ├── classifier.go # IsBootstrapOnly / IsRuntimeManaged
│ ├── config.go # BootstrapConfig, PostgresBootstrapConfig, MongoBootstrapConfig
│ ├── defaults.go # Default table/collection names
│ ├── env.go # LoadFromEnv() — SYSTEMPLANE_* env vars
│ ├── postgres_identifiers.go # Regex validation for PG identifiers
│ └── builtin/
│ └── backend.go # init()-registered factories (PG + Mongo)
├── adapters/
│ ├── store/
│ │ ├── secretcodec/
│ │ │ └── codec.go # AES-256-GCM encryption for secret values
│ │ ├── storetest/
│ │ │ └── contract.go # 15 shared contract tests (Store + History)
│ │ ├── postgres/ # 3 tables, optimistic concurrency, pg_notify
│ │ │ ├── ddl.go # CREATE TABLE templates
│ │ │ ├── postgres.go # New(), NewFromDB() constructors
│ │ │ ├── store.go # Get(), Put()
│ │ │ ├── store_mutation.go # applyOp, upsert, delete, insertHistory
│ │ │ ├── store_revision.go # lock/read/update revision
│ │ │ ├── store_runtime.go # encrypt/decrypt, notify, escalate
│ │ │ ├── json_decode.go # Integer-preserving JSON decoder
│ │ │ ├── history.go # ListHistory with dynamic builder
│ │ │ └── identifiers.go # qualify() helper
│ │ └── mongodb/ # Sentinel revision doc, multi-doc txns
│ │ ├── mongodb.go # New(), ensureIndexes, txn support check
│ │ ├── store.go # Get(), Put() with sessions
│ │ ├── store_helpers.go # applyOperation, encrypt, CRUD
│ │ ├── models.go # BSON models + normalization
│ │ └── history.go # ListHistory with BSON filter
│ ├── changefeed/
│ │ ├── debounce.go # Trailing-edge per-target debounce
│ │ ├── debounce_helpers.go # Timer management, jitter
│ │ ├── safe_handler.go # Panic-to-error conversion
│ │ ├── feedtest/
│ │ │ └── contract.go # 3 shared contract tests for ChangeFeed
│ │ ├── postgres/ # LISTEN/NOTIFY, auto-reconnect, revision resync
│ │ │ ├── feed.go # Feed struct + Subscribe
│ │ │ ├── feed_subscribe.go # subscribeLoop, reconnect, listenLoop
│ │ │ ├── feed_runtime.go # backoff, jitter, revision validation
│ │ │ └── options.go # WithReconnectBounds, WithRevisionSource
│ │ └── mongodb/ # change_stream or poll mode
│ │ ├── feed.go # Feed struct + Subscribe
│ │ ├── feed_stream.go # subscribeChangeStream
│ │ ├── feed_poll.go # subscribePoll, pollRevisions
│ │ └── feed_bson.go # BSON helpers
│ └── http/
│ └── fiber/ # 9 endpoints, DTOs, middleware
│ ├── handler.go # Handler struct, NewHandler, Mount()
│ ├── handler_configs.go # Config CRUD + reload
│ ├── handler_settings.go # Settings CRUD
│ ├── dto.go # All request/response DTOs
│ ├── middleware.go # requireAuth, settingsAuth
│ └── errors.go # Domain error → HTTP status mapping
├── swagger/ # Embedded OpenAPI spec for merge
│ └── ... # swagger.MergeInto(baseSpec) ([]byte, error)
└── testutil/ # Test doubles
├── fake_store.go # In-memory Store with concurrency
├── fake_history.go # In-memory HistoryStore
├── fake_bundle.go # FakeBundle + FakeBundleFactory
├── fake_reconciler.go # Records all calls, configurable phase
└── fake_incremental_bundle.go # FakeIncrementalBundleFactory
Appendix B: Domain Model Reference
Pure value objects. No infrastructure dependencies.
| Type | Purpose | Key Fields |
|---|---|---|
Entry |
Persisted config override | Kind, Scope, Subject, Key, Value any, Revision, UpdatedAt, UpdatedBy, Source |
Kind |
Config vs Setting | KindConfig ("config") or KindSetting ("setting") |
Scope |
Visibility | ScopeGlobal ("global") or ScopeTenant ("tenant") |
Target |
Coordinate for a group of entries | Kind + Scope + SubjectID; constructor NewTarget() validates |
Revision |
Monotonic version counter | uint64; RevisionZero = 0; methods: Next(), Uint64() |
Actor |
Who made the change | ID string |
KeyDef |
Registry metadata per key | Key, Kind, AllowedScopes, DefaultValue, ValueType, Validator ValidatorFunc, Secret, RedactPolicy, ApplyBehavior, MutableAtRuntime, Description, Group, Component |
ReconcilerPhase |
Reconciler execution ordering | PhaseStateSync (0), PhaseValidation (1), PhaseSideEffect (2) |
Snapshot |
Immutable point-in-time view | Configs, GlobalSettings, TenantSettings map[string]map[string]EffectiveValue, Revision, BuiltAt |
EffectiveValue |
Resolved value with override info | Key, Value, Default, Override, Source, Revision, Redacted |
RuntimeBundle |
App-defined resource container | Interface: Close(ctx) error |
ApplyBehavior |
How changes propagate | See Appendix F; has Strength() int (0-4 scale) |
ValueType |
Type constraint | "string", "int", "bool", "float", "object", "array" |
RedactPolicy |
Secret handling | RedactNone, RedactFull, RedactMask |
BackendKind |
Storage backend | BackendPostgres or BackendMongoDB |
ValidatorFunc |
Custom value validation | func(value any) error |
ComponentNone |
Sentinel for no-rebuild keys | "_none" — for pure business-logic keys |
Snapshot accessor methods (nil-safe):
GetConfig(key),GetGlobalSetting(key),GetTenantSetting(tenantID, key)→(EffectiveValue, bool)ConfigValue(key, fallback),GlobalSettingValue(key, fallback),TenantSettingValue(tenantID, key, fallback)→any
Sentinel errors (14):
var (
ErrKeyUnknown = errors.New("unknown configuration key")
ErrValueInvalid = errors.New("invalid configuration value")
ErrRevisionMismatch = errors.New("revision mismatch")
ErrScopeInvalid = errors.New("scope not allowed for this key")
ErrPermissionDenied = errors.New("permission denied")
ErrReloadFailed = errors.New("configuration reload failed")
ErrKeyNotMutable = errors.New("key is not mutable at runtime")
ErrSnapshotBuildFailed = errors.New("snapshot build failed")
ErrBundleBuildFailed = errors.New("runtime bundle build failed")
ErrBundleSwapFailed = errors.New("runtime bundle swap failed")
ErrReconcileFailed = errors.New("bundle reconciliation failed")
ErrNoCurrentBundle = errors.New("no current runtime bundle")
ErrSupervisorStopped = errors.New("supervisor has been stopped")
ErrRegistryRequired = errors.New("registry is required")
)
Per-enum parse errors: ErrInvalidKind, ErrInvalidBackendKind, ErrInvalidScope,
ErrInvalidApplyBehavior, ErrInvalidValueType, ErrInvalidReconcilerPhase.
Appendix C: Ports Interface Reference
Seven interfaces defining all external dependencies:
// Persistence — read/write config entries
type Store interface {
Get(ctx context.Context, target domain.Target) (ReadResult, error)
Put(ctx context.Context, target domain.Target, ops []WriteOp,
expected domain.Revision, actor domain.Actor, source string) (domain.Revision, error)
}
type WriteOp struct { Key string; Value any; Reset bool }
type ReadResult struct { Entries []domain.Entry; Revision domain.Revision }
// Audit trail — change history
type HistoryStore interface {
ListHistory(ctx context.Context, filter HistoryFilter) ([]HistoryEntry, error)
}
type HistoryEntry struct {
Revision domain.Revision
Key string
Scope string
SubjectID string
OldValue any
NewValue any
ActorID string
ChangedAt time.Time
}
type HistoryFilter struct { Kind, Scope, SubjectID, Key string; Limit, Offset int }
// Real-time change notifications
type ChangeFeed interface {
Subscribe(ctx context.Context, handler func(ChangeSignal)) error // blocks until ctx cancelled
}
type ChangeSignal struct {
Target domain.Target
Revision domain.Revision
ApplyBehavior domain.ApplyBehavior
}
// Permission checking
type Authorizer interface {
Authorize(ctx context.Context, permission string) error
}
// Identity extraction from request context
type IdentityResolver interface {
Actor(ctx context.Context) (domain.Actor, error)
TenantID(ctx context.Context) (string, error)
}
// Application-specific runtime dependency builder (FULL rebuild)
type BundleFactory interface {
Build(ctx context.Context, snap domain.Snapshot) (domain.RuntimeBundle, error)
}
// INCREMENTAL rebuild — extends BundleFactory with component-granular rebuilds.
// When the Supervisor detects only a subset of components changed, it calls
// BuildIncremental instead of Build, reusing unchanged components.
type IncrementalBundleFactory interface {
BundleFactory
BuildIncremental(ctx context.Context, snap domain.Snapshot,
previous domain.RuntimeBundle, prevSnap domain.Snapshot) (domain.RuntimeBundle, error)
}
// Side-effect applier when bundles change.
// Reconcilers are sorted by Phase before execution:
// PhaseStateSync → update shared state (ConfigManager, caches)
// PhaseValidation → gates that can reject the change
// PhaseSideEffect → external side effects (worker restarts)
type BundleReconciler interface {
Name() string
Phase() domain.ReconcilerPhase
Reconcile(ctx context.Context, previous, candidate domain.RuntimeBundle,
snap domain.Snapshot) error
}
Appendix D: Service Layer Reference
Registry (commons/systemplane/registry/)
Thread-safe in-memory registry of key definitions:
type Registry interface {
Register(def domain.KeyDef) error
MustRegister(def domain.KeyDef) // panics — startup only
Get(key string) (domain.KeyDef, bool)
List(kind domain.Kind) []domain.KeyDef // sorted by key name
Validate(key string, value any) error // nil value always valid (reset)
}
func New() Registry // returns *inMemoryRegistry (RWMutex-protected)
Validation (validation.go):
validateValue(def, value)→ type check + custom validator- JSON coercion:
float64without fractional part accepted asint;int/int64widened tofloat isObjectCompatible(value)usesreflect.Map;isArrayCompatible(value)usesreflect.Array/Slice- Nil values always pass validation (they mean "reset to default")
Manager
type Manager interface {
GetConfigs(ctx context.Context) (ResolvedSet, error)
GetSettings(ctx context.Context, subject Subject) (ResolvedSet, error)
PatchConfigs(ctx context.Context, req PatchRequest) (WriteResult, error)
PatchSettings(ctx context.Context, subject Subject, req PatchRequest) (WriteResult, error)
GetConfigSchema(ctx context.Context) ([]SchemaEntry, error)
GetSettingSchema(ctx context.Context) ([]SchemaEntry, error)
GetConfigHistory(ctx context.Context, filter HistoryFilter) ([]HistoryEntry, error)
GetSettingHistory(ctx context.Context, filter HistoryFilter) ([]HistoryEntry, error)
ApplyChangeSignal(ctx context.Context, signal ChangeSignal) error
Resync(ctx context.Context) error
}
ManagerConfig (constructor dependencies):
type ManagerConfig struct {
Registry registry.Registry
Store ports.Store
History ports.HistoryStore
Supervisor Supervisor
Builder *SnapshotBuilder
// ConfigWriteValidator is called BEFORE persistence with a preview snapshot.
// Return non-nil error to reject the write.
ConfigWriteValidator func(ctx context.Context, snapshot domain.Snapshot) error
// StateSync is called AFTER successful writes for live-read escalation.
// Used to update ConfigManager atomically.
StateSync func(ctx context.Context, snapshot domain.Snapshot)
}
Key behaviors:
- Read path: Uses supervisor's cached snapshot when available; falls back to builder
- Write path:
validate ops → preview snapshot (if ConfigWriteValidator configured) → escalate → store.Put → apply escalation - Escalation application: Maps strongest
ApplyBehaviorto supervisor method:ApplyBootstrapOnly→ no-opApplyLiveRead→PublishSnapshot+StateSynccallbackApplyWorkerReconcile→ReconcileCurrentApplyBundleRebuild/ApplyBundleRebuildAndReconcile→ fullReload
- Redaction: All read results and history entries are redacted per
RedactPolicy/Secretflag - Masking:
RedactMaskshows last 4 runes;RedactFullandSecret=trueshow"****"
Supporting types:
Subject { Scope, SubjectID }— settings targetPatchRequest { Ops []WriteOp, ExpectedRevision, Actor, Source }— write inputWriteResult { Revision }— write outputResolvedSet { Values map[string]EffectiveValue, Revision }— read outputSchemaEntry { Key, Kind, AllowedScopes, ValueType, DefaultValue, MutableAtRuntime, ApplyBehavior, Secret, RedactPolicy, Description, Group }— schema metadata
Supervisor
type Supervisor interface {
Current() domain.RuntimeBundle
Snapshot() domain.Snapshot
PublishSnapshot(ctx context.Context, snap domain.Snapshot, reason string) error
ReconcileCurrent(ctx context.Context, snap domain.Snapshot, reason string) error
Reload(ctx context.Context, reason string, extraTenantIDs ...string) error
Stop(ctx context.Context) error
}
SupervisorConfig:
type SupervisorConfig struct {
Builder *SnapshotBuilder
Factory ports.BundleFactory
Reconcilers []ports.BundleReconciler
// Observer is invoked AFTER each successful reload with structured info
// about the build strategy (full vs incremental).
Observer func(ReloadEvent)
}
type ReloadEvent struct {
Strategy BuildStrategy // "full" or "incremental"
Reason string // caller-supplied (e.g., "changefeed-signal")
Snapshot domain.Snapshot
Bundle domain.RuntimeBundle
}
Reload lifecycle (the heart of the system):
- Build snapshot:
builder.BuildFull(ctx, tenantIDs...) - Build bundle: Try incremental first (if factory implements
IncrementalBundleFactoryand previous exists), fall back to full - Reconcile BEFORE commit: Run all reconcilers against candidate while previous is still active (prevents state corruption on failure)
- Atomic swap:
snapshot.Store(&snap)+bundle.Store(&holder{candidate}) - Resource adoption: If candidate implements
resourceAdopter, callAdoptResourcesFrom(previous)to nil-out transferred pointers - Observer callback: Notify host application (e.g., update bundleState, swap loggers)
- Close previous:
previous.Close()— only tears down REPLACED components since transferred pointers are nil
Hidden interfaces (implement on your bundle for advanced patterns):
resourceAdopter { AdoptResourcesFrom(previous RuntimeBundle) }— called after commit to mark transferred resourcesrollbackDiscarder { Discard(ctx) error }— called on failed candidate instead ofClose()
SnapshotBuilder
type SnapshotBuilder struct { registry Registry; store Store }
func NewSnapshotBuilder(reg, store) (*SnapshotBuilder, error)
func (b) BuildConfigs(ctx) (map[string]EffectiveValue, Revision, error)
func (b) BuildGlobalSettings(ctx) (map[string]EffectiveValue, Revision, error)
func (b) BuildSettings(ctx, Subject) (map[string]EffectiveValue, Revision, error)
func (b) BuildFull(ctx, tenantIDs ...string) (Snapshot, error)
Override cascade (3-layer for tenant settings):
initDefaults(defs)→ populate from KeyDef.DefaultValueapplyOverrides(effective, entries, source)→ overlay store entries- Tenant path:
registry defaults → global override → per-tenant override
Escalation
func Escalate(reg Registry, ops []WriteOp) (ApplyBehavior, []string, error)
- Returns strongest
ApplyBehavioracross all ops (byStrength()0-4) - Rejects
ApplyBootstrapOnlyandMutableAtRuntime=falsekeys - Rejects duplicate keys in batch
- Empty batch →
ApplyLiveRead - Returns the list of keys that drove the escalation
Appendix E: Bootstrap Configuration
| File | Purpose |
|---|---|
config.go |
BootstrapConfig, PostgresBootstrapConfig, MongoBootstrapConfig, SecretStoreConfig |
backend.go |
BackendFactory registry, RegisterBackendFactory(), NewBackendFromConfig() → BackendResources{Store, History, ChangeFeed, Closer} |
builtin/backend.go |
init() registers Postgres + MongoDB factories |
env.go |
LoadFromEnv() reads SYSTEMPLANE_* env vars |
classifier.go |
IsBootstrapOnly(def) — `!MutableAtRuntime |
defaults.go |
Default table/collection names |
postgres_identifiers.go |
Regex validation for PG identifiers (^[a-z_][a-z0-9_]{0,62}$) |
Environment variables for standalone systemplane backend:
| Variable | Default | Description |
|---|---|---|
SYSTEMPLANE_BACKEND |
— | postgres or mongodb |
SYSTEMPLANE_POSTGRES_DSN |
— | PostgreSQL DSN (falls back to app's PG DSN) |
SYSTEMPLANE_POSTGRES_SCHEMA |
system |
Schema name |
SYSTEMPLANE_POSTGRES_ENTRIES_TABLE |
runtime_entries |
Entries table |
SYSTEMPLANE_POSTGRES_HISTORY_TABLE |
runtime_history |
History table |
SYSTEMPLANE_POSTGRES_REVISION_TABLE |
runtime_revisions |
Revisions table |
SYSTEMPLANE_POSTGRES_NOTIFY_CHANNEL |
systemplane_changes |
PG LISTEN/NOTIFY channel |
SYSTEMPLANE_SECRET_MASTER_KEY |
— | AES-256-GCM master key (32 bytes, raw or base64) |
SYSTEMPLANE_MONGODB_URI |
— | MongoDB connection URI |
SYSTEMPLANE_MONGODB_DATABASE |
systemplane |
MongoDB database |
SYSTEMPLANE_MONGODB_ENTRIES_COLLECTION |
runtime_entries |
Entries collection |
SYSTEMPLANE_MONGODB_HISTORY_COLLECTION |
runtime_history |
History collection |
SYSTEMPLANE_MONGODB_WATCH_MODE |
change_stream |
change_stream or poll |
SYSTEMPLANE_MONGODB_POLL_INTERVAL_SEC |
5 |
Poll interval (poll mode only) |
Appendix F: Apply Behavior Taxonomy
Every config key MUST be classified with exactly one ApplyBehavior:
| ApplyBehavior | Code Constant | Strength | Runtime Effect | Use When |
|---|---|---|---|---|
| bootstrap-only | domain.ApplyBootstrapOnly |
4 | Immutable after startup. Never changes. | Server listen address, TLS, auth enable, telemetry endpoints |
| bundle-rebuild+worker-reconcile | domain.ApplyBundleRebuildAndReconcile |
3 | Full bundle swap: new infra clients AND worker restart | Worker enable/disable (needs new connections + restart) |
| bundle-rebuild | domain.ApplyBundleRebuild |
2 | Full bundle swap: new PG/Redis/RMQ/S3 clients | Connection strings, pool sizes, credentials |
| worker-reconcile | domain.ApplyWorkerReconcile |
1 | Reconciler restarts affected workers | Worker intervals, scheduler periods |
| live-read | domain.ApplyLiveRead |
0 | Read from snapshot on every request. Zero cost. | Rate limits, timeouts, cache TTLs — anything read per-request |
Strength determines escalation: when a PATCH contains multiple keys with different
behaviors, Escalate() picks the strongest (highest number). If any key is
ApplyBootstrapOnly or MutableAtRuntime=false, the entire write is rejected.
Classification decision tree:
Is this key needed BEFORE the systemplane itself can start?
YES → ApplyBootstrapOnly (server address, auth enable, telemetry)
NO ↓
Can this key be read per-request from a snapshot without side effects?
YES → ApplyLiveRead (rate limits, timeouts, TTLs)
NO ↓
Does changing this key require rebuilding infrastructure clients?
YES → Does it ALSO require restarting background workers?
YES → ApplyBundleRebuildAndReconcile (worker enable + storage changes)
NO → ApplyBundleRebuild (DB connections, pool sizes, credentials)
NO ↓
Does changing this key require restarting background workers?
YES → ApplyWorkerReconcile (worker intervals, scheduler periods)
NO → ApplyLiveRead (safe default for read-only configs)
Appendix G: Screening Methodology
Before implementing, screen the target service to build the key inventory:
G.1 Identify All Configuration Sources
# In the target repo:
# 1. Find the Config struct
grep -rn 'type Config struct' internal/ --include='*.go'
# 2. Find all envDefault tags
grep -rn 'envDefault:' internal/ --include='*.go' | sort
# 3. Find env var reads outside Config struct
grep -rn 'os.Getenv\|viper.Get' internal/ --include='*.go' | grep -v _test.go
# 4. Find .env files
find . -name '.env*' -o -name '*.yaml.example' | head -20
# 5. Find file-based config loading
grep -rn 'viper\.\|yaml.Unmarshal\|json.Unmarshal.*config' internal/ --include='*.go'
G.2 Classify Infrastructure vs Application Config
Infrastructure (stays as env-var with defaults in code):
- Database connection strings (PG host, port, user, password) →
Component: "postgres" - Redis connection (host, master name, password) →
Component: "redis" - RabbitMQ connection (URI, host, port, user, password) →
Component: "rabbitmq" - Object storage (endpoint, bucket, credentials) →
Component: "s3" - These become
ApplyBundleRebuildkeys in systemplane
Bootstrap-Only (env-var, immutable after startup):
- Server listen address and port →
Component: ComponentNone(not mutable) - TLS certificate paths
- Auth enable/disable
- Auth service address
- Telemetry endpoints (OTEL collector)
- These become
ApplyBootstrapOnlykeys
Application Runtime (hot-reloadable):
- Rate limits →
Component: ComponentNone+ApplyLiveRead - Worker intervals and enable/disable flags →
Component: ComponentNone+ApplyWorkerReconcile - Timeouts (webhook, health check) →
Component: ComponentNone+ApplyLiveRead - Feature flags →
Component: ComponentNone+ApplyLiveRead - Cache TTLs →
Component: ComponentNone+ApplyLiveRead
G.3 Identify Runtime Dependencies (Bundle Candidates)
List all infrastructure clients that the service creates at startup:
# Find client constructors
grep -rn 'libPostgres.New\|libRedis.New\|libRabbitmq.New\|storage.New' internal/ --include='*.go'
# Find connection pools
grep -rn 'sql.Open\|pgx\|redis.New\|amqp.Dial' internal/ --include='*.go'
Each of these becomes a field in the InfraBundle and a component name in
keyComponentMap.
G.4 Identify Background Workers
# Find worker patterns
grep -rn 'ticker\|time.NewTicker\|cron\|worker\|scheduler' internal/ --include='*.go' | grep -v _test.go
Each worker with configurable intervals becomes a WorkerReconciler candidate.
G.5 Generate the Key Inventory
Create a table with columns:
- Key name (dotted:
postgres.primary_host) - Current env var (
POSTGRES_HOST) - Default value
- Type (
string,int,bool,float) - Kind (
configorsetting) - Scope (
globalortenant) - ApplyBehavior
- Component (
postgres,redis,rabbitmq,s3,http,logger,_none) - Secret (yes/no)
- MutableAtRuntime (yes/no)
- Group
- Validator (if any)
Appendix H: Testing Patterns
H.1 Key Registration Tests
func TestRegister{Service}Keys_AllKeysValid(t *testing.T) {
reg := registry.New()
err := Register{Service}Keys(reg)
require.NoError(t, err)
configs := reg.List(domain.KindConfig)
settings := reg.List(domain.KindSetting)
assert.Greater(t, len(configs)+len(settings), 0)
}
func TestRegister{Service}Keys_DefaultsMatchConfig(t *testing.T) {
reg := registry.New()
_ = Register{Service}Keys(reg)
defaults := defaultConfig()
for _, def := range reg.List(domain.KindConfig) {
// Compare def.DefaultValue against defaults struct field
}
}
H.2 Bundle Factory Tests
func TestBundleFactory_Build_Success(t *testing.T) {
snap := buildTestSnapshot(t)
factory := New{Service}BundleFactory(testBootstrapConfig())
bundle, err := factory.Build(context.Background(), snap)
require.NoError(t, err)
defer bundle.Close(context.Background())
b := bundle.(*{Service}Bundle)
assert.NotNil(t, b.Infra.Postgres)
assert.NotNil(t, b.Infra.Redis)
}
func TestBundleFactory_BuildIncremental_ReusesUnchangedComponents(t *testing.T) {
snap1 := buildTestSnapshot(t)
snap2 := modifySnapshot(snap1, "rate_limit.max", 500) // only changes _none component
factory := New{Service}BundleFactory(testBootstrapConfig())
ctx := context.Background()
prev, _ := factory.Build(ctx, snap1)
candidate, _ := factory.BuildIncremental(ctx, snap2, prev, snap1)
b := candidate.(*{Service}Bundle)
// Postgres should be reused (not owned by candidate)
assert.False(t, b.ownsPostgres)
}
H.3 Reconciler Tests
func TestWorkerReconciler_AppliesConfig(t *testing.T) {
wm := NewTestWorkerManager()
reconciler := NewWorkerReconciler(wm)
snap := buildModifiedSnapshot(t, "export_worker.enabled", true)
err := reconciler.Reconcile(context.Background(), nil, nil, snap)
require.NoError(t, err)
assert.True(t, wm.LastAppliedConfig().ExportWorkerEnabled)
}
H.4 Contract Tests (Backend-Agnostic)
Run the storetest and feedtest contract suites against your backend:
func TestPostgresStore_ContractSuite(t *testing.T) {
store, history := setupPostgresStore(t) // testcontainers
storetest.RunAll(t, /* factory that returns store+history */)
}
func TestPostgresChangeFeed_ContractSuite(t *testing.T) {
feed := setupPostgresFeed(t) // testcontainers
feedtest.RunAll(t, /* factory that returns store+feed */)
}
Store contract tests (15):
- GetEmptyTarget, PutSingleOp, PutBatch, OptimisticConcurrency, ResetOp
- NilValueOp, EmptyBatchIsNoOp, TypedNilValueOp, PutPreservesOtherKeys
- RevisionMonotonicallyIncreasing, ConcurrentPuts
- HistoryRecording, BatchHistoryConsistency, HistoryFiltering, HistoryPagination
Feed contract tests (3):
- SubscribeReceivesSignal, ContextCancellationStops, MultipleSignals
H.5 ConfigWriteValidator Tests
func TestConfigWriteValidator_RejectsInvalidConfig(t *testing.T) {
// Test that production guards prevent persisting invalid config
snap := buildSnapshot(t)
snap.Configs["rate_limit.enabled"] = ev(false) // can't disable in production
err := validator(context.Background(), snap)
assert.Error(t, err)
}
Appendix I: Operational Guide
I.1 For Operators: What Changes
| Before | After |
|---|---|
Edit .env + restart |
PATCH /v1/system/configs (no restart) |
| Edit YAML + wait for fsnotify | PATCH /v1/system/configs (instant) |
| No audit trail | GET /v1/system/configs/history |
| No schema discovery | GET /v1/system/configs/schema |
| No concurrency protection | If-Match / ETag headers |
| Manual rollback | Change feed propagates across replicas |
| Full restart for any change | Only ApplyBootstrapOnly keys need restart |
I.2 Bootstrap-Only Keys (Require Restart)
Document in config/.config-map.example:
# {Service} — Bootstrap-Only Configuration (requires restart)
#
# These are the ONLY settings that require a container/pod restart.
# Everything else is hot-reloadable via:
#
# GET /v1/system/configs — view current runtime config
# PATCH /v1/system/configs — change any runtime-managed key
# GET /v1/system/configs/schema — see all keys, types, and mutability
# GET /v1/system/configs/history — audit trail of changes
ENV_NAME=development
SERVER_ADDRESS=:8080
AUTH_ENABLED=false
ENABLE_TELEMETRY=false
# ... etc
I.3 Docker Compose (Zero-Config)
services:
myservice:
build: .
ports:
- "${SERVER_PORT:-8080}:8080"
environment:
- POSTGRES_HOST=${POSTGRES_HOST:-postgres}
- POSTGRES_PORT=${POSTGRES_PORT:-5432}
- REDIS_HOST=${REDIS_HOST:-redis}
# NO env_file directive — defaults baked into binary
depends_on:
postgres:
condition: service_healthy
I.4 Systemplane Backend Config
By default, the systemplane reuses the application's primary PostgreSQL connection.
Override with SYSTEMPLANE_* env vars for a separate backend:
# Use app's Postgres (default — no extra config needed)
# The init function builds the DSN from the app's POSTGRES_* env vars
# Or use a dedicated backend:
SYSTEMPLANE_BACKEND=postgres
SYSTEMPLANE_POSTGRES_DSN=postgres://user:pass@host:5432/systemplane?sslmode=require
# Secret encryption (REQUIRED — rejected in production without it):
SYSTEMPLANE_SECRET_MASTER_KEY=<32-byte key, raw or base64>
I.5 Graceful Degradation
If the systemplane fails to initialize, the service continues without it:
- Config values from env vars still work
- No runtime mutation API available
- No hot-reload capability
- Workers run with static config
This is by design — the service never fails to start due to systemplane issues.
I.6 HTTP API Endpoints
After migration, the service exposes these endpoints:
| Method | Path | Description | Auth Permission |
|---|---|---|---|
GET |
/v1/system/configs |
View all resolved config values | system/configs:read |
PATCH |
/v1/system/configs |
Update config values (with If-Match for concurrency) |
system/configs:write |
GET |
/v1/system/configs/schema |
View all key definitions (types, defaults, mutability) | system/configs/schema:read |
GET |
/v1/system/configs/history |
Audit trail of config changes | system/configs/history:read |
POST |
/v1/system/configs/reload |
Force a full reload | system/configs/reload:write |
GET |
/v1/system/settings |
View resolved settings (?scope=global|tenant) |
system/settings:read |
PATCH |
/v1/system/settings |
Update settings | system/settings:write |
GET |
/v1/system/settings/schema |
View setting key definitions | system/settings/schema:read |
GET |
/v1/system/settings/history |
Settings audit trail | scope-dependent |
Settings routes have an extra settingsScopeAuthorization middleware that
elevates to system/settings/global:{action} when ?scope=global is queried.
PATCH Request Format:
PATCH /v1/system/configs
If-Match: "42"
Content-Type: application/json
{
"values": {
"rate_limit.max": 200,
"rate_limit.expiry_sec": 120,
"export_worker.enabled": false
}
}
PATCH Response Format:
HTTP/1.1 200 OK
ETag: "43"
{
"revision": 43
}
Schema Response Format:
{
"keys": [
{
"key": "rate_limit.max",
"kind": "config",
"valueType": "int",
"defaultValue": 100,
"mutableAtRuntime": true,
"applyBehavior": "live-read",
"secret": false,
"description": "Maximum requests per window",
"group": "rate_limit",
"allowedScopes": ["global"]
}
]
}
HTTP Error Mapping:
| Domain Error | HTTP Status | Code |
|---|---|---|
ErrKeyUnknown |
400 | system_key_unknown |
ErrValueInvalid |
400 | system_value_invalid |
ErrKeyNotMutable |
400 | system_key_not_mutable |
ErrScopeInvalid |
400 | system_scope_invalid |
ErrRevisionMismatch |
409 | system_revision_mismatch |
ErrPermissionDenied |
403 | system_permission_denied |
ErrReloadFailed |
500 | system_reload_failed |
ErrSupervisorStopped |
503 | system_unavailable |
I.7 Adapters Reference
| Adapter | Location | Key Feature |
|---|---|---|
| PostgreSQL Store | adapters/store/postgres/ |
3 tables + indexes, optimistic concurrency, pg_notify, integer-preserving JSON decode |
| MongoDB Store | adapters/store/mongodb/ |
Sentinel revision doc (__revision_meta__), multi-doc transactions, requires replica set |
| Secret Codec | adapters/store/secretcodec/ |
AES-256-GCM with random nonce, AAD = "kind|scope|subject|key", envelope format {__systemplane_secret_v: 1, alg, nonce, ciphertext} |
| PostgreSQL ChangeFeed | adapters/changefeed/postgres/ |
LISTEN/NOTIFY, exponential backoff reconnect, revision resync on reconnect |
| MongoDB ChangeFeed | adapters/changefeed/mongodb/ |
Change stream or poll mode, auto-reconnect, revision-jump escalation |
| DebouncedFeed | adapters/changefeed/debounce.go |
Per-target trailing-edge debounce (default 100ms + 50ms jitter), escalates to strongest ApplyBehavior in window |
| SafeInvokeHandler | adapters/changefeed/safe_handler.go |
Catches handler panics, returns ErrHandlerPanic |
| Fiber HTTP | adapters/http/fiber/ |
9 endpoints, DTOs, If-Match/ETag concurrency, middleware, domain→HTTP error mapping |
| Store Contract Tests | adapters/store/storetest/ |
15 backend-agnostic tests: CRUD, concurrency, reset, history, pagination |
| Feed Contract Tests | adapters/changefeed/feedtest/ |
3 backend-agnostic tests: signal receipt, cancellation, multiple signals |
PostgreSQL Store DDL (created automatically):
runtime_entries: PK (kind,scope,subject,key), JSONBvalue, BIGINTrevisionruntime_history: BIGSERIALid,old_value/new_valueJSONB,actor_id,changed_atruntime_revisions: PK (kind,scope,subject),revisioncounter,apply_behavior
I.8 Test Utilities
| Fake | Implements | Key Features |
|---|---|---|
FakeStore |
ports.Store |
In-memory, optimistic concurrency, Seed() for pre-population |
FakeHistoryStore |
ports.HistoryStore |
In-memory, newest-first, Append()/AppendForKind() |
FakeBundle / FakeBundleFactory |
RuntimeBundle / BundleFactory |
Tracks Close state, SetError(), CallCount() |
FakeReconciler |
BundleReconciler |
Configurable phase, records all ReconcileCalls, SetError() |
FakeIncrementalBundleFactory |
IncrementalBundleFactory |
Embeds FakeBundleFactory + IncrementalBuildFunc, IncrementalCallCount() |
Appendix J: Matcher Service Reference
The Matcher service registers ~130 keys across 20 groups, split into 9 focused sub-files. Here's the breakdown by group and ApplyBehavior:
| Group | Keys | BootstrapOnly | BundleRebuild | LiveRead | WorkerReconcile | Rebuild+Reconcile |
|---|---|---|---|---|---|---|
app |
2 | 1 | 1 | - | - | - |
server |
8 | 4 | 4 | - | - | - |
tenancy |
11 | - | 11 | - | - | - |
postgres |
19 | - | 18 | 1 | - | - |
redis |
12 | - | 12 | - | - | - |
rabbitmq |
8 | - | 8 | - | - | - |
auth |
3 | 3 | - | - | - | - |
swagger |
3 | - | 3 | - | - | - |
telemetry |
7 | 7 | - | - | - | - |
rate_limit |
7 | - | - | 7 | - | - |
infrastructure |
2 | - | 1 | 1 | - | - |
idempotency |
3 | 1 | 2 | - | - | - |
callback_rate_limit |
1 | - | - | 1 | - | - |
fetcher |
9 | - | 4 | - | 3 | 2 |
deduplication |
1 | - | - | 1 | - | - |
object_storage |
6 | - | 6 | - | - | - |
export_worker |
4 | - | - | 1 | 1 | 2 |
webhook |
1 | - | - | 1 | - | - |
cleanup_worker |
4 | - | - | - | 2 | 2 |
scheduler |
1 | - | - | - | 1 | - |
archival |
12 | - | - | 1 | 3 | 8 |
Secrets: ~9 keys with RedactFull policy (passwords, tokens, certificates, access keys).
Components referenced: postgres, redis, rabbitmq, s3, http, logger, _none.
Key Sub-File Organization (Reference)
| Sub-File | Groups Covered | Key Count |
|---|---|---|
systemplane_keys_app_server.go |
app, server | ~11 |
systemplane_keys_tenancy.go |
tenancy | ~11 |
systemplane_keys_postgres.go |
postgres | ~19 |
systemplane_keys_messaging.go |
redis, rabbitmq | ~20 |
systemplane_keys_runtime_http.go |
auth, swagger, telemetry, rate_limit | ~17 |
systemplane_keys_runtime_services.go |
infrastructure, idempotency, callback_rate_limit, fetcher | ~14 |
systemplane_keys_storage_export.go |
deduplication, object_storage, export_worker | ~11 |
systemplane_keys_workers.go |
webhook, cleanup_worker | ~5 |
systemplane_keys_archival.go |
scheduler, archival | ~12 |
Appendix K: Quick Reference Commands
# LOCAL DEV ONLY — requires AUTH_ENABLED=false
# View current runtime config
curl -s http://localhost:4018/v1/system/configs | jq
# View schema (all keys, types, mutability)
curl -s http://localhost:4018/v1/system/configs/schema | jq
# Change a runtime key
curl -X PATCH http://localhost:4018/v1/system/configs \
-H 'Content-Type: application/json' \
-H 'If-Match: "current-revision"' \
-d '{"values": {"rate_limit.max": 200}}'
# View change history
curl -s http://localhost:4018/v1/system/configs/history | jq
# Force full reload
curl -X POST http://localhost:4018/v1/system/configs/reload
# View settings
curl -s http://localhost:4018/v1/system/settings?scope=global | jq
# Change a setting
curl -X PATCH http://localhost:4018/v1/system/settings \
-H 'Content-Type: application/json' \
-H 'If-Match: "current-revision"' \
-d '{"scope": "global", "values": {"feature.enabled": true}}'
Appendix L: Files to Create per Service
Core Systemplane Wiring
| File | Purpose | Matcher Reference |
|---|---|---|
systemplane_init.go |
Init function with 11-step boot + change feed start | internal/bootstrap/systemplane_init.go |
systemplane_mount.go |
HTTP route registration + swagger merge | internal/bootstrap/systemplane_mount.go |
Key Definitions (split by group)
| File | Purpose | Matcher Reference |
|---|---|---|
systemplane_keys.go |
Orchestrator: Register{Service}Keys() + {service}KeyDefs() |
internal/bootstrap/systemplane_keys.go |
systemplane_keys_{group}.go |
Per-group key definitions | internal/bootstrap/systemplane_keys_{group}.go |
systemplane_keys_validation.go |
Validator functions used by KeyDef.Validator | internal/bootstrap/systemplane_keys_validation.go |
systemplane_keys_helpers.go |
concatKeyDefs() utility |
internal/bootstrap/systemplane_keys_helpers.go |
Bundle + Factory
| File | Purpose | Matcher Reference |
|---|---|---|
systemplane_bundle.go |
Bundle struct + Close + AdoptResourcesFrom (ownership tracking) | internal/bootstrap/systemplane_bundle.go |
systemplane_factory.go |
BundleFactory + IncrementalBundleFactory (full + incremental) | internal/bootstrap/systemplane_factory.go |
systemplane_factory_infra.go |
Per-component builders: buildPostgres, buildRedis, buildRabbitMQ, buildS3 | internal/bootstrap/systemplane_factory_infra.go |
Reconcilers
| File | Purpose | Matcher Reference |
|---|---|---|
systemplane_reconciler_http.go |
HTTP policy validation (PhaseValidation) | internal/bootstrap/systemplane_reconciler_http.go |
systemplane_reconciler_publishers.go |
RabbitMQ publisher staging (PhaseValidation) | internal/bootstrap/systemplane_reconciler_publishers.go |
systemplane_reconciler_worker.go |
Worker restart (PhaseSideEffect) | internal/bootstrap/systemplane_reconciler_worker.go |
Identity + Authorization
| File | Purpose | Matcher Reference |
|---|---|---|
systemplane_identity.go |
JWT → Actor bridge | internal/bootstrap/systemplane_identity.go |
systemplane_authorizer.go |
Permission mapping | internal/bootstrap/systemplane_authorizer.go |
Config Manager Integration
| File | Purpose | Matcher Reference |
|---|---|---|
config_manager_systemplane.go |
Snapshot → Config hydration | internal/bootstrap/config_manager_systemplane.go |
config_manager_seed.go |
Env → Store one-time seed | internal/bootstrap/config_manager_seed.go |
config_manager_helpers.go |
Type-safe value comparison | internal/bootstrap/config_manager_helpers.go |
config_validation.go |
Production config guards | internal/bootstrap/config_validation.go |
Runtime Integration
| File | Purpose | Matcher Reference |
|---|---|---|
active_bundle_state.go |
Thread-safe live-read accessor for current bundle | internal/bootstrap/active_bundle_state.go |
config/.config-map.example |
Bootstrap-only key reference (operators) | config/.config-map.example |
Files to Delete
| File | Reason |
|---|---|
config/.env.example |
Replaced by code defaults + .config-map.example |
config/*.yaml.example |
No more YAML config |
| Config API handlers (old) | Replaced by systemplane HTTP adapter |
| Config file watcher | Replaced by change feed |
| Config audit publisher (old) | Replaced by systemplane history |
| Config YAML loader | No more YAML |
Files to Modify
| File | Change |
|---|---|
config_loading.go |
Remove YAML loading, keep env-only |
config_defaults.go |
Derive from {service}KeyDefs() via configFromSnapshot(defaultSnapshotFromKeyDefs(...)) |
config_manager.go |
Add UpdateFromSystemplane(), enterSeedMode(), atomic.Pointer[Config] for lock-free reads |
init.go |
Wire systemplane init after workers, before HTTP mount; add reload observer callback |
service.go |
Add systemplane shutdown sequence (5-step) |
docker-compose.yml |
Remove env_file, inline defaults with ${VAR:-default} |
Makefile |
Remove set-env, clear-envs targets |
Appendix M: Canonical Key Catalog
The canonical key catalog is defined in lib-commons/commons/systemplane/catalog/. Products MUST match these names, tiers, and components for shared infrastructure keys.
Naming Conventions
| Convention | Rule | Example |
|---|---|---|
| SSL mode | ssl_mode (with underscore) |
postgres.primary_ssl_mode |
| Connection count | Plural conns |
postgres.max_open_conns, redis.min_idle_conns |
| CORS | cors.* namespace (NOT server.cors_*) |
cors.allowed_origins |
| RabbitMQ connection | rabbitmq.url (NOT uri) |
rabbitmq.url |
| Timeout unit suffix | MANDATORY _ms or _sec |
redis.read_timeout_ms, rate_limit.expiry_sec |
| Size unit suffix | MANDATORY _bytes |
server.body_limit_bytes |
Tier Classification Standard
| Config Category | Canonical Tier | Rationale |
|---|---|---|
PG pool tuning (max_open_conns, etc.) |
LiveRead | Go's database/sql supports SetMaxOpenConns() at runtime |
| CORS settings | LiveRead | Middleware reads from snapshot per-request |
| Log level | LiveRead | Use zap.AtomicLevel.SetLevel() |
| Migration path | BootstrapOnly | Migrations only run at startup |
| Body limit | BootstrapOnly | Set at Fiber server initialization |
| DB connection strings | BundleRebuild | Requires new connection pool |
| Redis connection | BundleRebuild | Requires new Redis client |
| Worker enable/disable | BundleRebuildAndReconcile | Needs new connections + worker restart |
| Worker intervals | WorkerReconcile | Needs worker restart only |
| Rate limits, timeouts, TTLs | LiveRead | Read per-request from snapshot |
Enforcement
Products run catalog.ValidateKeyDefs() in their test suite. Any mismatch is a test failure that blocks CI.
Appendix N: Environment Variable Convention
| Infrastructure | Prefix | Examples |
|---|---|---|
| PostgreSQL | POSTGRES_* |
POSTGRES_HOST, POSTGRES_PORT, POSTGRES_PASSWORD |
| Redis | REDIS_* |
REDIS_HOST, REDIS_PASSWORD, REDIS_DB |
| RabbitMQ | RABBITMQ_* |
RABBITMQ_URL, RABBITMQ_EXCHANGE |
| Auth | PLUGIN_AUTH_* |
PLUGIN_AUTH_ENABLED, PLUGIN_AUTH_ADDRESS |
| Telemetry | OTEL_* / ENABLE_TELEMETRY |
OTEL_RESOURCE_SERVICE_NAME |
| Server | SERVER_* |
SERVER_ADDRESS, SERVER_TLS_CERT_FILE |
PROHIBITED: DB_HOST, DB_PORT, DB_USER, DB_PASSWORD, DB_NAME — use POSTGRES_* prefix.
PROHIBITED: AUTH_ENABLED without PLUGIN_ prefix — use PLUGIN_AUTH_ENABLED.
Appendix O: Unit Suffix Standard
All config keys with dimensional values MUST include a unit suffix:
| Dimension | Suffix | Examples |
|---|---|---|
| Time (milliseconds) | _ms |
redis.read_timeout_ms, rabbitmq.publish_timeout_ms |
| Time (seconds) | _sec |
rate_limit.expiry_sec, auth.cache_ttl_sec |
| Time (minutes) | _mins |
postgres.conn_max_lifetime_mins |
| Size (bytes) | _bytes |
server.body_limit_bytes |
| Count | Plural noun | postgres.max_open_conns, redis.min_idle_conns |
PROHIBITED: Dimensionless timeout keys like webhook.timeout — must be webhook.timeout_ms or webhook.timeout_sec.
PROHIBITED: Mixed units — if one timeout in a group uses _ms, all timeouts in that group should use _ms.