ring/dev-team/skills/dev-systemplane-migration/SKILL.md
Fred Amaral 9cb5a72737
fix(codereview): align reviewer references and harden lib-commons/multi-tenant agents
Propagates the 10-reviewer peer list across agent frontmatter, Position/Critical prose, shared-patterns, skill dispatchers, gate validators, and docs — resolving drift left behind when multi-tenant-reviewer and lib-commons-reviewer were added to the pool. Also fixes broken shared-pattern paths in lib-commons-reviewer and adds substantive blocker criteria to multi-tenant-reviewer plus codebase-context severity heuristic (Lerian third-rail vs external recommendation) to lib-commons-reviewer.

X-Lerian-Ref: 0x1
2026-04-18 20:18:16 -03:00

2657 lines
121 KiB
Markdown

---
name: ring:dev-systemplane-migration
description: >
Gate-based systemplane migration orchestrator. Migrates Lerian Go services from
.env/YAML-based configuration to the systemplane — a database-backed, hot-reloadable
runtime configuration and settings management plane with full audit history, optimistic
concurrency, change feeds, component-granular bundle rebuilds, and atomic infrastructure
replacement. Requires lib-commons v4.3.0+.
trigger: |
- User requests systemplane migration or adoption
- Task mentions runtime configuration, hot-reload, or config management
- Service needs database-backed configuration with audit trail
- BundleFactory or Reconciler development is required
skip_when: |
- Project is not a Go service
- Service does not use lib-commons v4
- Task is unrelated to configuration management or systemplane
NOT_skip_when: |
- Service already has systemplane code (verify compliance, do not skip)
- "It looks like systemplane is already set up" (existence ≠ compliance)
prerequisites: |
- Go project
- lib-commons/v4 dependency (v4.3.0+ required; upgrade first if older)
- PostgreSQL or MongoDB backend available
sequence:
after: ["ring:dev-cycle"]
input_schema:
type: object
properties:
execution_mode:
type: string
enum: ["FULL", "SCOPED"]
description: "FULL = fresh migration. SCOPED = compliance audit of existing."
detected_backend:
type: string
enum: ["postgres", "mongodb"]
detected_dependencies:
type: array
items: { type: string }
description: "Infrastructure detected: postgres, mongodb, redis, rabbitmq, s3"
service_has_workers:
type: boolean
existing_systemplane:
type: boolean
output_schema:
format: markdown
required_sections:
- name: "Migration Summary"
pattern: "^## Migration Summary"
required: true
- name: "Gates Completed"
pattern: "^## Gates Completed"
required: true
- name: "Compliance Status"
pattern: "^## Compliance Status"
required: true
- name: "Files Created"
pattern: "^## Files Created"
required: true
---
# Systemplane Migration Orchestrator
<cannot_skip>
## CRITICAL: This Skill ORCHESTRATES. Agents IMPLEMENT.
| Who | Responsibility |
|-----|----------------|
| **This Skill** | Detect stack, determine gates, pass context to agent, verify outputs, enforce order |
| **ring:backend-engineer-golang** | Implement systemplane code following the patterns in this document |
| **ring:codebase-explorer** | Analyze the codebase for configuration patterns (Gate 1) |
| **ring:visualize** | Generate implementation preview HTML (Gate 1.5) |
| **10 reviewers** | Review at Gate 8 |
**CANNOT change scope:** the skill defines WHAT to implement. The agent implements HOW.
**FORBIDDEN: Orchestrator MUST NOT use Edit, Write, or Bash tools to modify source code files.**
All code changes MUST go through `Task(subagent_type="ring:backend-engineer-golang")`.
The orchestrator only verifies outputs (grep, go build, go test) — MUST NOT write implementation code.
</cannot_skip>
---
## Architecture Overview
The systemplane replaces traditional env-var-only / YAML-based configuration with a
three-tier architecture:
```
┌─ TIER 1: Bootstrap-Only ──────────────────────────────────────────┐
│ Env vars read ONCE at startup. Immutable for process lifetime. │
│ Examples: SERVER_ADDRESS, AUTH_ENABLED, OTEL_* (telemetry) │
│ Stored in: BootstrapOnlyConfig struct (frozen at init) │
└───────────────────────────────────────────────────────────────────┘
┌─ TIER 2: Runtime-Managed (Hot-Reload) ────────────────────────────┐
│ Stored in database (PostgreSQL or MongoDB). │
│ Changed via PATCH /v1/system/configs or /v1/system/settings. │
│ Propagation: ChangeFeed → Supervisor → Snapshot → Bundle/Reconcile│
│ Examples: rate limits, worker intervals, DB pool sizes, CORS │
└───────────────────────────────────────────────────────────────────┘
┌─ TIER 3: Live-Read (Zero-Cost Per-Request) ───────────────────────┐
│ Read directly from Supervisor.Snapshot() on every request. │
│ No rebuild, no reconciler, no locking. │
│ Examples: rate_limit.max, health_check_timeout_sec │
└───────────────────────────────────────────────────────────────────┘
```
### Data Flow: Startup → Hot-Reload → Shutdown
```
STARTUP:
ENV VARS → defaultConfig() → loadConfigFromEnv() → *Config → ConfigManager
InitSystemplane():
1. ExtractBootstrapOnlyConfig(cfg) → BootstrapOnlyConfig (immutable)
2. LoadSystemplaneBackendConfig(cfg) → BootstrapConfig (PG/Mongo DSN)
3. builtin.NewBackendFromConfig(ctx, cfg) → Store + History + ChangeFeed
4. registry.New() + Register{Service}Keys() → Registry (100+ key definitions)
5. configureBackendWithRegistry() → pass secret keys + apply behaviors
6. NewSnapshotBuilder(registry, store) → SnapshotBuilder
7. New{Service}BundleFactory(bootstrapCfg) → BundleFactory (full + incremental)
8. seedStoreForInitialReload() → Env overrides → Store
9. buildReconcilers() → [HTTP, Publisher, Worker] (phased)
10. NewSupervisor → Reload("initial-bootstrap") → First bundle
11. NewManager(ManagerConfig{ → HTTP API handler
ConfigWriteValidator: productionGuards,
StateSync: configManagerSync,
})
StartChangeFeed() → DebouncedFeed(200ms) → goroutine: subscribe(store changes)
MountSystemplaneAPI() → 9 endpoints on /v1/system/*
HOT-RELOAD (on API PATCH or ChangeFeed signal):
Signal → Supervisor.Reload(ctx, reason) →
1. SnapshotBuilder.BuildFull() → new Snapshot (defaults + store overrides)
2. BundleFactory.BuildIncremental(snap, prev, prevSnap) →
a) Diff changed keys via keyComponentMap (postgres/redis/rabbitmq/s3/http/logger)
b) Rebuild ONLY changed components
c) Reuse unchanged components from previous bundle (pointer transfer)
d) Falls back to full Build() when all components are affected
3. Reconcilers run IN PHASE ORDER:
a) PhaseStateSync: (reserved — no current reconcilers)
b) PhaseValidation: HTTPPolicy, Publisher
c) PhaseSideEffect: Worker → WorkerManager.ApplyConfig()
4. On success: atomic swap — snapshot.Store() + bundle.Store()
5. AdoptResourcesFrom(previous) → nil-out transferred pointers
6. Observer callback → bundleState.Update() + ConfigManager.UpdateFromSystemplane()
+ SwappableLogger.Swap() + SwapRuntimePublishers()
7. previous.Close() → only tears down REPLACED components
SHUTDOWN:
1. ConfigManager.Stop() (prevent mutations)
2. cancelChangeFeed() (stop reload triggers)
3. Supervisor.Stop() (stop supervisory loop + close bundle)
4. Backend.Close() (close store connection)
5. WorkerManager.Stop() (stop all workers)
```
### The Three Configuration Authorities
| Phase | Authority | Scope | Mutability |
|-------|-----------|-------|------------|
| **Bootstrap** | Env vars → `defaultConfig()` + `loadConfigFromEnv()` | Server address, TLS, auth, telemetry | Immutable after startup |
| **Runtime** | Systemplane Store + Supervisor | Rate limits, workers, timeouts, DB pools, CORS | Hot-reloadable via API |
| **Legacy bridge** | `ConfigManager.Get()` | Backward-compat for existing code | Updated by StateSync callback + observer |
**Single source of truth**: `{service}KeyDefs()` is THE canonical source of all
default values. The `defaultConfig()` function derives its values from KeyDefs
via `defaultSnapshotFromKeyDefs()``configFromSnapshot()`. No manual sync
required between defaults, key definitions, or struct tags.
**Component-granular awareness**: Every key's `Component` field (e.g., `"postgres"`,
`"redis"`, `"rabbitmq"`, `"s3"`, `"http"`, `"logger"`, or `ComponentNone`) enables
the `IncrementalBundleFactory` to rebuild only the affected infrastructure component
when that key changes, instead of tearing down and rebuilding everything.
---
## Canonical Import Paths
The systemplane is a shared library in lib-commons (since v4.3.0). All services
import it via:
```go
import (
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/ports"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/registry"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/service"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/bootstrap"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/bootstrap/builtin"
"github.com/LerianStudio/lib-commons/v4/commons/systemplane/adapters/changefeed"
fiberhttp "github.com/LerianStudio/lib-commons/v4/commons/systemplane/adapters/http/fiber"
// Swagger spec (for consuming app integration)
systemswagger "github.com/LerianStudio/lib-commons/v4/commons/systemplane/swagger"
)
```
Requires `lib-commons v4.3.0+` in your `go.mod`.
**⛔ HARD GATE:** Agent must not use v2 or v3 import paths or invent sub-package paths. If any tool truncates output, this table is the authoritative reference.
---
## Severity Calibration
| Severity | Criteria | Examples |
|----------|----------|----------|
| **CRITICAL** | Data loss, runtime panic, silent config corruption | Bundle not closed on shutdown (resource leak), snapshot reads nil (panic), ChangeFeed not started (hot-reload broken) |
| **HIGH** | Missing integration, broken hot-reload path | Keys registered but no BundleFactory, no HTTP Mount, no StateSync callback |
| **MEDIUM** | Incomplete but functional | Missing validators on some keys, incomplete secret redaction, no swagger merge |
| **LOW** | Polish and optimization | Missing config-map.example, incomplete descriptions, suboptimal component grouping |
MUST report all severities. CRITICAL: STOP immediately. HIGH: Fix before gate pass. MEDIUM: Fix in iteration. LOW: Document.
---
## Pressure Resistance
| User Says | This Is | Response |
|-----------|---------|----------|
| "Keys are registered, migration is done" | SCOPE_REDUCTION | "Key registration is Gate 2 of 10. Without BundleFactory + Wiring + HTTP Mount, keys are metadata — nothing reads them at runtime. MUST complete ALL gates." |
| "We can wire it later" | SCOPE_REDUCTION | "Gate 7 is where the system comes alive. Without it, systemplane is dead code. Gate 7 is NEVER skippable." |
| "Swagger integration is cosmetic" | QUALITY_BYPASS | "Operators need API discoverability. Without swagger.MergeInto(), 9 endpoints are invisible in Swagger UI. MUST implement." |
| "Config bridge is backward compat only" | SCOPE_REDUCTION | "Existing code reads Config struct. Without StateSync, it reads stale bootstrap values forever. MUST implement Gate 6." |
| "Reconcilers are optional" | SCOPE_REDUCTION | "Without reconcilers, workers and HTTP policies ignore config changes. Hot-reload is partial." |
| "The service already has systemplane" | COMPLIANCE_BYPASS | "Existence ≠ compliance. MUST run compliance audit. If it doesn't match canonical patterns exactly, it is non-compliant." |
| "Skip code review, we tested it" | QUALITY_BYPASS | "MANDATORY: 10 reviewers. One wiring mistake = silent config corruption or resource leak." |
| "Agent says out of scope" | AUTHORITY_OVERRIDE | "Skill defines scope, not agent. Re-dispatch with gate context." |
| "ChangeFeed can be added later" | SCOPE_REDUCTION | "Without ChangeFeed, hot-reload is broken. Config changes in DB are invisible to the service. MUST start DebouncedFeed." |
| "Active bundle state is internal detail" | SCOPE_REDUCTION | "Without thread-safe accessor, request handlers cannot read live config. Race conditions. MUST implement active_bundle_state.go." |
---
## Gate Overview
| Gate | Name | Condition | Agent |
|------|------|-----------|-------|
| 0 | Stack Detection + Prerequisite Audit | Always | Orchestrator |
| 1 | Codebase Analysis (Config Focus) | Always | ring:codebase-explorer |
| 1.5 | Implementation Preview | Always | Orchestrator (ring:visualize) |
| 2 | Key Definitions + Registry | Always | ring:backend-engineer-golang |
| 3 | Bundle + BundleFactory | Always | ring:backend-engineer-golang |
| 4 | Reconcilers | Conditional (skip if no workers AND no RMQ AND no HTTP policy changes) | ring:backend-engineer-golang |
| 5 | Identity + Authorization | Always | ring:backend-engineer-golang |
| 6 | Config Manager Bridge | Always | ring:backend-engineer-golang |
| 7 | Wiring + HTTP Mount + Swagger + ChangeFeed | Always ⛔ NEVER SKIPPABLE | ring:backend-engineer-golang |
| 8 | Code Review | Always | 10 parallel reviewers |
| 9 | User Validation | Always | User |
| 10 | Activation Guide | Always | Orchestrator |
MUST execute gates sequentially. CANNOT skip or reorder.
### Gate Execution Rules
**HARD GATES — CANNOT be overridden:**
- All gates must execute in order (0→1→1.5→2→3→4→5→6→7→8→9→10)
- Gates execute with explicit dependencies — no gate can start until its predecessor completes
- Existence ≠ compliance: existing systemplane code triggers compliance audit, NOT a skip
- A gate can only be marked SKIP when ALL its compliance checks pass with evidence
- Gate 7 (Wiring) is NEVER skippable — it is the most critical gate
<cannot_skip>
### HARD GATE: Existence ≠ Compliance
**"The service already has systemplane code" is NOT a reason to skip any gate.**
MUST replace existing systemplane code that does not follow the canonical patterns — it is **non-compliant**. The only valid reason to skip a gate is when the existing implementation has been **verified** to match the exact patterns defined in this skill document.
**Compliance verification requires EVIDENCE, not assumption.** The Gate 0 Phase 2 compliance audit (S1-S8 grep checks) verifies each component against canonical patterns.
**If ANY audit check is NON-COMPLIANT → the corresponding gate MUST execute to fix it. CANNOT skip.**
</cannot_skip>
---
## Gate 0: Stack Detection + Prerequisite Audit
**Orchestrator executes directly. No agent dispatch.**
**This gate has THREE phases: detection, compliance audit, and non-canonical pattern detection.**
### Phase 1: Stack Detection
```text
DETECT (run in parallel):
1. lib-commons version:
grep "lib-commons" go.mod
(require v4.3.0+ — if older, MUST upgrade before proceeding)
2. PostgreSQL:
grep -rn "postgresql\|pgx\|squirrel" internal/ go.mod
3. MongoDB:
grep -rn "mongodb\|mongo" internal/ go.mod
4. Redis:
grep -rn "redis\|valkey" internal/ go.mod
5. RabbitMQ:
grep -rn "rabbitmq\|amqp" internal/ go.mod
6. S3/Object Storage:
grep -rn "s3\|ObjectStorage\|PutObject\|GetObject\|Upload.*storage\|Download.*storage" internal/ pkg/ go.mod
7. Background workers:
grep -rn "ticker\|time.NewTicker\|cron\|worker\|scheduler" internal/ --include="*.go" | grep -v _test.go
8. Existing systemplane code:
- Imports: grep -rn "systemplane" internal/ go.mod
- BundleFactory: grep -rn "BundleFactory\|IncrementalBundleFactory" internal/ --include="*.go"
- Supervisor: grep -rn "service.NewSupervisor\|Supervisor" internal/ --include="*.go" | grep systemplane
- Keys: grep -rn "Register.*Keys\|KeyDefs()" internal/ --include="*.go"
- HTTP Mount: grep -rn "handler.Mount\|fiberhttp.NewHandler" internal/ --include="*.go"
9. Current config pattern:
- Struct: grep -rn "type Config struct" internal/ --include="*.go"
- Env tags: grep -rn "envDefault:" internal/ --include="*.go" | sort
- Env reads: grep -rn "os.Getenv\|viper\.\|envconfig" internal/ --include="*.go" | grep -v _test.go
- YAML files: find . -name '.env*' -o -name '*.yaml' -o -name '*.yml' | grep -i config
```
**Output format for stack detection:**
```text
STACK DETECTION RESULTS:
| Component | Detected | Evidence |
|-----------|----------|----------|
| lib-commons v4.3.0+ | YES/NO | {go.mod version} |
| PostgreSQL | YES/NO | {file:line matches} |
| MongoDB | YES/NO | {file:line matches} |
| Redis | YES/NO | {file:line matches} |
| RabbitMQ | YES/NO | {file:line matches} |
| S3/Object Storage | YES/NO | {file:line matches} |
| Background workers | YES/NO | {file:line matches} |
| Existing systemplane | YES/NO | {file:line matches} |
| Config pattern | envDefault/viper/os.Getenv | {file:line matches} |
```
### Phase 2: Compliance Audit (MANDATORY if any existing systemplane code detected)
If Phase 1 step 8 detects any existing systemplane code, MUST run a compliance audit. MUST replace existing code that does not match canonical patterns — it is not "partially done", it is **wrong**.
```text
AUDIT (run in parallel — only if step 8 found existing systemplane code):
S1. Key Registry compliance:
grep -rn "Register.*Keys\|KeyDefs()" internal/ --include="*.go"
(no match = NON-COMPLIANT → Gate 2 MUST fix)
S2. BundleFactory compliance:
grep -rn "IncrementalBundleFactory\|BundleFactory" internal/ --include="*.go"
(no IncrementalBundleFactory = NON-COMPLIANT → Gate 3 MUST fix)
S3. AdoptResources compliance:
grep -rn "AdoptResourcesFrom" internal/ --include="*.go"
(no match but BundleFactory exists = NON-COMPLIANT → Gate 3 MUST fix)
S4. Reconciler compliance:
grep -rn "BundleReconciler\|Reconcile.*context" internal/ --include="*.go" | grep -v _test.go
(expected reconcilers missing = NON-COMPLIANT → Gate 4 MUST fix)
S5. Identity/Authorization compliance:
grep -rn "IdentityResolver\|Authorizer" internal/ --include="*.go"
(no match = NON-COMPLIANT → Gate 5 MUST fix)
S6. Config Bridge compliance:
grep -rn "StateSync\|config_manager_systemplane\|configFromSnapshot" internal/ --include="*.go"
(no StateSync callback or config hydration = NON-COMPLIANT → Gate 6 MUST fix)
S7. HTTP Mount compliance:
grep -rn "handler.Mount\|fiberhttp.NewHandler" internal/ --include="*.go"
(no match = NON-COMPLIANT → Gate 7 MUST fix)
Also check all 9 routes are accessible:
grep -rn "/v1/system" internal/ --include="*.go"
S8. Swagger compliance:
grep -rn "swagger.MergeInto\|systemswagger" internal/ --include="*.go"
(no match = NON-COMPLIANT → Gate 7 MUST fix)
S9. File structure compliance:
ls internal/bootstrap/systemplane_mount.go
ls internal/bootstrap/systemplane_authorizer.go
ls internal/bootstrap/systemplane_identity.go
ls internal/bootstrap/systemplane_factory.go
ls internal/bootstrap/active_bundle_state.go
(Each systemplane concern MUST have a dedicated file. Monolithic systemplane.go
files that combine mount + authorizer + identity = NON-COMPLIANT)
```
**Output format for compliance audit:**
```text
COMPLIANCE AUDIT RESULTS:
| Check | Component | Status | Evidence | Gate Action |
|-------|-----------|--------|----------|-------------|
| S1 | Key Registry | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 2: SKIP / MUST FIX |
| S2 | BundleFactory | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 3: SKIP / MUST FIX |
| S3 | AdoptResources | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 3: SKIP / MUST FIX |
| S4 | Reconcilers | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 4: SKIP / MUST FIX |
| S5 | Identity/Auth | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 5: SKIP / MUST FIX |
| S6 | Config Bridge | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 6: SKIP / MUST FIX |
| S7 | HTTP Mount | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 7: SKIP / MUST FIX |
| S8 | Swagger | COMPLIANT / NON-COMPLIANT | {grep results} | Gate 7: SKIP / MUST FIX |
| S9 | File Structure | COMPLIANT / NON-COMPLIANT | {ls results} | Gate 7: SKIP / MUST FIX |
```
**⛔ HARD GATE: A gate can only be marked as SKIP when ALL its compliance checks are COMPLIANT with evidence. One NON-COMPLIANT row → gate MUST execute.**
### Phase 3: Non-Canonical Pattern Detection (MANDATORY)
MUST scan for custom config management outside the systemplane canonical patterns:
```text
DETECT non-canonical config patterns:
N1. Custom config hot-reload:
grep -rln "hot.reload\|config.watch\|config.reload\|ConfigReload" internal/config/ pkg/config/ internal/bootstrap/ --include="*.go" 2>/dev/null
(any match = NON-CANONICAL → MUST be removed and replaced with systemplane ChangeFeed)
N2. Custom file watchers:
grep -rn "fsnotify\|viper.WatchConfig\|inotify\|file.watch" internal/ pkg/ --include="*.go"
(any match = NON-CANONICAL → MUST be removed; systemplane uses database change feed)
N3. Custom change notification channels:
grep -rn "chan.*config\|config.*chan\|ConfigChange\|configChanged" internal/ pkg/ --include="*.go" | grep -v systemplane
(any match outside systemplane = NON-CANONICAL → MUST be replaced with systemplane ChangeFeed)
```
**If non-canonical files are found:** report them in the compliance audit as `NON-CANONICAL FILES DETECTED`. The implementing agent MUST remove these files and replace their functionality with systemplane during the appropriate gate.
**Store detection results:**
```json
{
"skill": "ring:dev-systemplane-migration",
"gate": "0",
"detection": {
"lib_commons_version": "v4.3.2",
"backend": "postgres",
"dependencies": ["postgres", "redis", "rabbitmq"],
"has_workers": true,
"existing_systemplane": false,
"config_pattern": "envDefault",
"config_struct_location": "internal/bootstrap/config.go:15"
},
"compliance": {
"status": "NEW",
"checks": {}
},
"non_canonical": []
}
```
MUST confirm: user explicitly approves detection results before proceeding.
---
## Gate 1: Codebase Analysis (Config Focus)
**Always executes. This gate builds the configuration inventory for all subsequent gates.**
**Dispatch `ring:codebase-explorer` with systemplane-focused context:**
> TASK: Analyze this codebase exclusively under the systemplane migration perspective.
> DETECTED STACK: {databases and infrastructure from Gate 0}
>
> FOCUS AREAS (explore ONLY these — ignore everything else):
>
> 1. **Config struct location and fields**: Find the main Config struct, all env tags (`envDefault:`), default values. List every field with its current type, default, and env var name. Include nested structs (e.g., `Config.Postgres.Host`).
>
> 2. **Environment variable reads**: All `os.Getenv`, `envconfig`, `viper` usage. Include those outside the Config struct (ad-hoc reads in business logic).
>
> 3. **Infrastructure client creation**: How postgres, redis, rabbitmq, mongo, S3 clients are created. File:line for each constructor call. Connection strings, pool sizes, timeouts used. These become Bundle components.
>
> 4. **Background workers**: Ticker-based, cron-based, goroutine workers that need reconciliation. File:line for each worker start. Which config fields control their behavior (intervals, enable/disable, batch sizes).
>
> 5. **HTTP server configuration**: Listen address, TLS, CORS, rate limits, timeouts, body limits. Where these are read and applied. These become HTTP policy keys.
>
> 6. **Authentication/authorization**: JWT parsing, middleware, permission model. Where user ID and tenant ID are extracted from context. This drives Identity + Authorizer implementation.
>
> 7. **Existing config reload patterns**: Any hot-reload, file watching, config refresh mechanisms. Viper watchers, fsnotify, custom channels. These will be replaced by systemplane ChangeFeed.
>
> 8. **Swagger/OpenAPI setup**: How swagger is generated (`swag init` command?), where spec is served (middleware?), what tool (swaggo/swag? go-swagger?). This drives the swagger.MergeInto() integration in Gate 7.
>
> OUTPUT FORMAT: Structured report with file:line references for every point above.
> DO NOT write code. Analysis only.
**From the analysis, produce the Config Inventory Table:**
| Env Var Name | Config Field | Current Default | Go Type | Proposed Key | Proposed Tier | ApplyBehavior | Component | Secret | Validator | Group |
|-------------|-------------|-----------------|---------|-------------|--------------|---------------|-----------|--------|-----------|-------|
| `POSTGRES_HOST` | `Config.Postgres.Host` | `localhost` | `string` | `postgres.primary_host` | Runtime | BundleRebuild | `postgres` | No | — | `postgres` |
| `RATE_LIMIT_MAX` | `Config.RateLimit.Max` | `100` | `int` | `rate_limit.max` | Live-Read | LiveRead | `_none` | No | validatePositiveInt | `rate_limit` |
| `SERVER_ADDRESS` | `Config.Server.Address` | `:8080` | `string` | `server.address` | Bootstrap | BootstrapOnly | `_none` | No | — | `server` |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
**This inventory becomes the CONTEXT for all subsequent gates.** Gate 2 uses it to create key definitions. Gate 3 uses it to build the Bundle. Gate 6 uses it for snapshot→Config hydration.
<block_condition>
HARD GATE: MUST complete the analysis report and config inventory before proceeding. All subsequent gates depend on this inventory.
</block_condition>
---
## Gate 1.5: Implementation Preview (Visual Report)
**Always executes. This gate generates a visual HTML report showing exactly what will change before any code is written.**
**Uses the `ring:visualize` skill to produce a self-contained HTML page.**
The report is built from Gate 0 (stack detection) and Gate 1 (codebase analysis). It shows the developer a complete preview of every change that will be made across all subsequent gates.
**Orchestrator generates the report using `ring:visualize` with this content:**
The HTML page MUST include these sections:
### 1. Current Architecture (Before)
- Mermaid diagram showing current config flow: env vars → Config struct → static injection into services
- Table of all files that will be created or modified, with purpose
- How infrastructure clients get their config today (static fields, constructor args)
### 2. Target Architecture (After)
- Mermaid diagram showing the three-tier architecture:
```
Bootstrap env → Supervisor → Snapshot → BundleFactory → Bundle
→ Live-Read (per-request)
→ StateSync → Config struct (backward compat)
```
- Component map showing which infrastructure is managed by systemplane
- Change feed flow: DB change → pg_notify/change_stream → DebouncedFeed → Supervisor.Reload
### 3. Key Inventory Table
ALL keys with: name, tier (bootstrap/runtime/live-read), scope, ApplyBehavior, component, secret, validator, group. One row per key. This is the FULL inventory from Gate 1.
### 4. File Creation Plan
Every file to be created with gate assignment:
| Gate | File | Purpose | Lines (est.) |
|------|------|---------|-------------|
| 2 | `systemplane_keys.go` | Key orchestrator: Register + KeyDefs | ~50 |
| 2 | `systemplane_keys_{group}.go` | Per-group key definitions | ~30 each |
| 2 | `systemplane_keys_validation.go` | Validator functions | ~60 |
| 2 | `systemplane_keys_helpers.go` | concatKeyDefs utility | ~15 |
| 3 | `systemplane_bundle.go` | Bundle struct + Close + AdoptResources | ~80 |
| 3 | `systemplane_factory.go` | IncrementalBundleFactory | ~120 |
| 3 | `systemplane_factory_infra.go` | Per-component builders | ~100 |
| 4 | `systemplane_reconciler_*.go` | Reconcilers (if applicable) | ~50 each |
| 5 | `systemplane_identity.go` | JWT → Actor bridge | ~30 |
| 5 | `systemplane_authorizer.go` | Permission mapping | ~50 |
| 6 | `config_manager_systemplane.go` | Snapshot → Config hydration | ~150 |
| 6 | `config_manager_seed.go` | Env → Store seed | ~80 |
| 6 | `config_manager_helpers.go` | Type-safe comparison | ~40 |
| 6 | `config_validation.go` | Production config guards | ~60 |
| 7 | `systemplane_init.go` | 11-step init sequence | ~120 |
| 7 | `systemplane_mount.go` | HTTP route registration + swagger merge | ~40 |
| 7 | `active_bundle_state.go` | Thread-safe live-read accessor | ~40 |
### 5. ApplyBehavior Distribution
Visual breakdown showing how many keys per tier:
| ApplyBehavior | Count | Percentage | Example Keys |
|---------------|-------|-----------|-------------|
| BootstrapOnly | N | N% | server.address, auth.enabled |
| LiveRead | N | N% | rate_limit.max, timeout_sec |
| WorkerReconcile | N | N% | worker.interval_sec |
| BundleRebuild | N | N% | postgres.host, redis.host |
| BundleRebuildAndReconcile | N | N% | worker.enabled |
### 6. Component Map
| Component | Infrastructure | Keys | Rebuild Trigger |
|-----------|---------------|------|-----------------|
| `postgres` | PostgreSQL client | N | Connection string, pool size, credentials |
| `redis` | Redis client | N | Host, port, password |
| `rabbitmq` | RabbitMQ connection | N | URI, host, credentials |
| `s3` | Object storage | N | Endpoint, bucket, credentials |
| `http` | HTTP policy | N | Body limit, CORS |
| `logger` | Logger | N | Log level |
| `_none` | No infrastructure | N | Business logic only (live-read) |
### 7. Swagger Integration
Shows how systemplane routes will appear in the app's Swagger UI:
```go
import systemswagger "github.com/LerianStudio/lib-commons/v4/commons/systemplane/swagger"
// After swag init generates the base spec:
merged, err := systemswagger.MergeInto(baseSwaggerSpec)
// Use merged spec for swagger UI handler
```
All 9 `/v1/system/*` routes will be visible in the Swagger UI after merge.
### 8. Risk Assessment
| Risk | Mitigation | Verification |
|------|-----------|-------------|
| Config struct regression | StateSync hydrates Config from Snapshot on every change | Existing tests pass with systemplane active |
| Startup failure | Graceful degradation — service starts without systemplane if backend unavailable | Manual test with backend down |
| Resource leaks on hot-reload | AdoptResourcesFrom + ownership tracking in Bundle | Bundle factory tests verify resource reuse |
| Secret exposure | RedactPolicy on secret keys, AES-256-GCM encryption in store | Security review at Gate 8 |
### 9. Retro Compatibility Guarantee
**When systemplane store is unavailable**, the service degrades to bootstrap defaults:
- Config values from env vars still work (via `defaultConfig()` derived from KeyDefs)
- No runtime mutation API available (9 endpoints return 503)
- No hot-reload capability (static config for process lifetime)
- Workers run with static config from env vars
- **The service never fails to start due to systemplane issues**
**Output:** Save the HTML report to `docs/systemplane-preview.html` in the project root.
**Open in browser** for the developer to review.
<block_condition>
HARD GATE: Developer MUST explicitly approve the implementation preview before any code changes begin (Gates 2+). This prevents wasted effort on incorrect key classification or architectural decisions.
</block_condition>
**If the developer requests changes to the preview, regenerate the report and re-confirm.**
---
## Gate 2: Key Definitions + Registry
**Always executes.** This gate creates the key definition files that form the backbone of systemplane.
**Dispatch `ring:backend-engineer-golang` with context from Gate 1 config inventory:**
> TASK: Create systemplane key definition files based on the approved config inventory.
>
> CONTEXT FROM GATE 1: {Full config inventory table from analysis report}
> APPROVED KEY COUNT: {N keys from Gate 1.5 preview}
>
> **IMPORT PATHS (use exactly these):**
> ```go
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain"
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/registry"
> ```
>
> **FILES TO CREATE:**
>
> 1. **`systemplane_keys.go`** — Orchestrator:
> ```go
> func Register{Service}Keys(reg registry.Registry) error {
> for _, def := range {service}KeyDefs() {
> if err := reg.Register(def); err != nil {
> return fmt.Errorf("register key %q: %w", def.Key, err)
> }
> }
> return nil
> }
>
> func {service}KeyDefs() []domain.KeyDef {
> return concatKeyDefs(
> {service}KeyDefsAppServer(),
> {service}KeyDefsPostgres(),
> // ... one per group from inventory
> )
> }
> ```
>
> 2. **`systemplane_keys_{group}.go`** — Per-group key definitions. Split by group names from inventory (app, server, postgres, redis, rabbitmq, auth, swagger, telemetry, rate_limit, worker, etc.)
>
> 3. **`systemplane_keys_validation.go`** — Validator functions:
> - `validatePositiveInt(value any) error`
> - `validateLogLevel(value any) error`
> - `validateSSLMode(value any) error`
> - `validateAbsoluteHTTPURL(value any) error`
> - `validatePort(value any) error`
> - Any service-specific validators
>
> 4. **`systemplane_keys_helpers.go`** — `concatKeyDefs()` utility
>
> **KEY DEFINITION RULES:**
>
> Every key MUST have ALL of these fields:
> - `Key` — dotted name (e.g., `"postgres.primary_host"`)
> - `Kind` — `domain.KindConfig` or `domain.KindSetting`
> - `AllowedScopes` — `[]domain.Scope{domain.ScopeGlobal}` or both
> - `DefaultValue` — THE single source of truth for defaults
> - `ValueType` — `domain.ValueString`, `domain.ValueInt`, `domain.ValueBool`, `domain.ValueFloat`
> - `ApplyBehavior` — classify using the 5-level taxonomy (see Appendix F)
> - `MutableAtRuntime` — `false` for bootstrap-only, `true` for everything else
> - `Secret` — `true` if contains credentials, tokens, keys
> - `RedactPolicy` — `domain.RedactFull` for secrets, `domain.RedactNone` otherwise
> - `Description` — human-readable description
> - `Group` — logical grouping matching file suffix
> - `Component` — infrastructure component: `"postgres"`, `"redis"`, `"rabbitmq"`, `"s3"`, `"http"`, `"logger"`, or `domain.ComponentNone` (`"_none"`)
> - `Validator` — custom validation function (nil if none needed)
>
> **DefaultValue is the SINGLE SOURCE OF TRUTH.** No separate config defaults. The `defaultConfig()` function will derive from KeyDefs via `configFromSnapshot(defaultSnapshotFromKeyDefs(...))`.
**Apply Behavior classification** — Include the 5-level taxonomy:
| ApplyBehavior | Constant | Strength | Runtime Effect | Use When |
|---------------|----------|----------|----------------|----------|
| **bootstrap-only** | `domain.ApplyBootstrapOnly` | 4 | Immutable after startup | Server address, TLS, auth, telemetry |
| **bundle-rebuild+worker-reconcile** | `domain.ApplyBundleRebuildAndReconcile` | 3 | Bundle swap AND worker restart | Worker enable/disable (needs new connections + restart) |
| **bundle-rebuild** | `domain.ApplyBundleRebuild` | 2 | Rebuild infra clients only | Connection strings, pool sizes, credentials |
| **worker-reconcile** | `domain.ApplyWorkerReconcile` | 1 | Restart workers only | Worker intervals, scheduler periods |
| **live-read** | `domain.ApplyLiveRead` | 0 | Zero-cost per-request reads | Rate limits, timeouts, TTLs, feature flags |
**Completion criteria:** All keys registered, `go build ./...` passes, key count matches inventory from Gate 1.5.
**Catalog validation (MANDATORY):**
Products MUST add a test that validates their KeyDefs against the lib-commons canonical catalog:
```go
func TestKeyDefs_MatchCanonicalCatalog(t *testing.T) {
reg := registry.New()
Register{Service}Keys(reg)
var allDefs []domain.KeyDef
for _, def := range reg.List(domain.KindConfig) {
allDefs = append(allDefs, def)
}
for _, def := range reg.List(domain.KindSetting) {
allDefs = append(allDefs, def)
}
mismatches := catalog.ValidateKeyDefs(allDefs, catalog.AllSharedKeys()...)
for _, m := range mismatches {
t.Errorf("catalog mismatch: %s", m)
}
}
```
Any mismatch = NON-COMPLIANT. The canonical catalog in `lib-commons/commons/systemplane/catalog/` defines the source of truth for shared key names, tiers, and components.
**KindSetting vs KindConfig decision tree:**
```
Is this a per-tenant business rule?
YES → KindSetting (AllowedScopes: [ScopeGlobal, ScopeTenant])
NO ↓
Is this a product-wide feature flag or business parameter?
YES → KindSetting (AllowedScopes: [ScopeGlobal])
NO ↓
Is this an infrastructure knob (DB, Redis, auth, telemetry)?
YES → KindConfig
NO ↓
Default: KindConfig
```
Products MUST NOT leave the Settings API empty. If no per-tenant settings are identified, document the justification in `systemplane-guide.md`.
**Verification:** `grep "Register.*Keys\|KeyDefs()" internal/bootstrap/` + `go build ./...`
---
## Gate 3: Bundle + BundleFactory
**Always executes.** This gate creates the runtime resource container and its incremental builder.
**Dispatch `ring:backend-engineer-golang` with context from Gates 1-2:**
> TASK: Create the Bundle, BundleFactory, and per-component infrastructure builders.
>
> DETECTED INFRASTRUCTURE: {postgres, redis, rabbitmq, s3 — from Gate 0}
> KEY DEFINITIONS: {key groups and components from Gate 2}
>
> **IMPORT PATHS (use exactly these):**
> ```go
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain"
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/ports"
> ```
>
> **FILES TO CREATE:**
>
> 1. **`systemplane_bundle.go`** — Bundle struct with:
> - Per-component ownership fields (postgres, redis, rabbitmq, mongo, s3, http, logger)
> - `Close(ctx context.Context) error` method that closes all **owned** resources in REVERSE dependency order
> - `AdoptResourcesFrom(previous domain.RuntimeBundle)` method that nil-outs transferred pointers in previous
> - Ownership booleans per component (e.g., `ownsPostgres`, `ownsRedis`)
>
> Pattern:
> ```go
> type {Service}Bundle struct {
> Infra *InfraBundle
> HTTP *HTTPPolicyBundle
> Logger *LoggerBundle
> ownsPostgres, ownsRedis, ownsRabbitMQ, ownsObjectStorage bool
> }
>
> func (b *{Service}Bundle) Close(ctx context.Context) error {
> // Close in REVERSE dependency order, only owned resources
> if b.ownsObjectStorage && b.Infra.ObjectStorage != nil { b.Infra.ObjectStorage.Close() }
> if b.ownsRabbitMQ && b.Infra.RabbitMQ != nil { b.Infra.RabbitMQ.Close() }
> if b.ownsRedis && b.Infra.Redis != nil { b.Infra.Redis.Close() }
> if b.ownsPostgres && b.Infra.Postgres != nil { b.Infra.Postgres.Close() }
> return nil
> }
>
> func (b *{Service}Bundle) AdoptResourcesFrom(previous domain.RuntimeBundle) {
> prev, ok := previous.(*{Service}Bundle)
> if !ok || prev == nil { return }
> if !b.ownsPostgres { prev.Infra.Postgres = nil }
> if !b.ownsRedis { prev.Infra.Redis = nil }
> // ... etc
> }
> ```
>
> 2. **`systemplane_factory.go`** — IncrementalBundleFactory with:
> - `Build(ctx, snapshot)` — full rebuild, creates everything from scratch
> - `BuildIncremental(ctx, snapshot, previous, prevSnapshot)` — component-granular rebuild
> - `keyComponentMap` — built once at factory construction from KeyDefs
> - `diffChangedComponents(snap, prevSnap)` — uses keyComponentMap to find which components changed
>
> Pattern:
> ```go
> type {Service}BundleFactory struct {
> bootstrapCfg *BootstrapOnlyConfig
> keyComponentMap map[string]string // key → component
> }
>
> func (f *{Service}BundleFactory) Build(ctx context.Context, snap domain.Snapshot) (domain.RuntimeBundle, error) {
> // Full rebuild — creates everything from scratch
> }
>
> func (f *{Service}BundleFactory) BuildIncremental(ctx context.Context, snap domain.Snapshot,
> previous domain.RuntimeBundle, prevSnap domain.Snapshot) (domain.RuntimeBundle, error) {
> // Diff changed components, rebuild only what changed, reuse the rest
> }
> ```
>
> 3. **`systemplane_factory_infra.go`** — Per-component builders:
> - `buildPostgres(ctx, snapshot)` — creates postgres client from snapshot keys
> - `buildRedis(ctx, snapshot)` — creates redis client from snapshot keys
> - `buildRabbitMQ(ctx, snapshot)` — creates rabbitmq connection from snapshot keys
> - `buildObjectStorage(ctx, snapshot)` — creates S3 client from snapshot keys
> - `buildLogger(snapshot)` — creates logger from snapshot keys
> - `buildHTTPPolicy(snapshot)` — extracts HTTP policy from snapshot keys
> (one per detected infrastructure — omit unused components)
**Completion criteria:** Bundle builds successfully from snapshot, incremental rebuild reuses unchanged components, `go build ./...` passes.
**Verification:** `grep "IncrementalBundleFactory\|BundleFactory\|AdoptResourcesFrom" internal/bootstrap/` + `go build ./...`
---
## Gate 4: Reconcilers (Conditional)
**SKIP IF:** no background workers AND no RabbitMQ AND no HTTP policy changes detected in Gate 0.
**When to include each reconciler:**
- `HTTPPolicyReconciler` — if HTTP policy keys exist (body limit, CORS, timeouts)
- `PublisherReconciler` — if RabbitMQ detected
- `WorkerReconciler` — if background workers detected
**Dispatch `ring:backend-engineer-golang` with context from Gates 1-3:**
> TASK: Create reconcilers for side effects on config changes.
>
> DETECTED: {workers: Y/N, rabbitmq: Y/N, http_policy_keys: Y/N from Gates 0-2}
>
> **IMPORT PATHS (use exactly these):**
> ```go
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain"
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/ports"
> ```
>
> **FILES TO CREATE (only for detected components):**
>
> 1. **`systemplane_reconciler_http.go`** — PhaseValidation reconciler for HTTP config:
> - Validates body limit is non-negative
> - Validates CORS origins are present
> - Can REJECT changes (returns error to abort the reload)
> ```go
> func (r *HTTPPolicyReconciler) Phase() domain.ReconcilerPhase { return domain.PhaseValidation }
> ```
>
> 2. **`systemplane_reconciler_publishers.go`** — PhaseValidation for RabbitMQ publisher staging (if RMQ detected):
> - When RabbitMQ connection changed, creates staged publishers on candidate bundle
> - The observer callback swaps them in later
> ```go
> func (r *PublisherReconciler) Phase() domain.ReconcilerPhase { return domain.PhaseValidation }
> ```
>
> 3. **`systemplane_reconciler_worker.go`** — PhaseSideEffect for worker restart (if workers detected):
> - Reads worker config from snapshot
> - Calls `workerManager.ApplyConfig(cfg)` to restart affected workers
> - Runs LAST because it has external side effects
> ```go
> func (r *WorkerReconciler) Phase() domain.ReconcilerPhase { return domain.PhaseSideEffect }
> ```
>
> **Phase ordering is enforced by the type system** — you cannot register a reconciler without declaring its phase. The supervisor stable-sorts by phase, so reconcilers within the same phase retain their registration order.
>
> Phase execution order: PhaseStateSync (0) → PhaseValidation (1) → PhaseSideEffect (2)
**Completion criteria:** Reconcilers compile, implement `ports.BundleReconciler` interface, phase ordering is correct. `go build ./...` passes.
**Verification:** `grep "BundleReconciler\|Reconcile\|Phase()" internal/bootstrap/systemplane_reconciler_*.go` + `go build ./...`
---
## Gate 5: Identity + Authorization
**Always executes.** Systemplane HTTP endpoints require identity resolution and permission checking.
**Dispatch `ring:backend-engineer-golang` with context from Gate 1 analysis:**
> TASK: Implement IdentityResolver and Authorizer for systemplane HTTP endpoints.
>
> CONTEXT FROM GATE 1: {Authentication/authorization analysis — how JWT is parsed, where user ID/tenant ID are extracted}
>
> **IMPORT PATHS (use exactly these):**
> ```go
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain"
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/ports"
> ```
>
> **FILES TO CREATE:**
>
> 1. **`systemplane_identity.go`** — Implements `ports.IdentityResolver`:
> ```go
> type {Service}IdentityResolver struct{}
>
> func (r *{Service}IdentityResolver) Actor(ctx context.Context) (domain.Actor, error) {
> uid := auth.GetUserID(ctx) // use the service's existing auth context extraction
> if uid == "" { uid = "anonymous" }
> return domain.Actor{ID: uid}, nil
> }
>
> func (r *{Service}IdentityResolver) TenantID(ctx context.Context) (string, error) {
> return auth.GetTenantID(ctx) // use the service's existing tenant extraction
> }
> ```
>
> 2. **`systemplane_authorizer.go`** — Implements `ports.Authorizer`:
> Maps 9 permissions to the service's auth model:
> - `system/configs:read` — view config values
> - `system/configs:write` — update config values
> - `system/configs/schema:read` — view key definitions
> - `system/configs/history:read` — view config change history
> - `system/configs/reload:write` — force full reload
> - `system/settings:read` — view settings
> - `system/settings:write` — update settings
> - `system/settings/schema:read` — view setting key definitions
> - `system/settings:history:read` — view settings history (scope-dependent)
>
> ```go
> type {Service}Authorizer struct {
> authEnabled bool
> }
>
> func (a *{Service}Authorizer) Authorize(ctx context.Context, permission string) error {
> if !a.authEnabled { return nil } // dev mode bypass
> // Map permission to the service's RBAC action, call auth.Authorize()
> }
> ```
>
> **When auth is disabled** (dev mode), authorizer returns nil for all permissions.
> **When auth is enabled**, map systemplane permissions to the service's existing permission model.
**Authorizer enforcement policy (MANDATORY):**
The authorizer MUST perform real permission checking. Two approved patterns:
1. **Granular delegation** (recommended): Use `ports.DelegatingAuthorizer` from lib-commons. Splits permission string and delegates to external auth service.
2. **Admin-only with justification**: Use a binary admin check BUT document the justification in `systemplane-guide.md` and add a code comment explaining why granular is not used.
**PROHIBITED patterns:**
- No-op authorizer (always returns nil) = automatic Gate 5 FAILURE
- Authorizer that ignores the permission string = NON-COMPLIANT
- Authorizer with no auth-enabled check (must handle disabled auth)
**Default implementations from lib-commons:**
- `ports.AllowAllAuthorizer` — for auth-disabled mode
- `ports.DelegatingAuthorizer` — for auth-enabled mode with per-permission delegation
- `ports.FuncIdentityResolver` — adapts context extraction functions to IdentityResolver
**Completion criteria:** Both interfaces implemented, `go build ./...` passes.
**Verification:** `grep "IdentityResolver\|Authorizer\|Actor\|TenantID\|Authorize" internal/bootstrap/systemplane_identity.go internal/bootstrap/systemplane_authorizer.go` + `go build ./...`
---
## Gate 6: Config Manager Bridge
**Always executes.** This gate ensures backward compatibility — existing code reading the Config struct continues to work.
**Dispatch `ring:backend-engineer-golang` with context from Gates 1-2:**
> TASK: Create the Config Manager bridge that hydrates the existing Config struct from systemplane Snapshot.
>
> CONTEXT FROM GATE 1: {Config struct location, all fields, current defaults}
> KEY DEFINITIONS: {All keys from Gate 2 with their snapshot key names}
>
> **FILES TO CREATE:**
>
> 1. **`config_manager_systemplane.go`** — StateSync callback + snapshot hydration:
>
> Two core functions:
> - `configFromSnapshot(snap domain.Snapshot) *Config` — builds a Config entirely from snapshot values. ALL fields come from the snapshot — no bootstrap overlay. This is used by `defaultConfig()`.
> - `snapshotToFullConfig(snap domain.Snapshot, oldCfg *Config) *Config` — hydrates from snapshot, then overlays bootstrap-only fields from the previous config (they never change at runtime).
>
> Helper functions for type-safe extraction with JSON coercion:
> - `snapString(snap, key, fallback) string`
> - `snapInt(snap, key, fallback) int`
> - `snapBool(snap, key, fallback) bool`
> - `snapFloat64(snap, key, fallback) float64`
>
> The StateSync callback (registered on Manager):
> ```go
> StateSync: func(_ context.Context, snap domain.Snapshot) {
> newCfg := snapshotToFullConfig(snap, baseCfg)
> configManager.swapConfig(newCfg)
> },
> ```
>
> 2. **`config_manager_seed.go`** — One-time env → store seed:
> - Reads current env vars and seeds them into systemplane store
> - Only runs on first boot (when store is empty)
> - Preserves existing configuration values during migration
> - Skips bootstrap-only keys (they don't go into the store)
> - Uses `store.Put()` with `domain.RevisionZero` for initial write
>
> 3. **`config_manager_helpers.go`** — Type-safe value comparison:
> - `valuesEquivalent(a, b any) bool` — handles JSON coercion (float64 vs int)
> - Used by seed logic to avoid overwriting store values that match env defaults
>
> 4. **`config_validation.go`** — Production config guards:
> - Validates critical config combinations at startup
> - Used as `ConfigWriteValidator` on the Manager
> - Prevents invalid states (e.g., TLS enabled without cert path)
> - Returns non-nil error to REJECT a write before persistence
>
> **CRITICAL: The `defaultConfig()` function MUST be updated to derive from KeyDefs:**
> ```go
> func defaultConfig() *Config {
> return configFromSnapshot(defaultSnapshotFromKeyDefs({service}KeyDefs()))
> }
> ```
> This eliminates drift between defaults and key definitions. If you change a default in `{service}KeyDefs()`, the Config struct picks it up automatically.
**Completion criteria:** Config struct hydrates correctly from snapshot, existing code reading `configManager.Get()` gets updated values after StateSync, `go build ./...` passes.
**Verification:** `grep "configFromSnapshot\|snapshotToFullConfig\|StateSync\|seedStore" internal/bootstrap/` + `go build ./...`
---
## Gate 7: Wiring + HTTP Mount + Swagger + ChangeFeed ⛔ CRITICAL
<cannot_skip>
**This gate is NEVER skippable. It is the most critical gate.**
This is where the systemplane comes alive. Without Gate 7, keys are metadata, bundles are dead code, and reconcilers never fire. Gate 7 wires everything together: the 11-step init sequence, HTTP endpoint mount, Swagger merge, ChangeFeed subscription, active bundle state, and shutdown integration.
**⛔ HARD GATE: Gate 7 MUST be fully implemented. There is no "partial" Gate 7. Every subcomponent below is required.**
</cannot_skip>
**Dispatch `ring:backend-engineer-golang` with context from ALL previous gates:**
> TASK: Wire the complete systemplane lifecycle — init, HTTP mount, swagger merge, change feed, bundle state, and shutdown.
>
> ALL PREVIOUS GATES: {Key definitions from Gate 2, Bundle/Factory from Gate 3, Reconcilers from Gate 4, Identity/Auth from Gate 5, Config Bridge from Gate 6}
>
> **IMPORT PATHS (use exactly these):**
> ```go
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/domain"
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/ports"
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/registry"
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/service"
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/bootstrap"
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/bootstrap/builtin"
> "github.com/LerianStudio/lib-commons/v4/commons/systemplane/adapters/changefeed"
> fiberhttp "github.com/LerianStudio/lib-commons/v4/commons/systemplane/adapters/http/fiber"
> systemswagger "github.com/LerianStudio/lib-commons/v4/commons/systemplane/swagger"
> ```
>
> **FILES TO CREATE:**
>
> ### 1. `systemplane_init.go` — The 11-step initialization sequence
>
> ```go
> func Init{Service}Systemplane(ctx context.Context, cfg *Config, configManager *ConfigManager,
> workerManager *WorkerManager, logger log.Logger,
> observer func(service.ReloadEvent)) (*SystemplaneComponents, error) {
>
> // Step 1: Extract bootstrap-only config (immutable for process lifetime)
> bootstrapCfg := ExtractBootstrapOnlyConfig(cfg)
>
> // Step 2: Load backend config (default: reuse app's Postgres DSN)
> backendCfg := Load{Service}BackendConfig(cfg)
>
> // Step 3: Create registry + register ALL keys
> reg := registry.New()
> if err := Register{Service}Keys(reg); err != nil {
> return nil, err
> }
>
> // Step 4: Configure backend with registry metadata (secret keys + apply behaviors)
> configureBackendWithRegistry(backendCfg, reg)
>
> // Step 5: Create backend (Store + History + ChangeFeed)
> backend, err := builtin.NewBackendFromConfig(ctx, backendCfg)
> if err != nil { return nil, fmt.Errorf("systemplane backend: %w", err) }
>
> // Step 6: Create snapshot builder
> snapBuilder, err := service.NewSnapshotBuilder(reg, backend.Store)
> if err != nil { backend.Close(); return nil, err }
>
> // Step 7: Create bundle factory (supports incremental builds)
> bundleFactory := New{Service}BundleFactory(&bootstrapCfg)
>
> // Step 8: Seed store from current env-var config (first boot only)
> if err := seedStoreForInitialReload(ctx, configManager, reg, backend.Store); err != nil {
> backend.Close(); return nil, err
> }
>
> // Step 9: Build reconcilers (phase-sorted by supervisor)
> reconcilers := buildReconcilers(workerManager)
>
> // Step 10: Create supervisor + initial reload
> supervisor, err := service.NewSupervisor(service.SupervisorConfig{
> Builder: snapBuilder,
> Factory: bundleFactory,
> Reconcilers: reconcilers,
> Observer: observer,
> })
> if err != nil { backend.Close(); return nil, err }
>
> if err := supervisor.Reload(ctx, "initial-bootstrap"); err != nil {
> backend.Close(); return nil, err
> }
>
> // Step 11: Create manager with callbacks
> baseCfg := configManager.Get()
> manager, err := service.NewManager(service.ManagerConfig{
> Registry: reg,
> Store: backend.Store,
> History: backend.History,
> Supervisor: supervisor,
> Builder: snapBuilder,
> ConfigWriteValidator: productionConfigGuards(baseCfg),
> StateSync: func(_ context.Context, snap domain.Snapshot) {
> newCfg := snapshotToFullConfig(snap, baseCfg)
> configManager.swapConfig(newCfg)
> },
> })
> if err != nil { backend.Close(); return nil, err }
>
> return &SystemplaneComponents{
> ChangeFeed: backend.ChangeFeed,
> Supervisor: supervisor,
> Manager: manager,
> Backend: backend.Closer,
> }, nil
> }
> ```
>
> ### 2. `systemplane_mount.go` — HTTP route registration
>
> ```go
> func MountSystemplaneAPI(app *fiber.App, manager service.Manager,
> authEnabled bool) error {
>
> authorizer := &{Service}Authorizer{authEnabled: authEnabled}
> identity := &{Service}IdentityResolver{}
>
> handler, err := fiberhttp.NewHandler(manager, identity, authorizer)
> if err != nil {
> return fmt.Errorf("create systemplane handler: %w", err)
> }
>
> handler.Mount(app) // registers all 9 /v1/system/* routes
> return nil
> }
> ```
>
> This registers ALL 9 endpoints:
> - `GET /v1/system/configs`
> - `PATCH /v1/system/configs`
> - `GET /v1/system/configs/schema`
> - `GET /v1/system/configs/history`
> - `POST /v1/system/configs/reload`
> - `GET /v1/system/settings`
> - `PATCH /v1/system/settings`
> - `GET /v1/system/settings/schema`
> - `GET /v1/system/settings/history`
>
> ### 3. Swagger Integration
>
> In the app's swagger setup (wherever the swagger middleware is configured):
> ```go
> import systemswagger "github.com/LerianStudio/lib-commons/v4/commons/systemplane/swagger"
>
> // After swag init generates the base spec:
> merged, err := systemswagger.MergeInto(baseSwaggerSpec)
> if err != nil {
> return fmt.Errorf("merge systemplane swagger: %w", err)
> }
> // Use merged spec for swagger UI handler
> ```
>
> ### 4. `active_bundle_state.go` — Thread-safe live-read accessor
>
> ```go
> type activeBundleState struct {
> mu sync.RWMutex
> bundle *{Service}Bundle
> }
>
> func (s *activeBundleState) Current() *{Service}Bundle {
> s.mu.RLock()
> defer s.mu.RUnlock()
> return s.bundle
> }
>
> func (s *activeBundleState) Update(b *{Service}Bundle) {
> s.mu.Lock()
> defer s.mu.Unlock()
> s.bundle = b
> }
> ```
>
> Used by infrastructure consumers for live-read access to the current bundle.
>
> ### 5. ChangeFeed integration (in init.go wiring)
>
> ```go
> debouncedFeed := changefeed.NewDebouncedFeed(
> spComponents.ChangeFeed,
> changefeed.WithWindow(200 * time.Millisecond),
> )
>
> feedCtx, cancelFeed := context.WithCancel(ctx)
> go func() {
> _ = debouncedFeed.Subscribe(feedCtx, func(signal ports.ChangeSignal) {
> _ = spComponents.Manager.ApplyChangeSignal(feedCtx, signal)
> })
> }()
> ```
>
> ### 6. Shutdown sequence integration
>
> ```go
> func (s *Service) Stop(ctx context.Context) {
> s.configManager.Stop() // 1. Prevent mutations
> s.cancelChangeFeed() // 2. Stop change feed BEFORE supervisor
> s.spComponents.Supervisor.Stop(ctx) // 3. Stop supervisor + close bundle
> s.spComponents.Backend.Close() // 4. Close store connection
> s.workerManager.Stop() // 5. Stop workers
> }
> ```
>
> ### 7. Observer callback (passed to InitSystemplane)
>
> ```go
> runtimeReloadObserver := func(event service.ReloadEvent) {
> bundle := event.Bundle.(*{Service}Bundle)
> bundleState.Update(bundle)
> configManager.UpdateFromSystemplane(event.Snapshot)
> // Swap logger, publishers, etc.
> }
> ```
>
> ### 8. Bootstrap config file — `config/.config-map.example`
>
> Lists all bootstrap-only keys with their env var names. Operator reference for deployment configuration.
>
> ```
> # {Service} — Bootstrap-Only Configuration (requires restart)
> #
> # These are the ONLY settings that require a container/pod restart.
> # Everything else is hot-reloadable via:
> #
> # GET /v1/system/configs — view current runtime config
> # PATCH /v1/system/configs — change any runtime-managed key
> # GET /v1/system/configs/schema — see all keys, types, and mutability
> # GET /v1/system/configs/history — audit trail of changes
>
> ENV_NAME=development
> SERVER_ADDRESS=:8080
> AUTH_ENABLED=false
> ENABLE_TELEMETRY=false
> # ... all bootstrap-only keys
> ```
**Completion criteria:**
- All 9 HTTP routes accessible
- ChangeFeed fires on store changes
- Supervisor rebuilds bundle on config change
- Swagger UI shows systemplane routes (via MergeInto)
- Graceful shutdown closes all resources in correct order
- Active bundle state provides live-read access to request handlers
- Observer callback updates bundle state and config manager on reload
**Verification:**
```bash
grep "handler.Mount\|fiberhttp.NewHandler" internal/bootstrap/
grep "DebouncedFeed\|Subscribe\|ApplyChangeSignal" internal/bootstrap/
grep "swagger.MergeInto\|systemswagger" internal/bootstrap/ internal/
grep "activeBundleState\|bundleState.Update\|bundleState.Current" internal/bootstrap/
grep "cancelChangeFeed\|Supervisor.Stop\|Backend.Close" internal/bootstrap/
# Verify active bundle state exists for thread-safe live-read
grep -rn "activeBundleState\|bundleState" internal/bootstrap/ --include='*.go'
go build ./...
```
### ⛔ Gate 7 Anti-Rationalization Table
| Rationalization | WRONG BECAUSE | REQUIRED ACTION |
|-----------------|---------------|-----------------|
| "Mount is straightforward, I'll skip it" | Without Mount, 9 endpoints are unreachable — the entire management API is dead | MUST implement `handler.Mount(app)` |
| "Swagger is optional documentation" | Without swagger merge, operators cannot discover systemplane endpoints in Swagger UI | MUST call `systemswagger.MergeInto()` |
| "ChangeFeed can be added later" | Without ChangeFeed, hot-reload is broken — config changes in DB are invisible to the service | MUST start `DebouncedFeed` |
| "Shutdown is handled by the framework" | Supervisor, ChangeFeed, and Bundle hold resources that leak without explicit shutdown | MUST integrate 5-step shutdown sequence |
| "Active bundle state is internal detail" | Without thread-safe accessor, request handlers cannot read live config — race conditions | MUST implement `active_bundle_state.go` |
| "Init function is just boilerplate" | The 11-step sequence has strict ordering — wrong order causes nil panics or stale data | MUST follow exact step order |
| "Observer callback is optional" | Without observer, bundle state and config manager are never updated after reload | MUST register observer on Supervisor |
| "Config map example is nice-to-have" | Operators need to know which keys require restart vs hot-reload — without it, they restart for everything | MUST create `config/.config-map.example` |
---
## Gate 8: Code Review
**Dispatch 10 parallel reviewers (same pattern as ring:codereview).**
MUST include this context in ALL 10 reviewer dispatches:
> **SYSTEMPLANE REVIEW CONTEXT:**
> - Systemplane is a three-tier configuration system: bootstrap (env, immutable) → runtime (DB, hot-reload) → live-read (snapshot, per-request).
> - Every key has an ApplyBehavior that determines how changes propagate: BootstrapOnly → BundleRebuildAndReconcile → BundleRebuild → WorkerReconcile → LiveRead.
> - IncrementalBundleFactory rebuilds only changed components. Ownership tracking prevents double-free.
> - StateSync callback keeps the Config struct in sync for backward compatibility.
> - The 9 HTTP endpoints on /v1/system/* expose config management with If-Match/ETag concurrency.
> - ChangeFeed (pg_notify or change_stream) triggers automatic reload on database changes.
> - swagger.MergeInto() adds systemplane routes to the app's Swagger spec.
| Reviewer | Systemplane-Specific Focus |
|----------|---------------------------|
| ring:code-reviewer | Architecture, lib-commons v4 systemplane usage, three-tier separation, package boundaries, import paths |
| ring:business-logic-reviewer | Key classification correctness, ApplyBehavior assignments, component granularity, default values match current behavior |
| ring:security-reviewer | Secret key redaction (RedactFull/RedactMask), authorization model, no credential leaks in config API responses, AES-256-GCM encryption |
| ring:test-reviewer | Key registration tests, bundle factory tests, reconciler tests, contract tests, config hydration tests |
| ring:nil-safety-reviewer | Nil risks in snapshot reads, bundle adoption, reconciler error paths, active bundle state before first reload |
| ring:consequences-reviewer | Impact on existing config reads, backward compat via StateSync, degradation when store unavailable, shutdown resource cleanup |
| ring:dead-code-reviewer | Orphaned env-reading code, dead config helpers replaced by systemplane, unused YAML/viper imports, stale .env files |
| ring:performance-reviewer | Hot-path allocations in config reads, caching efficiency, systemplane polling overhead, connection pool sizing |
| ring:multi-tenant-reviewer | Systemplane key scoping across tenants, tenant isolation in config snapshots, cross-tenant config leakage prevention |
| ring:lib-commons-reviewer | lib-commons v4 systemplane package adoption, canonical patterns vs custom reimplementations, shared helper reuse |
**⛔ MANDATORY:** All 10 reviewers must PASS. 9/10 = FAIL. Critical findings → fix and re-review.
---
## Gate 9: User Validation
MUST approve: present checklist for explicit user approval.
```markdown
## Systemplane Migration Complete
- [ ] All keys registered and classified (count matches preview: {N} keys)
- [ ] BundleFactory builds successfully with IncrementalBundleFactory
- [ ] Incremental rebuild reuses unchanged components (ownership tracking works)
- [ ] Reconcilers fire on config changes (if applicable)
- [ ] Identity + Authorization wired (JWT → Actor, 9 permissions mapped)
- [ ] Config Manager bridge works (StateSync hydrates Config struct from Snapshot)
- [ ] `defaultConfig()` derives from KeyDefs (single source of truth)
- [ ] HTTP Mount: all 9 /v1/system/* endpoints accessible
- [ ] Swagger: systemplane routes visible in Swagger UI (via MergeInto)
- [ ] ChangeFeed: config changes in DB trigger supervisor reload
- [ ] Active bundle state: thread-safe live-read for request handlers
- [ ] Shutdown: clean 5-step resource release on SIGTERM
- [ ] Backward compat: service starts normally without systemplane store (degrades to defaults)
- [ ] Tests pass: `go test ./...`
- [ ] Code review passed: all 10 reviewers PASS
```
---
## Gate 10: Activation Guide
**MUST generate `docs/systemplane-guide.md` in the project root.** Direct, concise, no filler text.
The guide is built from Gate 0 (stack detection), Gate 1 (analysis), and Gate 2 (key inventory).
The guide MUST include:
### 1. Components Table
| Component | Purpose | Status |
|-----------|---------|--------|
| Registry | Key definitions (types, defaults, validators) | {N} keys registered |
| Supervisor | Reload lifecycle (snapshot → bundle → reconcile → swap) | Active |
| Manager | HTTP API backend (reads, writes, schema, history, resync) | Active |
| BundleFactory | IncrementalBundleFactory with component-granular rebuilds | Active |
| Reconcilers | Side-effect appliers (HTTP, Publisher, Worker) | {N} reconcilers |
| ChangeFeed | Database change → DebouncedFeed → Supervisor.Reload | Active |
### 2. Bootstrap-Only Config Reference
| Env Var | Default | Description | Requires Restart |
|---------|---------|-------------|-----------------|
| `ENV_NAME` | `development` | Environment name | Yes |
| `SERVER_ADDRESS` | `:8080` | Listen address | Yes |
| `AUTH_ENABLED` | `false` | Enable auth middleware | Yes |
| ... | ... | ... | ... |
### 3. Runtime-Managed Config Reference
| Key | Default | Type | ApplyBehavior | Hot-Reloadable |
|-----|---------|------|---------------|---------------|
| `rate_limit.max` | `100` | int | LiveRead | Yes (instant) |
| `postgres.primary_host` | `localhost` | string | BundleRebuild | Yes (rebuilds PG client) |
| ... | ... | ... | ... | ... |
### 4. Activation Steps
1. Ensure PostgreSQL (or MongoDB) is running
2. Set `SYSTEMPLANE_SECRET_MASTER_KEY` env var (32 bytes, raw or base64)
3. Start the service — systemplane auto-creates tables/collections
4. On first boot, env var values are seeded into the store
5. Verify: `curl http://localhost:{port}/v1/system/configs | jq`
### 5. Verification Commands
```bash
# View current runtime config
curl -s http://localhost:{port}/v1/system/configs | jq
# View schema (all keys, types, mutability)
curl -s http://localhost:{port}/v1/system/configs/schema | jq
# Change a runtime key
curl -X PATCH http://localhost:{port}/v1/system/configs \
-H 'Content-Type: application/json' \
-H 'If-Match: "current-revision"' \
-d '{"values": {"rate_limit.max": 200}}'
# View change history
curl -s http://localhost:{port}/v1/system/configs/history | jq
# Force full reload
curl -X POST http://localhost:{port}/v1/system/configs/reload
# View settings
curl -s http://localhost:{port}/v1/system/settings?scope=global | jq
```
### 6. Degradation Behavior
| Scenario | Behavior |
|----------|----------|
| Systemplane backend unavailable at startup | Service starts with env var defaults (no hot-reload) |
| Systemplane backend goes down after startup | Last-known config remains active (no updates until reconnect) |
| ChangeFeed disconnects | Auto-reconnect with exponential backoff; manual `POST /v1/system/configs/reload` available |
| Invalid config write | ConfigWriteValidator rejects before persistence; no impact on running config |
| Bundle build failure | Previous bundle stays active; error logged; no disruption |
### 7. Common Errors and Solutions
| Error | Cause | Solution |
|-------|-------|----------|
| `systemplane backend: connection refused` | PostgreSQL/MongoDB not running | Start the database |
| `register key "X": duplicate key` | Key registered twice | Check for duplicate key names in key definition files |
| `revision mismatch` (409) | Concurrent write conflict | Re-read current revision, retry PATCH with updated `If-Match` |
| `key is not mutable at runtime` (400) | Tried to PATCH a bootstrap-only key | This key requires a restart — update env var instead |
| `secret master key required` | `SYSTEMPLANE_SECRET_MASTER_KEY` not set | Set the env var (32 bytes) |
---
## State Persistence
Save to `docs/dev-systemplane-migration/current-cycle.json` for resume support:
```json
{
"skill": "ring:dev-systemplane-migration",
"version": "2.0.0",
"service": "{service_name}",
"started_at": "ISO8601",
"current_gate": "2",
"gates_completed": ["0", "1", "1.5"],
"gates_skipped": [],
"detection": {
"backend": "postgres",
"dependencies": ["postgres", "redis", "rabbitmq"],
"has_workers": true,
"existing_systemplane": false,
"config_pattern": "envDefault",
"key_count": 45
},
"compliance": {
"status": "NEW",
"checks": {}
}
}
```
---
## Anti-Rationalization Table
| Rationalization | WRONG BECAUSE | REQUIRED ACTION |
|-----------------|---------------|-----------------|
| "Keys are registered, migration is done" | Without BundleFactory + Wiring, keys are just metadata — nothing reads them at runtime | Complete ALL gates through Gate 7 |
| "We can wire it later" | Gate 7 is where the system comes alive — without it, systemplane is dead code | Gate 7 is NEVER skippable |
| "Swagger integration is cosmetic" | Operators need API discoverability; without it, 9 endpoints are invisible | MUST call `systemswagger.MergeInto()` |
| "Config bridge is backward compat only" | Existing code reads Config struct — without StateSync, it reads stale bootstrap values forever | MUST implement Gate 6 |
| "Reconcilers are optional" | Without reconcilers, workers and HTTP policies ignore config changes — hot-reload is partial | Evaluate at Gate 4; implement if infra detected |
| "Service already has systemplane code" | Existence ≠ compliance. Code that doesn't follow canonical patterns is WRONG and must be fixed. | STOP. Run compliance audit (Gate 0 Phase 2). Fix every NON-COMPLIANT component. |
| "Partial systemplane is good enough" | Partial = broken. Without ChangeFeed, no hot-reload. Without Mount, no API. Without StateSync, stale config. | STOP. Every component is required. Execute all gates. |
| "The current approach works fine" | Working ≠ compliant. Custom config management creates drift and blocks standardization. | Replace with canonical systemplane implementation. |
| "Skip code review, we tested it" | One wiring mistake = resource leak, config corruption, or silent degradation. 10 reviewers catch what tests miss. | MANDATORY: All 10 reviewers must PASS. |
| "ChangeFeed can be added later" | Later = never. Without ChangeFeed, the database is a dead store — changes are invisible. | MUST start DebouncedFeed in Gate 7 |
| "Agent says out of scope" | Skill defines scope, not agent. | Re-dispatch with gate context |
| "Active bundle state is an implementation detail" | Without it, request handlers cannot safely read live config. Race conditions and nil panics. | MUST implement in Gate 7 |
| "Key names don't matter, it's just a string" | Different names for the same config break cross-product dashboards, Helm overlays, and operator muscle memory. Names are the API. | MUST match canonical catalog names |
| "Our tier classification is correct for our use case" | Products don't get to choose tiers independently. PG pool tuning is LiveRead everywhere or nowhere — mixed tiers break operator expectations | MUST match canonical tier classification |
| "Admin-only auth is fine for now" | It's been "for now" across 3 products and 2 years. Granular permissions exist in lib-commons but are unused. | MUST implement granular or document justification |
---
## Appendix A: Package Structure Reference
The systemplane lives in `lib-commons/v4/commons/systemplane/` — a self-contained,
backend-agnostic library with **zero imports of internal application packages**.
It was extracted from Matcher's `pkg/systemplane/` into lib-commons v4.3.0 to be
shared across all Lerian services.
### Directory Structure
```
commons/systemplane/ # in lib-commons repo
├── doc.go # Package doc
├── domain/ # Pure value objects, no infra deps
│ ├── actor.go # Actor{ID}
│ ├── apply_behavior.go # ApplyBehavior enum + Strength()
│ ├── backend_kind.go # BackendKind enum (postgres, mongodb)
│ ├── bundle.go # RuntimeBundle interface
│ ├── entry.go # Entry (persisted override record)
│ ├── errors.go # 14 sentinel errors
│ ├── key_def.go # KeyDef, ValueType, RedactPolicy, ValidatorFunc, ComponentNone
│ ├── kind.go # Kind enum (config, setting)
│ ├── nil_value.go # IsNilValue() — reflect-based typed-nil detection
│ ├── reconciler_phase.go # ReconcilerPhase enum (3 phases)
│ ├── revision.go # Revision type (uint64 wrapper)
│ ├── scope.go # Scope enum (global, tenant)
│ ├── snapshot.go # Snapshot + EffectiveValue
│ └── target.go # Target (kind/scope/subject coordinate)
├── ports/ # Interface definitions (hex boundary)
│ ├── authorizer.go # Authorizer
│ ├── bundle_factory.go # BundleFactory + IncrementalBundleFactory
│ ├── changefeed.go # ChangeFeed + ChangeSignal
│ ├── history.go # HistoryStore + HistoryEntry + HistoryFilter
│ ├── identity.go # IdentityResolver
│ ├── reconciler.go # BundleReconciler (phased)
│ └── store.go # Store + WriteOp + ReadResult
├── registry/ # Thread-safe key definition registry
│ ├── registry.go # Registry interface + inMemoryRegistry
│ └── validation.go # Type validation with JSON coercion
├── service/ # Use-case orchestration
│ ├── escalation.go # Escalate() — strongest ApplyBehavior
│ ├── manager.go # Manager interface + ManagerConfig
│ ├── manager_helpers.go # Redaction, schema building
│ ├── manager_reads.go # GetConfigs, GetSettings, GetSchema, GetHistory, Resync
│ ├── manager_writes.go # PatchConfigs, PatchSettings, ApplyChangeSignal
│ ├── manager_writes_helpers.go # Validation, escalation, snapshot preview
│ ├── snapshot_builder.go # 3-layer cascade: default → global → tenant
│ ├── supervisor.go # Supervisor interface + Reload lifecycle
│ └── supervisor_helpers.go # Build strategy, phase sort, resource adoption
├── bootstrap/ # Backend wiring from env/config
│ ├── backend.go # BackendFactory registry + NewBackendFromConfig
│ ├── classifier.go # IsBootstrapOnly / IsRuntimeManaged
│ ├── config.go # BootstrapConfig, PostgresBootstrapConfig, MongoBootstrapConfig
│ ├── defaults.go # Default table/collection names
│ ├── env.go # LoadFromEnv() — SYSTEMPLANE_* env vars
│ ├── postgres_identifiers.go # Regex validation for PG identifiers
│ └── builtin/
│ └── backend.go # init()-registered factories (PG + Mongo)
├── adapters/
│ ├── store/
│ │ ├── secretcodec/
│ │ │ └── codec.go # AES-256-GCM encryption for secret values
│ │ ├── storetest/
│ │ │ └── contract.go # 15 shared contract tests (Store + History)
│ │ ├── postgres/ # 3 tables, optimistic concurrency, pg_notify
│ │ │ ├── ddl.go # CREATE TABLE templates
│ │ │ ├── postgres.go # New(), NewFromDB() constructors
│ │ │ ├── store.go # Get(), Put()
│ │ │ ├── store_mutation.go # applyOp, upsert, delete, insertHistory
│ │ │ ├── store_revision.go # lock/read/update revision
│ │ │ ├── store_runtime.go # encrypt/decrypt, notify, escalate
│ │ │ ├── json_decode.go # Integer-preserving JSON decoder
│ │ │ ├── history.go # ListHistory with dynamic builder
│ │ │ └── identifiers.go # qualify() helper
│ │ └── mongodb/ # Sentinel revision doc, multi-doc txns
│ │ ├── mongodb.go # New(), ensureIndexes, txn support check
│ │ ├── store.go # Get(), Put() with sessions
│ │ ├── store_helpers.go # applyOperation, encrypt, CRUD
│ │ ├── models.go # BSON models + normalization
│ │ └── history.go # ListHistory with BSON filter
│ ├── changefeed/
│ │ ├── debounce.go # Trailing-edge per-target debounce
│ │ ├── debounce_helpers.go # Timer management, jitter
│ │ ├── safe_handler.go # Panic-to-error conversion
│ │ ├── feedtest/
│ │ │ └── contract.go # 3 shared contract tests for ChangeFeed
│ │ ├── postgres/ # LISTEN/NOTIFY, auto-reconnect, revision resync
│ │ │ ├── feed.go # Feed struct + Subscribe
│ │ │ ├── feed_subscribe.go # subscribeLoop, reconnect, listenLoop
│ │ │ ├── feed_runtime.go # backoff, jitter, revision validation
│ │ │ └── options.go # WithReconnectBounds, WithRevisionSource
│ │ └── mongodb/ # change_stream or poll mode
│ │ ├── feed.go # Feed struct + Subscribe
│ │ ├── feed_stream.go # subscribeChangeStream
│ │ ├── feed_poll.go # subscribePoll, pollRevisions
│ │ └── feed_bson.go # BSON helpers
│ └── http/
│ └── fiber/ # 9 endpoints, DTOs, middleware
│ ├── handler.go # Handler struct, NewHandler, Mount()
│ ├── handler_configs.go # Config CRUD + reload
│ ├── handler_settings.go # Settings CRUD
│ ├── dto.go # All request/response DTOs
│ ├── middleware.go # requireAuth, settingsAuth
│ └── errors.go # Domain error → HTTP status mapping
├── swagger/ # Embedded OpenAPI spec for merge
│ └── ... # swagger.MergeInto(baseSpec) ([]byte, error)
└── testutil/ # Test doubles
├── fake_store.go # In-memory Store with concurrency
├── fake_history.go # In-memory HistoryStore
├── fake_bundle.go # FakeBundle + FakeBundleFactory
├── fake_reconciler.go # Records all calls, configurable phase
└── fake_incremental_bundle.go # FakeIncrementalBundleFactory
```
---
## Appendix B: Domain Model Reference
Pure value objects. No infrastructure dependencies.
| Type | Purpose | Key Fields |
|------|---------|------------|
| `Entry` | Persisted config override | `Kind`, `Scope`, `Subject`, `Key`, `Value any`, `Revision`, `UpdatedAt`, `UpdatedBy`, `Source` |
| `Kind` | Config vs Setting | `KindConfig` (`"config"`) or `KindSetting` (`"setting"`) |
| `Scope` | Visibility | `ScopeGlobal` (`"global"`) or `ScopeTenant` (`"tenant"`) |
| `Target` | Coordinate for a group of entries | `Kind` + `Scope` + `SubjectID`; constructor `NewTarget()` validates |
| `Revision` | Monotonic version counter | `uint64`; `RevisionZero = 0`; methods: `Next()`, `Uint64()` |
| `Actor` | Who made the change | `ID string` |
| `KeyDef` | Registry metadata per key | `Key`, `Kind`, `AllowedScopes`, `DefaultValue`, `ValueType`, `Validator ValidatorFunc`, `Secret`, `RedactPolicy`, `ApplyBehavior`, `MutableAtRuntime`, `Description`, `Group`, `Component` |
| `ReconcilerPhase` | Reconciler execution ordering | `PhaseStateSync` (0), `PhaseValidation` (1), `PhaseSideEffect` (2) |
| `Snapshot` | Immutable point-in-time view | `Configs`, `GlobalSettings`, `TenantSettings map[string]map[string]EffectiveValue`, `Revision`, `BuiltAt` |
| `EffectiveValue` | Resolved value with override info | `Key`, `Value`, `Default`, `Override`, `Source`, `Revision`, `Redacted` |
| `RuntimeBundle` | App-defined resource container | Interface: `Close(ctx) error` |
| `ApplyBehavior` | How changes propagate | See [Appendix F](#appendix-f-apply-behavior-taxonomy); has `Strength() int` (0-4 scale) |
| `ValueType` | Type constraint | `"string"`, `"int"`, `"bool"`, `"float"`, `"object"`, `"array"` |
| `RedactPolicy` | Secret handling | `RedactNone`, `RedactFull`, `RedactMask` |
| `BackendKind` | Storage backend | `BackendPostgres` or `BackendMongoDB` |
| `ValidatorFunc` | Custom value validation | `func(value any) error` |
| `ComponentNone` | Sentinel for no-rebuild keys | `"_none"` — for pure business-logic keys |
**Snapshot accessor methods** (nil-safe):
- `GetConfig(key)`, `GetGlobalSetting(key)`, `GetTenantSetting(tenantID, key)``(EffectiveValue, bool)`
- `ConfigValue(key, fallback)`, `GlobalSettingValue(key, fallback)`, `TenantSettingValue(tenantID, key, fallback)``any`
**Sentinel errors** (14):
```go
var (
ErrKeyUnknown = errors.New("unknown configuration key")
ErrValueInvalid = errors.New("invalid configuration value")
ErrRevisionMismatch = errors.New("revision mismatch")
ErrScopeInvalid = errors.New("scope not allowed for this key")
ErrPermissionDenied = errors.New("permission denied")
ErrReloadFailed = errors.New("configuration reload failed")
ErrKeyNotMutable = errors.New("key is not mutable at runtime")
ErrSnapshotBuildFailed = errors.New("snapshot build failed")
ErrBundleBuildFailed = errors.New("runtime bundle build failed")
ErrBundleSwapFailed = errors.New("runtime bundle swap failed")
ErrReconcileFailed = errors.New("bundle reconciliation failed")
ErrNoCurrentBundle = errors.New("no current runtime bundle")
ErrSupervisorStopped = errors.New("supervisor has been stopped")
ErrRegistryRequired = errors.New("registry is required")
)
```
Per-enum parse errors: `ErrInvalidKind`, `ErrInvalidBackendKind`, `ErrInvalidScope`,
`ErrInvalidApplyBehavior`, `ErrInvalidValueType`, `ErrInvalidReconcilerPhase`.
---
## Appendix C: Ports Interface Reference
Seven interfaces defining all external dependencies:
```go
// Persistence — read/write config entries
type Store interface {
Get(ctx context.Context, target domain.Target) (ReadResult, error)
Put(ctx context.Context, target domain.Target, ops []WriteOp,
expected domain.Revision, actor domain.Actor, source string) (domain.Revision, error)
}
type WriteOp struct { Key string; Value any; Reset bool }
type ReadResult struct { Entries []domain.Entry; Revision domain.Revision }
// Audit trail — change history
type HistoryStore interface {
ListHistory(ctx context.Context, filter HistoryFilter) ([]HistoryEntry, error)
}
type HistoryEntry struct {
Revision domain.Revision
Key string
Scope string
SubjectID string
OldValue any
NewValue any
ActorID string
ChangedAt time.Time
}
type HistoryFilter struct { Kind, Scope, SubjectID, Key string; Limit, Offset int }
// Real-time change notifications
type ChangeFeed interface {
Subscribe(ctx context.Context, handler func(ChangeSignal)) error // blocks until ctx cancelled
}
type ChangeSignal struct {
Target domain.Target
Revision domain.Revision
ApplyBehavior domain.ApplyBehavior
}
// Permission checking
type Authorizer interface {
Authorize(ctx context.Context, permission string) error
}
// Identity extraction from request context
type IdentityResolver interface {
Actor(ctx context.Context) (domain.Actor, error)
TenantID(ctx context.Context) (string, error)
}
// Application-specific runtime dependency builder (FULL rebuild)
type BundleFactory interface {
Build(ctx context.Context, snap domain.Snapshot) (domain.RuntimeBundle, error)
}
// INCREMENTAL rebuild — extends BundleFactory with component-granular rebuilds.
// When the Supervisor detects only a subset of components changed, it calls
// BuildIncremental instead of Build, reusing unchanged components.
type IncrementalBundleFactory interface {
BundleFactory
BuildIncremental(ctx context.Context, snap domain.Snapshot,
previous domain.RuntimeBundle, prevSnap domain.Snapshot) (domain.RuntimeBundle, error)
}
// Side-effect applier when bundles change.
// Reconcilers are sorted by Phase before execution:
// PhaseStateSync → update shared state (ConfigManager, caches)
// PhaseValidation → gates that can reject the change
// PhaseSideEffect → external side effects (worker restarts)
type BundleReconciler interface {
Name() string
Phase() domain.ReconcilerPhase
Reconcile(ctx context.Context, previous, candidate domain.RuntimeBundle,
snap domain.Snapshot) error
}
```
---
## Appendix D: Service Layer Reference
### Registry (`commons/systemplane/registry/`)
Thread-safe in-memory registry of key definitions:
```go
type Registry interface {
Register(def domain.KeyDef) error
MustRegister(def domain.KeyDef) // panics — startup only
Get(key string) (domain.KeyDef, bool)
List(kind domain.Kind) []domain.KeyDef // sorted by key name
Validate(key string, value any) error // nil value always valid (reset)
}
func New() Registry // returns *inMemoryRegistry (RWMutex-protected)
```
**Validation** (`validation.go`):
- `validateValue(def, value)` → type check + custom validator
- **JSON coercion**: `float64` without fractional part accepted as `int`; `int`/`int64` widened to `float`
- `isObjectCompatible(value)` uses `reflect.Map`; `isArrayCompatible(value)` uses `reflect.Array`/`Slice`
- Nil values always pass validation (they mean "reset to default")
### Manager
```go
type Manager interface {
GetConfigs(ctx context.Context) (ResolvedSet, error)
GetSettings(ctx context.Context, subject Subject) (ResolvedSet, error)
PatchConfigs(ctx context.Context, req PatchRequest) (WriteResult, error)
PatchSettings(ctx context.Context, subject Subject, req PatchRequest) (WriteResult, error)
GetConfigSchema(ctx context.Context) ([]SchemaEntry, error)
GetSettingSchema(ctx context.Context) ([]SchemaEntry, error)
GetConfigHistory(ctx context.Context, filter HistoryFilter) ([]HistoryEntry, error)
GetSettingHistory(ctx context.Context, filter HistoryFilter) ([]HistoryEntry, error)
ApplyChangeSignal(ctx context.Context, signal ChangeSignal) error
Resync(ctx context.Context) error
}
```
**`ManagerConfig`** (constructor dependencies):
```go
type ManagerConfig struct {
Registry registry.Registry
Store ports.Store
History ports.HistoryStore
Supervisor Supervisor
Builder *SnapshotBuilder
// ConfigWriteValidator is called BEFORE persistence with a preview snapshot.
// Return non-nil error to reject the write.
ConfigWriteValidator func(ctx context.Context, snapshot domain.Snapshot) error
// StateSync is called AFTER successful writes for live-read escalation.
// Used to update ConfigManager atomically.
StateSync func(ctx context.Context, snapshot domain.Snapshot)
}
```
**Key behaviors:**
- **Read path**: Uses supervisor's cached snapshot when available; falls back to builder
- **Write path**: `validate ops → preview snapshot (if ConfigWriteValidator configured) → escalate → store.Put → apply escalation`
- **Escalation application**: Maps strongest `ApplyBehavior` to supervisor method:
- `ApplyBootstrapOnly` → no-op
- `ApplyLiveRead``PublishSnapshot` + `StateSync` callback
- `ApplyWorkerReconcile``ReconcileCurrent`
- `ApplyBundleRebuild` / `ApplyBundleRebuildAndReconcile` → full `Reload`
- **Redaction**: All read results and history entries are redacted per `RedactPolicy`/`Secret` flag
- **Masking**: `RedactMask` shows last 4 runes; `RedactFull` and `Secret=true` show `"****"`
**Supporting types:**
- `Subject { Scope, SubjectID }` — settings target
- `PatchRequest { Ops []WriteOp, ExpectedRevision, Actor, Source }` — write input
- `WriteResult { Revision }` — write output
- `ResolvedSet { Values map[string]EffectiveValue, Revision }` — read output
- `SchemaEntry { Key, Kind, AllowedScopes, ValueType, DefaultValue, MutableAtRuntime, ApplyBehavior, Secret, RedactPolicy, Description, Group }` — schema metadata
### Supervisor
```go
type Supervisor interface {
Current() domain.RuntimeBundle
Snapshot() domain.Snapshot
PublishSnapshot(ctx context.Context, snap domain.Snapshot, reason string) error
ReconcileCurrent(ctx context.Context, snap domain.Snapshot, reason string) error
Reload(ctx context.Context, reason string, extraTenantIDs ...string) error
Stop(ctx context.Context) error
}
```
**`SupervisorConfig`**:
```go
type SupervisorConfig struct {
Builder *SnapshotBuilder
Factory ports.BundleFactory
Reconcilers []ports.BundleReconciler
// Observer is invoked AFTER each successful reload with structured info
// about the build strategy (full vs incremental).
Observer func(ReloadEvent)
}
type ReloadEvent struct {
Strategy BuildStrategy // "full" or "incremental"
Reason string // caller-supplied (e.g., "changefeed-signal")
Snapshot domain.Snapshot
Bundle domain.RuntimeBundle
}
```
**Reload lifecycle** (the heart of the system):
1. **Build snapshot**: `builder.BuildFull(ctx, tenantIDs...)`
2. **Build bundle**: Try incremental first (if factory implements `IncrementalBundleFactory` and previous exists), fall back to full
3. **Reconcile BEFORE commit**: Run all reconcilers against candidate while previous is still active (prevents state corruption on failure)
4. **Atomic swap**: `snapshot.Store(&snap)` + `bundle.Store(&holder{candidate})`
5. **Resource adoption**: If candidate implements `resourceAdopter`, call `AdoptResourcesFrom(previous)` to nil-out transferred pointers
6. **Observer callback**: Notify host application (e.g., update bundleState, swap loggers)
7. **Close previous**: `previous.Close()` — only tears down REPLACED components since transferred pointers are nil
**Hidden interfaces** (implement on your bundle for advanced patterns):
- `resourceAdopter { AdoptResourcesFrom(previous RuntimeBundle) }` — called after commit to mark transferred resources
- `rollbackDiscarder { Discard(ctx) error }` — called on failed candidate instead of `Close()`
### SnapshotBuilder
```go
type SnapshotBuilder struct { registry Registry; store Store }
func NewSnapshotBuilder(reg, store) (*SnapshotBuilder, error)
func (b) BuildConfigs(ctx) (map[string]EffectiveValue, Revision, error)
func (b) BuildGlobalSettings(ctx) (map[string]EffectiveValue, Revision, error)
func (b) BuildSettings(ctx, Subject) (map[string]EffectiveValue, Revision, error)
func (b) BuildFull(ctx, tenantIDs ...string) (Snapshot, error)
```
**Override cascade** (3-layer for tenant settings):
1. `initDefaults(defs)` → populate from KeyDef.DefaultValue
2. `applyOverrides(effective, entries, source)` → overlay store entries
3. **Tenant path**: `registry defaults → global override → per-tenant override`
### Escalation
```go
func Escalate(reg Registry, ops []WriteOp) (ApplyBehavior, []string, error)
```
- Returns strongest `ApplyBehavior` across all ops (by `Strength()` 0-4)
- Rejects `ApplyBootstrapOnly` and `MutableAtRuntime=false` keys
- Rejects duplicate keys in batch
- Empty batch → `ApplyLiveRead`
- Returns the list of keys that drove the escalation
---
## Appendix E: Bootstrap Configuration
| File | Purpose |
|------|---------|
| `config.go` | `BootstrapConfig`, `PostgresBootstrapConfig`, `MongoBootstrapConfig`, `SecretStoreConfig` |
| `backend.go` | `BackendFactory` registry, `RegisterBackendFactory()`, `NewBackendFromConfig()``BackendResources{Store, History, ChangeFeed, Closer}` |
| `builtin/backend.go` | `init()` registers Postgres + MongoDB factories |
| `env.go` | `LoadFromEnv()` reads `SYSTEMPLANE_*` env vars |
| `classifier.go` | `IsBootstrapOnly(def)``!MutableAtRuntime || ApplyBootstrapOnly`; `IsRuntimeManaged(def)` — inverse |
| `defaults.go` | Default table/collection names |
| `postgres_identifiers.go` | Regex validation for PG identifiers (`^[a-z_][a-z0-9_]{0,62}$`) |
**Environment variables** for standalone systemplane backend:
| Variable | Default | Description |
|----------|---------|-------------|
| `SYSTEMPLANE_BACKEND` | — | `postgres` or `mongodb` |
| `SYSTEMPLANE_POSTGRES_DSN` | — | PostgreSQL DSN (falls back to app's PG DSN) |
| `SYSTEMPLANE_POSTGRES_SCHEMA` | `system` | Schema name |
| `SYSTEMPLANE_POSTGRES_ENTRIES_TABLE` | `runtime_entries` | Entries table |
| `SYSTEMPLANE_POSTGRES_HISTORY_TABLE` | `runtime_history` | History table |
| `SYSTEMPLANE_POSTGRES_REVISION_TABLE` | `runtime_revisions` | Revisions table |
| `SYSTEMPLANE_POSTGRES_NOTIFY_CHANNEL` | `systemplane_changes` | PG LISTEN/NOTIFY channel |
| `SYSTEMPLANE_SECRET_MASTER_KEY` | — | AES-256-GCM master key (32 bytes, raw or base64) |
| `SYSTEMPLANE_MONGODB_URI` | — | MongoDB connection URI |
| `SYSTEMPLANE_MONGODB_DATABASE` | `systemplane` | MongoDB database |
| `SYSTEMPLANE_MONGODB_ENTRIES_COLLECTION` | `runtime_entries` | Entries collection |
| `SYSTEMPLANE_MONGODB_HISTORY_COLLECTION` | `runtime_history` | History collection |
| `SYSTEMPLANE_MONGODB_WATCH_MODE` | `change_stream` | `change_stream` or `poll` |
| `SYSTEMPLANE_MONGODB_POLL_INTERVAL_SEC` | `5` | Poll interval (poll mode only) |
---
## Appendix F: Apply Behavior Taxonomy
Every config key MUST be classified with exactly one `ApplyBehavior`:
| ApplyBehavior | Code Constant | Strength | Runtime Effect | Use When |
|---------------|---------------|----------|----------------|----------|
| **bootstrap-only** | `domain.ApplyBootstrapOnly` | 4 | Immutable after startup. Never changes. | Server listen address, TLS, auth enable, telemetry endpoints |
| **bundle-rebuild+worker-reconcile** | `domain.ApplyBundleRebuildAndReconcile` | 3 | Full bundle swap: new infra clients AND worker restart | Worker enable/disable (needs new connections + restart) |
| **bundle-rebuild** | `domain.ApplyBundleRebuild` | 2 | Full bundle swap: new PG/Redis/RMQ/S3 clients | Connection strings, pool sizes, credentials |
| **worker-reconcile** | `domain.ApplyWorkerReconcile` | 1 | Reconciler restarts affected workers | Worker intervals, scheduler periods |
| **live-read** | `domain.ApplyLiveRead` | 0 | Read from snapshot on every request. Zero cost. | Rate limits, timeouts, cache TTLs — anything read per-request |
**Strength** determines escalation: when a PATCH contains multiple keys with different
behaviors, `Escalate()` picks the strongest (highest number). If any key is
`ApplyBootstrapOnly` or `MutableAtRuntime=false`, the entire write is rejected.
**Classification decision tree**:
```
Is this key needed BEFORE the systemplane itself can start?
YES → ApplyBootstrapOnly (server address, auth enable, telemetry)
NO ↓
Can this key be read per-request from a snapshot without side effects?
YES → ApplyLiveRead (rate limits, timeouts, TTLs)
NO ↓
Does changing this key require rebuilding infrastructure clients?
YES → Does it ALSO require restarting background workers?
YES → ApplyBundleRebuildAndReconcile (worker enable + storage changes)
NO → ApplyBundleRebuild (DB connections, pool sizes, credentials)
NO ↓
Does changing this key require restarting background workers?
YES → ApplyWorkerReconcile (worker intervals, scheduler periods)
NO → ApplyLiveRead (safe default for read-only configs)
```
---
## Appendix G: Screening Methodology
Before implementing, screen the target service to build the key inventory:
### G.1 Identify All Configuration Sources
```bash
# In the target repo:
# 1. Find the Config struct
grep -rn 'type Config struct' internal/ --include='*.go'
# 2. Find all envDefault tags
grep -rn 'envDefault:' internal/ --include='*.go' | sort
# 3. Find env var reads outside Config struct
grep -rn 'os.Getenv\|viper.Get' internal/ --include='*.go' | grep -v _test.go
# 4. Find .env files
find . -name '.env*' -o -name '*.yaml.example' | head -20
# 5. Find file-based config loading
grep -rn 'viper\.\|yaml.Unmarshal\|json.Unmarshal.*config' internal/ --include='*.go'
```
### G.2 Classify Infrastructure vs Application Config
**Infrastructure (stays as env-var with defaults in code)**:
- Database connection strings (PG host, port, user, password) → `Component: "postgres"`
- Redis connection (host, master name, password) → `Component: "redis"`
- RabbitMQ connection (URI, host, port, user, password) → `Component: "rabbitmq"`
- Object storage (endpoint, bucket, credentials) → `Component: "s3"`
- These become `ApplyBundleRebuild` keys in systemplane
**Bootstrap-Only (env-var, immutable after startup)**:
- Server listen address and port → `Component: ComponentNone` (not mutable)
- TLS certificate paths
- Auth enable/disable
- Auth service address
- Telemetry endpoints (OTEL collector)
- These become `ApplyBootstrapOnly` keys
**Application Runtime (hot-reloadable)**:
- Rate limits → `Component: ComponentNone` + `ApplyLiveRead`
- Worker intervals and enable/disable flags → `Component: ComponentNone` + `ApplyWorkerReconcile`
- Timeouts (webhook, health check) → `Component: ComponentNone` + `ApplyLiveRead`
- Feature flags → `Component: ComponentNone` + `ApplyLiveRead`
- Cache TTLs → `Component: ComponentNone` + `ApplyLiveRead`
### G.3 Identify Runtime Dependencies (Bundle Candidates)
List all infrastructure clients that the service creates at startup:
```bash
# Find client constructors
grep -rn 'libPostgres.New\|libRedis.New\|libRabbitmq.New\|storage.New' internal/ --include='*.go'
# Find connection pools
grep -rn 'sql.Open\|pgx\|redis.New\|amqp.Dial' internal/ --include='*.go'
```
Each of these becomes a field in the `InfraBundle` and a component name in
`keyComponentMap`.
### G.4 Identify Background Workers
```bash
# Find worker patterns
grep -rn 'ticker\|time.NewTicker\|cron\|worker\|scheduler' internal/ --include='*.go' | grep -v _test.go
```
Each worker with configurable intervals becomes a `WorkerReconciler` candidate.
### G.5 Generate the Key Inventory
Create a table with columns:
- Key name (dotted: `postgres.primary_host`)
- Current env var (`POSTGRES_HOST`)
- Default value
- Type (`string`, `int`, `bool`, `float`)
- Kind (`config` or `setting`)
- Scope (`global` or `tenant`)
- ApplyBehavior
- Component (`postgres`, `redis`, `rabbitmq`, `s3`, `http`, `logger`, `_none`)
- Secret (yes/no)
- MutableAtRuntime (yes/no)
- Group
- Validator (if any)
---
## Appendix H: Testing Patterns
### H.1 Key Registration Tests
```go
func TestRegister{Service}Keys_AllKeysValid(t *testing.T) {
reg := registry.New()
err := Register{Service}Keys(reg)
require.NoError(t, err)
configs := reg.List(domain.KindConfig)
settings := reg.List(domain.KindSetting)
assert.Greater(t, len(configs)+len(settings), 0)
}
func TestRegister{Service}Keys_DefaultsMatchConfig(t *testing.T) {
reg := registry.New()
_ = Register{Service}Keys(reg)
defaults := defaultConfig()
for _, def := range reg.List(domain.KindConfig) {
// Compare def.DefaultValue against defaults struct field
}
}
```
### H.2 Bundle Factory Tests
```go
func TestBundleFactory_Build_Success(t *testing.T) {
snap := buildTestSnapshot(t)
factory := New{Service}BundleFactory(testBootstrapConfig())
bundle, err := factory.Build(context.Background(), snap)
require.NoError(t, err)
defer bundle.Close(context.Background())
b := bundle.(*{Service}Bundle)
assert.NotNil(t, b.Infra.Postgres)
assert.NotNil(t, b.Infra.Redis)
}
func TestBundleFactory_BuildIncremental_ReusesUnchangedComponents(t *testing.T) {
snap1 := buildTestSnapshot(t)
snap2 := modifySnapshot(snap1, "rate_limit.max", 500) // only changes _none component
factory := New{Service}BundleFactory(testBootstrapConfig())
ctx := context.Background()
prev, _ := factory.Build(ctx, snap1)
candidate, _ := factory.BuildIncremental(ctx, snap2, prev, snap1)
b := candidate.(*{Service}Bundle)
// Postgres should be reused (not owned by candidate)
assert.False(t, b.ownsPostgres)
}
```
### H.3 Reconciler Tests
```go
func TestWorkerReconciler_AppliesConfig(t *testing.T) {
wm := NewTestWorkerManager()
reconciler := NewWorkerReconciler(wm)
snap := buildModifiedSnapshot(t, "export_worker.enabled", true)
err := reconciler.Reconcile(context.Background(), nil, nil, snap)
require.NoError(t, err)
assert.True(t, wm.LastAppliedConfig().ExportWorkerEnabled)
}
```
### H.4 Contract Tests (Backend-Agnostic)
Run the `storetest` and `feedtest` contract suites against your backend:
```go
func TestPostgresStore_ContractSuite(t *testing.T) {
store, history := setupPostgresStore(t) // testcontainers
storetest.RunAll(t, /* factory that returns store+history */)
}
func TestPostgresChangeFeed_ContractSuite(t *testing.T) {
feed := setupPostgresFeed(t) // testcontainers
feedtest.RunAll(t, /* factory that returns store+feed */)
}
```
**Store contract tests (15):**
- GetEmptyTarget, PutSingleOp, PutBatch, OptimisticConcurrency, ResetOp
- NilValueOp, EmptyBatchIsNoOp, TypedNilValueOp, PutPreservesOtherKeys
- RevisionMonotonicallyIncreasing, ConcurrentPuts
- HistoryRecording, BatchHistoryConsistency, HistoryFiltering, HistoryPagination
**Feed contract tests (3):**
- SubscribeReceivesSignal, ContextCancellationStops, MultipleSignals
### H.5 ConfigWriteValidator Tests
```go
func TestConfigWriteValidator_RejectsInvalidConfig(t *testing.T) {
// Test that production guards prevent persisting invalid config
snap := buildSnapshot(t)
snap.Configs["rate_limit.enabled"] = ev(false) // can't disable in production
err := validator(context.Background(), snap)
assert.Error(t, err)
}
```
---
## Appendix I: Operational Guide
### I.1 For Operators: What Changes
| Before | After |
|--------|-------|
| Edit `.env` + restart | `PATCH /v1/system/configs` (no restart) |
| Edit YAML + wait for fsnotify | `PATCH /v1/system/configs` (instant) |
| No audit trail | `GET /v1/system/configs/history` |
| No schema discovery | `GET /v1/system/configs/schema` |
| No concurrency protection | `If-Match` / `ETag` headers |
| Manual rollback | Change feed propagates across replicas |
| Full restart for any change | Only `ApplyBootstrapOnly` keys need restart |
### I.2 Bootstrap-Only Keys (Require Restart)
Document in `config/.config-map.example`:
```
# {Service} — Bootstrap-Only Configuration (requires restart)
#
# These are the ONLY settings that require a container/pod restart.
# Everything else is hot-reloadable via:
#
# GET /v1/system/configs — view current runtime config
# PATCH /v1/system/configs — change any runtime-managed key
# GET /v1/system/configs/schema — see all keys, types, and mutability
# GET /v1/system/configs/history — audit trail of changes
ENV_NAME=development
SERVER_ADDRESS=:8080
AUTH_ENABLED=false
ENABLE_TELEMETRY=false
# ... etc
```
### I.3 Docker Compose (Zero-Config)
```yaml
services:
myservice:
build: .
ports:
- "${SERVER_PORT:-8080}:8080"
environment:
- POSTGRES_HOST=${POSTGRES_HOST:-postgres}
- POSTGRES_PORT=${POSTGRES_PORT:-5432}
- REDIS_HOST=${REDIS_HOST:-redis}
# NO env_file directive — defaults baked into binary
depends_on:
postgres:
condition: service_healthy
```
### I.4 Systemplane Backend Config
By default, the systemplane reuses the application's primary PostgreSQL connection.
Override with `SYSTEMPLANE_*` env vars for a separate backend:
```bash
# Use app's Postgres (default — no extra config needed)
# The init function builds the DSN from the app's POSTGRES_* env vars
# Or use a dedicated backend:
SYSTEMPLANE_BACKEND=postgres
SYSTEMPLANE_POSTGRES_DSN=postgres://user:pass@host:5432/systemplane?sslmode=require
# Secret encryption (REQUIRED — rejected in production without it):
SYSTEMPLANE_SECRET_MASTER_KEY=<32-byte key, raw or base64>
```
### I.5 Graceful Degradation
If the systemplane fails to initialize, the service continues without it:
- Config values from env vars still work
- No runtime mutation API available
- No hot-reload capability
- Workers run with static config
This is by design — the service never fails to start due to systemplane issues.
### I.6 HTTP API Endpoints
After migration, the service exposes these endpoints:
| Method | Path | Description | Auth Permission |
|--------|------|-------------|-----------------|
| `GET` | `/v1/system/configs` | View all resolved config values | `system/configs:read` |
| `PATCH` | `/v1/system/configs` | Update config values (with `If-Match` for concurrency) | `system/configs:write` |
| `GET` | `/v1/system/configs/schema` | View all key definitions (types, defaults, mutability) | `system/configs/schema:read` |
| `GET` | `/v1/system/configs/history` | Audit trail of config changes | `system/configs/history:read` |
| `POST` | `/v1/system/configs/reload` | Force a full reload | `system/configs/reload:write` |
| `GET` | `/v1/system/settings` | View resolved settings (`?scope=global\|tenant`) | `system/settings:read` |
| `PATCH` | `/v1/system/settings` | Update settings | `system/settings:write` |
| `GET` | `/v1/system/settings/schema` | View setting key definitions | `system/settings/schema:read` |
| `GET` | `/v1/system/settings/history` | Settings audit trail | scope-dependent |
Settings routes have an extra `settingsScopeAuthorization` middleware that
elevates to `system/settings/global:{action}` when `?scope=global` is queried.
**PATCH Request Format:**
```json
PATCH /v1/system/configs
If-Match: "42"
Content-Type: application/json
{
"values": {
"rate_limit.max": 200,
"rate_limit.expiry_sec": 120,
"export_worker.enabled": false
}
}
```
**PATCH Response Format:**
```json
HTTP/1.1 200 OK
ETag: "43"
{
"revision": 43
}
```
**Schema Response Format:**
```json
{
"keys": [
{
"key": "rate_limit.max",
"kind": "config",
"valueType": "int",
"defaultValue": 100,
"mutableAtRuntime": true,
"applyBehavior": "live-read",
"secret": false,
"description": "Maximum requests per window",
"group": "rate_limit",
"allowedScopes": ["global"]
}
]
}
```
**HTTP Error Mapping:**
| Domain Error | HTTP Status | Code |
|-------------|-------------|------|
| `ErrKeyUnknown` | 400 | `system_key_unknown` |
| `ErrValueInvalid` | 400 | `system_value_invalid` |
| `ErrKeyNotMutable` | 400 | `system_key_not_mutable` |
| `ErrScopeInvalid` | 400 | `system_scope_invalid` |
| `ErrRevisionMismatch` | 409 | `system_revision_mismatch` |
| `ErrPermissionDenied` | 403 | `system_permission_denied` |
| `ErrReloadFailed` | 500 | `system_reload_failed` |
| `ErrSupervisorStopped` | 503 | `system_unavailable` |
### I.7 Adapters Reference
| Adapter | Location | Key Feature |
|---------|----------|-------------|
| PostgreSQL Store | `adapters/store/postgres/` | 3 tables + indexes, optimistic concurrency, `pg_notify`, integer-preserving JSON decode |
| MongoDB Store | `adapters/store/mongodb/` | Sentinel revision doc (`__revision_meta__`), multi-doc transactions, requires replica set |
| Secret Codec | `adapters/store/secretcodec/` | AES-256-GCM with random nonce, AAD = `"kind\|scope\|subject\|key"`, envelope format `{__systemplane_secret_v: 1, alg, nonce, ciphertext}` |
| PostgreSQL ChangeFeed | `adapters/changefeed/postgres/` | LISTEN/NOTIFY, exponential backoff reconnect, revision resync on reconnect |
| MongoDB ChangeFeed | `adapters/changefeed/mongodb/` | Change stream or poll mode, auto-reconnect, revision-jump escalation |
| DebouncedFeed | `adapters/changefeed/debounce.go` | Per-target trailing-edge debounce (default 100ms + 50ms jitter), escalates to strongest ApplyBehavior in window |
| SafeInvokeHandler | `adapters/changefeed/safe_handler.go` | Catches handler panics, returns `ErrHandlerPanic` |
| Fiber HTTP | `adapters/http/fiber/` | 9 endpoints, DTOs, `If-Match`/`ETag` concurrency, middleware, domain→HTTP error mapping |
| Store Contract Tests | `adapters/store/storetest/` | 15 backend-agnostic tests: CRUD, concurrency, reset, history, pagination |
| Feed Contract Tests | `adapters/changefeed/feedtest/` | 3 backend-agnostic tests: signal receipt, cancellation, multiple signals |
**PostgreSQL Store DDL** (created automatically):
- `runtime_entries`: PK (`kind`, `scope`, `subject`, `key`), JSONB `value`, BIGINT `revision`
- `runtime_history`: BIGSERIAL `id`, `old_value`/`new_value` JSONB, `actor_id`, `changed_at`
- `runtime_revisions`: PK (`kind`, `scope`, `subject`), `revision` counter, `apply_behavior`
### I.8 Test Utilities
| Fake | Implements | Key Features |
|------|-----------|--------------|
| `FakeStore` | `ports.Store` | In-memory, optimistic concurrency, `Seed()` for pre-population |
| `FakeHistoryStore` | `ports.HistoryStore` | In-memory, newest-first, `Append()`/`AppendForKind()` |
| `FakeBundle` / `FakeBundleFactory` | `RuntimeBundle` / `BundleFactory` | Tracks Close state, `SetError()`, `CallCount()` |
| `FakeReconciler` | `BundleReconciler` | Configurable phase, records all `ReconcileCall`s, `SetError()` |
| `FakeIncrementalBundleFactory` | `IncrementalBundleFactory` | Embeds `FakeBundleFactory` + `IncrementalBuildFunc`, `IncrementalCallCount()` |
---
## Appendix J: Matcher Service Reference
The Matcher service registers **~130 keys** across 20 groups, split into 9 focused
sub-files. Here's the breakdown by group and ApplyBehavior:
| Group | Keys | BootstrapOnly | BundleRebuild | LiveRead | WorkerReconcile | Rebuild+Reconcile |
|-------|------|---------------|---------------|----------|-----------------|-------------------|
| `app` | 2 | 1 | 1 | - | - | - |
| `server` | 8 | 4 | 4 | - | - | - |
| `tenancy` | 11 | - | 11 | - | - | - |
| `postgres` | 19 | - | 18 | 1 | - | - |
| `redis` | 12 | - | 12 | - | - | - |
| `rabbitmq` | 8 | - | 8 | - | - | - |
| `auth` | 3 | 3 | - | - | - | - |
| `swagger` | 3 | - | 3 | - | - | - |
| `telemetry` | 7 | 7 | - | - | - | - |
| `rate_limit` | 7 | - | - | 7 | - | - |
| `infrastructure` | 2 | - | 1 | 1 | - | - |
| `idempotency` | 3 | 1 | 2 | - | - | - |
| `callback_rate_limit` | 1 | - | - | 1 | - | - |
| `fetcher` | 9 | - | 4 | - | 3 | 2 |
| `deduplication` | 1 | - | - | 1 | - | - |
| `object_storage` | 6 | - | 6 | - | - | - |
| `export_worker` | 4 | - | - | 1 | 1 | 2 |
| `webhook` | 1 | - | - | 1 | - | - |
| `cleanup_worker` | 4 | - | - | - | 2 | 2 |
| `scheduler` | 1 | - | - | - | 1 | - |
| `archival` | 12 | - | - | 1 | 3 | 8 |
**Secrets**: ~9 keys with `RedactFull` policy (passwords, tokens, certificates, access keys).
**Components referenced**: `postgres`, `redis`, `rabbitmq`, `s3`, `http`, `logger`, `_none`.
### Key Sub-File Organization (Reference)
| Sub-File | Groups Covered | Key Count |
|----------|---------------|-----------|
| `systemplane_keys_app_server.go` | app, server | ~11 |
| `systemplane_keys_tenancy.go` | tenancy | ~11 |
| `systemplane_keys_postgres.go` | postgres | ~19 |
| `systemplane_keys_messaging.go` | redis, rabbitmq | ~20 |
| `systemplane_keys_runtime_http.go` | auth, swagger, telemetry, rate_limit | ~17 |
| `systemplane_keys_runtime_services.go` | infrastructure, idempotency, callback_rate_limit, fetcher | ~14 |
| `systemplane_keys_storage_export.go` | deduplication, object_storage, export_worker | ~11 |
| `systemplane_keys_workers.go` | webhook, cleanup_worker | ~5 |
| `systemplane_keys_archival.go` | scheduler, archival | ~12 |
---
## Appendix K: Quick Reference Commands
```bash
# LOCAL DEV ONLY — requires AUTH_ENABLED=false
# View current runtime config
curl -s http://localhost:4018/v1/system/configs | jq
# View schema (all keys, types, mutability)
curl -s http://localhost:4018/v1/system/configs/schema | jq
# Change a runtime key
curl -X PATCH http://localhost:4018/v1/system/configs \
-H 'Content-Type: application/json' \
-H 'If-Match: "current-revision"' \
-d '{"values": {"rate_limit.max": 200}}'
# View change history
curl -s http://localhost:4018/v1/system/configs/history | jq
# Force full reload
curl -X POST http://localhost:4018/v1/system/configs/reload
# View settings
curl -s http://localhost:4018/v1/system/settings?scope=global | jq
# Change a setting
curl -X PATCH http://localhost:4018/v1/system/settings \
-H 'Content-Type: application/json' \
-H 'If-Match: "current-revision"' \
-d '{"scope": "global", "values": {"feature.enabled": true}}'
```
---
## Appendix L: Files to Create per Service
### Core Systemplane Wiring
| File | Purpose | Matcher Reference |
|------|---------|-------------------|
| `systemplane_init.go` | Init function with 11-step boot + change feed start | `internal/bootstrap/systemplane_init.go` |
| `systemplane_mount.go` | HTTP route registration + swagger merge | `internal/bootstrap/systemplane_mount.go` |
### Key Definitions (split by group)
| File | Purpose | Matcher Reference |
|------|---------|-------------------|
| `systemplane_keys.go` | Orchestrator: `Register{Service}Keys()` + `{service}KeyDefs()` | `internal/bootstrap/systemplane_keys.go` |
| `systemplane_keys_{group}.go` | Per-group key definitions | `internal/bootstrap/systemplane_keys_{group}.go` |
| `systemplane_keys_validation.go` | Validator functions used by KeyDef.Validator | `internal/bootstrap/systemplane_keys_validation.go` |
| `systemplane_keys_helpers.go` | `concatKeyDefs()` utility | `internal/bootstrap/systemplane_keys_helpers.go` |
### Bundle + Factory
| File | Purpose | Matcher Reference |
|------|---------|-------------------|
| `systemplane_bundle.go` | Bundle struct + Close + AdoptResourcesFrom (ownership tracking) | `internal/bootstrap/systemplane_bundle.go` |
| `systemplane_factory.go` | BundleFactory + IncrementalBundleFactory (full + incremental) | `internal/bootstrap/systemplane_factory.go` |
| `systemplane_factory_infra.go` | Per-component builders: buildPostgres, buildRedis, buildRabbitMQ, buildS3 | `internal/bootstrap/systemplane_factory_infra.go` |
### Reconcilers
| File | Purpose | Matcher Reference |
|------|---------|-------------------|
| `systemplane_reconciler_http.go` | HTTP policy validation (PhaseValidation) | `internal/bootstrap/systemplane_reconciler_http.go` |
| `systemplane_reconciler_publishers.go` | RabbitMQ publisher staging (PhaseValidation) | `internal/bootstrap/systemplane_reconciler_publishers.go` |
| `systemplane_reconciler_worker.go` | Worker restart (PhaseSideEffect) | `internal/bootstrap/systemplane_reconciler_worker.go` |
### Identity + Authorization
| File | Purpose | Matcher Reference |
|------|---------|-------------------|
| `systemplane_identity.go` | JWT → Actor bridge | `internal/bootstrap/systemplane_identity.go` |
| `systemplane_authorizer.go` | Permission mapping | `internal/bootstrap/systemplane_authorizer.go` |
### Config Manager Integration
| File | Purpose | Matcher Reference |
|------|---------|-------------------|
| `config_manager_systemplane.go` | Snapshot → Config hydration | `internal/bootstrap/config_manager_systemplane.go` |
| `config_manager_seed.go` | Env → Store one-time seed | `internal/bootstrap/config_manager_seed.go` |
| `config_manager_helpers.go` | Type-safe value comparison | `internal/bootstrap/config_manager_helpers.go` |
| `config_validation.go` | Production config guards | `internal/bootstrap/config_validation.go` |
### Runtime Integration
| File | Purpose | Matcher Reference |
|------|---------|-------------------|
| `active_bundle_state.go` | Thread-safe live-read accessor for current bundle | `internal/bootstrap/active_bundle_state.go` |
| `config/.config-map.example` | Bootstrap-only key reference (operators) | `config/.config-map.example` |
### Files to Delete
| File | Reason |
|------|--------|
| `config/.env.example` | Replaced by code defaults + `.config-map.example` |
| `config/*.yaml.example` | No more YAML config |
| Config API handlers (old) | Replaced by systemplane HTTP adapter |
| Config file watcher | Replaced by change feed |
| Config audit publisher (old) | Replaced by systemplane history |
| Config YAML loader | No more YAML |
### Files to Modify
| File | Change |
|------|--------|
| `config_loading.go` | Remove YAML loading, keep env-only |
| `config_defaults.go` | Derive from `{service}KeyDefs()` via `configFromSnapshot(defaultSnapshotFromKeyDefs(...))` |
| `config_manager.go` | Add `UpdateFromSystemplane()`, `enterSeedMode()`, `atomic.Pointer[Config]` for lock-free reads |
| `init.go` | Wire systemplane init after workers, before HTTP mount; add reload observer callback |
| `service.go` | Add systemplane shutdown sequence (5-step) |
| `docker-compose.yml` | Remove `env_file`, inline defaults with `${VAR:-default}` |
| `Makefile` | Remove `set-env`, `clear-envs` targets |
---
## Appendix M: Canonical Key Catalog
The canonical key catalog is defined in `lib-commons/commons/systemplane/catalog/`. Products MUST match these names, tiers, and components for shared infrastructure keys.
### Naming Conventions
| Convention | Rule | Example |
|-----------|------|---------|
| SSL mode | `ssl_mode` (with underscore) | `postgres.primary_ssl_mode` |
| Connection count | Plural `conns` | `postgres.max_open_conns`, `redis.min_idle_conns` |
| CORS | `cors.*` namespace (NOT `server.cors_*`) | `cors.allowed_origins` |
| RabbitMQ connection | `rabbitmq.url` (NOT `uri`) | `rabbitmq.url` |
| Timeout unit suffix | MANDATORY `_ms` or `_sec` | `redis.read_timeout_ms`, `rate_limit.expiry_sec` |
| Size unit suffix | MANDATORY `_bytes` | `server.body_limit_bytes` |
### Tier Classification Standard
| Config Category | Canonical Tier | Rationale |
|----------------|---------------|-----------|
| PG pool tuning (`max_open_conns`, etc.) | **LiveRead** | Go's `database/sql` supports `SetMaxOpenConns()` at runtime |
| CORS settings | **LiveRead** | Middleware reads from snapshot per-request |
| Log level | **LiveRead** | Use `zap.AtomicLevel.SetLevel()` |
| Migration path | **BootstrapOnly** | Migrations only run at startup |
| Body limit | **BootstrapOnly** | Set at Fiber server initialization |
| DB connection strings | **BundleRebuild** | Requires new connection pool |
| Redis connection | **BundleRebuild** | Requires new Redis client |
| Worker enable/disable | **BundleRebuildAndReconcile** | Needs new connections + worker restart |
| Worker intervals | **WorkerReconcile** | Needs worker restart only |
| Rate limits, timeouts, TTLs | **LiveRead** | Read per-request from snapshot |
### Enforcement
Products run `catalog.ValidateKeyDefs()` in their test suite. Any mismatch is a test failure that blocks CI.
---
## Appendix N: Environment Variable Convention
| Infrastructure | Prefix | Examples |
|---------------|--------|---------|
| PostgreSQL | `POSTGRES_*` | `POSTGRES_HOST`, `POSTGRES_PORT`, `POSTGRES_PASSWORD` |
| Redis | `REDIS_*` | `REDIS_HOST`, `REDIS_PASSWORD`, `REDIS_DB` |
| RabbitMQ | `RABBITMQ_*` | `RABBITMQ_URL`, `RABBITMQ_EXCHANGE` |
| Auth | `PLUGIN_AUTH_*` | `PLUGIN_AUTH_ENABLED`, `PLUGIN_AUTH_ADDRESS` |
| Telemetry | `OTEL_*` / `ENABLE_TELEMETRY` | `OTEL_RESOURCE_SERVICE_NAME` |
| Server | `SERVER_*` | `SERVER_ADDRESS`, `SERVER_TLS_CERT_FILE` |
**PROHIBITED:** `DB_HOST`, `DB_PORT`, `DB_USER`, `DB_PASSWORD`, `DB_NAME` — use `POSTGRES_*` prefix.
**PROHIBITED:** `AUTH_ENABLED` without `PLUGIN_` prefix — use `PLUGIN_AUTH_ENABLED`.
---
## Appendix O: Unit Suffix Standard
All config keys with dimensional values MUST include a unit suffix:
| Dimension | Suffix | Examples |
|-----------|--------|---------|
| Time (milliseconds) | `_ms` | `redis.read_timeout_ms`, `rabbitmq.publish_timeout_ms` |
| Time (seconds) | `_sec` | `rate_limit.expiry_sec`, `auth.cache_ttl_sec` |
| Time (minutes) | `_mins` | `postgres.conn_max_lifetime_mins` |
| Size (bytes) | `_bytes` | `server.body_limit_bytes` |
| Count | Plural noun | `postgres.max_open_conns`, `redis.min_idle_conns` |
**PROHIBITED:** Dimensionless timeout keys like `webhook.timeout` — must be `webhook.timeout_ms` or `webhook.timeout_sec`.
**PROHIBITED:** Mixed units — if one timeout in a group uses `_ms`, all timeouts in that group should use `_ms`.