Fixes three tightly-coupled bugs that made web approval gates unusable:
1. orchestrator-agent did not pass parentConversationId to executeWorkflow
for any web-dispatched foreground / interactive / resumable run. Without
that field, findResumableRunByParentConversation (the machinery the CLI
relies on for resume) couldn't find the paused run from the same
conversation on a follow-up message, and the approve/reject API handlers
had no conversation to dispatch back to.
2. POST /api/workflows/runs/:runId/{approve,reject} recorded the decision
and returned "Send a message to continue the workflow." — the workflow
never actually resumed. Added tryAutoResumeAfterGate() that mirrors what
workflowApproveCommand / workflowRejectCommand already do on the CLI:
look up the parent conversation, dispatch `/workflow run <name>
<userMessage>` back through dispatchToOrchestrator. Failures are
non-fatal — the user can still send a manual message as a fallback.
3. The during-streaming cancel-check in dag-executor aborted any streaming
node whenever the run status left 'running', including the legitimate
transition to 'paused' that an approval node performs. A concurrent AI
node in the same DAG layer now tolerates 'paused' and finishes its own
stream; only truly terminal / unknown states (null, cancelled, failed,
completed) abort the in-flight stream.
Web UI: ConfirmRunActionDialog gains an optional reasonInput prop (label +
placeholder) that renders a textarea and passes the trimmed value to
onConfirm. WorkflowRunCard (dashboard) and WorkflowProgressCard (chat)
both use it for Reject now — the chat card was still on window.confirm,
which was both inconsistent with the dashboard and couldn't collect a
reason. The trimmed reason threads through to $REJECTION_REASON in the
workflow's on_reject prompt.
Supersedes #1147. @jonasvanderhaegen surfaced the root cause and shape of
the fix; that PR was 87 commits stale and pre-dated the reject-UX upgrade
(#1261 area), so this is a fresh re-do on current dev.
Tests:
- packages/server/src/routes/api.workflow-runs.test.ts — 5 new cases:
approve with parent dispatches; approve without parent returns "Send a
message"; approve with deleted parent conversation skips safely; reject
dispatches on-reject flows; reject that cancels (no on_reject) does NOT
dispatch.
- packages/core/src/orchestrator/orchestrator.test.ts — updated the two
synthesizedPrompt-dispatch tests for the new executeWorkflow arity.
Closes#1131.
Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>
* feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath
Collapses the awkward `~/.archon/.archon/workflows/` convention to a direct
`~/.archon/workflows/` child (matching `workspaces/`, `archon.db`, etc.), adds
home-scoped commands and scripts with the same loading story, and kills the
opt-in `globalSearchPath` parameter so every call site gets home-scope for free.
Closes#1136 (supersedes @jonasvanderhaegen's tactical fix — the bug was the
primitive itself: an easy-to-forget parameter that five of six call sites on
dev dropped).
Primitive changes:
- Home paths are direct children of `~/.archon/`. New helpers in `@archon/paths`:
`getHomeWorkflowsPath()`, `getHomeCommandsPath()`, `getHomeScriptsPath()`,
and `getLegacyHomeWorkflowsPath()` (detection-only for migration).
- `discoverWorkflowsWithConfig(cwd, loadConfig)` reads home-scope internally.
The old `{ globalSearchPath }` option is removed. Chat command handler, Web
UI workflow picker, orchestrator resolve path — all inherit home-scope for
free without maintainer patches at every new site.
- `discoverScriptsForCwd(cwd)` merges home + repo scripts (repo wins on name
collision). dag-executor and validator use it; the hardcoded
`resolve(cwd, '.archon', 'scripts')` single-scope path is gone.
- Command resolution is now walked-by-basename in each scope. `loadCommand`
and `resolveCommand` walk 1 subfolder deep and match by `.md` basename, so
`.archon/commands/triage/review.md` resolves as `review` — closes the
latent bug where subfolder commands were listed but unresolvable.
- All three (`workflows/`, `commands/`, `scripts/`) enforce a 1-level
subfolder cap (matches the existing `defaults/` convention). Deeper
nesting is silently skipped.
- `WorkflowSource` gains `'global'` alongside `'bundled'` and `'project'`.
Web UI node palette shows a dedicated "Global (~/.archon/commands/)"
section; badges updated.
Migration (clean cut — no fallback read):
- First use after upgrade: if `~/.archon/.archon/workflows/` exists, Archon
logs a one-time WARN per process with the exact `mv` command:
`mv ~/.archon/.archon/workflows ~/.archon/workflows && rmdir ~/.archon/.archon`
The legacy path is NOT read — users migrate manually. Rollback caveat
noted in CHANGELOG.
Tests:
- `@archon/paths/archon-paths.test.ts`: new helper tests (default HOME,
ARCHON_HOME override, Docker), plus regression guards for the double-`.archon/`
path.
- `@archon/workflows/loader.test.ts`: home-scoped workflows, precedence,
subfolder 1-depth cap, legacy-path deprecation warning fires exactly once
per process.
- `@archon/workflows/validator.test.ts`: home-scoped commands + subfolder
resolution.
- `@archon/workflows/script-discovery.test.ts`: depth cap + merge semantics
(repo wins, home-missing tolerance).
- Existing CLI + orchestrator tests updated to drop `globalSearchPath`
assertions.
E2E smoke (verified locally, before cleanup):
- `.archon/workflows/e2e-home-scope.yaml` + scratch repo at /tmp
- Home-scoped workflow discovered from an unrelated git repo
- Home-scoped script (`~/.archon/scripts/*.ts`) executes inside a script node
- 1-level subfolder workflow (`~/.archon/workflows/triage/*.yaml`) listed
- Legacy path warning fires with actionable `mv` command; workflows there
are NOT loaded
Docs: `CLAUDE.md`, `docs-web/guides/global-workflows.md` (full rewrite for
three-type scope + subfolder convention + migration), `docs-web/reference/
configuration.md` (directory tree), `docs-web/reference/cli.md`,
`docs-web/guides/authoring-workflows.md`.
Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>
* test(script-discovery): normalize path separators in mocks for Windows
The 4 new tests in `scanScriptDir depth cap` and `discoverScriptsForCwd —
merge repo + home with repo winning` compared incoming mock paths with
hardcoded forward-slash strings (`if (path === '/scripts/triage')`). On
Windows, `path.join('/scripts', 'triage')` produces `\scripts\triage`, so
those branches never matched, readdir returned `[]`, and the tests failed.
Added a `norm()` helper at module scope and wrapped the incoming `path`
argument in every `mockImplementation` before comparing. Stored paths go
through `normalizeSep()` in production code, so the existing equality
assertions on `script.path` remain OS-independent.
Fixes Windows CI job `test (windows-latest)` on PR #1315.
* address review feedback: home-scope error handling, depth cap, and tests
Critical fixes:
- api.ts: add `maxDepth: 1` to all 3 findMarkdownFilesRecursive calls in
GET /api/commands (bundled/home/project). Without this the UI palette
surfaced commands from deep subfolders that the executor (capped at 1)
could not resolve — silent "command not found" at runtime.
- validator.ts: wrap home-scope findMarkdownFilesRecursive and
resolveCommandInDir calls in try/catch so EACCES/EPERM on
~/.archon/commands/ doesn't crash the validator with a raw filesystem
error. ENOENT still returns [] via the underlying helper.
Error handling fixes:
- workflow-discovery.ts: maybeWarnLegacyHomePath now sets the
"warned-once" flag eagerly before `await access()`, so concurrent
discovery calls (server startup with parallel codebase resolution)
can't double-warn. Non-ENOENT probe errors (EACCES/EPERM) now log at
WARN instead of DEBUG so permission issues on the legacy dir are
visible in default operation.
- dag-executor.ts: wrap discoverScriptsForCwd in its own try/catch so
an EACCES on ~/.archon/scripts/ routes through safeSendMessage /
logNodeError with a dedicated "failed to discover scripts" message
instead of being mis-attributed by the outer catch's
"permission denied (check cwd permissions)" branch.
Tests:
- load-command-prompt.test.ts (new): 6 tests covering the executor's
command resolution hot path — home-scope resolves when repo misses,
repo shadows home, 1-level subfolder resolvable by basename, 2-level
rejected, not-found, empty-file. Runs in its own bun test batch.
- archon-paths.test.ts: add getHomeScriptsPath describe block to match
the existing getHomeCommandsPath / getHomeWorkflowsPath coverage.
Comment clarity:
- workflow-discovery.ts: MAX_DISCOVERY_DEPTH comment now leads with the
actual value (1) before describing what 0 would mean.
- script-discovery.ts: copy the "routing ambiguity" rationale from
MAX_DISCOVERY_DEPTH to MAX_SCRIPT_DISCOVERY_DEPTH.
Cleanup:
- Remove .archon/workflows/e2e-home-scope.yaml — one-off smoke test that
would ship permanently in every project's workflow list. Equivalent
coverage exists in loader.test.ts.
Addresses all blocking and important feedback from the multi-agent
review on PR #1315.
---------
Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>
* feat(providers/pi): interactive flag binds UIContext for extensions
Adds `interactive: true` opt-in to Pi provider (in `.archon/config.yaml`
under `assistants.pi`) that binds a minimal `ExtensionUIContext` stub to
each session. Without this, Pi's `ExtensionRunner.hasUI()` reports false,
causing extensions like `@plannotator/pi-extension` to silently auto-approve
every plan instead of opening their browser review UI.
Semantics: clamped to `enableExtensions: true` — no extensions loaded
means nothing would consume `hasUI`, so `interactive` alone is silently
dropped. Stub forwards `notify()` to Archon's event stream; interactive
dialogs (select/confirm/input/editor/custom) resolve to undefined/false;
TUI-only setters (widgets/headers/footers/themes) no-op. Theme access
throws with a clear diagnostic — Pi's theme singleton is coupled to its
own `Symbol.for()` registry which Archon doesn't own.
Trust boundary: only binds when the operator has explicitly enabled
both flags. Extensions gated on `ctx.hasUI` (plannotator and similar)
get a functional UI context; extensions that reach for TUI features
still fail loudly rather than rendering garbage.
Includes smoke-test workflow documenting the integration surface.
End-to-end plannotator UI rendering requires plan-mode activation
(Pi `--plan` CLI flag or `/plannotator` TUI slash command) which is
out of reach for programmatic Archon sessions — manual test only.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(providers/pi): end-to-end interactive extension UI
Three fixes that together get plannotator's browser review UI to actually
render from an Archon workflow and reach the reviewer's browser.
1. Call resourceLoader.reload() when enableExtensions is true.
createAgentSession's internal reload is gated on `!resourceLoader`, so
caller-supplied loaders must reload themselves. Without this,
getExtensions() returns the empty default, no ExtensionRunner is built,
and session.extensionRunner.setFlagValue() silently no-ops.
2. Set PLANNOTATOR_REMOTE=1 in interactive mode.
plannotator-browser.ts only calls ctx.ui.notify(url) when openBrowser()
returns { isRemote: true }; otherwise it spawns xdg-open/start on the
Archon server host — invisible to the user and untestable from bash
asserts. From the workflow runner's POV every Archon execution IS
remote; flipping the heuristic routes the URL through notify(), which
the ExtensionUIContext stub forwards into the event stream. Respect
explicit operator overrides.
3. notify() emits as assistant chunks, not system chunks.
The DAG executor's system-chunk filter only forwards warnings/MCP
prefixes, and only assistant chunks accumulate into $nodeId.output.
Emitting as assistant makes the URL available both in the user's
stream and in downstream bash/script nodes via output substitution.
Plus: extensionFlags config pass-through (equivalent to `pi --plan` on the
CLI) applied via ExtensionRunner.setFlagValue() BEFORE bindExtensions
fires session_start, so extensions reading flags in their startup handler
actually see them. Also bind extensions with an empty binding when
enableExtensions is on but interactive is off, so session_start still
fires for flag-driven but UI-less extensions.
Smoke test (.archon/workflows/e2e-plannotator-smoke.yaml) uses
openai-codex/gpt-5.4-mini (ChatGPT Plus OAuth compatible) and bumps
idle_timeout to 600000ms so plannotator's server survives while a human
approves in the browser.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(providers/pi): keep Archon extension-agnostic
Remove the plannotator-specific PLANNOTATOR_REMOTE=1 env var write from
the Pi provider. Archon's provider layer shouldn't know about any
specific extension's internals. Document the env var in the plannotator
smoke test instead — operators who use plannotator set it via their shell
or per-codebase env config.
Workflow smoke test updated with:
- Instructions for setting PLANNOTATOR_REMOTE=1 externally
- Simpler assertion (URL emission only) — validated in a real
reject-revise-approve run: reviewer annotated, clicked Send Feedback,
Pi received the feedback as a tool result, revised the plan (added
aria-label and WCAG contrast per the annotation), resubmitted, and
reviewer approved. Plannotator's tool result signals approval but
doesn't return the plan text, so the bash assertion now only checks
that the review URL reached the stream (not that plan content flowed
into \$nodeId.output — it can't).
- Known-limitation note documenting the tool-result shape so downstream
workflow authors know to Write the plan separately if they need it.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* chore(providers/pi): keep e2e-plannotator-smoke workflow local-only
The smoke test is plannotator-specific (calls plannotator_submit_plan,
expects PLAN.md on disk, requires PLANNOTATOR_REMOTE=1) and is better
kept out of the PR while the extension-agnostic infra lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* style(providers/pi): trim verbose inline comments
Collapse multi-paragraph SDK explanations to 1-2 line "why" notes across
provider.ts, types.ts, ui-context-stub.ts, and event-bridge.ts. No
behavior change.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(providers/pi): wire assistants.pi.env + theme-proxy identity
Two end-to-end fixes discovered while exercising the combined
plannotator + @pi-agents/loop smoke flow:
- PiProviderDefaults gains an optional `env` map; parsePiConfig picks
it up and the provider applies it to process.env at session start
(shell env wins, no override). Needed so extensions like plannotator
can read PLANNOTATOR_REMOTE=1 from config.yaml without requiring a
shell export before `archon workflow run`.
- ui-context-stub theme proxy returns identity decorators instead of
throwing on unknown methods. Styled strings flow into no-op
setStatus/setWidget sinks anyway, so the throw was blocking
plannotator_submit_plan after HTTP approval with no benefit.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(providers/pi): flush notify() chunks immediately in batch mode
Batch-mode adapters (CLI) accumulate assistant chunks and only flush on
node completion. That broke plannotator's review-URL flow: Pi's notify()
emitted the URL as an assistant chunk, but the user needed the URL to
POST /api/approve — which is what unblocks the node in the first place.
Adds an optional `flush` flag on assistant MessageChunks. notify() sets
it, and the DAG executor drains pending batched content before surfacing
the flushed chunk so ordering is preserved.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs: mention Pi alongside Claude and Codex in README + top-level docs
The AI assistants docs page already covers Pi in depth, but the README
architecture diagram + docs table, overview "Further Reading" section,
and local-deployment .env comment still listed only Claude/Codex.
Left feature-specific mentions alone where Pi genuinely lacks support
(e.g. structured output — Claude + Codex only).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs: note Pi structured output (best-effort) in matrix + workflow docs
Pi gained structured output support via prompt augmentation + JSON
extraction (see packages/providers/src/community/pi/capabilities.ts).
Unlike Claude/Codex, which use SDK-enforced JSON mode, Pi appends the
schema to the prompt and parses JSON out of the result text (bare or
fenced). Updates four stale references that still said Claude/Codex-only:
- ai-assistants.md capabilities matrix
- authoring-workflows.md (YAML example + field table)
- workflow-dag.md skill reference
- CLAUDE.md DAG-format node description
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(providers/pi): default extensions + interactive to on
Extensions (community packages like @plannotator/pi-extension and
user-authored ones) are a core reason users pick Pi. Defaulting
enableExtensions and interactive to false previously silenced installed
extensions with no signal, leading to "did my extension even load?"
confusion.
Opt out in .archon/config.yaml when you want the prior behavior:
assistants:
pi:
enableExtensions: false # skip extension discovery entirely
# interactive: false # load extensions, but no UI bridge
Docs gain a new "Extensions (on by default)" section in
getting-started/ai-assistants.md that documents the three config
surfaces (extensionFlags, env, workflow-level interactive) and uses
plannotator as a concrete walk-through example.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* feat(workflows): inline sub-agent definitions on DAG nodes
Add `agents:` node field letting workflow YAML define Claude Agent SDK
sub-agents inline, keyed by kebab-case ID. The main agent can spawn
them via the Task tool — useful for map-reduce patterns where a cheap
model briefs items and a stronger model reduces.
Authors no longer need standalone `.claude/agents/*.md` files for
workflow-scoped helpers; the definitions live with the workflow.
Claude only. Codex and community providers without the capability
emit a capability warning and ignore the field. Merges with the
internal `dag-node-skills` wrapper when `skills:` is also set —
user-defined agents win on ID collision.
* fix(workflows): address PR #1276 review feedback
Critical:
- Re-export agentDefinitionSchema + AgentDefinition from schemas/index.ts
(matches the "schemas/index.ts re-exports all" convention).
Important:
- Surface user-override of internal 'dag-node-skills' wrapper: warn-level
provider log + platform message to the user when agents: redefines the
reserved ID alongside skills:. User-wins behavior preserved (by design)
but silent capability removal is now observable.
- Add validator test coverage for the agents-capability warning (codex
node with agents: → warning; claude node → no warning; no-agents
field → no warning).
- Strengthen NodeConfig.agents duplicate-type comment explaining the
intentional circular-dep avoidance and pointing at the Zod schema as
authoritative source. Actual extraction is follow-up work.
Simplifications:
- Drop redundant typeof check in validator (schema already enforces).
- Drop unreachable Object.keys(...).length > 0 check in dag-executor.
- Drop rot-prone "(out of v1 scope)" parenthetical.
- Drop WHAT-only comment on AGENT_ID_REGEX.
- Tighten AGENT_ID_REGEX to reject trailing/double hyphens
(/^[a-z0-9]+(-[a-z0-9]+)*$/).
Tests:
- parseWorkflow strips agents on script: and loop: nodes (parallel to
the existing bash: coverage).
- provider emits warn log on dag-node-skills collision; no warn on
non-colliding inline agents.
Docs:
- Renumber authoring-workflows Summary section (12b → 13; bump 13-19).
- Add Pi capability-table row for inline agents (❌, Claude-only).
- Add when-to-use guidance (agents: vs .claude/agents/*.md) in the
new "Inline sub-agents" section.
- Cross-link skills.md Related → inline-sub-agents.
- CHANGELOG [Unreleased] Added entry for #1276.
Previously, `dag-executor` only failed nodes/iterations when the SDK
returned an `error_max_budget_usd` result. Every other `isError: true`
subtype — including `error_during_execution` — was silently `break`ed
out of the stream with whatever partial output had accumulated, letting
failed runs masquerade as successful ones with empty output.
This is the most likely explanation for the "5-second crash" symptom in
#1208: iterations finish instantly with empty text, the loop keeps
going, and only the `claude.result_is_error` log tips the user off.
Changes:
- Capture the SDK's `errors: string[]` detail on result messages
(previously discarded) and surface it through `MessageChunk.errors`.
- Log `errors`, `stopReason` alongside `errorSubtype` in
`claude.result_is_error` so users can see what actually failed.
- Throw from both the general node path and the loop iteration path
on any `isError: true` result, including the subtype and SDK errors
detail in the thrown message.
Note: this does not implement auto-retry. See PR comments on #1121 and
the analysis on #1208 — a retry-with-fresh-session approach for loop
iterations is not obviously correct until we see what
`error_during_execution` actually carries in the reporter's env.
This change is the observability + fail-loud step that has to come
first so that signal is no longer silent.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Injects exit 1 into e2e-deterministic bash-echo node to prove the engine
fix (failWorkflowRun on anyFailed) propagates to a non-zero CLI exit code
and a red X in GitHub Actions. Will be reverted in the next commit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: replace hardcoded provider factory with typed registry system
Replace the built-in-only factory switch with a typed ProviderRegistration
registry where entries carry metadata (displayName, capabilities,
isModelCompatible) alongside the factory function. This enables community
providers to register without modifying core code.
- Add ProviderRegistration and ProviderInfo types to contract layer
- Create registry.ts with register/get/list/clear API, delete factory.ts
- Bootstrap registerBuiltinProviders() at server and CLI entrypoints
- Widen provider unions from 'claude' | 'codex' to string across schemas,
config types, deps, executors, and API validation
- Replace hardcoded model-validation with registry-driven isModelCompatible
and inferProviderFromModel (built-in only inference)
- Add GET /api/providers endpoint returning registry metadata
- Dynamic provider dropdowns in Web UI (BuilderToolbar, NodeInspector,
WorkflowBuilder, SettingsPage) via useProviders hook
- Dynamic provider selection in CLI setup command
- Registry test suite covering full lifecycle
* feat: generalize assistant config and tighten registry validation
- Add ProviderDefaults/ProviderDefaultsMap generic types to contract layer
- Add index signatures to ClaudeProviderDefaults/CodexProviderDefaults
- Introduce AssistantDefaults/AssistantDefaultsConfig intersection types
that combine ProviderDefaultsMap with typed built-in entries
- Replace hardcoded claude/codex config merging with generic
mergeAssistantDefaults() that iterates all provider entries
- Replace hardcoded toSafeConfig projection with generic
toSafeAssistantDefaults() that strips server-internal fields
- Validate provider strings at all config-entry surfaces: env override,
global config, repo config all throw on unknown providers
- Validate provider on PATCH /api/config/assistants (400 on unknown)
- Move validator.ts from hardcoded Codex checks to capability-driven
warnings using registry getProviderCapabilities()
- Remove resolveProvider() default to 'claude' — returns undefined when
no provider is set, skipping capability warnings for unresolved nodes
- Widen config API schemas to generic Record<string, ProviderDefaults>
- Rewrite SettingsPage to iterate providers dynamically with built-in
specific UI for Claude/Codex and generic JSON view for community
- Extract bootstrap to provider-bootstrap modules in CLI and server
- Remove all as Record<...> casts from dag-executor, executor,
orchestrator — clean indexing via ProviderDefaultsMap intersection
* fix: remove remaining hardcoded provider assumptions and regenerate types
- Replace hardcoded 'claude' defaults in CLI setup with registry lookup
(getRegisteredProviders().find(p => p.builtIn)?.id)
- Replace hardcoded 'claude' default in clone.ts folder detection with
registry-driven fallback
- Update config YAML comment from "claude or codex" to "registered provider"
- Make bootstrap test assertions use toContain instead of exact toEqual
so they don't break when community providers are registered
- Widen validator.test.ts helper from 'claude' | 'codex' to string
- Remove unnecessary type casts in NodeInspector, WorkflowBuilder,
SettingsPage now that generated types use string
- Regenerate api.generated.d.ts from updated OpenAPI spec — all provider
fields are now string instead of 'claude' | 'codex' union
* fix: address PR review findings — consistency, tests, docs
Critical fixes:
- isModelCompatible now throws on unknown providers (fail-fast parity
with getProviderCapabilities) instead of silently returning true
- Schema provider fields use z.string().trim().min(1) to reject
whitespace-only values
- validator.ts resolveProvider accepts defaultProvider param so
capability warnings fire for config-inherited providers
- PATCH /api/config/assistants validates assistants keys against
registry (rejects unknown provider IDs in the map)
YAGNI cleanup:
- Delete provider-bootstrap.ts wrappers in CLI and server — call
registerBuiltinProviders() directly
- Remove no-op .map(provider => provider) in SettingsPage
Test coverage:
- Add GET /api/providers endpoint tests (shape, projection, capabilities)
- Add config-loader throw-path tests for unknown providers in env var,
global config, and repo config
- Add isModelCompatible throw test for unknown providers
Docs:
- CLAUDE.md: factory.ts → registry.ts in directory tree, add
GET /api/providers to API endpoints section
- .env.example: update DEFAULT_AI_ASSISTANT comment
- docs-web configuration reference: update provider constraint docs
UI:
- Settings default-assistant dropdown uses allProviderEntries fallback
(no longer silently empty on API failure)
- clearRegistry marked @internal in JSDoc
* fix: use registry defaults in getDefaults/registerProject, document type design
- getDefaults() initializes assistant defaults from registered providers
instead of hardcoding { claude: {}, codex: {} }
- getDefaults() uses first registered built-in as default assistant
instead of hardcoding 'claude'
- handleRegisterProject uses config.assistant instead of hardcoded 'claude'
for new codebase ai_assistant_type
- Document AssistantDefaults/AssistantDefaultsConfig intersection types:
built-in keys are typed for parseClaudeConfig/parseCodexConfig type
safety; community providers use the generic [string] index
- Document WorkflowConfig.assistants intersection type with same rationale
* docs: update stale provider references to reflect registry system
- architecture.md: DB schema comment now says 'registered provider'
- first-workflow.md: provider field accepts any registered provider
- quick-reference.md: provider type changed from enum to string
- authoring-workflows.md: provider type changed from enum to string
- title-generator.ts: @param doc updated from 'claude or codex' to
generic provider identifier
* docs: fix remaining stale provider references in quick-reference and authoring guide
- quick-reference.md: per-node provider type changed from enum to string
- quick-reference.md: model mismatch guidance updated for registry pattern
- authoring-workflows.md: provider comment says 'any registered provider'
* refactor: extract provider metadata seam for Phase 2 registry readiness
- Add static capability constants (capabilities.ts) for Claude and Codex
- Export getProviderCapabilities() from @archon/providers for capability
queries without provider instantiation
- Add inferProviderFromModel() to model-validation.ts, replacing three
copy-pasted inline inference blocks in executor.ts and dag-executor.ts
- Replace throwaway provider instantiation in dag-executor with static
capability lookup (getProviderCapabilities)
- Add orchestrator warning when env vars are configured but provider
doesn't support envInjection
* refactor: address LOW findings from code review
- Remove CLAUDE_CAPABILITIES/CODEX_CAPABILITIES from public index (YAGNI —
callers should use getProviderCapabilities(), not raw constants)
- Remove dead _deps parameter from resolveNodeProviderAndModel and its
two call-sites (no longer needed after static capability lookup refactor)
- Update factory.ts module JSDoc to mention both exported functions
- Add edge-case tests for getProviderCapabilities: empty string and
case-sensitive throws (parity with existing getAgentProvider tests)
- Add test for inferProviderFromModel with empty string (returns default,
documenting the falsy-string shortcut)
Remove the entire env-leak scanning/consent infrastructure: scanner,
allow_env_keys DB column usage, allow_target_repo_keys config, PATCH
consent route, --allow-env-keys CLI flag, and UI consent toggle.
The env-leak gate was the wrong primitive. Target repo .env protection
is already structural:
- stripCwdEnv() at boot removes Bun-auto-loaded CWD .env keys
- Archon loads its own env sources afterward (~/.archon/.env)
- process.env is clean before any subprocess spawns
- Managed env injection (config.yaml env: + DB vars) is unchanged
No scanning, no consent, no blocking. Any repo can be registered and
used. Subprocesses receive the already-clean process.env.
* refactor: extract providers from @archon/core into @archon/providers
Move Claude and Codex provider implementations, factory, and SDK
dependencies into a new @archon/providers package. This establishes a
clean boundary: providers own SDK translation, core owns business logic.
Key changes:
- New @archon/providers package with zero-dep contract layer (types.ts)
- @archon/workflows imports from @archon/providers/types — no mirror types
- dag-executor delegates option building to providers via nodeConfig
- IAgentProvider gains getCapabilities() for provider-agnostic warnings
- @archon/core no longer depends on SDK packages directly
- UnknownProviderError standardizes error shape across all surfaces
Zero user-facing changes — same providers, same config, same behavior.
* refactor: remove config type duplication and backward-compat re-exports
Address review findings:
- Move ClaudeProviderDefaults and CodexProviderDefaults to the
@archon/providers/types contract layer as the single source of truth.
@archon/core/config/config-types.ts now imports from there.
- Remove provider re-exports from @archon/core (index.ts and types/).
Consumers should import from @archon/providers directly.
- Update @archon/server to depend on @archon/providers for MessageChunk.
* refactor: move structured output validation into providers
Each provider now normalizes its own structured output semantics:
- Claude already yields structuredOutput from the SDK's native field
- Codex now parses inline agent_message text as JSON when outputFormat
is set, populating structuredOutput on the result chunk
This eliminates the last provider === 'codex' branch from dag-executor,
making it fully provider-agnostic. The dag-executor checks structuredOutput
uniformly regardless of provider.
Also removes the ClaudeCodexProviderDefaults deprecated alias — all
consumers now use ClaudeProviderDefaults directly.
* fix: address PR review — restore warnings, fix loop options, cleanup
Critical fixes:
- Restore MCP missing env vars user-facing warning (was silently dropped)
- Restore Haiku + MCP tool search warning
- Fix buildLoopNodeOptions to pass workflow-level nodeConfig (effort,
thinking, betas, sandbox were silently lost for loop nodes)
- Add TODO(#1135) comments documenting env-leak gate gap
Cleanup:
- Remove backward-compat type aliases from deps.ts (keep WorkflowTokenUsage)
- Remove 26 unnecessary eslint-disable comments from test files
- Trim internal helpers from providers barrel (withFirstMessageTimeout,
getProcessUid, loadMcpConfig, buildSDKHooksFromYAML)
- Add @archon/providers dep to CLI package.json
- Fix 8 stale documentation paths pointing to deleted core/src/providers/
- Add E2E smoke test workflows for both Claude and Codex providers
* fix: forward provider system warnings to users in dag-executor
The dag-executor only forwarded system chunks starting with
"MCP server connection failed:" — all other provider warnings
(missing env vars, Haiku+MCP, structured output issues) were
logged but never reached the user.
Now forwards all system chunks starting with ⚠️ (the prefix
providers use for user-actionable warnings).
* fix: add providers package to Dockerfile and fix CI module resolution
- Add packages/providers/ to all three Dockerfile stages (deps,
production package.json copy, production source copy)
- Replace wildcard export map (./*) with explicit subpath entries
to fix module resolution in CI (bun workspace linking)
* chore: update bun.lock for providers package exports
Three fixes for message duplication during live workflow execution:
1. dag-executor: Add missing `tool_call_formatted` category to loop iteration
tool messages. Without this, the web adapter sent tool text as both a regular
SSE text event AND a structured tool_call event, causing each tool to appear
twice (raw text + rendered card). Regular DAG nodes already had this metadata.
2. WorkflowLogs: Add text content dedup in SSE/DB merge. During live execution,
the same text (e.g. "Starting workflow...") can appear in both DB (REST fetch)
and SSE (event buffer replay). Collects DB text into a Set and skips matching
SSE text messages.
3. orchestrator-agent: Suppress remainingMessage re-send in stream mode. The
routing AI streams text chunks before /invoke-workflow is detected, then
retracts them. Without suppression, remainingMessage re-sends the same text.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add ScriptNode schema and type guards (US-001)
Implements US-001 from the script-nodes PRD.
Changes:
- Add scriptNodeSchema with script, runtime (bun|uv), deps, and timeout fields
- Add ScriptNode type with never fields for mutual exclusivity
- Add isScriptNode type guard
- Add SCRIPT_NODE_AI_FIELDS constant (same as BASH_NODE_AI_FIELDS)
- Update dagNodeSchema superRefine and transform to handle script: nodes
- Update DagNode union type to include ScriptNode
- Add script node dispatch stub in dag-executor.ts (fails fast until US-003)
- Export all new types and values from schemas/index.ts
- Add comprehensive schema tests for ScriptNode parsing and validation
* feat: script discovery from .archon/scripts/
Implements US-002 from PRD.
Changes:
- Add ScriptDefinition type and discoverScripts() in script-discovery.ts
- Auto-detect runtime from file extension (.ts/.js->bun, .py->uv)
- Handle duplicate script name conflicts across extensions
- Add bundled defaults infrastructure (empty) for scripts
- Add tests for discovery, naming, and runtime detection
* feat: script execution engine (inline + named)
Implements US-003 from PRD.
Changes:
- Add executeScriptNode() in dag-executor.ts following executeBashNode pattern
- Support inline bun (-e) and uv (run python -c) execution
- Support named scripts via bun run / uv run
- Wire ScriptNode dispatch replacing 'not yet implemented' stub
- Capture stdout as node output, stderr as warning
- Handle timeout and non-zero exit
- Pass env vars for variable substitution
- Add tests for inline/named/timeout/failure cases
* feat: runtime availability validation at load time
Implements US-004 from PRD.
Changes:
- Add checkRuntimeAvailable() utility for bun/uv binary detection
- Extend validator.ts with script file and runtime validation
- Integrate script validation into parseWorkflow flow in loader.ts
- Add tests for runtime availability detection
* feat: dependency installation for script nodes
Implements US-005 from PRD.
Changes:
- Support deps field for uv nodes: uvx --with dep1... for inline
- Support uv run --with dep1... for named uv scripts
- Bun deps are auto-installed at runtime via bun's native mechanism
- Empty/omitted deps field produces no extra flags
- Add tests for dep injection into both runtimes
* test: integration tests and validation for script nodes
Implements US-006 from PRD.
Changes:
- Fill test coverage gaps for script node feature
- Add script + command mutual exclusivity schema test
- Add env var substitution tests ($WORKFLOW_ID, $ARTIFACTS_DIR in scripts)
- Add stderr handling test (stderr sent to user as platform message)
- Add missing named script file validation tests to validator.test.ts
- Full bun run validate passes
* fix: address review findings in script nodes
- Extract isInlineScript to executor-shared.ts (was duplicated in
dag-executor.ts and validator.ts)
- Remove dead warnMissingScriptRuntimes from loader.ts (validator
already covers runtime checks)
- Remove path traversal fallback in executeScriptNode — error when
named script not found instead of executing arbitrary file paths
- Memoize checkRuntimeAvailable to avoid repeated subprocess spawns
- Add min(1) to scriptNodeSchema.script field for consistency
- Replace dynamic import with static import in validator.ts
* fix(workflows): address review findings for script node implementation
Critical fixes:
- Wrap discoverScripts() in try-catch inside executeScriptNode to prevent
unhandled rejections when script discovery fails (e.g. duplicate names)
- Add isScriptNode to isNonAiNode check in loader.ts so AI-specific fields
on script nodes emit warnings (activates SCRIPT_NODE_AI_FIELDS)
Important fixes:
- Surface script stderr in user-facing error messages on non-zero exit
- Replace uvx with uv run --with for inline uv scripts with deps
- Add z.string().min(1) validation on deps array items
- Remove unused ScriptDefinition.content field and readFile I/O
- Add logging in discoverAvailableScripts catch block
- Warn when deps is specified with bun runtime (silently ignored)
Simplifications:
- Merge BASH_DEFAULT_TIMEOUT and SCRIPT_DEFAULT_TIMEOUT into single
SUBPROCESS_DEFAULT_TIMEOUT constant
- Use scriptDef.runtime instead of re-deriving from extname()
- Extract shared formatValidationResult helper, deduplicate section comments
Tests:
- Add isInlineScript unit tests to executor-shared.test.ts
- Add named-script-not-found executor test to dag-executor.test.ts
- Update deps tests to expect uv instead of uvx
Docs:
- Add script: node type to CLAUDE.md node types and directory structure
- Add script: to .claude/rules/workflows.md DAG Node Types section
* feat: use 1M context model for implement nodes in bundled workflows (#1016)
Large codebases fill the 200k context window during implementation, triggering
SDK auto-compaction which loses context detail and slows execution. Setting
claude-opus-4-6[1m] on implementation nodes gives 5x more room before compaction.
Changes:
- Set model: claude-opus-4-6[1m] on implement nodes in 8 bundled workflows
- Fix loop nodes to respect per-node model overrides (previously only used
workflow-level model)
- Review/classify/report nodes stay on sonnet/haiku for cost efficiency
Fixes#1016
* fix: resolve per-node provider/model overrides for loop nodes
Move provider and model resolution for loop nodes to the call site,
matching the pattern used by command/prompt/bash nodes. This fixes:
- Loop nodes now respect per-node `provider` overrides (previously ignored)
- Model/provider compatibility is validated before execution
- JSDoc on buildLoopNodeOptions accurately describes the function
- Add tests for result chunk cost/stopReason/numTurns/modelUsage fields in claude.test.ts
- Add tests for rate_limit_event chunk (with and without rate_limit_info)
- Add tests for total_cost_usd accumulation in completeWorkflowRun (single-node, multi-node, no-cost)
- Add test for loop node cost accumulation across iterations
- Fix stale JSDoc in executeNodeInternal and executeLoopNode (NodeOutput -> NodeExecutionResult)
- Change stopReason guard from truthiness to !== undefined for consistency
- Add comment explaining totalCostUsd > 0 guard intent
- Add comment noting resume runs only accumulate cost for the current invocation
- Add rate_limit fallthrough comment in executeNodeInternal and executeLoopNode
Extract total_cost_usd, stop_reason, num_turns, and model_usage from
Claude SDK result messages. Accumulate per-node cost in dag-executor
(regular + loop nodes), aggregate to per-run total via completeWorkflowRun
metadata, and display cost in the Web UI WorkflowRunCard. Rate limit
events are logged as warnings and yielded as a new message chunk type.
Fixes#932
- Tighten workflow-level schema: add .min(1) to fallbackModel and .nonempty() to betas to match node-level validation
- Replace inline effort/thinking/sandbox types in deps.ts with EffortLevel/ThinkingConfig/SandboxSettings imported from ./schemas
- Extract WorkflowLevelOptions interface in dag-executor.ts to eliminate repeated inline type shape
- Add dag.node_budget_cap_exceeded structured log event before budget cap throw, replacing bare type assertion with nodeOptions?.maxBudgetUsd
- Update CLAUDE.md nodes: bullet to document new Claude-only options
- Add Claude SDK advanced options section and 7 table rows to authoring-workflows.md; update Summary list
- Add SDK option forwarding tests to claude.test.ts (effort, thinking, maxBudgetUsd, systemPrompt, fallbackModel, betas, sandbox)
- Add dag-executor tests: error_max_budget_usd failure path, per-node vs workflow-level precedence, Codex warning
Workflow authors can now configure 7 Claude SDK options per DAG node in YAML.
Five of these (effort, thinking, fallbackModel, betas, sandbox) are also
settable at workflow level as defaults that per-node values override.
Changes:
- Add effortLevelSchema, thinkingConfigSchema, sandboxSettingsSchema to dag-node.ts
- Add 7 optional fields to dagNodeBaseSchema with BASH_NODE_AI_FIELDS + transform
- Add 5 workflow-level defaults to workflowBaseSchema
- Extend WorkflowAssistantOptions (deps.ts) and AssistantRequestOptions (types/index.ts)
- Forward all 7 options in claude.ts with systemPrompt conditional override
- Add workflow-level options resolution in dag-executor.ts with per-node override
- Add error_max_budget_usd handling in result handler
- Consolidated Codex warning for all Claude-only options
- Add 16 schema parsing tests for new options
Fixes#931
Projects with docs outside `docs/` (e.g., `packages/docs-web/src/content/docs/`)
get broken bundled commands because the path is hardcoded. Add `docs.path` to
`.archon/config.yaml` and thread it through the workflow engine as `$DOCS_DIR`
(default: `docs/`), following the same pipeline as `$BASE_BRANCH`.
Changes:
- Add `docs.path` to RepoConfig and `docsPath` to MergedConfig/WorkflowConfig
- Thread `docsDir` through executor-shared, executor, and dag-executor
- Update bundled commands to use `$DOCS_DIR` instead of hardcoded `docs/`
- Add optional docs path prompt to `archon setup`
- Add variable reference and configuration documentation
- Resolve pre-existing merge conflicts in server/api.ts
Fixes#982
Add per-project environment variable management as a first-class config
primitive. Env vars defined in .archon/config.yaml or stored in DB via
Web UI are merged into Options.env on Claude SDK calls.
Three env var sources merge in priority order (later wins):
1. process.env — global, from ~/.archon/.env via dotenv
2. .archon/config.yaml env: section — file-based per-project
3. DB remote_agent_codebase_env_vars table — Web UI per-project
Changes:
- Add remote_agent_codebase_env_vars table (PG migration + SQLite schema)
- Add DB CRUD module (packages/core/src/db/env-vars.ts)
- Extend IWorkflowStore with getCodebaseEnvVars method
- Add env field to RepoConfig, MergedConfig, WorkflowConfig, WorkflowAssistantOptions, AssistantRequestOptions
- Merge DB env vars in executor after config load
- Inject env vars into Claude subprocess via Options.env
- Add 3 API routes (GET/PUT/DELETE /api/codebases/:id/env)
- Add EnvVarsPanel to Settings page with masked value display
Fixes#852
Approval nodes can now capture the reviewer's comment as $nodeId.output
(via capture_response: true) and optionally retry on rejection instead of
cancelling (via on_reject: { prompt, max_attempts }). This enables
iterative human-AI review cycles without needing interactive loop nodes.
Changes:
- Add approvalOnRejectSchema and extend approval node schema with
capture_response and on_reject fields
- Extend ApprovalContext with captureResponse, onRejectPrompt,
onRejectMaxAttempts (stored at pause time for reject handlers)
- Add $REJECTION_REASON variable to substituteWorkflowVariables
- Extract executeApprovalNode function with rejection resume logic
- Update all 4 approve handlers (CLI, command-handler, orchestrator,
server API) to use captureResponse and clear rejection state
- Update all 3 reject handlers (CLI, command-handler, server API) to
check onRejectPrompt and retry instead of cancel when configured
- Add 5 tests for approval node behavior (fresh pause, capture_response,
on_reject resume, max_attempts exhaustion, max_attempts=1)
Fixes#936
The `startsWith('error_')` check would misclassify any SDK error whose
subtype begins with `error_` (e.g. `error_max_turns`, `error_max_tokens`)
as credit exhaustion, giving users misleading "wait for credits" guidance.
The SDK returns credit exhaustion as assistant text anyway, so
`detectCreditExhaustion(nodeOutputText)` is the correct and complete
detection path. Remove the speculative SDK flag path and its associated
local variables.
Also adds an independent test for the 'insufficient credit' pattern in
CREDIT_EXHAUSTION_OUTPUT_PATTERNS, which was previously only incidentally
covered by the 'credit balance' test string.
The when: condition evaluator only supported == and != string equality,
forcing workflow authors to coerce numeric values to sentinel strings
and use extra gate nodes for multi-condition branching.
Changes:
- Add <, >, <=, >= numeric comparison operators (fail-closed on non-finite values)
- Add && (AND) and || (OR) compound expression support with standard precedence
- Extract evaluateAtom() and splitOutsideQuotes() helpers in condition-evaluator.ts
- Add 16 new tests covering numeric ops, compound logic, precedence, and fail-closed
- Update dag-executor parse-error message to mention new operators
- Update docs/authoring-workflows.md with full operator reference
Fixes#796
When Claude credits run out mid-workflow, the SDK returns the error as
normal assistant text rather than throwing. The DAG executor was marking
these nodes as completed with garbage output, preventing resume from
re-running them.
Changes:
- Add isError/errorSubtype to MessageChunk and WorkflowMessageChunk result types
- Propagate is_error/subtype from Claude SDK result message
- Add detectCreditExhaustion() to executor-shared for text pattern matching
- Gate node_completed in dag-executor with credit exhaustion check
- Mark credit-exhausted nodes as failed so resume re-runs them
Fixes#940
Removes the brittle approvalPhrases word matching (commit 555d5f4) and
re-enables AI completion signal detection for interactive loops. The AI
now determines when the user approves based on prompt instructions, not
a fixed list of 19 phrases.
Key changes:
- Remove approvalPhrases block from dag-executor.ts
- Re-enable completion signal detection (revert ac2cfc07 bypass)
- Move completion check before interactive gate (signal = exit, no signal = pause)
- Add first-iteration guard: fresh interactive loops always gate before exit
- Fix node_completed timing: only written on actual loop completion, not on every approve
- Strengthen PIV loop prompts with explicit signal emission rules
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
For interactive loops, the AI's completion signal is now fully ignored.
Exit is controlled exclusively by the user's input — if they send a
short approval phrase, the loop exits before running the AI. If they
send substantive feedback, the AI runs and always pauses after.
This prevents the AI from auto-approving by addressing feedback AND
emitting the signal in the same iteration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Interactive loops now check the user's input BEFORE running the AI
iteration. If the user's response is a short approval phrase ("approved",
"looks good", "lgtm", etc.), the loop exits immediately without running
another AI iteration. If it's substantive feedback, the AI iteration
runs and always pauses after.
This fixes the core issue where the AI would address user feedback AND
emit the completion signal in the same iteration, causing the loop to
exit before the user could review the AI's changes. The completion
signal from the AI is now irrelevant for interactive loops — only the
user's explicit approval controls exit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, interactive loops would exit immediately when the AI emitted
the completion signal, bypassing user review. This caused two problems:
- Plan refinement loops auto-approved without showing the revised plan
- Validation feedback loops auto-approved on first iteration (empty input)
Now interactive loops ALWAYS pause after every iteration, regardless of
whether the AI signaled completion. The user decides when to exit by
explicitly approving. The completion signal becomes the AI's suggestion
("I think this is ready") but the user still gets the final say.
Non-interactive loops are unchanged — they still exit on completion signal.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Loop nodes can now pause between iterations for user feedback when
`interactive: true` and `gate_message` are set. On pause, the user
provides feedback via `/workflow approve <id> <feedback>`, which is
injected as `$LOOP_USER_INPUT` in the next iteration's prompt. Session
context is preserved across pause/resume when `fresh_context: false`.
Changes:
- Extended `ApprovalContext` with type/iteration/sessionId fields
- Added `interactive` and `gate_message` to loop node schema with cross-field validation
- Updated `pauseWorkflowRun` signature to use typed `ApprovalContext`
- Added `$LOOP_USER_INPUT` variable substitution in executor-shared
- Implemented resume detection and interactive pause in executeLoopNode
- Updated approve handler to branch on interactive_loop gate type
- Added loader warning for interactive loops in non-interactive workflows
- Added 4 test cases covering pause, signal exit, resume, and regression
Fixes#937
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace 5-pass nodeOutputs iteration with single for-loop
- Derive anyCompleted/anyFailed from counts instead of separate .some() calls
- Add metadata payload assertion to loop completion test
- Add comment on failure path documenting missing node_counts
When a DAG workflow completes, compute node outcome counts (completed,
failed, skipped, total) from nodeOutputs and persist them into the
metadata JSONB column. The dashboard card now shows a compact summary
like "7/10 nodes succeeded · 2 failed · 1 skipped".
Changes:
- Extend completeWorkflowRun to accept optional metadata (store + db)
- Compute node counts in dag-executor at completion time
- Add NodeCountsSummary component to WorkflowRunCard
- Add tests for metadata merge and node counts propagation
Fixes#564
Rename `now` to `tickNow` at the top of the streaming loop to avoid
shadowing the pre-existing `const now = Date.now()` in the tool event
block further down. No behavioral change.
Split the 10s activity interval into separate cancel-check (read, 10s) and
heartbeat (write, 60s) timers. Reads don't contend under WAL mode, so cancel
responsiveness is unchanged while heartbeat writes drop by 6x.
Fixes#419
Workflow authors can now use `cancel: "reason string"` as a DAG node type
to terminate a running workflow from inside the DAG based on upstream
conditions, preventing wasted compute on branches that should be killed.
Changes:
- Add cancelNodeSchema, CancelNode type, isCancelNode guard to dag-node.ts
- Add cancel to mutual exclusivity checks (command/prompt/bash/loop/approval/cancel)
- Add cancel node dispatch in dag-executor (mirrors approval pattern)
- Add cancelWorkflowRun to IWorkflowStore interface and store-adapter
- Add workflow_cancelled event type and WorkflowCancelledEvent emitter event
- Add cancel to non-AI node detection in loader (AI field warnings)
- Handle workflow_cancelled in server SSE bridge (exhaustive switch)
- Add tests for schema, loader, store-adapter, and executor
Fixes#220
* fix(workflows): restore post-workflow summary in parent conversation (#900)
After PR #805 removed sequential execution and made DAG the sole format,
the terminal node output was never returned from executeDagWorkflow (void
return), so executor.ts always returned {success:true} with no summary
field, and orchestrator.ts silently skipped the summary send.
Changes:
- dag-executor.ts: Change executeDagWorkflow return type void → string|undefined
- dag-executor.ts: Compute terminal nodes (no dependents) and return first
completed non-empty output after emitter.unregisterRun
- executor.ts: Capture dagSummary from executeDagWorkflow and pass as
summary field in the completed WorkflowExecutionResult
Fixes#900
* test(isolation): update syncWorkspace spy assertions for resetAfterFetch arg
Five WorktreeProvider tests were failing because syncWorkspace gained a
third `{ resetAfterFetch: boolean }` argument; test assertions expected
only two arguments. Updated assertions to include `{ resetAfterFetch: false }`
(test paths are not managed clones under ~/.archon/workspaces).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings from PR #907
- Add doc comment on terminal node selection explaining first-match semantics
for multi-terminal DAGs (code-review LOW finding)
- Fix mock type signature for mockExecuteDagWorkflow from Promise<void> to
Promise<string | undefined> to match updated return type (test-coverage LOW)
- Add result.summary wiring tests in executor.test.ts: verifies that when
executeDagWorkflow returns a string it flows through to WorkflowExecutionResult.summary,
and that undefined is passed through correctly (test-coverage MEDIUM)
- Add terminal node selection return-value tests in dag-executor.test.ts: linear DAG
returns terminal output, empty terminal returns undefined, fan-in DAG returns only
the true terminal node (test-coverage MEDIUM + rating-5 gap)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: configurable Claude Code settingSources via repo config (#839)
Users who maintain a global ~/.claude/CLAUDE.md lose that context in Archon
sessions because settingSources was hardcoded to ['project']. This adds a
settingSources field to .archon/config.yaml so users can opt in to
['project', 'user'] per-repo or globally.
Changes:
- Add ClaudeAssistantDefaults interface with settingSources field
- Merge settingSources through global and repo config cascade
- Pass settingSources from config through orchestrator to Claude SDK
- Default remains ['project'] (no behavior change for existing users)
- Add tests for settingSources passthrough and default fallback
Fixes#839
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address review findings for settingSources feature
Fixed:
- Guard settingSources with assistant type check (claude only)
- Add config-loader tests for settingSources merge logic (4 tests)
- Add orchestrator tests for settingSources forwarding (2 tests)
- Document settingSources in CLAUDE.md assistant defaults example
- Document settingSources in docs/configuration.md (global, repo, subsection)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: propagate settingSources through workflow executor
Add settingSources to WorkflowAssistantOptions and WorkflowConfig.assistants.claude
in deps.ts so the field is part of the narrow workflow interface. Propagate it in
dag-executor.ts for both regular Claude DAG nodes and loop nodes, so .archon/config.yaml
settingSources setting takes effect when running workflows (not just chat sessions).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: eliminate duplicate loadConfig and revert SafeConfig settingSources leak
Load config once during discoverAllWorkflows and reuse the result in handleMessage
instead of calling loadConfig a second time for the same codebase path. Also restrict
SafeConfig.assistants.claude to Pick<ClaudeAssistantDefaults, 'model'> so server-internal
settingSources is never sent to web clients (YAGNI).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add approval gate node type for human-in-the-loop workflows
Add `approval` as a fifth DAG node type that pauses workflow execution
until a human approves or rejects. Built on existing resume infrastructure.
- New `approval` node type in YAML workflows with `message` field
- `paused` workflow status (non-terminal, resumable)
- Approve/reject via REST API, CLI, chat commands, and Web UI
- Approval comment available as `$node.output` in downstream nodes
- Dashboard shows amber pulsing badge with approval message for paused runs
- Path guard blocks new workflows on paused worktrees
* fix: resolve 7 bugs in approval gate implementation
Critical:
- Parse metadata JSON on SQLite (returned as TEXT string, not object)
- Suppress spurious "Workflow stopped (paused)" message after approval gates
Medium:
- CLI exits 0 on pause (not error code 1)
- workflow status shows paused runs alongside running
- Docs clarify auto-resume is CLI-only; API/chat mark resumable
Low:
- Skip completed_at when transitioning to failed for approval resume
- Warn on AI fields (model, hooks, etc.) set on approval nodes
- Add rowCount check to updateWorkflowRun
Also: extract ApprovalContext type, add .min(1) to approval.message,
fix stale "Only failed runs" error messages, fix log domain prefix,
add recovery instructions to CLI approve error message.
* fix(workflows): idle timeout too aggressive — break after result, reset on all messages (#854)
The idle timeout (5 min default) caused two problems: (1) after a node's AI
finished (result message), the loop waited for the subprocess to exit, wasting
5 min on hangs; (2) tool messages didn't reset the timer, so long Bash calls
(tests, builds) triggered false timeouts on actively working nodes.
Changes:
- Break out of the for-await loop immediately after receiving the result message
in both command/prompt and loop node paths — no more post-completion waste
- Remove shouldResetTimer predicate so all message types (including tool) reset
the timer — timeout only fires on complete silence
- Increase STEP_IDLE_TIMEOUT_MS from 5 min to 30 min — with every message
resetting the timer, this is a deadlock detector, not a work limiter
Fixes#854
* fix(workflows): update withIdleTimeout JSDoc to match new timer behavior
Remove the tool-exclusion example from the shouldResetTimer docs since
that pattern was just removed from all call sites. Clarify that most
callers should omit the parameter.
* fix(workflows): address review findings — log cleanup errors, add break tests, fix stale docs
- Log generator cleanup errors in withIdleTimeout instead of silently swallowing
- Add behavioral tests for break-after-result in both command/prompt and loop nodes
- Fix stale "5 minutes" default in docs/loop-nodes.md (now 30 minutes)
- Clarify shouldResetTimer test names and comments (utility API, not executor behavior)
- Extract effectiveIdleTimeout in loop node path (matches command/prompt pattern)
- Remove redundant iterResult alias in withIdleTimeout
* feat: add interrupted to WorkflowRunStatus schema
Implements US-001 from PRD.
Changes:
- Add 'interrupted' to workflowRunStatusSchema z.enum in packages/workflows/src/schemas/workflow-run.ts
- Add 'interrupted' to workflowRunStatusSchema in packages/server/src/routes/schemas/workflow.schemas.ts
- Add interrupted: z.number() to dashboardRunsResponseSchema counts object
- Add 'interrupted' to dashboardValidStatuses in API handler
- Add interrupted: 0 to DashboardRunsResult counts interface and runtime object in packages/core/src/db/workflows.ts
* feat: update IWorkflowStore interface & DB query implementations
Implements US-002 from PRD.
Changes:
- IWorkflowStore: rename getActiveWorkflowRun → getActiveWorkflowRunByPath(workingPath)
- IWorkflowStore: drop conversationId from findResumableRun signature
- IWorkflowStore: add interruptOrphanedRuns() method
- db/workflows: add getActiveWorkflowRunByPath querying status IN ('running', 'interrupted')
- db/workflows: update findResumableRun to query by workflow_name + working_path only, include 'interrupted' status
- db/workflows: add interruptOrphanedRuns() UPDATE SET status='interrupted' WHERE status='running'
- store-adapter: wire all three new/modified methods
- executor: update call sites to use renamed methods (type-check requirement)
- tests: update all mock stores and add new tests for getActiveWorkflowRunByPath and interruptOrphanedRuns
* feat: replace staleness guard with path-based lifecycle
Implements US-003 from PRD.
Changes:
- executor.ts: remove STALE_MINUTES staleness auto-kill; replace with
status-based guard — 'running' blocks, 'interrupted' offers resume/abandon
- server/src/index.ts: replace failStaleWorkflowRuns() with
createWorkflowStore().interruptOrphanedRuns() on startup
- executor-preamble.test.ts: replace staleness detection tests with
concurrent run guard tests covering 'running' and 'interrupted' cases
* feat: command handler — /workflow status, resume, and abandon
Implements US-004. Replaces time-based stale heuristics with explicit
lifecycle commands for workflow management.
Changes:
- Remove WORKFLOW_SLOW_THRESHOLD_MS and WORKFLOW_STALE_THRESHOLD_MS constants
- Replace /workflow status with global view: lists all running+interrupted runs
across all worktrees (ID, name, working path, status, started-at)
- Add /workflow resume <id>: validates state then calls resumeWorkflowRun
- Add /workflow abandon <id>: validates state then calls failWorkflowRun
- Add statuses[] filter to listWorkflowRuns for IN (...) queries
- Update /workflow help text and default case usage string
- Update /status command to remove stale warning that referenced removed constants
- Replace old /workflow status tests with new behavior coverage
- Add /workflow resume and /workflow abandon test coverage
* feat: CLI workflow status, resume, and abandon subcommands
Implements US-005 from PRD.
Changes:
- Implement workflowStatusCommand: lists all running+interrupted runs with ID, name, path, status, age; supports --json flag
- Add workflowResumeCommand: validates run state then calls resumeWorkflowRun
- Add workflowAbandonCommand: validates run state then calls failWorkflowRun('Abandoned by user')
- Replace findLastFailedRun usage in --resume path with findResumableRun(workflowName, cwd)
- Wire resume/abandon subcommands in cli.ts
- Update tests: replace "not implemented" test with status/resume/abandon coverage
* feat: Web UI interrupted status badge and dashboard support
Implements US-006 from PRD.
Changes:
- api.generated.d.ts: add 'interrupted' to WorkflowRunStatus enum and DashboardRunsResponse.counts
- api.ts: add interrupted field to DashboardCounts interface
- WorkflowExecution.tsx: add 'interrupted' to TERMINAL_STATUSES; add amber color to StatusBadge
- WorkflowRunCard.tsx: add amber dot and badge for interrupted status
- StatusSummaryBar.tsx: add 'interrupted' to STATUS_CHIPS filter list
- DashboardPage.tsx: include interrupted in activeRuns filter and counts default
* refactor: remove dead timer-based workflow staleness code
Implements US-007 from PRD.
Changes:
- Remove findLastFailedRun() from db/workflows.ts (CLI path unified on findResumableRun in US-005)
- Remove failStaleWorkflowRuns() from db/workflows.ts (replaced by interruptOrphanedRuns in US-002)
- Remove IDatabase import from db/workflows.ts (no longer needed)
- Remove failStaleWorkflowRuns tests from db/workflows.test.ts
grep -r 'STALE' packages/ (workflow-timer variant), grep -r 'findLastFailedRun' and
grep -r 'failStaleWorkflowRuns' all return zero matches.
* fix: address review feedback — truncated IDs, resume semantics, type safety
- Use full UUIDs in resume/abandon command suggestions (was .slice(0, 8))
- Add completed_at to interruptOrphanedRuns for correct duration metrics
- Fix resume commands: mark interrupted→failed to unblock path guard,
let next workflow invocation auto-resume via findResumableRun
- Collapse dual status/statuses fields into status?: T | T[]
- Extract TERMINAL/RESUMABLE/ACTIVE_WORKFLOW_STATUSES constants
- Add explicit else-if for interrupted status in executor path guard
- Add structured logging to CLI workflow commands
- Restore conversationId to cmd.workflow_status_failed log
- Add tests: listWorkflowRuns statuses filter, interrupted auto-resume,
DB error handling for resume/abandon
- Update docs: commands-reference, cli-user-guide, authoring-workflows, CLAUDE.md
* refactor: simplify CLI commands and status filter logic
- Extract getRunOrThrow helper for shared run lookup pattern
- Use WorkflowRun[] instead of Awaited<ReturnType<...>>
- Remove single-item special case in listWorkflowRuns (IN works for all)
- Use ?? instead of || for null-coalescing consistency
- Remove unused ACTIVE_WORKFLOW_STATUSES constant
- Add inline comment on completed_at for interrupted runs
* fix: handle SQLite string dates in formatAge
SQLite returns started_at as a string, not a Date object.
formatAge now accepts Date | string and converts accordingly.
Found during E2E testing against real SQLite database.
* feat: UX improvements — real resume, dashboard actions, cleanup command
- CLI resume now actually re-executes the workflow (calls workflowRunCommand
with --resume internally instead of just flipping DB status)
- Remove truncated IDs from executor guard messages (full ID in commands only)
- Add Resume/Abandon/Delete buttons to dashboard workflow run cards
- Add Delete button to history table rows
- Add API endpoints: POST resume, POST abandon, DELETE workflow run
- Add CLI workflow cleanup command (deletes terminal runs older than N days)
- Add deleteWorkflowRun and deleteOldWorkflowRuns DB functions
* refactor: simplify API handlers, dashboard actions, and log conventions
- Use RESUMABLE/TERMINAL_WORKFLOW_STATUSES constants in API handlers
(was inline string checks diverging from CLI/command-handler)
- Extract makeRunAction helper in DashboardPage (4 identical handlers → 1)
- Fix log event names to use domain prefix convention (api.workflow_run_*)
- Use Ban icon for Abandon to distinguish from Cancel's XCircle
- Use instanceof Date guard in formatAge for clarity
- Add comment on delete handler's active-status guard
* refactor: simplify workflow lifecycle — remove interrupted, single resume path
Rework the primitives for a clean foundation:
Status model: 5 statuses (pending, running, completed, failed, cancelled).
Remove 'interrupted' entirely — server restart now marks orphaned runs
as 'failed' directly (with metadata.failure_reason = 'server_restart').
Resume model: one path. The executor's implicit findResumableRun
detects prior failed runs and skips completed nodes. The CLI --resume
flag reuses the prior run's worktree but lets the executor handle
node-skipping (no more preCreatedRun bypass). Chat /workflow resume
tells the user to re-invoke (auto-resume kicks in).
Path guard: only blocks 'running' status (was running + interrupted).
Guards always run regardless of preCreatedRun.
Cancellation: generalized from status === 'cancelled' to
status !== 'running' at all 3 check points (streaming, loop iterations,
DAG layers). Ready for future 'paused' status with zero changes.
- interruptOrphanedRuns → failOrphanedRuns
- Remove preCreatedRun bypass from CLI --resume path
- Generalize 3 cancellation check points in dag-executor
- Update all API endpoints, command handlers, UI components
- Update all tests and documentation
* fix: cancel/complete race, abandon semantics, UTC dates
- Fix cancel/complete race condition: dag-executor now checks DB status
before calling completeWorkflowRun or failWorkflowRun, preventing a
cancel during the final layer from being overwritten to completed
- Abandon uses cancelWorkflowRun instead of failWorkflowRun, so
abandoned runs don't get auto-resumed by findResumableRun
- Fix formatAge UTC bug: SQLite dates without Z suffix now parsed as UTC
* fix: address PR review — SQL safety, transactions, error handling, docs, tests
- Validate olderThanDays before SQL interpolation in deleteOldWorkflowRuns
- Wrap multi-statement deletes in transactions (deleteOldWorkflowRuns, deleteWorkflowRun)
- Fix deleteWorkflowRun error double-wrap (don't re-wrap "not found" errors)
- Handle null getWorkflowRunStatus in DAG executor (treat as deleted, abort)
- Fix mock name mismatch: interruptOrphanedRuns → failOrphanedRuns in 3 test files
- Fix default mock getWorkflowRunStatus to return 'running' instead of null
- Add NaN guard to formatAge (returns 'unknown' on unparseable dates)
- Fix stale 'interrupted' references in route summary and delete comment
- Include working path in /workflow resume response
- Align deleteOldWorkflowRuns return type to { count } for consistency
- Document workflow cleanup command in CLAUDE.md, CLI user guide, commands reference
- Document new API endpoints (resume, abandon, delete) in CLAUDE.md
- Add tests for deleteOldWorkflowRuns, deleteWorkflowRun, workflowCleanupCommand
- Fix workflowAbandonCommand test to assert cancelWorkflowRun call
* refactor: simplify code per review — extract helper, cleaner date parsing, consistent guards
- Extract duplicated status-check blocks into skipIfStatusChanged helper in dag-executor
- Simplify formatAge to single-pass date parsing with Z suffix (ISO 8601)
- Use TERMINAL_WORKFLOW_STATUSES constant in delete route guard
- Rename cancelError → actionError in DashboardPage (covers 4 actions now)
- Fix merge conflict: add IDatabase import, getRunningWorkflows from dev
- Fix api.conversations.test.ts: add missing workflow mocks, fix Hono → OpenAPIHono
* fix: address review findings — double-rollback, missing guards, log context, tests
- Fix double-rollback in deleteWorkflowRun by removing inner rollback()
call and letting the outer catch handle it
- Add terminal-status guard inside deleteWorkflowRun itself, not just in
the route handler, to prevent deletion of running workflows
- Add rollback failure logging to the rollback() helper
- Add runId to error logs in resume/abandon/delete API route handlers
- Add workingPath to getActiveWorkflowRunByPath error log
- Add workflowRunId to dag-executor status-check warn logs
- Wrap workflowRunCommand in try/catch in workflowResumeCommand with
structured logging and null guard for user_message
- Clean up stale 'interrupted' references in JSDoc
- Fix missing / prefix on workflow cleanup in commands-reference.md
- Add API route tests for POST /resume, POST /abandon, DELETE /:runId
* refactor: apply code simplifications from review
- Replace fragile startsWith string matching in deleteWorkflowRun catch
with a typed WorkflowRunGuardError class
- Reorder listWorkflowRuns placeholder generation: capture startIdx
before pushing values for clarity
- Replace curried makeRunAction factory in DashboardPage with a plain
runAction helper function
- Move skipIfStatusChanged helper definition before its call sites in
dag-executor to match reading order
* refactor(workflows): eliminate types.ts re-export shim (step 2.1)
Move the 4 non-schema types (LoadCommandResult, WorkflowExecutionResult,
WorkflowLoadError, WorkflowLoadResult) into schemas/workflow.ts where
WorkflowDefinition already lives. Relocate the compile-time NodeOutput/
NodeState assertion to schemas/workflow-run.ts. Add DagWorkflow alias and
all 4 types to schemas/index.ts. Update index.ts barrel to re-export from
./schemas directly. Update all 21 import lines across 12 source files and
9 test files. Rename types.test.ts → schemas.test.ts. Delete types.ts.
The import chain is now one level shallower: caller → ./schemas →
specific schema file (was: caller → ./types → ./schemas → schema file).
External consumers of @archon/workflows are unaffected.
Fixes#842
* docs: address review findings from PR #844
- Update schemas/workflow.ts file header to reflect that it now also
contains non-Zod hand-written result types (LoadCommandResult,
WorkflowExecutionResult, WorkflowLoadError, WorkflowLoadResult)
- Remove stale types.ts entry from CLAUDE.md directory structure
listing (file was deleted in this PR)
* fix(docs): update stale references after types.ts deletion
- CLAUDE.md: wrong-example comment now notes the subpath no longer exists
(not just "don't do this in web")
- LoadCommandResult JSDoc: clarify non-empty content is enforced at load
time in executor-shared.ts, not by the type itself