Commit graph

84 commits

Author SHA1 Message Date
Rasmus Widing
3b16dd6c90 fix(server,web,workflows): web approval gates auto-resume + reject-with-reason dialog
Fixes three tightly-coupled bugs that made web approval gates unusable:

1. orchestrator-agent did not pass parentConversationId to executeWorkflow
   for any web-dispatched foreground / interactive / resumable run. Without
   that field, findResumableRunByParentConversation (the machinery the CLI
   relies on for resume) couldn't find the paused run from the same
   conversation on a follow-up message, and the approve/reject API handlers
   had no conversation to dispatch back to.

2. POST /api/workflows/runs/:runId/{approve,reject} recorded the decision
   and returned "Send a message to continue the workflow." — the workflow
   never actually resumed. Added tryAutoResumeAfterGate() that mirrors what
   workflowApproveCommand / workflowRejectCommand already do on the CLI:
   look up the parent conversation, dispatch `/workflow run <name>
   <userMessage>` back through dispatchToOrchestrator. Failures are
   non-fatal — the user can still send a manual message as a fallback.

3. The during-streaming cancel-check in dag-executor aborted any streaming
   node whenever the run status left 'running', including the legitimate
   transition to 'paused' that an approval node performs. A concurrent AI
   node in the same DAG layer now tolerates 'paused' and finishes its own
   stream; only truly terminal / unknown states (null, cancelled, failed,
   completed) abort the in-flight stream.

Web UI: ConfirmRunActionDialog gains an optional reasonInput prop (label +
placeholder) that renders a textarea and passes the trimmed value to
onConfirm. WorkflowRunCard (dashboard) and WorkflowProgressCard (chat)
both use it for Reject now — the chat card was still on window.confirm,
which was both inconsistent with the dashboard and couldn't collect a
reason. The trimmed reason threads through to $REJECTION_REASON in the
workflow's on_reject prompt.

Supersedes #1147. @jonasvanderhaegen surfaced the root cause and shape of
the fix; that PR was 87 commits stale and pre-dated the reject-UX upgrade
(#1261 area), so this is a fresh re-do on current dev.

Tests:
- packages/server/src/routes/api.workflow-runs.test.ts — 5 new cases:
  approve with parent dispatches; approve without parent returns "Send a
  message"; approve with deleted parent conversation skips safely; reject
  dispatches on-reject flows; reject that cancels (no on_reject) does NOT
  dispatch.
- packages/core/src/orchestrator/orchestrator.test.ts — updated the two
  synthesizedPrompt-dispatch tests for the new executeWorkflow arity.

Closes #1131.

Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>
2026-04-21 12:39:10 +03:00
Rasmus Widing
7be4d0a35e
feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath (closes #1136) (#1315)
* feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath

Collapses the awkward `~/.archon/.archon/workflows/` convention to a direct
`~/.archon/workflows/` child (matching `workspaces/`, `archon.db`, etc.), adds
home-scoped commands and scripts with the same loading story, and kills the
opt-in `globalSearchPath` parameter so every call site gets home-scope for free.

Closes #1136 (supersedes @jonasvanderhaegen's tactical fix — the bug was the
primitive itself: an easy-to-forget parameter that five of six call sites on
dev dropped).

Primitive changes:

- Home paths are direct children of `~/.archon/`. New helpers in `@archon/paths`:
  `getHomeWorkflowsPath()`, `getHomeCommandsPath()`, `getHomeScriptsPath()`,
  and `getLegacyHomeWorkflowsPath()` (detection-only for migration).
- `discoverWorkflowsWithConfig(cwd, loadConfig)` reads home-scope internally.
  The old `{ globalSearchPath }` option is removed. Chat command handler, Web
  UI workflow picker, orchestrator resolve path — all inherit home-scope for
  free without maintainer patches at every new site.
- `discoverScriptsForCwd(cwd)` merges home + repo scripts (repo wins on name
  collision). dag-executor and validator use it; the hardcoded
  `resolve(cwd, '.archon', 'scripts')` single-scope path is gone.
- Command resolution is now walked-by-basename in each scope. `loadCommand`
  and `resolveCommand` walk 1 subfolder deep and match by `.md` basename, so
  `.archon/commands/triage/review.md` resolves as `review` — closes the
  latent bug where subfolder commands were listed but unresolvable.
- All three (`workflows/`, `commands/`, `scripts/`) enforce a 1-level
  subfolder cap (matches the existing `defaults/` convention). Deeper
  nesting is silently skipped.
- `WorkflowSource` gains `'global'` alongside `'bundled'` and `'project'`.
  Web UI node palette shows a dedicated "Global (~/.archon/commands/)"
  section; badges updated.

Migration (clean cut — no fallback read):

- First use after upgrade: if `~/.archon/.archon/workflows/` exists, Archon
  logs a one-time WARN per process with the exact `mv` command:
  `mv ~/.archon/.archon/workflows ~/.archon/workflows && rmdir ~/.archon/.archon`
  The legacy path is NOT read — users migrate manually. Rollback caveat
  noted in CHANGELOG.

Tests:

- `@archon/paths/archon-paths.test.ts`: new helper tests (default HOME,
  ARCHON_HOME override, Docker), plus regression guards for the double-`.archon/`
  path.
- `@archon/workflows/loader.test.ts`: home-scoped workflows, precedence,
  subfolder 1-depth cap, legacy-path deprecation warning fires exactly once
  per process.
- `@archon/workflows/validator.test.ts`: home-scoped commands + subfolder
  resolution.
- `@archon/workflows/script-discovery.test.ts`: depth cap + merge semantics
  (repo wins, home-missing tolerance).
- Existing CLI + orchestrator tests updated to drop `globalSearchPath`
  assertions.

E2E smoke (verified locally, before cleanup):

- `.archon/workflows/e2e-home-scope.yaml` + scratch repo at /tmp
- Home-scoped workflow discovered from an unrelated git repo
- Home-scoped script (`~/.archon/scripts/*.ts`) executes inside a script node
- 1-level subfolder workflow (`~/.archon/workflows/triage/*.yaml`) listed
- Legacy path warning fires with actionable `mv` command; workflows there
  are NOT loaded

Docs: `CLAUDE.md`, `docs-web/guides/global-workflows.md` (full rewrite for
three-type scope + subfolder convention + migration), `docs-web/reference/
configuration.md` (directory tree), `docs-web/reference/cli.md`,
`docs-web/guides/authoring-workflows.md`.

Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>

* test(script-discovery): normalize path separators in mocks for Windows

The 4 new tests in `scanScriptDir depth cap` and `discoverScriptsForCwd —
merge repo + home with repo winning` compared incoming mock paths with
hardcoded forward-slash strings (`if (path === '/scripts/triage')`). On
Windows, `path.join('/scripts', 'triage')` produces `\scripts\triage`, so
those branches never matched, readdir returned `[]`, and the tests failed.

Added a `norm()` helper at module scope and wrapped the incoming `path`
argument in every `mockImplementation` before comparing. Stored paths go
through `normalizeSep()` in production code, so the existing equality
assertions on `script.path` remain OS-independent.

Fixes Windows CI job `test (windows-latest)` on PR #1315.

* address review feedback: home-scope error handling, depth cap, and tests

Critical fixes:
- api.ts: add `maxDepth: 1` to all 3 findMarkdownFilesRecursive calls in
  GET /api/commands (bundled/home/project). Without this the UI palette
  surfaced commands from deep subfolders that the executor (capped at 1)
  could not resolve — silent "command not found" at runtime.
- validator.ts: wrap home-scope findMarkdownFilesRecursive and
  resolveCommandInDir calls in try/catch so EACCES/EPERM on
  ~/.archon/commands/ doesn't crash the validator with a raw filesystem
  error. ENOENT still returns [] via the underlying helper.

Error handling fixes:
- workflow-discovery.ts: maybeWarnLegacyHomePath now sets the
  "warned-once" flag eagerly before `await access()`, so concurrent
  discovery calls (server startup with parallel codebase resolution)
  can't double-warn. Non-ENOENT probe errors (EACCES/EPERM) now log at
  WARN instead of DEBUG so permission issues on the legacy dir are
  visible in default operation.
- dag-executor.ts: wrap discoverScriptsForCwd in its own try/catch so
  an EACCES on ~/.archon/scripts/ routes through safeSendMessage /
  logNodeError with a dedicated "failed to discover scripts" message
  instead of being mis-attributed by the outer catch's
  "permission denied (check cwd permissions)" branch.

Tests:
- load-command-prompt.test.ts (new): 6 tests covering the executor's
  command resolution hot path — home-scope resolves when repo misses,
  repo shadows home, 1-level subfolder resolvable by basename, 2-level
  rejected, not-found, empty-file. Runs in its own bun test batch.
- archon-paths.test.ts: add getHomeScriptsPath describe block to match
  the existing getHomeCommandsPath / getHomeWorkflowsPath coverage.

Comment clarity:
- workflow-discovery.ts: MAX_DISCOVERY_DEPTH comment now leads with the
  actual value (1) before describing what 0 would mean.
- script-discovery.ts: copy the "routing ambiguity" rationale from
  MAX_DISCOVERY_DEPTH to MAX_SCRIPT_DISCOVERY_DEPTH.

Cleanup:
- Remove .archon/workflows/e2e-home-scope.yaml — one-off smoke test that
  would ship permanently in every project's workflow list. Equivalent
  coverage exists in loader.test.ts.

Addresses all blocking and important feedback from the multi-agent
review on PR #1315.

---------

Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>
2026-04-20 21:45:32 +03:00
Cole Medin
cb44b96f7b
feat(providers/pi): interactive flag binds UIContext for extensions (#1299)
* feat(providers/pi): interactive flag binds UIContext for extensions

Adds `interactive: true` opt-in to Pi provider (in `.archon/config.yaml`
under `assistants.pi`) that binds a minimal `ExtensionUIContext` stub to
each session. Without this, Pi's `ExtensionRunner.hasUI()` reports false,
causing extensions like `@plannotator/pi-extension` to silently auto-approve
every plan instead of opening their browser review UI.

Semantics: clamped to `enableExtensions: true` — no extensions loaded
means nothing would consume `hasUI`, so `interactive` alone is silently
dropped. Stub forwards `notify()` to Archon's event stream; interactive
dialogs (select/confirm/input/editor/custom) resolve to undefined/false;
TUI-only setters (widgets/headers/footers/themes) no-op. Theme access
throws with a clear diagnostic — Pi's theme singleton is coupled to its
own `Symbol.for()` registry which Archon doesn't own.

Trust boundary: only binds when the operator has explicitly enabled
both flags. Extensions gated on `ctx.hasUI` (plannotator and similar)
get a functional UI context; extensions that reach for TUI features
still fail loudly rather than rendering garbage.

Includes smoke-test workflow documenting the integration surface.
End-to-end plannotator UI rendering requires plan-mode activation
(Pi `--plan` CLI flag or `/plannotator` TUI slash command) which is
out of reach for programmatic Archon sessions — manual test only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(providers/pi): end-to-end interactive extension UI

Three fixes that together get plannotator's browser review UI to actually
render from an Archon workflow and reach the reviewer's browser.

1. Call resourceLoader.reload() when enableExtensions is true.
   createAgentSession's internal reload is gated on `!resourceLoader`, so
   caller-supplied loaders must reload themselves. Without this,
   getExtensions() returns the empty default, no ExtensionRunner is built,
   and session.extensionRunner.setFlagValue() silently no-ops.

2. Set PLANNOTATOR_REMOTE=1 in interactive mode.
   plannotator-browser.ts only calls ctx.ui.notify(url) when openBrowser()
   returns { isRemote: true }; otherwise it spawns xdg-open/start on the
   Archon server host — invisible to the user and untestable from bash
   asserts. From the workflow runner's POV every Archon execution IS
   remote; flipping the heuristic routes the URL through notify(), which
   the ExtensionUIContext stub forwards into the event stream. Respect
   explicit operator overrides.

3. notify() emits as assistant chunks, not system chunks.
   The DAG executor's system-chunk filter only forwards warnings/MCP
   prefixes, and only assistant chunks accumulate into $nodeId.output.
   Emitting as assistant makes the URL available both in the user's
   stream and in downstream bash/script nodes via output substitution.

Plus: extensionFlags config pass-through (equivalent to `pi --plan` on the
CLI) applied via ExtensionRunner.setFlagValue() BEFORE bindExtensions
fires session_start, so extensions reading flags in their startup handler
actually see them. Also bind extensions with an empty binding when
enableExtensions is on but interactive is off, so session_start still
fires for flag-driven but UI-less extensions.

Smoke test (.archon/workflows/e2e-plannotator-smoke.yaml) uses
openai-codex/gpt-5.4-mini (ChatGPT Plus OAuth compatible) and bumps
idle_timeout to 600000ms so plannotator's server survives while a human
approves in the browser.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(providers/pi): keep Archon extension-agnostic

Remove the plannotator-specific PLANNOTATOR_REMOTE=1 env var write from
the Pi provider. Archon's provider layer shouldn't know about any
specific extension's internals. Document the env var in the plannotator
smoke test instead — operators who use plannotator set it via their shell
or per-codebase env config.

Workflow smoke test updated with:
- Instructions for setting PLANNOTATOR_REMOTE=1 externally
- Simpler assertion (URL emission only) — validated in a real
  reject-revise-approve run: reviewer annotated, clicked Send Feedback,
  Pi received the feedback as a tool result, revised the plan (added
  aria-label and WCAG contrast per the annotation), resubmitted, and
  reviewer approved. Plannotator's tool result signals approval but
  doesn't return the plan text, so the bash assertion now only checks
  that the review URL reached the stream (not that plan content flowed
  into \$nodeId.output — it can't).
- Known-limitation note documenting the tool-result shape so downstream
  workflow authors know to Write the plan separately if they need it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(providers/pi): keep e2e-plannotator-smoke workflow local-only

The smoke test is plannotator-specific (calls plannotator_submit_plan,
expects PLAN.md on disk, requires PLANNOTATOR_REMOTE=1) and is better
kept out of the PR while the extension-agnostic infra lands.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style(providers/pi): trim verbose inline comments

Collapse multi-paragraph SDK explanations to 1-2 line "why" notes across
provider.ts, types.ts, ui-context-stub.ts, and event-bridge.ts. No
behavior change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(providers/pi): wire assistants.pi.env + theme-proxy identity

Two end-to-end fixes discovered while exercising the combined
plannotator + @pi-agents/loop smoke flow:

- PiProviderDefaults gains an optional `env` map; parsePiConfig picks
  it up and the provider applies it to process.env at session start
  (shell env wins, no override). Needed so extensions like plannotator
  can read PLANNOTATOR_REMOTE=1 from config.yaml without requiring a
  shell export before `archon workflow run`.

- ui-context-stub theme proxy returns identity decorators instead of
  throwing on unknown methods. Styled strings flow into no-op
  setStatus/setWidget sinks anyway, so the throw was blocking
  plannotator_submit_plan after HTTP approval with no benefit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(providers/pi): flush notify() chunks immediately in batch mode

Batch-mode adapters (CLI) accumulate assistant chunks and only flush on
node completion. That broke plannotator's review-URL flow: Pi's notify()
emitted the URL as an assistant chunk, but the user needed the URL to
POST /api/approve — which is what unblocks the node in the first place.

Adds an optional `flush` flag on assistant MessageChunks. notify() sets
it, and the DAG executor drains pending batched content before surfacing
the flushed chunk so ordering is preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: mention Pi alongside Claude and Codex in README + top-level docs

The AI assistants docs page already covers Pi in depth, but the README
architecture diagram + docs table, overview "Further Reading" section,
and local-deployment .env comment still listed only Claude/Codex.

Left feature-specific mentions alone where Pi genuinely lacks support
(e.g. structured output — Claude + Codex only).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: note Pi structured output (best-effort) in matrix + workflow docs

Pi gained structured output support via prompt augmentation + JSON
extraction (see packages/providers/src/community/pi/capabilities.ts).
Unlike Claude/Codex, which use SDK-enforced JSON mode, Pi appends the
schema to the prompt and parses JSON out of the result text (bare or
fenced). Updates four stale references that still said Claude/Codex-only:

- ai-assistants.md capabilities matrix
- authoring-workflows.md (YAML example + field table)
- workflow-dag.md skill reference
- CLAUDE.md DAG-format node description

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(providers/pi): default extensions + interactive to on

Extensions (community packages like @plannotator/pi-extension and
user-authored ones) are a core reason users pick Pi. Defaulting
enableExtensions and interactive to false previously silenced installed
extensions with no signal, leading to "did my extension even load?"
confusion.

Opt out in .archon/config.yaml when you want the prior behavior:

  assistants:
    pi:
      enableExtensions: false   # skip extension discovery entirely
      # interactive: false       # load extensions, but no UI bridge

Docs gain a new "Extensions (on by default)" section in
getting-started/ai-assistants.md that documents the three config
surfaces (extensionFlags, env, workflow-level interactive) and uses
plannotator as a concrete walk-through example.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-20 07:37:40 -05:00
Rasmus Widing
60eeb00e42
feat(workflows): inline sub-agent definitions on DAG nodes (#1276)
Some checks are pending
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (ubuntu-latest) (push) Waiting to run
* feat(workflows): inline sub-agent definitions on DAG nodes

Add `agents:` node field letting workflow YAML define Claude Agent SDK
sub-agents inline, keyed by kebab-case ID. The main agent can spawn
them via the Task tool — useful for map-reduce patterns where a cheap
model briefs items and a stronger model reduces.

Authors no longer need standalone `.claude/agents/*.md` files for
workflow-scoped helpers; the definitions live with the workflow.

Claude only. Codex and community providers without the capability
emit a capability warning and ignore the field. Merges with the
internal `dag-node-skills` wrapper when `skills:` is also set —
user-defined agents win on ID collision.

* fix(workflows): address PR #1276 review feedback

Critical:
- Re-export agentDefinitionSchema + AgentDefinition from schemas/index.ts
  (matches the "schemas/index.ts re-exports all" convention).

Important:
- Surface user-override of internal 'dag-node-skills' wrapper: warn-level
  provider log + platform message to the user when agents: redefines the
  reserved ID alongside skills:. User-wins behavior preserved (by design)
  but silent capability removal is now observable.
- Add validator test coverage for the agents-capability warning (codex
  node with agents: → warning; claude node → no warning; no-agents
  field → no warning).
- Strengthen NodeConfig.agents duplicate-type comment explaining the
  intentional circular-dep avoidance and pointing at the Zod schema as
  authoritative source. Actual extraction is follow-up work.

Simplifications:
- Drop redundant typeof check in validator (schema already enforces).
- Drop unreachable Object.keys(...).length > 0 check in dag-executor.
- Drop rot-prone "(out of v1 scope)" parenthetical.
- Drop WHAT-only comment on AGENT_ID_REGEX.
- Tighten AGENT_ID_REGEX to reject trailing/double hyphens
  (/^[a-z0-9]+(-[a-z0-9]+)*$/).

Tests:
- parseWorkflow strips agents on script: and loop: nodes (parallel to
  the existing bash: coverage).
- provider emits warn log on dag-node-skills collision; no warn on
  non-colliding inline agents.

Docs:
- Renumber authoring-workflows Summary section (12b → 13; bump 13-19).
- Add Pi capability-table row for inline agents (, Claude-only).
- Add when-to-use guidance (agents: vs .claude/agents/*.md) in the
  new "Inline sub-agents" section.
- Cross-link skills.md Related → inline-sub-agents.
- CHANGELOG [Unreleased] Added entry for #1276.
2026-04-19 09:16:01 +03:00
Cole Medin
4c6ddd994f
fix(workflows): fail loudly on SDK isError results (#1208) (#1291)
Some checks are pending
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
Test Suite / test (ubuntu-latest) (push) Waiting to run
Previously, `dag-executor` only failed nodes/iterations when the SDK
returned an `error_max_budget_usd` result. Every other `isError: true`
subtype — including `error_during_execution` — was silently `break`ed
out of the stream with whatever partial output had accumulated, letting
failed runs masquerade as successful ones with empty output.

This is the most likely explanation for the "5-second crash" symptom in
#1208: iterations finish instantly with empty text, the loop keeps
going, and only the `claude.result_is_error` log tips the user off.

Changes:
- Capture the SDK's `errors: string[]` detail on result messages
  (previously discarded) and surface it through `MessageChunk.errors`.
- Log `errors`, `stopReason` alongside `errorSubtype` in
  `claude.result_is_error` so users can see what actually failed.
- Throw from both the general node path and the loop iteration path
  on any `isError: true` result, including the subtype and SDK errors
  detail in the thrown message.

Note: this does not implement auto-retry. See PR comments on #1121 and
the analysis on #1208 — a retry-with-fresh-session approach for loop
iterations is not obviously correct until we see what
`error_during_execution` actually carries in the reporter's env.
This change is the observability + fail-loud step that has to come
first so that signal is no longer silent.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 15:02:35 -05:00
Cole Medin
367de7a625 test(ci): inject deliberate failure to verify CI red X
Injects exit 1 into e2e-deterministic bash-echo node to prove the engine
fix (failWorkflowRun on anyFailed) propagates to a non-zero CLI exit code
and a red X in GitHub Actions. Will be reverted in the next commit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:40:55 -05:00
Rasmus Widing
d6e24f5075
feat: Phase 2 — community-friendly provider registry system (#1195)
* feat: replace hardcoded provider factory with typed registry system

Replace the built-in-only factory switch with a typed ProviderRegistration
registry where entries carry metadata (displayName, capabilities,
isModelCompatible) alongside the factory function. This enables community
providers to register without modifying core code.

- Add ProviderRegistration and ProviderInfo types to contract layer
- Create registry.ts with register/get/list/clear API, delete factory.ts
- Bootstrap registerBuiltinProviders() at server and CLI entrypoints
- Widen provider unions from 'claude' | 'codex' to string across schemas,
  config types, deps, executors, and API validation
- Replace hardcoded model-validation with registry-driven isModelCompatible
  and inferProviderFromModel (built-in only inference)
- Add GET /api/providers endpoint returning registry metadata
- Dynamic provider dropdowns in Web UI (BuilderToolbar, NodeInspector,
  WorkflowBuilder, SettingsPage) via useProviders hook
- Dynamic provider selection in CLI setup command
- Registry test suite covering full lifecycle

* feat: generalize assistant config and tighten registry validation

- Add ProviderDefaults/ProviderDefaultsMap generic types to contract layer
- Add index signatures to ClaudeProviderDefaults/CodexProviderDefaults
- Introduce AssistantDefaults/AssistantDefaultsConfig intersection types
  that combine ProviderDefaultsMap with typed built-in entries
- Replace hardcoded claude/codex config merging with generic
  mergeAssistantDefaults() that iterates all provider entries
- Replace hardcoded toSafeConfig projection with generic
  toSafeAssistantDefaults() that strips server-internal fields
- Validate provider strings at all config-entry surfaces: env override,
  global config, repo config all throw on unknown providers
- Validate provider on PATCH /api/config/assistants (400 on unknown)
- Move validator.ts from hardcoded Codex checks to capability-driven
  warnings using registry getProviderCapabilities()
- Remove resolveProvider() default to 'claude' — returns undefined when
  no provider is set, skipping capability warnings for unresolved nodes
- Widen config API schemas to generic Record<string, ProviderDefaults>
- Rewrite SettingsPage to iterate providers dynamically with built-in
  specific UI for Claude/Codex and generic JSON view for community
- Extract bootstrap to provider-bootstrap modules in CLI and server
- Remove all as Record<...> casts from dag-executor, executor,
  orchestrator — clean indexing via ProviderDefaultsMap intersection

* fix: remove remaining hardcoded provider assumptions and regenerate types

- Replace hardcoded 'claude' defaults in CLI setup with registry lookup
  (getRegisteredProviders().find(p => p.builtIn)?.id)
- Replace hardcoded 'claude' default in clone.ts folder detection with
  registry-driven fallback
- Update config YAML comment from "claude or codex" to "registered provider"
- Make bootstrap test assertions use toContain instead of exact toEqual
  so they don't break when community providers are registered
- Widen validator.test.ts helper from 'claude' | 'codex' to string
- Remove unnecessary type casts in NodeInspector, WorkflowBuilder,
  SettingsPage now that generated types use string
- Regenerate api.generated.d.ts from updated OpenAPI spec — all provider
  fields are now string instead of 'claude' | 'codex' union

* fix: address PR review findings — consistency, tests, docs

Critical fixes:
- isModelCompatible now throws on unknown providers (fail-fast parity
  with getProviderCapabilities) instead of silently returning true
- Schema provider fields use z.string().trim().min(1) to reject
  whitespace-only values
- validator.ts resolveProvider accepts defaultProvider param so
  capability warnings fire for config-inherited providers
- PATCH /api/config/assistants validates assistants keys against
  registry (rejects unknown provider IDs in the map)

YAGNI cleanup:
- Delete provider-bootstrap.ts wrappers in CLI and server — call
  registerBuiltinProviders() directly
- Remove no-op .map(provider => provider) in SettingsPage

Test coverage:
- Add GET /api/providers endpoint tests (shape, projection, capabilities)
- Add config-loader throw-path tests for unknown providers in env var,
  global config, and repo config
- Add isModelCompatible throw test for unknown providers

Docs:
- CLAUDE.md: factory.ts → registry.ts in directory tree, add
  GET /api/providers to API endpoints section
- .env.example: update DEFAULT_AI_ASSISTANT comment
- docs-web configuration reference: update provider constraint docs

UI:
- Settings default-assistant dropdown uses allProviderEntries fallback
  (no longer silently empty on API failure)
- clearRegistry marked @internal in JSDoc

* fix: use registry defaults in getDefaults/registerProject, document type design

- getDefaults() initializes assistant defaults from registered providers
  instead of hardcoding { claude: {}, codex: {} }
- getDefaults() uses first registered built-in as default assistant
  instead of hardcoding 'claude'
- handleRegisterProject uses config.assistant instead of hardcoded 'claude'
  for new codebase ai_assistant_type
- Document AssistantDefaults/AssistantDefaultsConfig intersection types:
  built-in keys are typed for parseClaudeConfig/parseCodexConfig type
  safety; community providers use the generic [string] index
- Document WorkflowConfig.assistants intersection type with same rationale

* docs: update stale provider references to reflect registry system

- architecture.md: DB schema comment now says 'registered provider'
- first-workflow.md: provider field accepts any registered provider
- quick-reference.md: provider type changed from enum to string
- authoring-workflows.md: provider type changed from enum to string
- title-generator.ts: @param doc updated from 'claude or codex' to
  generic provider identifier

* docs: fix remaining stale provider references in quick-reference and authoring guide

- quick-reference.md: per-node provider type changed from enum to string
- quick-reference.md: model mismatch guidance updated for registry pattern
- authoring-workflows.md: provider comment says 'any registered provider'
2026-04-13 21:27:11 +03:00
Rasmus Widing
b5c5f81c8a
refactor: extract provider metadata seam for Phase 2 registry readiness (#1185)
* refactor: extract provider metadata seam for Phase 2 registry readiness

- Add static capability constants (capabilities.ts) for Claude and Codex
- Export getProviderCapabilities() from @archon/providers for capability
  queries without provider instantiation
- Add inferProviderFromModel() to model-validation.ts, replacing three
  copy-pasted inline inference blocks in executor.ts and dag-executor.ts
- Replace throwaway provider instantiation in dag-executor with static
  capability lookup (getProviderCapabilities)
- Add orchestrator warning when env vars are configured but provider
  doesn't support envInjection

* refactor: address LOW findings from code review

- Remove CLAUDE_CAPABILITIES/CODEX_CAPABILITIES from public index (YAGNI —
  callers should use getProviderCapabilities(), not raw constants)
- Remove dead _deps parameter from resolveNodeProviderAndModel and its
  two call-sites (no longer needed after static capability lookup refactor)
- Update factory.ts module JSDoc to mention both exported functions
- Add edge-case tests for getProviderCapabilities: empty string and
  case-sensitive throws (parity with existing getAgentProvider tests)
- Add test for inferProviderFromModel with empty string (returns default,
  documenting the falsy-string shortcut)
2026-04-13 16:10:48 +03:00
Rasmus Widing
bf20063e5a
feat: propagate managed execution env to all workflow surfaces (#1161)
* Implement managed execution env propagation

* Address managed env review feedback
2026-04-13 15:21:57 +03:00
Rasmus Widing
a8ac3f057b
security: prevent target repo .env from leaking into subprocesses (#1135)
Remove the entire env-leak scanning/consent infrastructure: scanner,
allow_env_keys DB column usage, allow_target_repo_keys config, PATCH
consent route, --allow-env-keys CLI flag, and UI consent toggle.

The env-leak gate was the wrong primitive. Target repo .env protection
is already structural:
- stripCwdEnv() at boot removes Bun-auto-loaded CWD .env keys
- Archon loads its own env sources afterward (~/.archon/.env)
- process.env is clean before any subprocess spawns
- Managed env injection (config.yaml env: + DB vars) is unchanged

No scanning, no consent, no blocking. Any repo can be registered and
used. Subprocesses receive the already-clean process.env.
2026-04-13 13:46:24 +03:00
Rasmus Widing
c1ed76524b
refactor: extract providers from @archon/core into @archon/providers (#1137)
* refactor: extract providers from @archon/core into @archon/providers

Move Claude and Codex provider implementations, factory, and SDK
dependencies into a new @archon/providers package. This establishes a
clean boundary: providers own SDK translation, core owns business logic.

Key changes:
- New @archon/providers package with zero-dep contract layer (types.ts)
- @archon/workflows imports from @archon/providers/types — no mirror types
- dag-executor delegates option building to providers via nodeConfig
- IAgentProvider gains getCapabilities() for provider-agnostic warnings
- @archon/core no longer depends on SDK packages directly
- UnknownProviderError standardizes error shape across all surfaces

Zero user-facing changes — same providers, same config, same behavior.

* refactor: remove config type duplication and backward-compat re-exports

Address review findings:
- Move ClaudeProviderDefaults and CodexProviderDefaults to the
  @archon/providers/types contract layer as the single source of truth.
  @archon/core/config/config-types.ts now imports from there.
- Remove provider re-exports from @archon/core (index.ts and types/).
  Consumers should import from @archon/providers directly.
- Update @archon/server to depend on @archon/providers for MessageChunk.

* refactor: move structured output validation into providers

Each provider now normalizes its own structured output semantics:
- Claude already yields structuredOutput from the SDK's native field
- Codex now parses inline agent_message text as JSON when outputFormat
  is set, populating structuredOutput on the result chunk

This eliminates the last provider === 'codex' branch from dag-executor,
making it fully provider-agnostic. The dag-executor checks structuredOutput
uniformly regardless of provider.

Also removes the ClaudeCodexProviderDefaults deprecated alias — all
consumers now use ClaudeProviderDefaults directly.

* fix: address PR review — restore warnings, fix loop options, cleanup

Critical fixes:
- Restore MCP missing env vars user-facing warning (was silently dropped)
- Restore Haiku + MCP tool search warning
- Fix buildLoopNodeOptions to pass workflow-level nodeConfig (effort,
  thinking, betas, sandbox were silently lost for loop nodes)
- Add TODO(#1135) comments documenting env-leak gate gap

Cleanup:
- Remove backward-compat type aliases from deps.ts (keep WorkflowTokenUsage)
- Remove 26 unnecessary eslint-disable comments from test files
- Trim internal helpers from providers barrel (withFirstMessageTimeout,
  getProcessUid, loadMcpConfig, buildSDKHooksFromYAML)
- Add @archon/providers dep to CLI package.json
- Fix 8 stale documentation paths pointing to deleted core/src/providers/
- Add E2E smoke test workflows for both Claude and Codex providers

* fix: forward provider system warnings to users in dag-executor

The dag-executor only forwarded system chunks starting with
"MCP server connection failed:" — all other provider warnings
(missing env vars, Haiku+MCP, structured output issues) were
logged but never reached the user.

Now forwards all system chunks starting with ⚠️ (the prefix
providers use for user-actionable warnings).

* fix: add providers package to Dockerfile and fix CI module resolution

- Add packages/providers/ to all three Dockerfile stages (deps,
  production package.json copy, production source copy)
- Replace wildcard export map (./*) with explicit subpath entries
  to fix module resolution in CI (bun workspace linking)

* chore: update bun.lock for providers package exports
2026-04-13 09:21:36 +03:00
Rasmus Widing
91c184af57 refactor: rename IAssistantClient to IAgentProvider
Rename the core AI provider interface and all related types, classes,
factory functions, and directory from clients/ to providers/.

Rename map:
- IAssistantClient → IAgentProvider
- ClaudeClient → ClaudeProvider
- CodexClient → CodexProvider
- getAssistantClient → getAgentProvider
- AssistantRequestOptions → AgentRequestOptions
- IWorkflowAssistantClient → IWorkflowAgentProvider
- AssistantClientFactory → AgentProviderFactory
- WorkflowAssistantOptions → WorkflowAgentOptions
- packages/core/src/clients/ → packages/core/src/providers/

NOT renamed (user-facing/DB-stored): assistant config key,
DEFAULT_AI_ASSISTANT env var, ai_assistant_type DB column.

No behavioral changes — purely naming.
2026-04-12 13:11:21 +03:00
Cole Medin
4e56c86dff fix: eliminate duplicate text and tool calls in workflow execution view
Three fixes for message duplication during live workflow execution:

1. dag-executor: Add missing `tool_call_formatted` category to loop iteration
   tool messages. Without this, the web adapter sent tool text as both a regular
   SSE text event AND a structured tool_call event, causing each tool to appear
   twice (raw text + rendered card). Regular DAG nodes already had this metadata.

2. WorkflowLogs: Add text content dedup in SSE/DB merge. During live execution,
   the same text (e.g. "Starting workflow...") can appear in both DB (REST fetch)
   and SSE (event buffer replay). Collects DB text into a Set and skips matching
   SSE text messages.

3. orchestrator-agent: Suppress remainingMessage re-send in stream mode. The
   routing AI streams text chunks before /invoke-workflow is detected, then
   retracts them. Without suppression, remainingMessage re-sends the same text.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 15:48:40 -05:00
Rasmus Widing
50f96f870e
feat: script node type for DAG workflows (bun/uv runtimes) (#999)
* feat: add ScriptNode schema and type guards (US-001)

Implements US-001 from the script-nodes PRD.

Changes:
- Add scriptNodeSchema with script, runtime (bun|uv), deps, and timeout fields
- Add ScriptNode type with never fields for mutual exclusivity
- Add isScriptNode type guard
- Add SCRIPT_NODE_AI_FIELDS constant (same as BASH_NODE_AI_FIELDS)
- Update dagNodeSchema superRefine and transform to handle script: nodes
- Update DagNode union type to include ScriptNode
- Add script node dispatch stub in dag-executor.ts (fails fast until US-003)
- Export all new types and values from schemas/index.ts
- Add comprehensive schema tests for ScriptNode parsing and validation

* feat: script discovery from .archon/scripts/

Implements US-002 from PRD.

Changes:
- Add ScriptDefinition type and discoverScripts() in script-discovery.ts
- Auto-detect runtime from file extension (.ts/.js->bun, .py->uv)
- Handle duplicate script name conflicts across extensions
- Add bundled defaults infrastructure (empty) for scripts
- Add tests for discovery, naming, and runtime detection

* feat: script execution engine (inline + named)

Implements US-003 from PRD.

Changes:
- Add executeScriptNode() in dag-executor.ts following executeBashNode pattern
- Support inline bun (-e) and uv (run python -c) execution
- Support named scripts via bun run / uv run
- Wire ScriptNode dispatch replacing 'not yet implemented' stub
- Capture stdout as node output, stderr as warning
- Handle timeout and non-zero exit
- Pass env vars for variable substitution
- Add tests for inline/named/timeout/failure cases

* feat: runtime availability validation at load time

Implements US-004 from PRD.

Changes:
- Add checkRuntimeAvailable() utility for bun/uv binary detection
- Extend validator.ts with script file and runtime validation
- Integrate script validation into parseWorkflow flow in loader.ts
- Add tests for runtime availability detection

* feat: dependency installation for script nodes

Implements US-005 from PRD.

Changes:
- Support deps field for uv nodes: uvx --with dep1... for inline
- Support uv run --with dep1... for named uv scripts
- Bun deps are auto-installed at runtime via bun's native mechanism
- Empty/omitted deps field produces no extra flags
- Add tests for dep injection into both runtimes

* test: integration tests and validation for script nodes

Implements US-006 from PRD.

Changes:
- Fill test coverage gaps for script node feature
- Add script + command mutual exclusivity schema test
- Add env var substitution tests ($WORKFLOW_ID, $ARTIFACTS_DIR in scripts)
- Add stderr handling test (stderr sent to user as platform message)
- Add missing named script file validation tests to validator.test.ts
- Full bun run validate passes

* fix: address review findings in script nodes

- Extract isInlineScript to executor-shared.ts (was duplicated in
  dag-executor.ts and validator.ts)
- Remove dead warnMissingScriptRuntimes from loader.ts (validator
  already covers runtime checks)
- Remove path traversal fallback in executeScriptNode — error when
  named script not found instead of executing arbitrary file paths
- Memoize checkRuntimeAvailable to avoid repeated subprocess spawns
- Add min(1) to scriptNodeSchema.script field for consistency
- Replace dynamic import with static import in validator.ts

* fix(workflows): address review findings for script node implementation

Critical fixes:
- Wrap discoverScripts() in try-catch inside executeScriptNode to prevent
  unhandled rejections when script discovery fails (e.g. duplicate names)
- Add isScriptNode to isNonAiNode check in loader.ts so AI-specific fields
  on script nodes emit warnings (activates SCRIPT_NODE_AI_FIELDS)

Important fixes:
- Surface script stderr in user-facing error messages on non-zero exit
- Replace uvx with uv run --with for inline uv scripts with deps
- Add z.string().min(1) validation on deps array items
- Remove unused ScriptDefinition.content field and readFile I/O
- Add logging in discoverAvailableScripts catch block
- Warn when deps is specified with bun runtime (silently ignored)

Simplifications:
- Merge BASH_DEFAULT_TIMEOUT and SCRIPT_DEFAULT_TIMEOUT into single
  SUBPROCESS_DEFAULT_TIMEOUT constant
- Use scriptDef.runtime instead of re-deriving from extname()
- Extract shared formatValidationResult helper, deduplicate section comments

Tests:
- Add isInlineScript unit tests to executor-shared.test.ts
- Add named-script-not-found executor test to dag-executor.test.ts
- Update deps tests to expect uv instead of uvx

Docs:
- Add script: node type to CLAUDE.md node types and directory structure
- Add script: to .claude/rules/workflows.md DAG Node Types section
2026-04-09 14:48:02 +03:00
Rasmus Widing
594d5daa2a
feat: use 1M context model for implement nodes in bundled workflows (#1018)
* feat: use 1M context model for implement nodes in bundled workflows (#1016)

Large codebases fill the 200k context window during implementation, triggering
SDK auto-compaction which loses context detail and slows execution. Setting
claude-opus-4-6[1m] on implementation nodes gives 5x more room before compaction.

Changes:
- Set model: claude-opus-4-6[1m] on implement nodes in 8 bundled workflows
- Fix loop nodes to respect per-node model overrides (previously only used
  workflow-level model)
- Review/classify/report nodes stay on sonnet/haiku for cost efficiency

Fixes #1016

* fix: resolve per-node provider/model overrides for loop nodes

Move provider and model resolution for loop nodes to the call site,
matching the pattern used by command/prompt/bash nodes. This fixes:

- Loop nodes now respect per-node `provider` overrides (previously ignored)
- Model/provider compatibility is validated before execution
- JSDoc on buildLoopNodeOptions accurately describes the function
2026-04-06 21:02:17 +03:00
Rasmus Widing
52b73b0ba4
Merge pull request #1013 from dynamous-community/archon/task-feat-sdk-result-capture
feat(workflows): capture SDK cost, usage, stop reason per node
2026-04-06 20:20:54 +03:00
Rasmus Widing
d227edbb08 Fix review findings: tests, JSDoc, and style consistency for cost tracking
- Add tests for result chunk cost/stopReason/numTurns/modelUsage fields in claude.test.ts
- Add tests for rate_limit_event chunk (with and without rate_limit_info)
- Add tests for total_cost_usd accumulation in completeWorkflowRun (single-node, multi-node, no-cost)
- Add test for loop node cost accumulation across iterations
- Fix stale JSDoc in executeNodeInternal and executeLoopNode (NodeOutput -> NodeExecutionResult)
- Change stopReason guard from truthiness to !== undefined for consistency
- Add comment explaining totalCostUsd > 0 guard intent
- Add comment noting resume runs only accumulate cost for the current invocation
- Add rate_limit fallthrough comment in executeNodeInternal and executeLoopNode
2026-04-06 20:03:47 +03:00
Rasmus Widing
bdb1be0a8d feat: capture SDK result data (cost, usage, stop reason) (#932)
Extract total_cost_usd, stop_reason, num_turns, and model_usage from
Claude SDK result messages. Accumulate per-node cost in dag-executor
(regular + loop nodes), aggregate to per-run total via completeWorkflowRun
metadata, and display cost in the Web UI WorkflowRunCard. Rate limit
events are logged as warnings and yielded as a new message chunk type.

Fixes #932
2026-04-06 19:39:18 +03:00
Rasmus Widing
6d606410bf simplify: drop redundant 'resolved' prefix on workflow option variables 2026-04-06 19:35:28 +03:00
Rasmus Widing
4cb4928794 simplify: reduce complexity in changed files 2026-04-06 19:02:26 +03:00
Rasmus Widing
c6bc695c35 fix: address review findings for Claude SDK node options PR
- Tighten workflow-level schema: add .min(1) to fallbackModel and .nonempty() to betas to match node-level validation
- Replace inline effort/thinking/sandbox types in deps.ts with EffortLevel/ThinkingConfig/SandboxSettings imported from ./schemas
- Extract WorkflowLevelOptions interface in dag-executor.ts to eliminate repeated inline type shape
- Add dag.node_budget_cap_exceeded structured log event before budget cap throw, replacing bare type assertion with nodeOptions?.maxBudgetUsd
- Update CLAUDE.md nodes: bullet to document new Claude-only options
- Add Claude SDK advanced options section and 7 table rows to authoring-workflows.md; update Summary list
- Add SDK option forwarding tests to claude.test.ts (effort, thinking, maxBudgetUsd, systemPrompt, fallbackModel, betas, sandbox)
- Add dag-executor tests: error_max_budget_usd failure path, per-node vs workflow-level precedence, Codex warning
2026-04-06 18:59:36 +03:00
Rasmus Widing
066289f8f0 feat: expose Claude SDK node options (effort, thinking, maxBudgetUsd, systemPrompt, fallbackModel, betas, sandbox) (#931)
Workflow authors can now configure 7 Claude SDK options per DAG node in YAML.
Five of these (effort, thinking, fallbackModel, betas, sandbox) are also
settable at workflow level as defaults that per-node values override.

Changes:
- Add effortLevelSchema, thinkingConfigSchema, sandboxSettingsSchema to dag-node.ts
- Add 7 optional fields to dagNodeBaseSchema with BASH_NODE_AI_FIELDS + transform
- Add 5 workflow-level defaults to workflowBaseSchema
- Extend WorkflowAssistantOptions (deps.ts) and AssistantRequestOptions (types/index.ts)
- Forward all 7 options in claude.ts with systemPrompt conditional override
- Add workflow-level options resolution in dag-executor.ts with per-node override
- Add error_max_budget_usd handling in result handler
- Consolidated Codex warning for all Claude-only options
- Add 16 schema parsing tests for new options

Fixes #931
2026-04-06 18:22:24 +03:00
Rasmus Widing
3b8857d977 feat: add $DOCS_DIR variable for configurable documentation path (#982)
Projects with docs outside `docs/` (e.g., `packages/docs-web/src/content/docs/`)
get broken bundled commands because the path is hardcoded. Add `docs.path` to
`.archon/config.yaml` and thread it through the workflow engine as `$DOCS_DIR`
(default: `docs/`), following the same pipeline as `$BASE_BRANCH`.

Changes:
- Add `docs.path` to RepoConfig and `docsPath` to MergedConfig/WorkflowConfig
- Thread `docsDir` through executor-shared, executor, and dag-executor
- Update bundled commands to use `$DOCS_DIR` instead of hardcoded `docs/`
- Add optional docs path prompt to `archon setup`
- Add variable reference and configuration documentation
- Resolve pre-existing merge conflicts in server/api.ts

Fixes #982
2026-04-06 16:26:59 +03:00
Rasmus Widing
fbfca227ba feat: per-project env var management via config and Web UI (#852)
Add per-project environment variable management as a first-class config
primitive. Env vars defined in .archon/config.yaml or stored in DB via
Web UI are merged into Options.env on Claude SDK calls.

Three env var sources merge in priority order (later wins):
1. process.env — global, from ~/.archon/.env via dotenv
2. .archon/config.yaml env: section — file-based per-project
3. DB remote_agent_codebase_env_vars table — Web UI per-project

Changes:
- Add remote_agent_codebase_env_vars table (PG migration + SQLite schema)
- Add DB CRUD module (packages/core/src/db/env-vars.ts)
- Extend IWorkflowStore with getCodebaseEnvVars method
- Add env field to RepoConfig, MergedConfig, WorkflowConfig, WorkflowAssistantOptions, AssistantRequestOptions
- Merge DB env vars in executor after config load
- Inject env vars into Claude subprocess via Options.env
- Add 3 API routes (GET/PUT/DELETE /api/codebases/:id/env)
- Add EnvVarsPanel to Settings page with masked value display

Fixes #852
2026-04-06 15:30:45 +03:00
Rasmus Widing
4c3dc67bcb simplify: remove redundant isRejectionResume variable in executeApprovalNode 2026-04-02 12:15:10 +03:00
Rasmus Widing
e7efc4155e fix: address review findings for on_reject and capture_response
- emit workflow_cancelled event + SSE notification on max-attempts exhaustion
  in executeApprovalNode (dag-executor.ts)
- add missing Record<string, unknown> type annotation on metadataUpdate in
  orchestrator-agent.ts
- wrap CLI hasOnReject DB calls in try/catch matching the else-branch pattern
- add tests for $REJECTION_REASON substitution in executor-shared.test.ts
- add tests for command-handler reject on_reject branch and captureResponse
  approve behavior
- add tests for API approve/reject endpoints (on_reject routing, max attempts,
  captureResponse, 404/400 error cases)
- add tests for workflowRejectCommand (on_reject, working_path guard, max
  attempts, plain cancel)
- add approvalOnRejectSchema validation tests (empty prompt, out-of-range
  max_attempts)
- update docs/approval-nodes.md: capture_response opt-in behavior, new fields
  table, conditional rejection behavior, on_reject lifecycle section
- update docs/authoring-workflows.md: add $REJECTION_REASON and $LOOP_USER_INPUT
  to variable table, update Human-in-the-Loop pattern
- add $REJECTION_REASON to CLAUDE.md variable substitution list
2026-04-02 12:15:10 +03:00
Rasmus Widing
70c9671175 feat(workflows): add capture_response and on_reject retry to approval nodes (#936)
Approval nodes can now capture the reviewer's comment as $nodeId.output
(via capture_response: true) and optionally retry on rejection instead of
cancelling (via on_reject: { prompt, max_attempts }). This enables
iterative human-AI review cycles without needing interactive loop nodes.

Changes:
- Add approvalOnRejectSchema and extend approval node schema with
  capture_response and on_reject fields
- Extend ApprovalContext with captureResponse, onRejectPrompt,
  onRejectMaxAttempts (stored at pause time for reject handlers)
- Add $REJECTION_REASON variable to substituteWorkflowVariables
- Extract executeApprovalNode function with rejection resume logic
- Update all 4 approve handlers (CLI, command-handler, orchestrator,
  server API) to use captureResponse and clear rejection state
- Update all 3 reject handlers (CLI, command-handler, server API) to
  check onRejectPrompt and retry instead of cancel when configured
- Add 5 tests for approval node behavior (fresh pause, capture_response,
  on_reject resume, max_attempts exhaustion, max_attempts=1)

Fixes #936
2026-04-02 12:15:09 +03:00
Rasmus Widing
0d34cfda7d
Merge pull request #943 from dynamous-community/archon/task-feat-issue-796
feat(workflows): extend when: conditions with numeric comparisons and AND/OR
2026-04-02 10:49:44 +03:00
Rasmus Widing
cbfb15c9f6 fix(workflows): remove over-broad SDK error gate for credit exhaustion detection
The `startsWith('error_')` check would misclassify any SDK error whose
subtype begins with `error_` (e.g. `error_max_turns`, `error_max_tokens`)
as credit exhaustion, giving users misleading "wait for credits" guidance.

The SDK returns credit exhaustion as assistant text anyway, so
`detectCreditExhaustion(nodeOutputText)` is the correct and complete
detection path. Remove the speculative SDK flag path and its associated
local variables.

Also adds an independent test for the 'insufficient credit' pattern in
CREDIT_EXHAUSTION_OUTPUT_PATTERNS, which was previously only incidentally
covered by the 'credit balance' test string.
2026-04-02 10:32:34 +03:00
Rasmus Widing
de3453b7e3 feat(workflows): extend when: conditions with numeric comparisons and AND/OR (#796)
The when: condition evaluator only supported == and != string equality,
forcing workflow authors to coerce numeric values to sentinel strings
and use extra gate nodes for multi-condition branching.

Changes:
- Add <, >, <=, >= numeric comparison operators (fail-closed on non-finite values)
- Add && (AND) and || (OR) compound expression support with standard precedence
- Extract evaluateAtom() and splitOutsideQuotes() helpers in condition-evaluator.ts
- Add 16 new tests covering numeric ops, compound logic, precedence, and fail-closed
- Update dag-executor parse-error message to mention new operators
- Update docs/authoring-workflows.md with full operator reference

Fixes #796
2026-04-02 10:28:14 +03:00
Rasmus Widing
0f894d07b4 fix(workflows): detect credit exhaustion in node output and fail instead of completing (#940)
When Claude credits run out mid-workflow, the SDK returns the error as
normal assistant text rather than throwing. The DAG executor was marking
these nodes as completed with garbage output, preventing resume from
re-running them.

Changes:
- Add isError/errorSubtype to MessageChunk and WorkflowMessageChunk result types
- Propagate is_error/subtype from Claude SDK result message
- Add detectCreditExhaustion() to executor-shared for text pattern matching
- Gate node_completed in dag-executor with credit exhaustion check
- Mark credit-exhausted nodes as failed so resume re-runs them

Fixes #940
2026-04-02 10:18:30 +03:00
Cole Medin
e011b82612 fix(workflows): let AI decide interactive loop exit instead of hard-coded phrases
Removes the brittle approvalPhrases word matching (commit 555d5f4) and
re-enables AI completion signal detection for interactive loops. The AI
now determines when the user approves based on prompt instructions, not
a fixed list of 19 phrases.

Key changes:
- Remove approvalPhrases block from dag-executor.ts
- Re-enable completion signal detection (revert ac2cfc07 bypass)
- Move completion check before interactive gate (signal = exit, no signal = pause)
- Add first-iteration guard: fresh interactive loops always gate before exit
- Fix node_completed timing: only written on actual loop completion, not on every approve
- Strengthen PIV loop prompts with explicit signal emission rules

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 16:24:13 -05:00
Cole Medin
ac2cfc07a5 fix(workflows): ignore completion signal in interactive loops entirely
For interactive loops, the AI's completion signal is now fully ignored.
Exit is controlled exclusively by the user's input — if they send a
short approval phrase, the loop exits before running the AI. If they
send substantive feedback, the AI runs and always pauses after.

This prevents the AI from auto-approving by addressing feedback AND
emitting the signal in the same iteration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 14:38:24 -05:00
Cole Medin
555d5f4945 fix(workflows): user-driven exit for interactive loops
Interactive loops now check the user's input BEFORE running the AI
iteration. If the user's response is a short approval phrase ("approved",
"looks good", "lgtm", etc.), the loop exits immediately without running
another AI iteration. If it's substantive feedback, the AI iteration
runs and always pauses after.

This fixes the core issue where the AI would address user feedback AND
emit the completion signal in the same iteration, causing the loop to
exit before the user could review the AI's changes. The completion
signal from the AI is now irrelevant for interactive loops — only the
user's explicit approval controls exit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 14:22:48 -05:00
Cole Medin
37846aa9fe fix(workflows): interactive loops always pause for user review
Previously, interactive loops would exit immediately when the AI emitted
the completion signal, bypassing user review. This caused two problems:
- Plan refinement loops auto-approved without showing the revised plan
- Validation feedback loops auto-approved on first iteration (empty input)

Now interactive loops ALWAYS pause after every iteration, regardless of
whether the AI signaled completion. The user decides when to exit by
explicitly approving. The completion signal becomes the AI's suggestion
("I think this is ready") but the user still gets the final say.

Non-interactive loops are unchanged — they still exit on completion signal.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:59:34 -05:00
Cole Medin
51a2702699 fix: address PR #938 review findings — interactive loop correctness and coverage
Fixed:
- H1: Return 'completed' (not 'failed') from interactive loop gate to prevent
  false "Some DAG nodes failed" warnings in multi-node workflows
- H2: Check safeSendMessage return value before pausing — fail the node with a
  clear error if the gate message failed to deliver, preventing orphaned paused runs
- H3: Extend isApprovalTransition guard in updateWorkflowRun to cover loop_user_input
  metadata key, preventing completed_at from being stamped on resumable loop runs
- M1: Add isApprovalContext() type guard in workflow-run.ts; replace unsafe casts
  in dag-executor.ts and command-handler.ts
- M4/L3: Update comments to accurately reflect completed-return semantics and
  metadata merge requirement
- L1: Pass '' instead of undefined for $LOOP_USER_INPUT on iterations after first
- L4: Update $LOOP_USER_INPUT docstring to clarify first-iteration-only scoping
- Gap6: Add archon-piv-loop to bundled-defaults.ts so it's available in binary builds

Tests added:
- H4: /workflow approve interactive_loop branch tests in command-handler.test.ts
  (routing, approval_received event, no node_completed, error cases)
- M2: superRefine validation tests in loader.test.ts (reject interactive without
  gate_message; accept valid interactive loop)
- M3: loader warning test for interactive loop in non-interactive workflow

Docs updated:
- H5: docs/loop-nodes.md — add interactive/gate_message fields, $LOOP_USER_INPUT
  variable, and interactive loop pattern section
- M5: docs/authoring-workflows.md — note interactive loops require workflow-level
  interactive: true
- L5: README.md — add archon-piv-loop row, update count 16→17
- L6: docs/authoring-workflows.md — update count 16→17
- L7: CLAUDE.md — add $LOOP_USER_INPUT to variable substitution table

Bundled defaults test updated: count 10→11 to reflect archon-piv-loop addition

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 11:17:56 -05:00
Cole Medin
8b4cfb4326 feat(workflows): add interactive loop node — pause for user input between iterations (#937)
Loop nodes can now pause between iterations for user feedback when
`interactive: true` and `gate_message` are set. On pause, the user
provides feedback via `/workflow approve <id> <feedback>`, which is
injected as `$LOOP_USER_INPUT` in the next iteration's prompt. Session
context is preserved across pause/resume when `fresh_context: false`.

Changes:
- Extended `ApprovalContext` with type/iteration/sessionId fields
- Added `interactive` and `gate_message` to loop node schema with cross-field validation
- Updated `pauseWorkflowRun` signature to use typed `ApprovalContext`
- Added `$LOOP_USER_INPUT` variable substitution in executor-shared
- Implemented resume detection and interactive pause in executeLoopNode
- Updated approve handler to branch on interactive_loop gate type
- Added loader warning for interactive loops in non-interactive workflows
- Added 4 test cases covering pause, signal exit, resume, and regression

Fixes #937

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-01 10:33:34 -05:00
Rasmus Widing
6f29e72eca
Merge pull request #927 from dynamous-community/archon/task-fix-issue-564
feat(workflows): store node outcome counts in workflow run metadata
2026-04-01 15:23:45 +03:00
Rasmus Widing
a98e71cb94
Merge pull request #923 from dynamous-community/archon/task-feat-cancel-node-type
feat(workflows): add cancel: node type for DAG-initiated run cancellation
2026-04-01 15:16:23 +03:00
Rasmus Widing
76aa8f833c fix: address review findings — single-pass iteration + test assertion
- Replace 5-pass nodeOutputs iteration with single for-loop
- Derive anyCompleted/anyFailed from counts instead of separate .some() calls
- Add metadata payload assertion to loop completion test
- Add comment on failure path documenting missing node_counts
2026-04-01 14:36:12 +03:00
Rasmus Widing
9ff684c96f feat: store node outcome counts in workflow run metadata (#564)
When a DAG workflow completes, compute node outcome counts (completed,
failed, skipped, total) from nodeOutputs and persist them into the
metadata JSONB column. The dashboard card now shows a compact summary
like "7/10 nodes succeeded · 2 failed · 1 skipped".

Changes:
- Extend completeWorkflowRun to accept optional metadata (store + db)
- Compute node counts in dag-executor at completion time
- Add NodeCountsSummary component to WorkflowRunCard
- Add tests for metadata merge and node counts propagation

Fixes #564
2026-04-01 14:23:39 +03:00
Rasmus Widing
e14d7aea58 style: rename shadowed variable to avoid ambiguity in streaming loop
Rename `now` to `tickNow` at the top of the streaming loop to avoid
shadowing the pre-existing `const now = Date.now()` in the tool event
block further down. No behavioral change.
2026-04-01 14:12:21 +03:00
Rasmus Widing
963201ccd7 perf(db): reduce SQLite write contention for parallel workflows (#419)
Split the 10s activity interval into separate cancel-check (read, 10s) and
heartbeat (write, 60s) timers. Reads don't contend under WAL mode, so cancel
responsiveness is unchanged while heartbeat writes drop by 6x.

Fixes #419
2026-04-01 12:14:45 +03:00
Rasmus Widing
f366741c63 feat(workflows): add cancel: node type for DAG-initiated run cancellation (#220)
Workflow authors can now use `cancel: "reason string"` as a DAG node type
to terminate a running workflow from inside the DAG based on upstream
conditions, preventing wasted compute on branches that should be killed.

Changes:
- Add cancelNodeSchema, CancelNode type, isCancelNode guard to dag-node.ts
- Add cancel to mutual exclusivity checks (command/prompt/bash/loop/approval/cancel)
- Add cancel node dispatch in dag-executor (mirrors approval pattern)
- Add cancelWorkflowRun to IWorkflowStore interface and store-adapter
- Add workflow_cancelled event type and WorkflowCancelledEvent emitter event
- Add cancel to non-AI node detection in loader (AI field warnings)
- Handle workflow_cancelled in server SSE bridge (exhaustive switch)
- Add tests for schema, loader, store-adapter, and executor

Fixes #220
2026-04-01 12:02:57 +03:00
Cole Medin
9c686902ab
fix(workflows): restore post-workflow summary in parent conversation (#907)
* fix(workflows): restore post-workflow summary in parent conversation (#900)

After PR #805 removed sequential execution and made DAG the sole format,
the terminal node output was never returned from executeDagWorkflow (void
return), so executor.ts always returned {success:true} with no summary
field, and orchestrator.ts silently skipped the summary send.

Changes:
- dag-executor.ts: Change executeDagWorkflow return type void → string|undefined
- dag-executor.ts: Compute terminal nodes (no dependents) and return first
  completed non-empty output after emitter.unregisterRun
- executor.ts: Capture dagSummary from executeDagWorkflow and pass as
  summary field in the completed WorkflowExecutionResult

Fixes #900

* test(isolation): update syncWorkspace spy assertions for resetAfterFetch arg

Five WorktreeProvider tests were failing because syncWorkspace gained a
third `{ resetAfterFetch: boolean }` argument; test assertions expected
only two arguments. Updated assertions to include `{ resetAfterFetch: false }`
(test paths are not managed clones under ~/.archon/workspaces).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review findings from PR #907

- Add doc comment on terminal node selection explaining first-match semantics
  for multi-terminal DAGs (code-review LOW finding)
- Fix mock type signature for mockExecuteDagWorkflow from Promise<void> to
  Promise<string | undefined> to match updated return type (test-coverage LOW)
- Add result.summary wiring tests in executor.test.ts: verifies that when
  executeDagWorkflow returns a string it flows through to WorkflowExecutionResult.summary,
  and that undefined is passed through correctly (test-coverage MEDIUM)
- Add terminal node selection return-value tests in dag-executor.test.ts: linear DAG
  returns terminal output, empty terminal returns undefined, fan-in DAG returns only
  the true terminal node (test-coverage MEDIUM + rating-5 gap)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 17:49:40 -07:00
Cole Medin
59ba1d2954
feat: configurable Claude Code settingSources via repo config (#879)
* feat: configurable Claude Code settingSources via repo config (#839)

Users who maintain a global ~/.claude/CLAUDE.md lose that context in Archon
sessions because settingSources was hardcoded to ['project']. This adds a
settingSources field to .archon/config.yaml so users can opt in to
['project', 'user'] per-repo or globally.

Changes:
- Add ClaudeAssistantDefaults interface with settingSources field
- Merge settingSources through global and repo config cascade
- Pass settingSources from config through orchestrator to Claude SDK
- Default remains ['project'] (no behavior change for existing users)
- Add tests for settingSources passthrough and default fallback

Fixes #839

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address review findings for settingSources feature

Fixed:
- Guard settingSources with assistant type check (claude only)
- Add config-loader tests for settingSources merge logic (4 tests)
- Add orchestrator tests for settingSources forwarding (2 tests)
- Document settingSources in CLAUDE.md assistant defaults example
- Document settingSources in docs/configuration.md (global, repo, subsection)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: propagate settingSources through workflow executor

Add settingSources to WorkflowAssistantOptions and WorkflowConfig.assistants.claude
in deps.ts so the field is part of the narrow workflow interface. Propagate it in
dag-executor.ts for both regular Claude DAG nodes and loop nodes, so .archon/config.yaml
settingSources setting takes effect when running workflows (not just chat sessions).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate duplicate loadConfig and revert SafeConfig settingSources leak

Load config once during discoverAllWorkflows and reuse the result in handleMessage
instead of calling loadConfig a second time for the same codebase path. Also restrict
SafeConfig.assistants.claude to Pick<ClaudeAssistantDefaults, 'model'> so server-internal
settingSources is never sent to web clients (YAGNI).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 15:25:41 -07:00
Rasmus Widing
d2bfd7a9bd
feat: add approval gate node type for human-in-the-loop workflows (#888)
* feat: add approval gate node type for human-in-the-loop workflows

Add `approval` as a fifth DAG node type that pauses workflow execution
until a human approves or rejects. Built on existing resume infrastructure.

- New `approval` node type in YAML workflows with `message` field
- `paused` workflow status (non-terminal, resumable)
- Approve/reject via REST API, CLI, chat commands, and Web UI
- Approval comment available as `$node.output` in downstream nodes
- Dashboard shows amber pulsing badge with approval message for paused runs
- Path guard blocks new workflows on paused worktrees

* fix: resolve 7 bugs in approval gate implementation

Critical:
- Parse metadata JSON on SQLite (returned as TEXT string, not object)
- Suppress spurious "Workflow stopped (paused)" message after approval gates

Medium:
- CLI exits 0 on pause (not error code 1)
- workflow status shows paused runs alongside running
- Docs clarify auto-resume is CLI-only; API/chat mark resumable

Low:
- Skip completed_at when transitioning to failed for approval resume
- Warn on AI fields (model, hooks, etc.) set on approval nodes
- Add rowCount check to updateWorkflowRun

Also: extract ApprovalContext type, add .min(1) to approval.message,
fix stale "Only failed runs" error messages, fix log domain prefix,
add recovery instructions to CLI approve error message.
2026-03-30 17:27:11 +03:00
Rasmus Widing
fea6812fde
fix(workflows): idle timeout too aggressive on DAG nodes (#854) (#886)
* fix(workflows): idle timeout too aggressive — break after result, reset on all messages (#854)

The idle timeout (5 min default) caused two problems: (1) after a node's AI
finished (result message), the loop waited for the subprocess to exit, wasting
5 min on hangs; (2) tool messages didn't reset the timer, so long Bash calls
(tests, builds) triggered false timeouts on actively working nodes.

Changes:
- Break out of the for-await loop immediately after receiving the result message
  in both command/prompt and loop node paths — no more post-completion waste
- Remove shouldResetTimer predicate so all message types (including tool) reset
  the timer — timeout only fires on complete silence
- Increase STEP_IDLE_TIMEOUT_MS from 5 min to 30 min — with every message
  resetting the timer, this is a deadlock detector, not a work limiter

Fixes #854

* fix(workflows): update withIdleTimeout JSDoc to match new timer behavior

Remove the tool-exclusion example from the shouldResetTimer docs since
that pattern was just removed from all call sites. Clarify that most
callers should omit the parameter.

* fix(workflows): address review findings — log cleanup errors, add break tests, fix stale docs

- Log generator cleanup errors in withIdleTimeout instead of silently swallowing
- Add behavioral tests for break-after-result in both command/prompt and loop nodes
- Fix stale "5 minutes" default in docs/loop-nodes.md (now 30 minutes)
- Clarify shouldResetTimer test names and comments (utility API, not executor behavior)
- Extract effectiveIdleTimeout in loop node path (matches command/prompt pattern)
- Remove redundant iterResult alias in withIdleTimeout
2026-03-30 14:49:14 +03:00
Rasmus Widing
71f8591e90
feat: workflow lifecycle overhaul — path-based guards, interrupted status, resume/abandon (#871)
* feat: add interrupted to WorkflowRunStatus schema

Implements US-001 from PRD.

Changes:
- Add 'interrupted' to workflowRunStatusSchema z.enum in packages/workflows/src/schemas/workflow-run.ts
- Add 'interrupted' to workflowRunStatusSchema in packages/server/src/routes/schemas/workflow.schemas.ts
- Add interrupted: z.number() to dashboardRunsResponseSchema counts object
- Add 'interrupted' to dashboardValidStatuses in API handler
- Add interrupted: 0 to DashboardRunsResult counts interface and runtime object in packages/core/src/db/workflows.ts

* feat: update IWorkflowStore interface & DB query implementations

Implements US-002 from PRD.

Changes:
- IWorkflowStore: rename getActiveWorkflowRun → getActiveWorkflowRunByPath(workingPath)
- IWorkflowStore: drop conversationId from findResumableRun signature
- IWorkflowStore: add interruptOrphanedRuns() method
- db/workflows: add getActiveWorkflowRunByPath querying status IN ('running', 'interrupted')
- db/workflows: update findResumableRun to query by workflow_name + working_path only, include 'interrupted' status
- db/workflows: add interruptOrphanedRuns() UPDATE SET status='interrupted' WHERE status='running'
- store-adapter: wire all three new/modified methods
- executor: update call sites to use renamed methods (type-check requirement)
- tests: update all mock stores and add new tests for getActiveWorkflowRunByPath and interruptOrphanedRuns

* feat: replace staleness guard with path-based lifecycle

Implements US-003 from PRD.

Changes:
- executor.ts: remove STALE_MINUTES staleness auto-kill; replace with
  status-based guard — 'running' blocks, 'interrupted' offers resume/abandon
- server/src/index.ts: replace failStaleWorkflowRuns() with
  createWorkflowStore().interruptOrphanedRuns() on startup
- executor-preamble.test.ts: replace staleness detection tests with
  concurrent run guard tests covering 'running' and 'interrupted' cases

* feat: command handler — /workflow status, resume, and abandon

Implements US-004. Replaces time-based stale heuristics with explicit
lifecycle commands for workflow management.

Changes:
- Remove WORKFLOW_SLOW_THRESHOLD_MS and WORKFLOW_STALE_THRESHOLD_MS constants
- Replace /workflow status with global view: lists all running+interrupted runs
  across all worktrees (ID, name, working path, status, started-at)
- Add /workflow resume <id>: validates state then calls resumeWorkflowRun
- Add /workflow abandon <id>: validates state then calls failWorkflowRun
- Add statuses[] filter to listWorkflowRuns for IN (...) queries
- Update /workflow help text and default case usage string
- Update /status command to remove stale warning that referenced removed constants
- Replace old /workflow status tests with new behavior coverage
- Add /workflow resume and /workflow abandon test coverage

* feat: CLI workflow status, resume, and abandon subcommands

Implements US-005 from PRD.

Changes:
- Implement workflowStatusCommand: lists all running+interrupted runs with ID, name, path, status, age; supports --json flag
- Add workflowResumeCommand: validates run state then calls resumeWorkflowRun
- Add workflowAbandonCommand: validates run state then calls failWorkflowRun('Abandoned by user')
- Replace findLastFailedRun usage in --resume path with findResumableRun(workflowName, cwd)
- Wire resume/abandon subcommands in cli.ts
- Update tests: replace "not implemented" test with status/resume/abandon coverage

* feat: Web UI interrupted status badge and dashboard support

Implements US-006 from PRD.

Changes:
- api.generated.d.ts: add 'interrupted' to WorkflowRunStatus enum and DashboardRunsResponse.counts
- api.ts: add interrupted field to DashboardCounts interface
- WorkflowExecution.tsx: add 'interrupted' to TERMINAL_STATUSES; add amber color to StatusBadge
- WorkflowRunCard.tsx: add amber dot and badge for interrupted status
- StatusSummaryBar.tsx: add 'interrupted' to STATUS_CHIPS filter list
- DashboardPage.tsx: include interrupted in activeRuns filter and counts default

* refactor: remove dead timer-based workflow staleness code

Implements US-007 from PRD.

Changes:
- Remove findLastFailedRun() from db/workflows.ts (CLI path unified on findResumableRun in US-005)
- Remove failStaleWorkflowRuns() from db/workflows.ts (replaced by interruptOrphanedRuns in US-002)
- Remove IDatabase import from db/workflows.ts (no longer needed)
- Remove failStaleWorkflowRuns tests from db/workflows.test.ts

grep -r 'STALE' packages/ (workflow-timer variant), grep -r 'findLastFailedRun' and
grep -r 'failStaleWorkflowRuns' all return zero matches.

* fix: address review feedback — truncated IDs, resume semantics, type safety

- Use full UUIDs in resume/abandon command suggestions (was .slice(0, 8))
- Add completed_at to interruptOrphanedRuns for correct duration metrics
- Fix resume commands: mark interrupted→failed to unblock path guard,
  let next workflow invocation auto-resume via findResumableRun
- Collapse dual status/statuses fields into status?: T | T[]
- Extract TERMINAL/RESUMABLE/ACTIVE_WORKFLOW_STATUSES constants
- Add explicit else-if for interrupted status in executor path guard
- Add structured logging to CLI workflow commands
- Restore conversationId to cmd.workflow_status_failed log
- Add tests: listWorkflowRuns statuses filter, interrupted auto-resume,
  DB error handling for resume/abandon
- Update docs: commands-reference, cli-user-guide, authoring-workflows, CLAUDE.md

* refactor: simplify CLI commands and status filter logic

- Extract getRunOrThrow helper for shared run lookup pattern
- Use WorkflowRun[] instead of Awaited<ReturnType<...>>
- Remove single-item special case in listWorkflowRuns (IN works for all)
- Use ?? instead of || for null-coalescing consistency
- Remove unused ACTIVE_WORKFLOW_STATUSES constant
- Add inline comment on completed_at for interrupted runs

* fix: handle SQLite string dates in formatAge

SQLite returns started_at as a string, not a Date object.
formatAge now accepts Date | string and converts accordingly.
Found during E2E testing against real SQLite database.

* feat: UX improvements — real resume, dashboard actions, cleanup command

- CLI resume now actually re-executes the workflow (calls workflowRunCommand
  with --resume internally instead of just flipping DB status)
- Remove truncated IDs from executor guard messages (full ID in commands only)
- Add Resume/Abandon/Delete buttons to dashboard workflow run cards
- Add Delete button to history table rows
- Add API endpoints: POST resume, POST abandon, DELETE workflow run
- Add CLI workflow cleanup command (deletes terminal runs older than N days)
- Add deleteWorkflowRun and deleteOldWorkflowRuns DB functions

* refactor: simplify API handlers, dashboard actions, and log conventions

- Use RESUMABLE/TERMINAL_WORKFLOW_STATUSES constants in API handlers
  (was inline string checks diverging from CLI/command-handler)
- Extract makeRunAction helper in DashboardPage (4 identical handlers → 1)
- Fix log event names to use domain prefix convention (api.workflow_run_*)
- Use Ban icon for Abandon to distinguish from Cancel's XCircle
- Use instanceof Date guard in formatAge for clarity
- Add comment on delete handler's active-status guard

* refactor: simplify workflow lifecycle — remove interrupted, single resume path

Rework the primitives for a clean foundation:

Status model: 5 statuses (pending, running, completed, failed, cancelled).
Remove 'interrupted' entirely — server restart now marks orphaned runs
as 'failed' directly (with metadata.failure_reason = 'server_restart').

Resume model: one path. The executor's implicit findResumableRun
detects prior failed runs and skips completed nodes. The CLI --resume
flag reuses the prior run's worktree but lets the executor handle
node-skipping (no more preCreatedRun bypass). Chat /workflow resume
tells the user to re-invoke (auto-resume kicks in).

Path guard: only blocks 'running' status (was running + interrupted).
Guards always run regardless of preCreatedRun.

Cancellation: generalized from status === 'cancelled' to
status !== 'running' at all 3 check points (streaming, loop iterations,
DAG layers). Ready for future 'paused' status with zero changes.

- interruptOrphanedRuns → failOrphanedRuns
- Remove preCreatedRun bypass from CLI --resume path
- Generalize 3 cancellation check points in dag-executor
- Update all API endpoints, command handlers, UI components
- Update all tests and documentation

* fix: cancel/complete race, abandon semantics, UTC dates

- Fix cancel/complete race condition: dag-executor now checks DB status
  before calling completeWorkflowRun or failWorkflowRun, preventing a
  cancel during the final layer from being overwritten to completed
- Abandon uses cancelWorkflowRun instead of failWorkflowRun, so
  abandoned runs don't get auto-resumed by findResumableRun
- Fix formatAge UTC bug: SQLite dates without Z suffix now parsed as UTC

* fix: address PR review — SQL safety, transactions, error handling, docs, tests

- Validate olderThanDays before SQL interpolation in deleteOldWorkflowRuns
- Wrap multi-statement deletes in transactions (deleteOldWorkflowRuns, deleteWorkflowRun)
- Fix deleteWorkflowRun error double-wrap (don't re-wrap "not found" errors)
- Handle null getWorkflowRunStatus in DAG executor (treat as deleted, abort)
- Fix mock name mismatch: interruptOrphanedRuns → failOrphanedRuns in 3 test files
- Fix default mock getWorkflowRunStatus to return 'running' instead of null
- Add NaN guard to formatAge (returns 'unknown' on unparseable dates)
- Fix stale 'interrupted' references in route summary and delete comment
- Include working path in /workflow resume response
- Align deleteOldWorkflowRuns return type to { count } for consistency
- Document workflow cleanup command in CLAUDE.md, CLI user guide, commands reference
- Document new API endpoints (resume, abandon, delete) in CLAUDE.md
- Add tests for deleteOldWorkflowRuns, deleteWorkflowRun, workflowCleanupCommand
- Fix workflowAbandonCommand test to assert cancelWorkflowRun call

* refactor: simplify code per review — extract helper, cleaner date parsing, consistent guards

- Extract duplicated status-check blocks into skipIfStatusChanged helper in dag-executor
- Simplify formatAge to single-pass date parsing with Z suffix (ISO 8601)
- Use TERMINAL_WORKFLOW_STATUSES constant in delete route guard
- Rename cancelError → actionError in DashboardPage (covers 4 actions now)
- Fix merge conflict: add IDatabase import, getRunningWorkflows from dev
- Fix api.conversations.test.ts: add missing workflow mocks, fix Hono → OpenAPIHono

* fix: address review findings — double-rollback, missing guards, log context, tests

- Fix double-rollback in deleteWorkflowRun by removing inner rollback()
  call and letting the outer catch handle it
- Add terminal-status guard inside deleteWorkflowRun itself, not just in
  the route handler, to prevent deletion of running workflows
- Add rollback failure logging to the rollback() helper
- Add runId to error logs in resume/abandon/delete API route handlers
- Add workingPath to getActiveWorkflowRunByPath error log
- Add workflowRunId to dag-executor status-check warn logs
- Wrap workflowRunCommand in try/catch in workflowResumeCommand with
  structured logging and null guard for user_message
- Clean up stale 'interrupted' references in JSDoc
- Fix missing / prefix on workflow cleanup in commands-reference.md
- Add API route tests for POST /resume, POST /abandon, DELETE /:runId

* refactor: apply code simplifications from review

- Replace fragile startsWith string matching in deleteWorkflowRun catch
  with a typed WorkflowRunGuardError class
- Reorder listWorkflowRuns placeholder generation: capture startIdx
  before pushing values for clarity
- Replace curried makeRunAction factory in DashboardPage with a plain
  runAction helper function
- Move skipIfStatusChanged helper definition before its call sites in
  dag-executor to match reading order
2026-03-30 13:36:53 +03:00
Rasmus Widing
43992363c0
refactor(workflows): eliminate types.ts re-export shim (step 2.1) (#844)
* refactor(workflows): eliminate types.ts re-export shim (step 2.1)

Move the 4 non-schema types (LoadCommandResult, WorkflowExecutionResult,
WorkflowLoadError, WorkflowLoadResult) into schemas/workflow.ts where
WorkflowDefinition already lives. Relocate the compile-time NodeOutput/
NodeState assertion to schemas/workflow-run.ts. Add DagWorkflow alias and
all 4 types to schemas/index.ts. Update index.ts barrel to re-export from
./schemas directly. Update all 21 import lines across 12 source files and
9 test files. Rename types.test.ts → schemas.test.ts. Delete types.ts.

The import chain is now one level shallower: caller → ./schemas →
specific schema file (was: caller → ./types → ./schemas → schema file).
External consumers of @archon/workflows are unaffected.

Fixes #842

* docs: address review findings from PR #844

- Update schemas/workflow.ts file header to reflect that it now also
  contains non-Zod hand-written result types (LoadCommandResult,
  WorkflowExecutionResult, WorkflowLoadError, WorkflowLoadResult)
- Remove stale types.ts entry from CLAUDE.md directory structure
  listing (file was deleted in this PR)

* fix(docs): update stale references after types.ts deletion

- CLAUDE.md: wrong-example comment now notes the subpath no longer exists
  (not just "don't do this in web")
- LoadCommandResult JSDoc: clarify non-empty content is enforced at load
  time in executor-shared.ts, not by the type itself
2026-03-27 00:07:46 +02:00