Compare commits

...

172 commits
v0.3.1 ... dev

Author SHA1 Message Date
Alex Siri
7ea321419f
fix: initialize options.hooks before merging YAML node hooks (#1177)
Some checks are pending
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (ubuntu-latest) (push) Waiting to run
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
When a workflow node defines hooks (PreToolUse/PostToolUse) in YAML but
no hooks exist yet on the options object, applyNodeConfig crashes with
"undefined is not an object" because it tries to assign properties on
the undefined options.hooks.

Initialize options.hooks to {} before the merge loop.

Reproduces with: archon workflow run archon-architect (which uses
per-node hooks extensively).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 14:52:56 +03:00
Rasmus Widing
ba4b9b47e6
docs(worktree): fix stale rename example + document copyFiles properly (#1328)
Three related fixes around the `worktree.copyFiles` primitive:

1. Remove the `.env.example -> .env` rename example from
   reference/configuration.md and getting-started/overview.md. The
   `->` parser was removed in #739 (2026-03-19) because it caused
   the stale-credentials production bug in #228 — but the docs kept
   advertising it. A user writing `.env.example -> .env` today gets
   `parseCopyFileEntry` returning `{source: '.env.example -> .env',
   destination: '.env.example -> .env'}`, stat() fails with ENOENT,
   and the copy silently no-ops at debug level.

2. Replace the single-line "Default behavior: .archon/ is always
   copied" note with a proper "Worktree file copying" subsection
   that explains:
   - Why this exists (git worktree add = tracked files only; gitignored
     workflow inputs need this hook)
   - The `.archon/` default (no config needed for the common case)
   - Common entries: .env, .vscode/, .claude/, plans/, reports/,
     data fixtures
   - Semantics: source=destination, ENOENT silently skipped, per-entry
     error isolation, path-traversal rejected
   - Interaction with `worktree.path` (both layouts get the same
     treatment)

3. Update the overview example to drop the `.env.example + .env` pair
   (which implied rename semantics) in favor of `.env + plans/`, and
   call out that `.archon/` is auto-copied so users don't list it.

No code changes. `bun run format:check` and `bun run lint` green.
2026-04-21 12:15:37 +03:00
Lior Franko
08de8ee5c6
fix(web,server): show real platform connection status in Settings (#1061)
The Settings page's Platform Connections section hardcoded every platform
except Web to 'Not configured', so users couldn't tell whether their Slack/
Telegram/Discord/GitHub/Gitea/GitLab adapters had actually started.

- Server: /api/health now returns an activePlatforms array populated live
  as each adapter's start() resolves. Passed into registerApiRoutes so the
  reference stays mutable — Telegram starts after the HTTP listener is
  already accepting requests, so a snapshot would miss it.
- Web: SettingsPage.PlatformConnectionsSection now reads activePlatforms
  from /api/health and looks each platform up in a Set. Also adds Gitea
  and GitLab to the list (they already ship as adapters).

Closes #1031

Co-authored-by: Lior Franko <liorfr@dreamgroup.com>
2026-04-21 11:47:32 +03:00
Rasmus Widing
5ed38dc765
feat(isolation,workflows): worktree location + per-workflow isolation policy (#1310)
Some checks are pending
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (ubuntu-latest) (push) Waiting to run
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
* feat(isolation): per-project worktree.path + collapse to two layouts

Adds an opt-in `worktree.path` to .archon/config.yaml so a repo can co-locate
worktrees with its own checkout (`<repoRoot>/<path>/<branch>`) instead of the
default `~/.archon/workspaces/<owner>/<repo>/worktrees/<branch>`. Requested in
joelsb's #1117.

Primitive changes (clean up the graveyard rather than add parallel code paths):

- Collapse worktree layouts from three to two. The old "legacy global" layout
  (`~/.archon/worktrees/<owner>/<repo>/<branch>`) is gone — every repo resolves
  to the workspace-scoped layout (`~/.archon/workspaces/<owner>/<repo>/worktrees/<branch>`),
  whether it was archon-cloned or locally registered. `extractOwnerRepo()` on
  the repo path is the stable identity fallback. Ends the divergence where
  workspace-cloned and local repos had visibly different worktree trees.

- `getWorktreeBase()` in @archon/git now returns `{ base, layout }` and accepts
  an optional `{ repoLocal }` override. The layout value replaces the old
  `isProjectScopedWorktreeBase()` classification at the call sites
  (`isProjectScopedWorktreeBase` stays exported as deprecated back-compat).

- `WorktreeCreateConfig.path` carries the validated override from repo config.
  `resolveRepoLocalOverride()` fails loudly on absolute paths, `..` escapes,
  and resolve-escape edge cases (Fail Fast — no silent default fallback when
  the config is syntactically wrong).

- `WorktreeProvider.create()` now loads repo config exactly once and threads it
  through `getWorktreePath()` + `createWorktree()`. Replaces the prior
  swallow-then-retry pattern flagged on #1117. `generateEnvId()` is gone —
  envId is assigned directly from the resolved path (the invariant was already
  documented on `destroy(envId)`).

Tests (packages/git + packages/isolation):
- Update the pre-existing `getWorktreeBase` / `isProjectScopedWorktreeBase`
  suite for the new two-layout return shape and precedence.
- Add 8 tests for `worktree.path`: default fallthrough, empty/whitespace
  ignored, override wins for workspace-scoped repos, rejects absolute, rejects
  `../` escapes (three variants), accepts nested relative paths.

Docs: add `worktree.path` to the repo config reference with explicit precedence
and the `.gitignore` responsibility note.

Co-authored-by: Joel Bastos <joelsb2001@gmail.com>

* feat(workflows): per-workflow worktree.enabled policy

Introduces a declarative top-level `worktree:` block on a workflow so
authors can pin isolation behavior regardless of invocation surface. Solves
the case where read-only workflows (e.g. `repo-triage`) should always run in
the live checkout, without every CLI/web/scheduled-trigger caller having to
remember to set the right flag.

Schema (packages/workflows/src/schemas/workflow.ts + loader.ts):

- New optional `worktree.enabled: boolean` on `workflowBaseSchema`. Loader
  parses with the same warn-and-ignore discipline used for `interactive`
  and `modelReasoningEffort` — invalid shapes log and drop rather than
  killing workflow discovery.

Policy reconciliation (packages/cli/src/commands/workflow.ts):

- Three hard-error cases when YAML policy contradicts invocation flags:
  • `enabled: false` + `--branch`       (worktree required by flag, forbidden by policy)
  • `enabled: false` + `--from`         (start-point only meaningful with worktree)
  • `enabled: true`  + `--no-worktree`  (policy requires worktree, flag forbids it)
- `enabled: false` + `--no-worktree` is redundant, accepted silently.
- `--resume` ignores the pinned policy (it reuses the existing run's worktree
  even when policy would disable — avoids disturbing a paused run).

Orchestrator wiring (packages/core/src/orchestrator/orchestrator-agent.ts):

- `dispatchOrchestratorWorkflow` short-circuits `validateAndResolveIsolation`
  when `workflow.worktree?.enabled === false` and runs directly in
  `codebase.default_cwd`. Web chat/slack/telegram callers have no flag
  equivalent to `--no-worktree`, so the YAML field is their only control.
- Logged as `workflow.worktree_disabled_by_policy` for operator visibility.

First consumer (.archon/workflows/repo-triage.yaml):

- `worktree: { enabled: false }` — triage reads issues/PRs and writes gh
  labels; no code mutations, no reason to spin up a worktree per run.

Tests:

- Loader: parses `worktree.enabled: true|false`, omits block when absent.
- CLI: four new integration tests for the reconciliation matrix (skip when
  policy false, three hard-error cases, redundant `--no-worktree` accepted,
  `--no-worktree` + `enabled: true` rejected).

Docs: authoring-workflows.md gets the new top-level field in the schema
example with a comment explaining the precedence and the `enabled: true|false`
semantics.

* fix(isolation): use path.sep for repo-containment check on Windows

resolveRepoLocalOverride was hardcoding '/' as the separator in the
startsWith check, so on Windows (where `resolve()` returns backslash
paths like `D:\Users\dev\Projects\myapp`) every otherwise-valid
relative `worktree.path` was rejected with "resolves outside the repo
root". Fixed by importing `path.sep` and using it in the sentinel.

Fixes the 3 Windows CI failures in `worktree.path repo-local override`.

---------

Co-authored-by: Joel Bastos <joelsb2001@gmail.com>
2026-04-20 21:54:10 +03:00
Rasmus Widing
7be4d0a35e
feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath (closes #1136) (#1315)
* feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath

Collapses the awkward `~/.archon/.archon/workflows/` convention to a direct
`~/.archon/workflows/` child (matching `workspaces/`, `archon.db`, etc.), adds
home-scoped commands and scripts with the same loading story, and kills the
opt-in `globalSearchPath` parameter so every call site gets home-scope for free.

Closes #1136 (supersedes @jonasvanderhaegen's tactical fix — the bug was the
primitive itself: an easy-to-forget parameter that five of six call sites on
dev dropped).

Primitive changes:

- Home paths are direct children of `~/.archon/`. New helpers in `@archon/paths`:
  `getHomeWorkflowsPath()`, `getHomeCommandsPath()`, `getHomeScriptsPath()`,
  and `getLegacyHomeWorkflowsPath()` (detection-only for migration).
- `discoverWorkflowsWithConfig(cwd, loadConfig)` reads home-scope internally.
  The old `{ globalSearchPath }` option is removed. Chat command handler, Web
  UI workflow picker, orchestrator resolve path — all inherit home-scope for
  free without maintainer patches at every new site.
- `discoverScriptsForCwd(cwd)` merges home + repo scripts (repo wins on name
  collision). dag-executor and validator use it; the hardcoded
  `resolve(cwd, '.archon', 'scripts')` single-scope path is gone.
- Command resolution is now walked-by-basename in each scope. `loadCommand`
  and `resolveCommand` walk 1 subfolder deep and match by `.md` basename, so
  `.archon/commands/triage/review.md` resolves as `review` — closes the
  latent bug where subfolder commands were listed but unresolvable.
- All three (`workflows/`, `commands/`, `scripts/`) enforce a 1-level
  subfolder cap (matches the existing `defaults/` convention). Deeper
  nesting is silently skipped.
- `WorkflowSource` gains `'global'` alongside `'bundled'` and `'project'`.
  Web UI node palette shows a dedicated "Global (~/.archon/commands/)"
  section; badges updated.

Migration (clean cut — no fallback read):

- First use after upgrade: if `~/.archon/.archon/workflows/` exists, Archon
  logs a one-time WARN per process with the exact `mv` command:
  `mv ~/.archon/.archon/workflows ~/.archon/workflows && rmdir ~/.archon/.archon`
  The legacy path is NOT read — users migrate manually. Rollback caveat
  noted in CHANGELOG.

Tests:

- `@archon/paths/archon-paths.test.ts`: new helper tests (default HOME,
  ARCHON_HOME override, Docker), plus regression guards for the double-`.archon/`
  path.
- `@archon/workflows/loader.test.ts`: home-scoped workflows, precedence,
  subfolder 1-depth cap, legacy-path deprecation warning fires exactly once
  per process.
- `@archon/workflows/validator.test.ts`: home-scoped commands + subfolder
  resolution.
- `@archon/workflows/script-discovery.test.ts`: depth cap + merge semantics
  (repo wins, home-missing tolerance).
- Existing CLI + orchestrator tests updated to drop `globalSearchPath`
  assertions.

E2E smoke (verified locally, before cleanup):

- `.archon/workflows/e2e-home-scope.yaml` + scratch repo at /tmp
- Home-scoped workflow discovered from an unrelated git repo
- Home-scoped script (`~/.archon/scripts/*.ts`) executes inside a script node
- 1-level subfolder workflow (`~/.archon/workflows/triage/*.yaml`) listed
- Legacy path warning fires with actionable `mv` command; workflows there
  are NOT loaded

Docs: `CLAUDE.md`, `docs-web/guides/global-workflows.md` (full rewrite for
three-type scope + subfolder convention + migration), `docs-web/reference/
configuration.md` (directory tree), `docs-web/reference/cli.md`,
`docs-web/guides/authoring-workflows.md`.

Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>

* test(script-discovery): normalize path separators in mocks for Windows

The 4 new tests in `scanScriptDir depth cap` and `discoverScriptsForCwd —
merge repo + home with repo winning` compared incoming mock paths with
hardcoded forward-slash strings (`if (path === '/scripts/triage')`). On
Windows, `path.join('/scripts', 'triage')` produces `\scripts\triage`, so
those branches never matched, readdir returned `[]`, and the tests failed.

Added a `norm()` helper at module scope and wrapped the incoming `path`
argument in every `mockImplementation` before comparing. Stored paths go
through `normalizeSep()` in production code, so the existing equality
assertions on `script.path` remain OS-independent.

Fixes Windows CI job `test (windows-latest)` on PR #1315.

* address review feedback: home-scope error handling, depth cap, and tests

Critical fixes:
- api.ts: add `maxDepth: 1` to all 3 findMarkdownFilesRecursive calls in
  GET /api/commands (bundled/home/project). Without this the UI palette
  surfaced commands from deep subfolders that the executor (capped at 1)
  could not resolve — silent "command not found" at runtime.
- validator.ts: wrap home-scope findMarkdownFilesRecursive and
  resolveCommandInDir calls in try/catch so EACCES/EPERM on
  ~/.archon/commands/ doesn't crash the validator with a raw filesystem
  error. ENOENT still returns [] via the underlying helper.

Error handling fixes:
- workflow-discovery.ts: maybeWarnLegacyHomePath now sets the
  "warned-once" flag eagerly before `await access()`, so concurrent
  discovery calls (server startup with parallel codebase resolution)
  can't double-warn. Non-ENOENT probe errors (EACCES/EPERM) now log at
  WARN instead of DEBUG so permission issues on the legacy dir are
  visible in default operation.
- dag-executor.ts: wrap discoverScriptsForCwd in its own try/catch so
  an EACCES on ~/.archon/scripts/ routes through safeSendMessage /
  logNodeError with a dedicated "failed to discover scripts" message
  instead of being mis-attributed by the outer catch's
  "permission denied (check cwd permissions)" branch.

Tests:
- load-command-prompt.test.ts (new): 6 tests covering the executor's
  command resolution hot path — home-scope resolves when repo misses,
  repo shadows home, 1-level subfolder resolvable by basename, 2-level
  rejected, not-found, empty-file. Runs in its own bun test batch.
- archon-paths.test.ts: add getHomeScriptsPath describe block to match
  the existing getHomeCommandsPath / getHomeWorkflowsPath coverage.

Comment clarity:
- workflow-discovery.ts: MAX_DISCOVERY_DEPTH comment now leads with the
  actual value (1) before describing what 0 would mean.
- script-discovery.ts: copy the "routing ambiguity" rationale from
  MAX_DISCOVERY_DEPTH to MAX_SCRIPT_DISCOVERY_DEPTH.

Cleanup:
- Remove .archon/workflows/e2e-home-scope.yaml — one-off smoke test that
  would ship permanently in every project's workflow list. Equivalent
  coverage exists in loader.test.ts.

Addresses all blocking and important feedback from the multi-agent
review on PR #1315.

---------

Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>
2026-04-20 21:45:32 +03:00
Rasmus Widing
cc78071ff6
fix(isolation): raise worktree git-operation timeout to 5m (#1306)
All 15 worktree git-subprocess timeouts in WorktreeProvider were hardcoded
at 30000ms. Repos with heavy post-checkout hooks (lint, dependency install,
submodule init) routinely exceed that budget and fail worktree creation.

Consolidate them onto a single GIT_OPERATION_TIMEOUT_MS constant at 5 min.
Generous enough to cover reported cases while still catching genuine hangs
(credential prompts in non-TTY, stalled fetches).

Chosen over the config-key approach in #1029 to avoid adding permanent
.archon/config.yaml surface for a problem a raised default solves cleanly.
If 5 min turns out to also be too tight for real-world use, we'll revisit.

Closes #1119
Supersedes #1029

Co-authored-by: Shay Elmualem <12733941+norbinsh@users.noreply.github.com>
2026-04-20 21:45:24 +03:00
ACJLabsDev
235a8ce202
Add Star History Chart to README.md (#1229) 2026-04-20 19:43:52 +03:00
Kagura
39a05b762f
fix(db): throw on corrupt commands JSON instead of silent empty fallback (#1033)
* fix(db): throw on corrupt commands JSON instead of silent empty fallback (#967)

getCodebaseCommands() silently returned {} when the commands column
contained corrupt JSON. Callers had no way to distinguish 'no commands'
from 'unreadable data', violating fail-fast principles.

Now throws a descriptive error with the codebase ID and a recovery hint.
The error is still logged for observability before throwing.

Adds two test cases: corrupt JSON throws, valid JSON string parses.

* fix: include parse error in log for better diagnostics
2026-04-20 16:19:50 +03:00
Cole Medin
c5e11ea8f5 docs(claude-md): surface Pi as peer provider alongside Claude and Codex
Some checks are pending
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / test (ubuntu-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
CLAUDE.md is the primary entry point for agents working in this repo, but it
only mentioned Pi once — buried in a DAG-node capability parenthetical. Add
Pi to the directory tree, Package Split blurb, and AI Agent Providers list
so Pi is discoverable without relying on the docs site or git log.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-20 07:59:41 -05:00
Cole Medin
cb44b96f7b
feat(providers/pi): interactive flag binds UIContext for extensions (#1299)
* feat(providers/pi): interactive flag binds UIContext for extensions

Adds `interactive: true` opt-in to Pi provider (in `.archon/config.yaml`
under `assistants.pi`) that binds a minimal `ExtensionUIContext` stub to
each session. Without this, Pi's `ExtensionRunner.hasUI()` reports false,
causing extensions like `@plannotator/pi-extension` to silently auto-approve
every plan instead of opening their browser review UI.

Semantics: clamped to `enableExtensions: true` — no extensions loaded
means nothing would consume `hasUI`, so `interactive` alone is silently
dropped. Stub forwards `notify()` to Archon's event stream; interactive
dialogs (select/confirm/input/editor/custom) resolve to undefined/false;
TUI-only setters (widgets/headers/footers/themes) no-op. Theme access
throws with a clear diagnostic — Pi's theme singleton is coupled to its
own `Symbol.for()` registry which Archon doesn't own.

Trust boundary: only binds when the operator has explicitly enabled
both flags. Extensions gated on `ctx.hasUI` (plannotator and similar)
get a functional UI context; extensions that reach for TUI features
still fail loudly rather than rendering garbage.

Includes smoke-test workflow documenting the integration surface.
End-to-end plannotator UI rendering requires plan-mode activation
(Pi `--plan` CLI flag or `/plannotator` TUI slash command) which is
out of reach for programmatic Archon sessions — manual test only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(providers/pi): end-to-end interactive extension UI

Three fixes that together get plannotator's browser review UI to actually
render from an Archon workflow and reach the reviewer's browser.

1. Call resourceLoader.reload() when enableExtensions is true.
   createAgentSession's internal reload is gated on `!resourceLoader`, so
   caller-supplied loaders must reload themselves. Without this,
   getExtensions() returns the empty default, no ExtensionRunner is built,
   and session.extensionRunner.setFlagValue() silently no-ops.

2. Set PLANNOTATOR_REMOTE=1 in interactive mode.
   plannotator-browser.ts only calls ctx.ui.notify(url) when openBrowser()
   returns { isRemote: true }; otherwise it spawns xdg-open/start on the
   Archon server host — invisible to the user and untestable from bash
   asserts. From the workflow runner's POV every Archon execution IS
   remote; flipping the heuristic routes the URL through notify(), which
   the ExtensionUIContext stub forwards into the event stream. Respect
   explicit operator overrides.

3. notify() emits as assistant chunks, not system chunks.
   The DAG executor's system-chunk filter only forwards warnings/MCP
   prefixes, and only assistant chunks accumulate into $nodeId.output.
   Emitting as assistant makes the URL available both in the user's
   stream and in downstream bash/script nodes via output substitution.

Plus: extensionFlags config pass-through (equivalent to `pi --plan` on the
CLI) applied via ExtensionRunner.setFlagValue() BEFORE bindExtensions
fires session_start, so extensions reading flags in their startup handler
actually see them. Also bind extensions with an empty binding when
enableExtensions is on but interactive is off, so session_start still
fires for flag-driven but UI-less extensions.

Smoke test (.archon/workflows/e2e-plannotator-smoke.yaml) uses
openai-codex/gpt-5.4-mini (ChatGPT Plus OAuth compatible) and bumps
idle_timeout to 600000ms so plannotator's server survives while a human
approves in the browser.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(providers/pi): keep Archon extension-agnostic

Remove the plannotator-specific PLANNOTATOR_REMOTE=1 env var write from
the Pi provider. Archon's provider layer shouldn't know about any
specific extension's internals. Document the env var in the plannotator
smoke test instead — operators who use plannotator set it via their shell
or per-codebase env config.

Workflow smoke test updated with:
- Instructions for setting PLANNOTATOR_REMOTE=1 externally
- Simpler assertion (URL emission only) — validated in a real
  reject-revise-approve run: reviewer annotated, clicked Send Feedback,
  Pi received the feedback as a tool result, revised the plan (added
  aria-label and WCAG contrast per the annotation), resubmitted, and
  reviewer approved. Plannotator's tool result signals approval but
  doesn't return the plan text, so the bash assertion now only checks
  that the review URL reached the stream (not that plan content flowed
  into \$nodeId.output — it can't).
- Known-limitation note documenting the tool-result shape so downstream
  workflow authors know to Write the plan separately if they need it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(providers/pi): keep e2e-plannotator-smoke workflow local-only

The smoke test is plannotator-specific (calls plannotator_submit_plan,
expects PLAN.md on disk, requires PLANNOTATOR_REMOTE=1) and is better
kept out of the PR while the extension-agnostic infra lands.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style(providers/pi): trim verbose inline comments

Collapse multi-paragraph SDK explanations to 1-2 line "why" notes across
provider.ts, types.ts, ui-context-stub.ts, and event-bridge.ts. No
behavior change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(providers/pi): wire assistants.pi.env + theme-proxy identity

Two end-to-end fixes discovered while exercising the combined
plannotator + @pi-agents/loop smoke flow:

- PiProviderDefaults gains an optional `env` map; parsePiConfig picks
  it up and the provider applies it to process.env at session start
  (shell env wins, no override). Needed so extensions like plannotator
  can read PLANNOTATOR_REMOTE=1 from config.yaml without requiring a
  shell export before `archon workflow run`.

- ui-context-stub theme proxy returns identity decorators instead of
  throwing on unknown methods. Styled strings flow into no-op
  setStatus/setWidget sinks anyway, so the throw was blocking
  plannotator_submit_plan after HTTP approval with no benefit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(providers/pi): flush notify() chunks immediately in batch mode

Batch-mode adapters (CLI) accumulate assistant chunks and only flush on
node completion. That broke plannotator's review-URL flow: Pi's notify()
emitted the URL as an assistant chunk, but the user needed the URL to
POST /api/approve — which is what unblocks the node in the first place.

Adds an optional `flush` flag on assistant MessageChunks. notify() sets
it, and the DAG executor drains pending batched content before surfacing
the flushed chunk so ordering is preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: mention Pi alongside Claude and Codex in README + top-level docs

The AI assistants docs page already covers Pi in depth, but the README
architecture diagram + docs table, overview "Further Reading" section,
and local-deployment .env comment still listed only Claude/Codex.

Left feature-specific mentions alone where Pi genuinely lacks support
(e.g. structured output — Claude + Codex only).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: note Pi structured output (best-effort) in matrix + workflow docs

Pi gained structured output support via prompt augmentation + JSON
extraction (see packages/providers/src/community/pi/capabilities.ts).
Unlike Claude/Codex, which use SDK-enforced JSON mode, Pi appends the
schema to the prompt and parses JSON out of the result text (bare or
fenced). Updates four stale references that still said Claude/Codex-only:

- ai-assistants.md capabilities matrix
- authoring-workflows.md (YAML example + field table)
- workflow-dag.md skill reference
- CLAUDE.md DAG-format node description

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(providers/pi): default extensions + interactive to on

Extensions (community packages like @plannotator/pi-extension and
user-authored ones) are a core reason users pick Pi. Defaulting
enableExtensions and interactive to false previously silenced installed
extensions with no signal, leading to "did my extension even load?"
confusion.

Opt out in .archon/config.yaml when you want the prior behavior:

  assistants:
    pi:
      enableExtensions: false   # skip extension discovery entirely
      # interactive: false       # load extensions, but no UI bridge

Docs gain a new "Extensions (on by default)" section in
getting-started/ai-assistants.md that documents the three config
surfaces (extensionFlags, env, workflow-level interactive) and uses
plannotator as a concrete walk-through example.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-20 07:37:40 -05:00
Cocoon-Break
45682bd2c8
fix(providers/claude): use || instead of ?? in hasExplicitTokens to handle empty-string env vars (#1028)
Closes #1027
2026-04-20 14:15:27 +03:00
Rasmus Widing
52eebf995a chore(gitignore): ignore .claude/scheduled_tasks.lock
Machine-local runtime state from the Claude Code scheduler (pid +
sessionId + acquisition timestamp). Should not be shared across machines.
2026-04-20 13:39:44 +03:00
Rasmus Widing
28908f0c75
feat(paths/cli/setup): unify env load + write on three-path model (#1302, #1303) (#1304)
* feat(paths/cli/setup): unify env load + write on three-path model (#1302, #1303)

Key env handling on directory ownership rather than filename. `.archon/` (at
`~/` or `<cwd>/`) is archon-owned; anything else is the user's.

- `<repo>/.env` — stripped at boot (guard kept), never loaded, never written
- `<repo>/.archon/.env` — loaded at repo scope (wins over home), writable via
  `archon setup --scope project`
- `~/.archon/.env` — loaded at home scope, writable via `--scope home` (default)

Read side (#1302):
- New `@archon/paths/env-loader` with `loadArchonEnv(cwd)` shared by CLI and
  server entry points. Loads both archon-owned files with `override: true`;
  repo scope wins.
- Replaced `[dotenv@17.3.1] injecting env (0) from .env` (always lied about
  stripped keys) with `[archon] stripped N keys from <cwd> (...)` and
  `[archon] loaded N keys from <path>` lines, emitted only when N > 0.
  `quiet: true` passed to dotenv to silence its own output.
- `stripCwdEnv` unchanged in semantics — still the only source that deletes
  keys from `process.env`; now logs what it did.

Write side (#1303):
- `archon setup` never writes to `<repo>/.env`. Writing there was incoherent
  because `stripCwdEnv` deletes those keys on every run.
- New `--scope home|project` (default home) targets exactly one archon-owned
  file. New `--force` overrides the merge; backup still written.
- Merge-only by default: existing non-empty values win, user-added custom keys
  survive, `<path>.archon-backup-<ISO-ts>` written before every rewrite. Fixes
  silent PostgreSQL→SQLite downgrade and silent token loss in Add mode.
- One-time migration note emitted when `<cwd>/.env` exists at setup start.

Tests: new `env-loader.test.ts` (6), extended `strip-cwd-env.test.ts` (+4 for
the log line), extended `setup.test.ts` (+10 for scope/merge/backup/force/
repo-untouched), extended `cli.test.ts` (+5 for flag parsing).

Docs: configuration.md, cli.md, security.md, cli-internals.md, setup skill —
all updated to the three-path model.

* fix(cli/setup): address PR review — scope/path/secret-handling edge cases

- cli: resolve --scope project to git repo root so running setup from a
  subdir writes to <repo-root>/.archon/.env (what loadArchonEnv reads at
  boot), not <subdir>/.archon/.env. Fail fast with a useful message when
  --scope project is used outside a git repo.
- setup: resolveScopedEnvPath() now delegates to @archon/paths helpers
  (getArchonEnvPath / getRepoArchonEnvPath) so Docker's /.archon home,
  ARCHON_HOME overrides, and the "undefined" literal guard all behave
  identically between the loader and the writer.
- setup: wrap the writeScopedEnv call in try/catch so an fs exception
  (permission denied, read-only FS, backup copy failure) stops the clack
  spinner cleanly and emits an actionable error instead of a raw stack
  trace after the user has completed the entire wizard.
- setup: checkExistingConfig(envPath?) — scope-aware existing-config read.
  Add/Update/Fresh now reflects the actual write target, not an
  unconditional ~/.archon/.env.
- setup: serializeEnv escapes \r (was only \n) so values with bare CR or
  CRLF round-trip through dotenv.parse without corruption. Regression
  test added.
- setup: merge path treats whitespace-only existing values ('   ') as
  empty, so a copy-paste stray space doesn't silently defeat the wizard
  update for that key forever. Regression test added.
- setup: 0o600 mode on the written env file AND on backup copies —
  writeFileSync+copyFileSync default to 0o666 & ~umask, which can leave
  secrets group/world-readable on a permissive umask.
- docs/cli.md + setup skill: appendix sections that still described the
  pre-#1303 two-file symlink model now reflect the three-path model.

* fix(paths/env-loader): Windows-safe assertion for home-scope load line

The test asserted the log line contained `from ~/`, which is opportunistic
tilde-shortening that only happens when the tmpdir lives under `homedir()`.
On Windows CI the tmpdir is on `D:\\` while homedir is `C:\\Users\\...`, so
the path renders absolute and the `~/` never appears.

Match on the count and the archon-home tmpdir segment instead — robust on
both Unix tilde-short paths and Windows absolute paths.
2026-04-20 12:49:14 +03:00
Rasmus Widing
8ae4a56193
feat(workflows): add repo-triage — periodic maintenance via inline Haiku sub-agents (#1293)
* feat(workflows): add repo-triage — 6-node periodic maintenance workflow

Adds .archon/workflows/repo-triage.yaml: a self-contained periodic
maintenance workflow that uses inline sub-agents (Claude SDK agents:
field introduced in #1276) for map-reduce across open issues and PRs.

Six DAG nodes, three-layer topology:
- Layer 1 (parallel): triage-issues, link-prs, closed-pr-dedup-check,
  stale-nudge
- Layer 2: closed-dedup-check (reads triage-issues state)
- Layer 3: digest (synthesises all prior nodes + writes markdown)

Capabilities per node:
- triage-issues: delegates labeling to on-disk triage-agent; inline
  brief-gen Haiku for duplicate detection; 3-day auto-close clock
  for unanswered duplicate warnings
- link-prs: conservative PR ↔ issue cross-refs via inline pr-issue-
  matcher Haiku, Sonnet re-verifies fully-addresses claims before
  suggesting Closes #X; auto-nudges on low-quality PR template fill
  with first-run grandfather guard (snapshot-only, no nudge spam)
- closed-dedup-check: cross-matches open issues against recently-
  closed ones via inline closed-brief-gen Haiku; same 3-day clock
- closed-pr-dedup-check: flags open PRs duplicating recently-closed
  PRs via inline pr-brief-gen Haiku; comment-only, never closes PRs
- stale-nudge: 60-day inactivity pings (configurable); no auto-close
- digest: synthesises per-node outputs + reads state files to emit
  $ARTIFACTS_DIR/digest.md with clickable GitHub comment links

Env-gated rollout knobs:
- DRY_RUN=1 (read-only; prints [DRY] lines, no gh/state mutations)
- SKIP_PR_LINK=1, SKIP_CLOSED_DEDUP=1, SKIP_CLOSED_PR_DEDUP=1,
  SKIP_STALE_NUDGE=1
- STALE_DAYS=N (stale-nudge window; default 60)

Cross-run state under .archon/state/ (gitignored):
- triage-state.json        briefs + pendingDedupComments
- closed-dedup-state.json  closedBriefs + closedMatchComments
- closed-pr-dedup-state.json openBriefs + closedBriefs + matches
- pr-state.json            linkedPrs + commentIds + templateAdherence
- stale-nudge-state.json   nudged (with updatedAtAtNudge for re-nudge)

Every bot comment:
- @-tags the target human (reporter for issues, author for PRs)
- Tracks comment ID in state for traceability
- Is idempotent — re-runs skip existing comments

Intended use: invoke periodically (`archon workflow run repo-triage
--no-worktree`) once a scheduler lands; live state persists across
runs so previously-flagged items reconcile correctly.

.gitignore: adds .archon/state/ for cross-run memory files.

* feat(workflows/repo-triage): post digest to Slack when SLACK_WEBHOOK is set

Extends the digest node with an optional Slack-post step after the
canonical digest.md artifact is written. Uses Slack incoming webhook
(no bot token required beyond the incoming-webhook scope).

Behavior:
- SLACK_WEBHOOK unset → skipped silently with a one-line note
- DRY_RUN=1 → prints full payload, does not curl
- Otherwise → POSTs a compact (<3500 char) mrkdwn-formatted summary
  containing headline numbers, this-run comment index (clickable
  GitHub URLs), pending items, and a path reference to digest.md
- curl failure or non-ok Slack response is logged but does not fail
  the node — digest.md on disk remains authoritative
- Intermediate Slack text written to $ARTIFACTS_DIR/digest-slack.txt
  for traceability; payload JSON assembled via jq and written to
  $ARTIFACTS_DIR/slack-payload.json before curl posts it

Slack mrkdwn conversion rules baked into the prompt (no tables, link
shape <url|text>, single-asterisk bold) so Sonnet emits a variant
that renders cleanly in Slack rather than being sent raw.

The webhook URL is read from the operator's environment (Archon
auto-loads ~/.archon/.env on CLI startup — put SLACK_WEBHOOK=... there).

* fix(workflows/repo-triage): address PR #1293 review feedback

Critical (3):
- `gh issue close --reason "not planned"` (space, not underscore) — the
  CLI expects lowercase with a space; `not_planned` fails at runtime.
  Fixed in both auto-close paths (triage-issues step 8, closed-dedup-
  check step 7).
- link-prs step 7 state save was sparse `{ sha, processedAt, related,
  fullyAddresses }`, overwriting `commentIds` / `templateNudgedAt` /
  `templateAdherence`. Changed to explicit merge that spreads existing
  entry first so per-run captured fields survive.
- Corrupt-JSON state files previously treated as first-run default
  (silent `pendingDedupComments` reset → 3-day clock restarts forever).
  All five state-load sites now abort loudly on JSON.parse throw;
  ENOENT/empty continue to default-shape.

Important (7):
- Sub-agents (`brief-gen`, `closed-brief-gen`, `pr-brief-gen`,
  `pr-issue-matcher`) emit `ERROR: <reason>` on gh failures rather than
  partial/fabricated JSON. Orchestrator detects the sentinel, logs the
  failed ID + first 200 chars of raw response, tracks in a failed-list,
  and aborts the cluster/match pass if ≥50% of items failed (avoids
  acting on bad data).
- `pr-brief-gen` now sets `diffTruncated: true` when the 30k-char diff
  cap hits; link-prs verify pass downgrades any `fully-addresses` claim
  to `related` when either side's brief was truncated.
- 3-day auto-close validates `postedAt` parses as ISO-8601 before the
  elapsed-time comparison; corrupt timestamps are logged and skipped,
  never acted on.
- `gh issue close` failure path no longer drops state — sets
  `closeAttemptFailed: true` on the entry for next-run retry. Only
  drops on exit 0.
- `closed-pr-dedup-check` idempotency check (`gh pr view --json comments`)
  now aborts the post on fetch failure rather than falling through —
  prevents double-posts on gh hiccups.
- `triage-agent` label pass has preflight `test -f` check for
  `.claude/agents/triage-agent.md`; skips the pass with a clear log if
  the file is missing rather than firing Task calls that fail obscurely.
- `brief-gen` template-adherence wording flipped from "Ignore … as
  'filled'" (ambiguous, read as affirmative) to explicit "A section
  counts as MISSING when …", matching the `pr-issue-matcher` phrasing.

Minor:
- `stale-nudge` idempotency check uses substring "has been quiet for"
  instead of a prefix check that never matched (posted body starts
  with @<author>).
- `closed-dedup-check` distinguishes "upstream crashed" (missing/corrupt
  triage-state.json, or `lastRunAt == null`) from "legitimately quiet
  day" (state present, briefs empty) — different log lines.
- Slack curl adds `-w "\nHTTP_STATUS:%{http_code}"` + `2>&1` so TLS /
  4xx / 5xx errors are visible in captured output.
- `stateReason` values from `gh issue view --json stateReason` are
  UPPERCASE (`COMPLETED`, `NOT_PLANNED`); documented and instruct
  sub-agent to normalize to lowercase for consistency.

Docs:
- CLAUDE.md repo-level `.archon/` tree now lists `state/`.
- archon-directories.md tree adds `state/` + `scripts/` (both were
  missing) with purpose descriptions.

Deferred (worth doing as a follow-up, not blocking):
- DRY/SKIP preamble duplication (~30-50 lines across 5 nodes).
- Explicit `BASELINE_IS_EMPTY` capture in link-prs (current derived
  check works but is a load-bearing model instruction).
- Digest `WARNING` prefix block when upstream nodes are missing
  outputs — today's "(output unavailable)" sub-line is functional.
- Pre-existing README workflow-count (17 → 20) and table gaps — not
  caused by this PR.
2026-04-20 11:34:38 +03:00
Fly Lee
eb730c0b82
fix(docs): prevent theme reset to dark after user switches to auto/light (#1079)
Starlight removes the `starlight-theme` localStorage key when the user
selects "auto" mode. The old init script checked that key, so every
navigation or refresh re-forced dark theme. Use a separate
`archon-theme-init` sentinel that persists across theme changes.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 10:01:27 +03:00
Anjishnu Sengupta
c495175d94
Fix formatting in README.md (#1059) 2026-04-20 09:59:26 +03:00
Cole Medin
ec5e5a5cf9
feat(providers/pi): opt-in extension discovery via config flag (#1298)
Some checks are pending
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
Test Suite / test (ubuntu-latest) (push) Waiting to run
Test Suite / test (windows-latest) (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / docker-build (push) Waiting to run
Adds `assistants.pi.enableExtensions` (default false) to `.archon/config.yaml`.
When true, Pi's `noExtensions` guard is lifted so the session loads tools and
lifecycle hooks from `~/.pi/agent/extensions/`, packages installed via
`pi install npm:<pkg>`, and the workflow's cwd `.pi/` directory — opening up
the community extension ecosystem at https://shittycodingagent.ai/packages.

Default stays suppressed to preserve the "Archon is source of truth" trust
boundary: enabling this loads arbitrary JS under the Archon server's OS
permissions, including whatever extension code the target repo happens to
ship. Operators opt in explicitly, per-host.

Skills, prompt templates, themes, and context files remain suppressed even
when extensions are enabled — only the extensions gate opens.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-19 14:35:52 -05:00
Cole Medin
fb73a500d7
feat(providers/pi): best-effort structured output via prompt engineering (#1297)
Pi's SDK has no native JSON-schema mode (unlike Claude's outputFormat /
Codex's outputSchema). Previously Pi declared structuredOutput: false
and any workflow using output_format silently degraded — the node ran,
the transcript was treated as free text, and downstream $nodeId.output.field
refs resolved to empty strings. 8 bundled/repo workflows across 10 nodes
were affected (archon-create-issue, archon-fix-github-issue,
archon-smart-pr-review, archon-workflow-builder, archon-validate-pr, etc.).

This PR closes the gap via prompt engineering + post-parse:

1. When requestOptions.outputFormat is present, the provider appends a
   "respond with ONLY a JSON object matching this schema" instruction plus
   JSON.stringify(schema) to the prompt before calling session.prompt().

2. bridgeSession accepts an optional jsonSchema param. When set, it buffers
   every assistant text_delta and — on the terminal result chunk — parses
   the buffer via tryParseStructuredOutput (trims whitespace, strips
   ```json / ``` fences, JSON.parse). On success, attaches
   structuredOutput to the result chunk (matching Claude's shape). On
   failure, emits a warn event and leaves structuredOutput undefined so
   the executor's existing dag.structured_output_missing path handles it.

3. Flipped PI_CAPABILITIES.structuredOutput to true. Unlike Claude/Codex
   this is best-effort, not SDK-enforced — reliable on GPT-5, Claude,
   Gemini 2.x, recent Qwen Coder, DeepSeek V3, less reliable on smaller
   or older models that ignore JSON-only instructions.

Tests added (14 total):
- tryParseStructuredOutput: clean JSON, fenced, bare fences, arrays,
  whitespace, empty, prose-wrapped (fails), malformed, inner backticks
- augmentPromptForJsonSchema via provider integration: schema appended,
  prompt unchanged when absent
- End-to-end: clean JSON → structuredOutput parsed; fenced JSON parses;
  prose-wrapped → no structuredOutput + no crash; no outputFormat →
  never sets structuredOutput even if assistant happens to emit JSON

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-19 10:16:02 -05:00
Cole Medin
83c119af78
fix(providers/pi): wire env injection + harden silent-failure paths (#1296)
Four defensive fixes to the Pi community provider to match the
Claude/Codex contract and eliminate silent error swallowing.

1. envInjection now actually wired (capability was declared but unused)
   Pi's SDK has no top-level `env` option on createAgentSession, so
   per-project env vars were being dropped. Routes requestOptions.env
   through a BashSpawnHook that merges caller env over the inherited
   baseline (caller wins, matching Claude/Codex semantics). When env is
   present with no allow/deny, resolvePiTools now explicitly returns Pi's
   4 default tools so the pre-constructed default bashTool is replaced
   with an env-aware one.

2. AsyncQueue no longer leaks on consumer abort. Added close() that
   drains pending waiters with { done: true } so iterate() exits instead
   of hanging forever when the producer's finally fires before the next
   push. bridgeSession calls queue.close() in its finally block.

3. buildResultChunk no longer reports silent success when agent_end fires
   with no assistant message. Now returns { isError: true, errorSubtype:
   'missing_assistant_message' } and logs a warn event so broken Pi
   sessions don't masquerade as clean completions.

4. session-resolver no longer swallows arbitrary errors from
   SessionManager.list(). Narrowed the catch to ENOENT/ENOTDIR (the only
   "session dir doesn't exist yet" signals); permission errors, parse
   failures, and other unexpected errors now propagate.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-19 09:20:32 -05:00
Rasmus Widing
60eeb00e42
feat(workflows): inline sub-agent definitions on DAG nodes (#1276)
Some checks are pending
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (ubuntu-latest) (push) Waiting to run
* feat(workflows): inline sub-agent definitions on DAG nodes

Add `agents:` node field letting workflow YAML define Claude Agent SDK
sub-agents inline, keyed by kebab-case ID. The main agent can spawn
them via the Task tool — useful for map-reduce patterns where a cheap
model briefs items and a stronger model reduces.

Authors no longer need standalone `.claude/agents/*.md` files for
workflow-scoped helpers; the definitions live with the workflow.

Claude only. Codex and community providers without the capability
emit a capability warning and ignore the field. Merges with the
internal `dag-node-skills` wrapper when `skills:` is also set —
user-defined agents win on ID collision.

* fix(workflows): address PR #1276 review feedback

Critical:
- Re-export agentDefinitionSchema + AgentDefinition from schemas/index.ts
  (matches the "schemas/index.ts re-exports all" convention).

Important:
- Surface user-override of internal 'dag-node-skills' wrapper: warn-level
  provider log + platform message to the user when agents: redefines the
  reserved ID alongside skills:. User-wins behavior preserved (by design)
  but silent capability removal is now observable.
- Add validator test coverage for the agents-capability warning (codex
  node with agents: → warning; claude node → no warning; no-agents
  field → no warning).
- Strengthen NodeConfig.agents duplicate-type comment explaining the
  intentional circular-dep avoidance and pointing at the Zod schema as
  authoritative source. Actual extraction is follow-up work.

Simplifications:
- Drop redundant typeof check in validator (schema already enforces).
- Drop unreachable Object.keys(...).length > 0 check in dag-executor.
- Drop rot-prone "(out of v1 scope)" parenthetical.
- Drop WHAT-only comment on AGENT_ID_REGEX.
- Tighten AGENT_ID_REGEX to reject trailing/double hyphens
  (/^[a-z0-9]+(-[a-z0-9]+)*$/).

Tests:
- parseWorkflow strips agents on script: and loop: nodes (parallel to
  the existing bash: coverage).
- provider emits warn log on dag-node-skills collision; no warn on
  non-colliding inline agents.

Docs:
- Renumber authoring-workflows Summary section (12b → 13; bump 13-19).
- Add Pi capability-table row for inline agents (, Claude-only).
- Add when-to-use guidance (agents: vs .claude/agents/*.md) in the
  new "Inline sub-agents" section.
- Cross-link skills.md Related → inline-sub-agents.
- CHANGELOG [Unreleased] Added entry for #1276.
2026-04-19 09:16:01 +03:00
Cole Medin
4c6ddd994f
fix(workflows): fail loudly on SDK isError results (#1208) (#1291)
Some checks are pending
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
Test Suite / test (ubuntu-latest) (push) Waiting to run
Previously, `dag-executor` only failed nodes/iterations when the SDK
returned an `error_max_budget_usd` result. Every other `isError: true`
subtype — including `error_during_execution` — was silently `break`ed
out of the stream with whatever partial output had accumulated, letting
failed runs masquerade as successful ones with empty output.

This is the most likely explanation for the "5-second crash" symptom in
#1208: iterations finish instantly with empty text, the loop keeps
going, and only the `claude.result_is_error` log tips the user off.

Changes:
- Capture the SDK's `errors: string[]` detail on result messages
  (previously discarded) and surface it through `MessageChunk.errors`.
- Log `errors`, `stopReason` alongside `errorSubtype` in
  `claude.result_is_error` so users can see what actually failed.
- Throw from both the general node path and the loop iteration path
  on any `isError: true` result, including the subtype and SDK errors
  detail in the thrown message.

Note: this does not implement auto-retry. See PR comments on #1121 and
the analysis on #1208 — a retry-with-fresh-session approach for loop
iterations is not obviously correct until we see what
`error_during_execution` actually carries in the reporter's env.
This change is the observability + fail-loud step that has to come
first so that signal is no longer silent.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 15:02:35 -05:00
DIY Smart Code
d89bc767d2
fix(setup): align PORT default on 3090 across .env.example, wizard, and JSDoc (#1152) (#1271)
The server's getPort() fallback changed from 3000 to 3090 in the Hono
migration (#318), but .env.example, the setup wizard's generated .env,
and the JSDoc describing the fallback were not updated — leaving three
different sources of truth for "the default PORT."

When the wizard writes PORT=3000 to ~/.archon/.env (which the Hono
server loads with override: true, while Vite only reads repo-local
.env), the two processes can land on different ports silently. That
mismatch is the real mechanism behind the failure described in #1152.

- .env.example: comment out PORT, document 3090 as the default
- packages/cli/src/commands/setup.ts: wizard no longer writes PORT=3000
  into the generated .env; fix the "Additional Options" note
- packages/cli/src/commands/setup.test.ts: assert no bare PORT= line and
  the commented default is present
- packages/core/src/utils/port-allocation.ts: fix stale JSDoc "default
  3000" -> "default 3090"
- deploy/.env.example: keep Docker default at 3000 (compose/Caddy target
  that) but annotate it so users don't copy it for local dev

Single source of truth for the local-dev default is now basePort in
port-allocation.ts.
2026-04-17 14:15:37 +02:00
Rasmus Widing
c864d8e427
refactor(providers/pi): drop rot-prone file:line refs from code comments (#1275)
Applies the CLAUDE.md comment rule ("don't embed paths/callers that rot
as the codebase evolves") flagged by the PR #1271 review to the Pi
provider's inline comments.

Three spots in the merged Pi code embed `packages/.../provider.ts:N-M`
line ranges pointing at the Claude and Codex providers. These ranges
will drift the moment those files change — the Claude auth-merge
pattern's line numbers are already off-by-a-few in some local branches.

Keep the conceptual cross-reference ("mirrors Claude's process-env +
request-env merge pattern", "matches the Codex provider's fallback
pattern for the same condition") — that's the load-bearing part of the
comment — drop the fragile line numbers and file paths.

Same treatment for the upstream Pi auth-storage.ts:424-485 reference,
which points at a specific line range in a moving dependency.

No behavior change; comment-only refactor.
2026-04-17 14:08:43 +02:00
Rasmus Widing
4e56991b72
feat(providers): add Pi community provider (@mariozechner/pi-coding-agent) (#1270)
* feat(providers): add Pi community provider (@mariozechner/pi-coding-agent)

Introduces Pi as the first community provider under the Phase 2 registry,
registered with builtIn: false. Wraps Pi's full coding-agent harness the
same way ClaudeProvider wraps @anthropic-ai/claude-agent-sdk and
CodexProvider wraps @openai/codex-sdk.

- PiProvider implements IAgentProvider; fresh AgentSession per sendQuery call
- AsyncQueue bridges Pi's callback-based session.subscribe() to Archon's
  AsyncGenerator<MessageChunk> contract
- Server-safe: AuthStorage.inMemory + SessionManager.inMemory +
  SettingsManager.inMemory + DefaultResourceLoader with all no* flags —
  no filesystem access, no cross-request state
- API key seeded per-call from options.env → process.env fallback
- Model refs: '<pi-provider-id>/<model-id>' (e.g. google/gemini-2.5-pro,
  openrouter/qwen/qwen3-coder) with syntactic compatibility check
- registerPiProvider() wired at CLI, server, and config-loader entrypoints,
  kept separate from registerBuiltinProviders() since builtIn: false is
  load-bearing for the community-provider validation story
- All 12 capability flags declared false in v1 — dag-executor warnings fire
  honestly for any unmapped nodeConfig field
- 58 new tests covering event mapping, async-queue semantics, model-ref
  parsing, defensive config parsing, registry integration

Supported Pi providers (v1): anthropic, openai, google, groq, mistral,
cerebras, xai, openrouter, huggingface. Extend PI_PROVIDER_ENV_VARS as
needed.

Out of scope (v1): session resume, MCP, hooks, skills mapping, thinking
level mapping, structured output, OAuth flows, model catalog validation.
These remain false on PI_CAPABILITIES until intentionally wired.

* feat(providers/pi): read ~/.pi/agent/auth.json for OAuth + api_key passthrough

Replaces the v1 env-var-only auth flow with AuthStorage.create(), which
reads ~/.pi/agent/auth.json. This transparently picks up credentials the
user has populated via `pi` → `/login` (OAuth subscriptions: Claude
Pro/Max, ChatGPT Plus, GitHub Copilot, Gemini CLI, Antigravity) or by
editing the file directly.

Env-var behavior preserved: when ANTHROPIC_API_KEY / GEMINI_API_KEY /
etc. is set (in process.env or per-request options.env), the adapter
calls setRuntimeApiKey which is priority #1 in Pi's resolution chain.
Auth.json entries are priority #2-#3. Pi's internal env-var fallback
remains priority #4 as a safety net.

Archon does not implement OAuth flows itself — it only rides on creds
the user created via the Pi CLI. OAuth refresh still happens inside Pi
(auth-storage.ts:369-413) under a file lock; concurrent refreshes
between the Pi CLI and Archon are race-safe by Pi's own design.

- Fail-fast error now mentions both the env-var path and `pi /login`
- 2 new tests: OAuth cred from auth.json; env var wins over auth.json
- 12 existing tests still pass (env-var-only path unchanged)

CI compatibility: no auth.json in CI, no change — env-var (secrets)
flows through Pi's getEnvApiKey fallback identically to v1.

* test(e2e): add Pi provider smoke test workflow

Mirrors e2e-claude-smoke.yaml: single prompt node + bash assert.
Targets `anthropic/claude-haiku-4-5` via `provider: pi`; works in CI
(ANTHROPIC_API_KEY secret) and locally (user's `pi /login` OAuth).

Verified locally with an Anthropic OAuth subscription — full run takes
~4s from session_started to assert PASS, exercising the async-queue
bridge and agent_end → result-chunk assembly under real Pi event timing.

Not yet wired into .github/workflows/e2e-smoke.yml — separate PR once
this lands, to keep the Pi provider PR minimal.

* feat(providers/pi): v2 — thinkingLevel, tool restrictions, systemPrompt

Extends the Pi adapter with three node-level translations, flipping the
corresponding capability flags from false → true so the dag-executor no
longer emits warnings for these fields on Pi nodes.

1. effort / thinking → Pi thinkingLevel (options-translator.ts)
   - Archon EffortLevel enum: low|medium|high|max (from
     packages/workflows/src/schemas/dag-node.ts). `max` maps to Pi's
     `xhigh` since Archon's enum lacks it.
   - Pi-native strings (minimal, xhigh, off) also accepted for
     programmatic callers bypassing the schema.
   - `off` on either field → no thinkingLevel (Pi's implicit off).
   - Claude-shape object `thinking: {type:'enabled', budget_tokens:N}`
     yields a system warning and is not applied.

2. allowed_tools / denied_tools → filtered Pi built-in tools
   - Supports all 7 Pi tools: read, bash, edit, write, grep, find, ls.
   - Case-insensitive normalization.
   - Empty `allowed_tools: []` means no tools (LLM-only), matching
     e2e-claude-smoke's idiom.
   - Unknown names (Claude-specific like `WebFetch`) collected and
     surfaced as a system warning; ignored tools don't fail the run.

3. systemPrompt (AgentRequestOptions + nodeConfig.systemPrompt)
   - Threaded through `DefaultResourceLoader({systemPrompt})`; Pi's
     default prompt is replaced entirely. Request-level wins over
     node-level.

Capability flag changes:
- thinkingControl: false → true
- effortControl:   false → true
- toolRestrictions: false → true

Package delta:
- +1 direct dep: @sinclair/typebox (Pi types reference it; adding as
  direct dep resolves the TS portable-type error).
- +1 test file: options-translator.test.ts (19 tests, 100% coverage).
- provider.test.ts extended with 11 new tests covering all three paths.
- registry.test.ts updated: capability assertion reflects new flags.

Live-verified: `bun run cli workflow run e2e-pi-smoke --no-worktree`
succeeds in 1.2s with thinkingLevel=low, toolCount=0. Smoke YAML updated
to use `effort: low` (schema-valid) + `allowed_tools: []` (LLM-only).

* test(e2e): add comprehensive Pi smoke covering every CI-compatible node type

Exercises every node type Archon supports under `provider: pi`, except
`approval:` (pauses for human input, incompatible with CI):
  1. prompt   — inline AI prompt
  2. command  — named command file (uses e2e-echo-command.md)
  3. loop     — bounded iterative AI prompt (max_iterations: 2)
  4. bash     — shell script with JSON output
  5. script   — bun runtime (echo-args.js)
  6. script   — uv / Python runtime (echo-py.py)

Plus DAG features on top of Pi:
  - depends_on + $nodeId.output substitution
  - when: conditional with JSON dot-access
  - trigger_rule: all_success merge
  - final assert node validates every upstream output is non-empty

Complements the minimal e2e-pi-smoke.yaml — that stays as the fast-path
smoke for connectivity checks; this one is the broader surface coverage.

Verified locally end-to-end against Anthropic OAuth (pi /login): PASS,
all 9 non-final nodes produce output, assert succeeds.

* feat(providers/pi): resolve Archon `skills:` names to Pi skill paths

Flips capabilities.skills: false → true by translating Archon's name-based
`skills:` nodeConfig (e.g. `skills: [agent-browser]`) to absolute directory
paths Pi's DefaultResourceLoader can consume via additionalSkillPaths.

Search order for each skill name (first match wins):
  1. <cwd>/.agents/skills/<name>/      — project-local, agentskills.io
  2. <cwd>/.claude/skills/<name>/      — project-local, Claude convention
  3. ~/.agents/skills/<name>/          — user-global, agentskills.io
  4. ~/.claude/skills/<name>/          — user-global, Claude convention

A directory resolves only if it contains a SKILL.md. Unresolved names are
collected and surfaced as a system-chunk warning (e.g. "Pi could not
resolve skill names: foo, bar. Searched .agents/skills and .claude/skills
(project + user-global)."), matching the semantic of "requested but not
found" without aborting the run.

Pi's buildSystemPrompt auto-appends the agentskills.io XML block for each
loaded skill, so the model sees them — no separate prompt injection needed
(Pi differs from Claude here; Claude wraps in an AgentDefinition with a
preloaded prompt, Pi uses XML block in system prompt).

Ancestor directory traversal above cwd is deliberately skipped in this
pass — matches the Pi provider's cwd-bound scope and avoids ambiguity
about which repo's skills win when Archon runs from a subdirectory.

Bun's os.homedir() bypasses the HOME env var; the resolver uses
`process.env.HOME ?? homedir()` so tests can stage a synthetic home dir.

Tests:
- 11 new tests in options-translator.test.ts cover project/user, .agents/
  vs .claude/, project-wins-over-user, SKILL.md presence check, dedup,
  missing-name collection.
- 2 new integration tests in provider.test.ts cover the missing-skill
  warning path and the "no skills configured → no additionalSkillPaths"
  path.
- registry.test.ts updated to assert skills: true in capabilities.

Live-verified locally: `.claude/skills/archon-dev/SKILL.md` resolves,
pi.session_started log shows `skillCount: 1, missingSkillCount: 0`,
smoke workflow passes in 1.2s.

* feat(providers/pi): session resume via Pi session store

Flips capabilities.sessionResume: false → true. Pi now persists sessions
under ~/.pi/agent/sessions/<encoded-cwd>/<uuid>.jsonl by default — same
pattern Claude and Codex use for their respective stores, same blast
radius as those providers.

Flow:
  - No resumeSessionId → SessionManager.create(cwd) (fresh, persisted)
  - resumeSessionId + match in SessionManager.list(cwd) → open(path)
  - resumeSessionId + no match → fresh session + system warning
    ("⚠️ Could not resume Pi session. Starting fresh conversation.")
    Matches Codex's resume_thread_failed fallback at
    packages/providers/src/codex/provider.ts:553-558.

The sessionId flows back to Archon via the terminal `result` chunk —
bridgeSession annotates it with session.sessionId unconditionally so
Archon's orchestrator can persist it and pass it as resumeSessionId on
the next turn. Same mechanism used for Claude/Codex.

Cross-cwd resume (e.g. worktree switch) is deliberately not supported in
this pass: list(cwd) scans only the current cwd's session dir. A workflow
that changes cwd mid-run lands on a fresh session, which matches Pi's
mental model.

Bridge sessionId annotation uses session.sessionId, which Pi always
populates (UUID) — so no special-case for inMemory sessions is needed.

Factored the resolver into session-resolver.ts (5 unit tests):
  - no id → create
  - id + match → open
  - id + no match → create with resumeFailed: true
  - list() throws → resumeFailed: true (graceful)
  - empty-string id → treated as "no resume requested"

Integration tests in provider.test.ts add 3 cases:
  - resume-not-found yields warning + calls create
  - resume-match calls open with the file path, no warning
  - result chunk always carries sessionId

Verified live end-to-end against Anthropic OAuth:
  - first call → sessionId 019d...; model replies "noted"
  - second call with that sessionId → "resumed: true" in logs; model
    correctly recalls prior turn ("Crimson.")
  - bogus sessionId → "⚠️ Could not resume..." warning + fresh UUID

* refactor(providers,core): generalize community-provider registration

Addresses the community-pattern regression flagged in the PR #1270 review:
a second community provider should require editing only its own directory,
not seven files across providers/ + core/ + cli/ + server/.

Three changes:

1. Drop typed `pi` slot from AssistantDefaultsConfig + AssistantDefaults.
   Community providers live behind the generic `[string]` index that
   `ProviderDefaultsMap` was explicitly designed to provide. The typed
   claude/codex slots stay — they give IDE autocomplete for built-in
   config access without `as` casts, which was the whole reason the
   intersection exists. Community providers parse their own config via
   Record<string, unknown> anyway, so the typed slot added no real
   parser safety.

2. Loop-based getDefaults + mergeAssistantDefaults. No more hardcoded
   `pi: {}` spreads. getDefaults() seeds from `getRegisteredProviders()`;
   mergeAssistantDefaults clones every slot present in `base`. Adding a
   new provider requires zero edits to this function.

3. New `registerCommunityProviders()` aggregator in registry.ts.
   Entrypoints (CLI, server, config-loader) call ONE function after
   `registerBuiltinProviders()` rather than one call per community
   provider. Adding a new community provider is now a single-line edit
   to registerCommunityProviders().

This makes Pi (and future community providers) actually behave like
Phase 2 (#1195) advertised: drop the implementation under
packages/providers/src/community/<id>/, export a `register<Id>Provider`,
add one line to the aggregator.

Tests:
- New `registerCommunityProviders` suite (2 tests: registers pi,
  idempotent).
- config-loader.test updated: assert built-in slots explicitly rather
  than exhaustive map shape.

No functional change for Pi end-users. Purely structural.

* fix(providers/pi,core): correctness + hygiene fixes from PR #1270 review

Addresses six of the review's important findings, all within the same
PR branch:

1. envInjection: false → true
   The provider reads requestOptions.env on every call (for API-key
   passthrough). Declaring the capability false caused a spurious
   dag-executor warning for every Pi user who configured codebase env
   vars — which is the MAIN auth path. Flipping to true removes the
   false positive.

2. toSafeAssistantDefaults: denylist → allowlist
   The old shape deleted `additionalDirectories`, `settingSources`,
   `codexBinaryPath` before sending defaults to the web UI. Any future
   sensitive provider field (OAuth token, absolute path, internal
   metadata) would silently leak via the `[key: string]: unknown` index
   signature. New SAFE_ASSISTANT_FIELDS map lists exactly what to
   expose per provider; unknown providers get an empty allowlist so
   the web UI sees "provider exists" but no config details.

3. AsyncQueue single-consumer invariant
   The type was documented single-consumer but unenforced. A second
   `for await` would silently race with the first over buffer +
   waiters. Added a synchronous guard in Symbol.asyncIterator that
   throws on second call — copy-paste mistakes now fail fast with a
   clear message instead of dropping items.

4. session.dispose() / session.abort() silent catches
   Both catch blocks now log at debug via a module-scoped logger so
   SDK regressions surface without polluting normal output.

5. Type scripted events as AgentSessionEvent in provider.test.ts
   Was `Record<string, unknown>` — Pi field renames would silently
   keep tests passing. Now typed against Pi's actual event union.

6. Leaked /tmp/pi-research/... path in provider.ts comment
   Local-machine path that crept in during research. Replaced with
   the upstream GitHub URL (matches convention at provider.ts:110).

Plus review-flagged simplifications:
  - Extract lookupPiModel wrapper — isolates the `as unknown as` cast
    behind one searchable name.
  - Hoist QueueItem → BridgeQueueItem at module scope (export'd for
    test visibility; not used externally yet but enables unit testing
    the mapping in isolation if needed later).
  - getRegisteredProviderNames: remove side-effecting registration
    calls. `loadConfig()` already bootstraps the registry before any
    caller can observe this helper — the hidden coupling was
    misleading.

Plus missing-coverage tests from the review (pr-test-analyzer):
  - session.prompt() rejection → error surfaces to consumer
  - pre-aborted signal → session.abort() called
  - mid-stream abort → session.abort() called
  - modelFallbackMessage → system chunk yielded
  - AsyncQueue second-consumer → throws synchronously

No behavioral changes for end users beyond the envInjection warning
fix.

* docs: Pi provider + community-provider contributor guide

Addresses the PR #1270 review's docs-impact findings: the original Pi
PR had no user-facing or contributor-facing documentation, and
architecture.md still referenced the pre-Phase-2 factory.ts pattern
(factory.ts was deleted in #1195).

1. packages/docs-web/src/content/docs/reference/architecture.md
   - Replace stale factory.ts references with the registry pattern.
   - Update inline IAgentProvider block: add getCapabilities, add
     options parameter.
   - Rewrite MessageChunk block as the actual discriminated union
     (was a placeholder with optional fields that didn't match the
     current type).
   - "Adding a New AI Agent Provider" checklist now distinguishes
     built-in (register in registerBuiltinProviders) from community
     (separate guide). Links to the new contributor guide.

2. packages/docs-web/src/content/docs/contributing/adding-a-community-provider.md (new)
   - Step-by-step guide using Pi as the reference implementation.
   - Covers: directory layout, capability discipline (start false,
     flip one at a time), provider class skeleton, registration via
     aggregator, test isolation (Bun mock.module pollution), what
     NOT to do (no edits to AssistantDefaultsConfig, no direct
     registerProvider from entrypoints, no overclaiming capabilities).

3. packages/docs-web/src/content/docs/getting-started/ai-assistants.md
   - New "Pi (Community Provider)" section: install, OAuth +
     API-key table per Pi backend, model ref format, workflow
     examples, capability matrix showing what Pi supports (session
     resume, tool restrictions, effort/thinking, skills, system
     prompt, envInjection) and what it doesn't (MCP, hooks,
     structured output, cost control, fallback model, sandbox).

4. .env.example
   - New Pi section with commented env vars for each supported
     backend (ANTHROPIC_API_KEY through HUGGINGFACE_API_KEY), each
     paired with its Pi provider id. OAuth flow (pi /login → auth.json)
     is explicitly called out — Archon reads that file too.

5. CHANGELOG.md
   - Unreleased entry for Pi, registerCommunityProviders aggregator,
     and the new contributor guide.
2026-04-17 13:52:03 +02:00
DIY Smart Code
922edbbac0
Merge pull request #1272 from coleam00/fix/issue-1260-docker-bind-mount-dirs
fix(docker): create /.archon subdirs in entrypoint for bind mounts (#1260)
2026-04-17 12:50:43 +02:00
Leex
a7337d6977 fix(docker): create /.archon subdirs in entrypoint for bind mounts (#1260)
Named volumes inherit /.archon/workspaces and /.archon/worktrees from the
image layer on first run, but bind mounts do not. Without these directories,
the Claude subprocess is spawned with a non-existent cwd and fails silently,
causing the 60s first-event timeout.

Adding mkdir -p in the entrypoint is idempotent for named volumes and fixes
bind-mount setups (e.g. ARCHON_DATA pointing to a host path on macOS/Linux).
2026-04-17 12:40:13 +02:00
Rasmus Widing
301a139e5a
fix(core/test): split connection.test.ts from DB-test batch to avoid mock pollution (#1269)
messages.test.ts uses mock.module('./connection', ...) at module-load time.
Per CLAUDE.md:131 (Bun issue oven-sh/bun#7823), mock.module() is process-
global and irreversible. When Bun pre-loads all test files in a batch, the
mock shadows the real connection module before connection.test.ts runs,
causing getDatabaseType() to always return the mocked value regardless of
DATABASE_URL.

Move connection.test.ts into its own `bun test` invocation immediately
after postgres.test.ts (which runs alone) and before the big DB/utils/
config/state batch that contains messages.test.ts. This follows the same
isolation pattern already used for command-handler, clone, postgres, and
path-validation tests.
2026-04-17 09:33:52 +02:00
Cole Medin
bed36ca4ad
fix(workflows): add word boundary to context variable substitution regex (#1256)
* fix(workflows): add word boundary to context variable substitution regex (#1112)

Variable substitution for $CONTEXT, $EXTERNAL_CONTEXT, and $ISSUE_CONTEXT
was matching as a prefix of longer identifiers like $CONTEXT_FILE, silently
corrupting bash node scripts. Added negative lookahead (?![A-Za-z0-9_]) to
CONTEXT_VAR_PATTERN_STR so only exact variable names are substituted.

Changes:
- Add negative lookahead to CONTEXT_VAR_PATTERN_STR regex in executor-shared.ts
- Add regression test for prefix-match boundary case

Fixes #1112

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(workflows): add missing boundary cases for context variable substitution

Add three new test cases that complete coverage of the word-boundary fix
from #1112: $ISSUE_CONTEXT with suffix variants, $ISSUE_CONTEXT with multiple
suffixes, and contextSubstituted=false for suffix-only prompts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 18:32:06 -05:00
Cole Medin
df828594d7 fix(test): normalize on-disk content to LF in bundled-defaults test
Companion to 75427c7c. The bundle-completeness test compared
BUNDLED_* strings (now LF-normalized by the generator) against raw
readFileSync output, which is CRLF on Windows checkouts. Apply the
same normalization to the on-disk side so the defense-in-depth check
stays meaningful on every platform.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-16 17:59:41 -05:00
Cole Medin
75427c7cdd fix(ci): normalize line endings in bundled-defaults generator
On Windows, `git checkout` converts source files to CRLF via the
`* text=auto` policy. The generator inlined raw file content as JSON
strings, so the Windows regeneration produced `\r\n` escapes while the
committed artifact (written on Linux) used `\n`. `bun run check:bundled`
then flagged the file as stale and failed the Windows CI job.

Fix by normalizing CRLF → LF both when reading source defaults and when
comparing against the existing generated file. No-op on Linux.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-16 17:55:24 -05:00
DIY Smart Code
b7b445bd31
Merge pull request #1110 from LocNguyenSGU/fix/issue-1108-settings-add-project-url-support
fix: accept GitHub URLs in settings add project
2026-04-16 23:55:51 +02:00
Leex
9dd57b2f3c fix(web): unify Add Project URL/path classification across UI entry points
Settings → Projects Add Project only submitted { path }, so GitHub URLs
entered there failed even though the API and the Sidebar Add Project
already accepted them. Closes #1108.

Changes:
- Add packages/web/src/lib/codebase-input.ts: shared getCodebaseInput()
  helper returning a discriminated { path } | { url } union (re-exported
  from api.ts for convenience).
- Use the helper from all three Add Project entry points: Sidebar,
  Settings, and ChatPage. Removes three divergent inline heuristics.
- SettingsPage: rename addPath → addValue (state now holds either URL
  or local path) and update placeholder text.
- Tests: cover https://, git@ shorthand, ssh://, git://, whitespace,
  unix/relative/home/Windows/UNC paths.
- Docs: document the unified Add Project entry point in adapters/web.md.

Heuristic flips from "assume URL unless explicitly local" to "assume
local unless explicitly remote" — only inputs starting with https?://,
ssh://, git@, or git:// are sent as { url }; everything else is sent
as { path }. The server already resolves tilde/relative paths.

Co-authored-by: Nguyen Huu Loc <lockbkbang@gmail.com>
2026-04-16 23:43:19 +02:00
Rasmus Widing
86e4c8d605
fix(bundled-defaults): auto-generate import list, emit inline strings (#1263)
* fix(bundled-defaults): auto-generate import list, emit inline strings

Root-cause fix for bundle drift (15 commands + 7 workflows previously
missing from binary distributions) and a prerequisite for packaging
@archon/workflows as a Node-loadable SDK.

The hand-maintained `bundled-defaults.ts` import list is replaced by
`scripts/generate-bundled-defaults.ts`, which walks
`.archon/{commands,workflows}/defaults/` and emits a generated source
file with inline string literals. `bundled-defaults.ts` becomes a thin
facade that re-exports the generated records and keeps the
`isBinaryBuild()` helper.

Inline strings (via JSON.stringify) replace Bun's
`import X from '...' with { type: 'text' }` attributes. The binary build
still embeds the data at compile time, but the module now loads under
Node too — removing SDK blocker #2.

- Generator: `scripts/generate-bundled-defaults.ts` (+ `--check` mode for CI)
- `package.json`: `generate:bundled`, `check:bundled`; wired into `validate`
- `build-binaries.sh`: regenerates defaults before compile
- Test: `bundle completeness` now derives expected set from on-disk files
- All 56 defaults (36 commands + 20 workflows) now in the bundle

* fix(bundled-defaults): address PR review feedback

Review: https://github.com/coleam00/Archon/pull/1263#issuecomment-4262719090

Generator:
- Guard against .yaml/.yml name collisions (previously silent overwrite)
- Add early access() check with actionable error when run from wrong cwd
- Type top-level catch as unknown; print only message for Error instances
- Drop redundant /* eslint-disable */ emission (global ignore covers it)
- Fix misleading CI-mechanism claim in header comment
- Collapse dead `if (!ext) continue` guard into a single typed pass

Scripts get real type-checking + linting:
- New scripts/tsconfig.json extending root config
- type-check now includes scripts/ via `tsc --noEmit -p scripts/tsconfig.json`
- Drop `scripts/**` from eslint ignores; add to projectService file scope

Tests:
- Inline listNames helper (Rule of Three)
- Drop redundant toBeDefined/typeof assertions; the Record<string, string>
  type plus length > 50 already cover them
- Add content-fidelity round-trip assertion (defense against generator
  content bugs, not just key-set drift)

Facade comment: drop dead reference to .claude/rules/dx-quirks.md.

CI: wire `bun run check:bundled` into .github/workflows/test.yml so the
header's CI-verification claim is truthful.

Docs: CLAUDE.md step count four→five; add contributor bullet about
`bun run generate:bundled` in the Defaults section and CONTRIBUTING.md.

* chore(e2e): bump Codex model to gpt-5.2

gpt-5.1-codex-mini is deprecated and unavailable on ChatGPT-account Codex
auth. Plain gpt-5.2 works. Verified end-to-end:

- e2e-codex-smoke: structured output returns {category:'math'}
- e2e-mixed-providers: claude+codex both return expected tokens
2026-04-16 21:27:51 +02:00
Cole Medin
d535c832e3
feat(telemetry): anonymous PostHog workflow-invocation tracking (#1262)
* feat(telemetry): add anonymous PostHog workflow-invocation tracking

Emits one `workflow_invoked` event per run with workflow name/description,
platform, and Archon version. Uses a stable random UUID persisted to
`$ARCHON_HOME/telemetry-id` for distinct-install counting, with
`$process_person_profile: false` to stay in PostHog's anonymous tier.

Opt out with `ARCHON_TELEMETRY_DISABLED=1` or `DO_NOT_TRACK=1`. Self-host
via `POSTHOG_API_KEY` / `POSTHOG_HOST`.

Closes #1261

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(telemetry): stop leaking test events to production PostHog

The `telemetry-id preservation` test exercised the real capture path with
the embedded production key, so every `bun run validate` published a
tombstone `workflow_name: "w"` event. Redirect POSTHOG_HOST to loopback
so the flush fails silently; bump test timeout to accommodate the
retry-then-give-up window.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(telemetry): silence posthog-node stderr leak on network failure

The PostHog SDK's internal logFlushError() writes 'Error while flushing
PostHog' directly to stderr via console.error on any network or HTTP
error, bypassing logger config. For a fire-and-forget telemetry path
this leaked stack traces to users' terminals whenever PostHog was
unreachable (offline, firewalled, DNS broken, rate-limited).

Pass a silentFetch wrapper to the PostHog client that masks failures as
fake 200 responses. The SDK never sees an error, so it never logs.
Original failure is still recorded at debug level for diagnostics.

Side benefit: shutdown is now fast on network failure (no retry loop),
so offline CLI commands no longer hang ~10s on exit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(telemetry): make id-preservation test deterministic

Replace the fire-and-forget capture + setTimeout + POSTHOG_HOST-loopback
dance with a direct synchronous call to getOrCreateTelemetryId(). Export
the function with an @internal marker so tests can exercise the id path
without spinning up the PostHog client. No network, no timer, no flake.

Addresses CodeRabbit feedback on #1262.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-16 13:45:55 -05:00
Cole Medin
f1c5dcb231
Merge pull request #1255 from coleam00/feat/e2e-smoke-tests
feat(ci): add E2E smoke test workflows for Claude and Codex
2026-04-16 11:48:15 -05:00
Cole Medin
47be699e00 chore(ci): remove test branch trigger before merge
Removes feat/e2e-smoke-tests from E2E workflow triggers. CI failure
detection verified: red X on run 24522356737 (deliberate bash exit 1),
green on run 24522484762 (reverted), and credit-exhaustion failure also
correctly produced exit 1.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:46:23 -05:00
Cole Medin
2682430543 test(ci): temporarily re-add branch trigger to verify green CI
Will remove feat/e2e-smoke-tests trigger in the final cleanup commit
before merging to dev.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:43:54 -05:00
Cole Medin
7d38716f1f fix(ci): revert deliberate failure, remove test branch trigger
Reverts the injected exit 1 in bash-echo (CI red X confirmed in run
24522356737). Removes feat/e2e-smoke-tests from branch triggers — ready
to merge to dev.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:43:33 -05:00
Cole Medin
367de7a625 test(ci): inject deliberate failure to verify CI red X
Injects exit 1 into e2e-deterministic bash-echo node to prove the engine
fix (failWorkflowRun on anyFailed) propagates to a non-zero CLI exit code
and a red X in GitHub Actions. Will be reverted in the next commit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:40:55 -05:00
Cole Medin
18681701b3 fix(ci): remove command node from Claude smoke test
Command nodes consistently produce zero output and hit the 30s idle
timeout in CI, even with allowed_tools: []. This appears to be a bug
in how command: nodes interact with the Claude CLI subprocess — the
process never emits output. This adds 30s of wasted time to every run.

The simple prompt node already verifies Claude connectivity. Command
file discovery/loading is a deterministic operation that doesn't need
an AI call to validate in a smoke test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:05:48 -05:00
Cole Medin
1c600f2b62 fix(ci): add allowed_tools: [] to command node to prevent 30s hang
The command-test node was missing allowed_tools: [], causing the Claude
CLI to load full tool access. Without tools restricted, the subprocess
hangs after responding. The simple prompt node with allowed_tools: []
completes in 4s — this should match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:03:04 -05:00
Cole Medin
bf9091159c refactor(ci): strip E2E smoke tests to bare minimum for speed
Claude CLI is extremely slow with structured output (~4 min) and tool use
(~2 min) in CI, making the previous multi-workflow approach take 10+ min.

Radical simplification:
- Remove e2e-all-nodes (redundant with deterministic + claude-smoke)
- Remove e2e-skills-mcp (advanced features too slow for per-commit smoke)
- Remove structured output and tool use from Claude smoke test (too slow)
- Strip Claude smoke to: 1 prompt + 1 command + 1 bash verify node
- Keep mixed providers (simplified: 1 Claude + 1 Codex + bash verify)
- All timeouts reduced to 30s, all job timeouts to 5 min
- Remove MCP test fixtures and e2e-test-skill (no longer needed)

Expected: Claude job ~15s of AI time, Codex ~5s, mixed ~10s

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:50:11 -05:00
Cole Medin
4c259e7a0a fix(ci): increase Claude E2E job timeout from 10 to 20 minutes
Claude CLI is slow with structured output and tool use in CI (~4 min for
structured output, ~2 min for tool use). With 3 sequential workflow runs
(claude-smoke, all-nodes, skills-mcp), 10 minutes is insufficient.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:46:48 -05:00
Cole Medin
d666b3c7ca fix(ci): resolve 5 E2E smoke test failures from first CI run
- Rename echo-args.py → echo-py.py to avoid duplicate script name conflict
  with echo-args.js (script discovery uses base name, not extension)
- Add CODEX_API_KEY env var to codex and mixed CI jobs (Codex CLI requires
  this, not OPENAI_API_KEY, for headless auth)
- Sequentialize all Claude AI nodes via depends_on chains to prevent
  concurrent CLI subprocess idle timeouts in CI
- Increase idle_timeout from 60s to 120s on all AI nodes for CI headroom
- Override MCP test node to model: sonnet (Haiku doesn't support MCP tool search)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:34:57 -05:00
Cole Medin
7d9090678e feat(ci): add E2E smoke test workflows for Claude and Codex providers
Adds real workflow execution to CI, verifying the full engine works
end-to-end with both providers. Organized into 4 tiers: deterministic
(0 API calls), Claude, Codex, and mixed-provider tests.

New workflows:
- e2e-deterministic: bash, script (bun/uv), conditions, trigger rules
- e2e-skills-mcp: skills injection, MCP server, effort, systemPrompt
- Enhanced existing e2e-claude-smoke, e2e-codex-smoke, e2e-mixed-providers
- Fixed e2e-all-nodes (was broken due to script node syntax)

Supporting files:
- e2e-echo-command.md (test command file)
- echo-args.py (Python script for uv runtime test)
- e2e-test-skill/SKILL.md (minimal skill for injection test)
- e2e-filesystem.json (MCP config for filesystem server test)

GitHub Actions: .github/workflows/e2e-smoke.yml
- Runs on push to main/dev only (no PR trigger to avoid API cost abuse)
- Uses haiku (Claude) and gpt-5.1-codex-mini (Codex) for cost efficiency

Closes #1254

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:12:06 -05:00
Cole Medin
7721259bdc
fix(core): surface auth errors instead of silently dropping them (#1089)
* fix: surface auth errors instead of silently dropping them (#1076)

When Claude OAuth refresh token is expired, the SDK yields a result chunk
with is_error=true and no session_id. Both handleStreamMode and
handleBatchMode guarded the result branch with `&& msg.sessionId`,
silently dropping the error. Users saw no response at all.

Changes:
- Remove sessionId guard from result branches in orchestrator-agent.ts
- Add isError early-exit that sends error message to user
- Add 4 OAuth patterns to AUTH_PATTERNS in claude.ts and codex.ts
- Add OAuth refresh-token handler to error-formatter.ts
- Add tests for new error-formatter branches

Fixes #1076

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add structured logging to isError path and remove overly broad auth pattern

- Add getLog().warn({ conversationId, errorSubtype }, 'ai_result_error') in both
  handleStreamMode and handleBatchMode isError branches so auth failures are
  visible server-side instead of silently swallowed
- Remove 'access token' from AUTH_PATTERNS in claude.ts and codex.ts; the real
  OAuth refresh error is already covered by 'refresh token' and 'could not be
  refreshed', eliminating false-positive auth classification risk

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: route isError results through classifyAndFormatError with provider-specific messages

The isError path in stream/batch mode used a hardcoded generic message,
bypassing the classifyAndFormatError infrastructure. Now constructs a
synthetic Error from errorSubtype and routes through the formatter.

Error formatter updated with provider-specific auth detection:
- Claude: OAuth token refresh, sign-in expired → guidance to run /login
- Codex: 401 retry exhaustion → guidance to run codex login
- General: tightened patterns (removed broad 'auth error' substring match)

Also persists session ID before early-returning on isError.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:36:40 -05:00
Cole Medin
818854474f
fix(workflows): stop warning about model/provider on loop nodes (#1090)
* fix(workflows): stop warning about model/provider on loop nodes (#1082)

The loader incorrectly classified loop nodes as "non-AI nodes" and warned
that model/provider fields were ignored, even though the DAG executor has
supported these fields on loop nodes since commit 594d5daa.

Changes:
- Add LOOP_NODE_AI_FIELDS constant excluding model/provider from the warn list
- Update loader to use LOOP_NODE_AI_FIELDS for loop node field checking
- Fix BASH_NODE_AI_FIELDS comment that incorrectly referenced loop nodes
- Add tests for loop node model/provider acceptance and unsupported field warnings

Fixes #1082

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(workflows): update stale comment and add LOOP_NODE_AI_FIELDS unit tests

- Update section comment from "bash/loop nodes" to "non-AI nodes" since loop
  nodes do support model/provider (the fix in this PR)
- Export LOOP_NODE_AI_FIELDS from schemas/index.ts alongside BASH/SCRIPT variants
- Add dedicated describe block in schemas.test.ts verifying that model and
  provider are excluded and all other BASH_NODE_AI_FIELDS are still present

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* simplify: merge nodeType and aiFields into a single if/else chain in parseDagNode

Eliminates the separate isNonAiNode predicate and nested ternary for aiFields
selection by combining both into one explicit if/else block — each branch sets
nodeType and aiFields together, removing the need to re-check node type twice.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:19:18 -05:00
Cole Medin
64bdd30ef4
Merge pull request #1066 from coleam00/archon/task-fix-issue-1042
fix: replace Telegraf with grammY to fix Bun TypeError crash
2026-04-16 09:15:56 -05:00
Cole Medin
a5e5d5ceeb fix: address review findings for grammY Telegram adapter
- Fix misleading 'unde***' log when ctx.from is undefined; use 'unknown'
  to match the Slack/Discord adapter pattern
- Log post-startup bot runtime errors before reject() (no-op after
  onStart fires but errors are now visible in logs)
- Add debug log when message is dropped due to no handler registered
- Add stop() unit test to guard against grammY API rename regressions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 09:15:47 -05:00
Cole Medin
da1f8b7d97 fix: replace Telegraf with grammY to fix Bun TypeError crash (#1042)
Telegraf v4's internal `redactToken()` assigns to readonly `error.message`
properties, which crashes under Bun's strict ESM mode. Telegraf is EOL.

Changes:
- Replace `telegraf` dependency with `grammy` ^1.36.0
- Migrate adapter from Telegraf API to grammY API (Bot, bot.api, bot.start)
- Use grammY's `onStart` callback pattern for async polling launch
- Preserve 409 retry logic and all existing behavior
- Update test mocks from telegraf types to grammy types

Fixes #1042

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:15:46 -05:00
Cole Medin
2732288f07
Merge pull request #1065 from coleam00/archon/task-fix-issue-1055
feat(core): inject workflow run context into orchestrator prompt
2026-04-16 07:55:00 -05:00
Cole Medin
b100cd4b48
Merge pull request #1064 from coleam00/archon/task-fix-issue-1054
fix(web): interleave tool calls with text during SSE streaming
2026-04-16 07:44:48 -05:00
Cole Medin
5acf5640c8
Merge pull request #1063 from coleam00/archon/task-fix-issue-1035
fix: archon setup --spawn fails on Windows with spaces in repo path
2026-04-16 07:36:58 -05:00
Cole Medin
68ecb75f0f
Merge pull request #1052 from coleam00/archon/task-fix-github-issue-1775831868291
fix(cli): send workflow dispatch/result messages for Web UI cards
2026-04-16 07:32:52 -05:00
Cole Medin
51b8652d43 fix: complete defensive chaining and add missing test coverage for PR #1052
- Fix half-applied optional chaining in WorkflowProgressCard refetchInterval
  (query.state.data?.run.status → ?.run?.status) preventing TypeError in polling
- Add dispatch-failure test verifying executeWorkflow still runs when
  dispatch sendMessage fails
- Add paused-workflow test proving paused guard fires before summary check
- Strengthen dispatch metadata assertion to verify workerConversationId format

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 07:32:37 -05:00
jinglesthula
3dedc22537
Fix incorrect substep numbering in setup.md (#1013)
Substeps for Step 4 were: 4a, 4b, 5c, 5d

Co-authored-by: Jon Anderson <jonathan.anderson@byu.edu>
2026-04-15 12:15:35 +03:00
Rasmus Widing
882fc58f7c
fix: stop server startup from auto-failing in-flight workflow runs (#1216) (#1231)
* fix: stop server startup from auto-failing in-flight workflow runs (#1216)

`failOrphanedRuns()` at server startup unconditionally flipped every
`running` workflow row to `failed`, including runs actively executing in
another process (CLI / adapters). The dag-executor's between-layer
status check then bailed out of the run, exit code 1 — even though every
node had completed successfully. Same class of bug the CLI already
learned (see comment at packages/cli/src/cli.ts:256-258).

Per the new CLAUDE.md principle "No Autonomous Lifecycle Mutation Across
Process Boundaries", we don't replace the call with a timer-based
heuristic. Instead we remove it and surface running workflows to the
user with one-click actions.

Backend
- `packages/server/src/index.ts` — remove the `failOrphanedRuns()` call
  at startup. Replace with explanatory comment referencing the CLI
  precedent and the CLAUDE.md principle. The function in
  `packages/core/src/db/workflows.ts:911` is preserved for use by the
  explicit `archon workflow cleanup` command.

UI
- `packages/web/src/components/layout/TopNav.tsx` — replace the binary
  pulse dot on the Dashboard nav with a numeric count badge sourced
  from `/api/dashboard/runs` `counts.running`. Hidden when count is 0.
  Same 10s polling interval as before. No animation — a steady factual
  count is honest; a pulse would imply system judgment.

- `packages/web/src/components/dashboard/ConfirmRunActionDialog.tsx`
  (new) — shadcn AlertDialog wrapper for destructive workflow-run
  actions, mirroring the codebase-delete pattern in
  `sidebar/ProjectSelector.tsx`. Caller passes the existing button as
  `trigger` slot; dialog handles open/close via Radix.

- `packages/web/src/components/dashboard/WorkflowRunCard.tsx` — replace
  4 `window.confirm()` callsites (Reject, Abandon, Cancel, Delete) with
  ConfirmRunActionDialog. Each gets a context-appropriate description.

- `packages/web/src/components/dashboard/WorkflowHistoryTable.tsx` —
  replace 1 `window.confirm()` (Delete) with the same dialog.

CHANGELOG entries under [Unreleased]: Fixed for #1216, two Changed
entries for the nav badge and dialog upgrade.

No new tests: the web package has no React component testing
infrastructure (existing `bun test` covers `src/lib/` and `src/stores/`
only). Type-check + lint + manual UI verification + the backend
reproducer are the verification levels.

Closes #1216.

* review: address PR #1231 nits — stale doc + 3 code polish

PR review surfaced one real correctness issue in docs and three small
code polish items. None block merge; addressing for cleanliness.

- packages/docs-web/src/content/docs/guides/authoring-workflows.md:486
  removed the "auto-marked as failed on next startup" paragraph that
  described the now-deleted behavior. Replaced with a "Crashed servers /
  orphaned runs" note pointing users at `archon workflow cleanup` and
  the dashboard Cancel/Abandon buttons; explains the auto-resume
  mechanism still works once the row reaches a terminal status.

- ConfirmRunActionDialog: narrow `onConfirm` from
  `() => void | Promise<void>` to `() => void`. All five callsites are
  synchronous wrappers around React Query mutations whose error
  handling lives at the page level (`runAction` in DashboardPage). The
  union widened the API for no current caller. Documented in the JSDoc
  what to do if an awaiting caller appears later.

- TopNav: dropped the redundant `String(runningCount)` cast in the
  aria-label — template literal coerces. Also rewrote the comment above
  the `listDashboardRuns` query: the previous version implied `limit=1`
  constrained `counts.running`; in fact `counts` is a server-side
  aggregate independent of `limit`, and `limit=1` only minimises the
  `runs` array we discard.

* review: correct remediation docs — cleanup ≠ abandon

CodeRabbit caught a factual error I introduced in the doc update:
`archon workflow cleanup` calls `deleteOldWorkflowRuns(days)` which
DELETEs old terminal rows (`completed`/`failed`/`cancelled` older than
N days) for disk hygiene. It does NOT transition stuck `running` rows.

The correct remediation for a stuck `running` row is either the
dashboard's per-row Cancel/Abandon button (already documented) or
`archon workflow abandon <run-id>` from the CLI (existing subcommand,
see packages/cli/src/cli.ts:366-374).

Fixed three locations:
- packages/docs-web/.../guides/authoring-workflows.md — replaced the
  vague "clean up explicitly" with concrete Web UI / CLI instructions
  and an explicit "Not to be confused with `archon workflow cleanup`"
  callout to close off the ambiguity CodeRabbit flagged.
- packages/server/src/index.ts — comment updated to point at the
  correct remediation (`archon workflow abandon`) and clarify that
  `archon workflow cleanup` is unrelated disk-hygiene.
- CHANGELOG.md — same correction in the [Unreleased] Fixed entry.
2026-04-15 12:05:41 +03:00
Rasmus Widing
5c8c39e5c9
fix(test): update stale mocks in cleanup-service 'continues processing' test (#1230) (#1232)
After PR #1034 changed worktree existence checks from execFileAsync to
fs/promises.access, the mockExecFileAsync rejections had no effect.
removeEnvironment needs getById + getCodebase mocks to proceed past
the early-return guard, otherwise envs route to report.skipped instead
of report.removed.

Replace the two stale mockExecFileAsync rejection calls with proper
mockGetById and mockGetCodebase return values for both test environments.

Fixes #1230
2026-04-15 11:53:02 +03:00
Shane McCarron
f61d576a4d
feat(isolation): auto-init submodules in worktrees (#1189)
Worktrees created via `git worktree add` do not initialize submodules — monorepo workflows that need submodule content find empty directories. Auto-detect `.gitmodules` and run `git submodule update --init --recursive` after worktree creation; classify failures through the isolation error pipeline.

Behavior:
- `.gitmodules` absent → skip silently (zero-cost probe, no effect on non-submodule repos)
- `.gitmodules` present → run submodule init by default (opt out via `worktree.initSubmodules: false`)
- submodule init or `.gitmodules` read failure → throw with classified error including opt-out guidance
- Only `ENOENT` on `.gitmodules` is treated as "no submodules"; other access errors (EACCES, EIO) surface as failures to prevent silent empty-dir worktrees

Changes:
- `packages/isolation/src/providers/worktree.ts` — `initSubmodules()` method + call site in `createWorktree()`
- `packages/isolation/src/errors.ts` — collapsed `errorPatterns` + `knownPatterns` into single `ERROR_PATTERNS` source of truth with `known: boolean` per entry; added submodule pattern with opt-out guidance
- `packages/isolation/src/types.ts` + `packages/core/src/config/config-types.ts` — new `initSubmodules?: boolean` config option
- `packages/docs-web/src/content/docs/reference/configuration.md` — documented the new option and submodule behavior
- Tests: default-on, explicit opt-in, explicit opt-out, skip-when-absent, fail-fast on EACCES, fail-fast on git failure, fail-fast on timeout

Credit to @halindrome for the original implementation and root-cause mapping across #1183, #1187, #1188, #1192.

Follow-up: #1192 (codebase identity rearchitect) would retire the cross-clone guard code in `resolver.ts` and `worktree.ts` that #1198, #1206 added. Separate PR.

Closes #1187
2026-04-15 09:48:18 +03:00
Rasmus Widing
c4ab0a2333 docs(claude.md): codify "no autonomous lifecycle mutation across process boundaries"
Generalize the lesson from #1216 (and the CLI precedent at
packages/cli/src/cli.ts:256-258) into a project-wide engineering
principle. When a process cannot reliably distinguish "actively
running elsewhere" from "orphaned by a crash" — typically because the
work was started by a different process or input source (CLI, adapter,
webhook, web UI, cron) — it must not autonomously mutate that work
based on a timer or staleness guess. Surface and ask instead.

Phrased to be specific about what is still allowed: heuristics for
recoverable operations (retry backoff, subprocess timeouts, hygiene
cleanup of terminal-status data) are not banned. The rule targets
destructive mutation of non-terminal state owned by an unknowable
other party.
2026-04-15 09:14:15 +03:00
Kagura
73d9240eb3
fix(isolation): complete reports false success when worktree remains on disk (fixes #964) (#1034)
* fix(isolation): complete reports false success when worktree remains on disk (fixes #964)

Three changes to prevent ghost worktrees:

1. isolationCompleteCommand now checks result.worktreeRemoved — if the
   worktree was not actually removed (partial failure), it reports
   'Partial' with warnings and counts as failed, not completed.
   Previously only skippedReason was checked; a destroy that returned
   successfully but with worktreeRemoved=false would still print
   'Completed'.

2. WorktreeProvider.destroy() now runs 'git worktree prune' after
   removal to clean up stale worktree references that git may keep
   even after the directory is removed.

3. WorktreeProvider.destroy() adds post-removal verification: after
   git worktree remove, it checks 'git worktree list --porcelain' to
   confirm the worktree is actually unregistered. If still registered,
   worktreeRemoved is set back to false with a descriptive warning.

* fix: address CodeRabbit review — ghost worktree prune, partial cleanup callers, accurate messages

* test: add regression test for Partial branch in isolation complete

Exercises the !result.worktreeRemoved path (without skippedReason)
that was flagged as uncovered by CodeRabbit review.
2026-04-14 17:58:45 +03:00
Matt Chapman
28b258286f
Extra backticks for markdown block to fix formatting (#1218)
of nested code blocks.
2026-04-14 17:58:31 +03:00
Rasmus Widing
81859d6842
fix(providers): replace Claude SDK embed with explicit binary-path resolver (#1217)
* feat(providers): replace Claude SDK embed with explicit binary-path resolver

Drop `@anthropic-ai/claude-agent-sdk/embed` and resolve Claude Code via
CLAUDE_BIN_PATH env → assistants.claude.claudeBinaryPath config → throw
with install instructions. The embed's silent failure modes on macOS
(#1210) and Windows (#1087) become actionable errors with a documented
recovery path.

Dev mode (bun run) remains auto-resolved via node_modules. The setup
wizard auto-detects Claude Code by probing the native installer path
(~/.local/bin/claude), npm global cli.js, and PATH, then writes
CLAUDE_BIN_PATH to ~/.archon/.env. Dockerfile pre-sets CLAUDE_BIN_PATH
so extenders using the compiled binary keep working. Release workflow
gets negative and positive resolver smoke tests.

Docs, CHANGELOG, README, .env.example, CLAUDE.md, test-release and
archon skills all updated to reflect the curl-first install story.

Retires #1210, #1087, #1091 (never merged, now obsolete).
Implements #1176.

* fix(providers): only pass --no-env-file when spawning Claude via Bun/Node

`--no-env-file` is a Bun flag that prevents Bun from auto-loading
`.env` from the subprocess cwd. It is only meaningful when the Claude
Code executable is a `cli.js` file — in which case the SDK spawns it
via `bun`/`node` and the flag reaches the runtime.

When `CLAUDE_BIN_PATH` points at a native compiled Claude binary (e.g.
`~/.local/bin/claude` from the curl installer, which is Anthropic's
recommended default), the SDK executes the binary directly. Passing
`--no-env-file` then goes straight to the native binary, which
rejects it with `error: unknown option '--no-env-file'` and the
subprocess exits code 1.

Emit `executableArgs` only when the target is a `.js` file (dev mode
or explicit cli.js path). Caught by end-to-end smoke testing against
the curl-installed native Claude binary.

* docs: record env-leak validation result in provider comment

Verified end-to-end with sentinel `.env` and `.env.local` files in a
workflow CWD that the native Claude binary (curl installer) does not
auto-load `.env` files. With Archon's full spawn pathway and parent
env stripped, the subprocess saw both sentinels as UNSET. The
first-layer protection in `@archon/paths` (#1067) handles the
inheritance leak; `--no-env-file` only matters for the Bun-spawned
cli.js path, where it is still emitted.

* chore(providers): cleanup pass — exports, docs, troubleshooting

Final-sweep cleanup tied to the binary-resolver PR:

- Mirror Codex's package surface for the new Claude resolver: add
  `./claude/binary-resolver` subpath export and re-export
  `resolveClaudeBinaryPath` + `claudeFileExists` from the package
  index. Renames the previously single `fileExists` re-export to
  `codexFileExists` for symmetry; nothing outside the providers
  package was importing it.
- Add a "Claude Code not found" entry to the troubleshooting reference
  doc with platform-specific install snippets and pointers to the
  AI Assistants binary-path section.
- Reframe the example claudeBinaryPath in reference/configuration.md
  away from cli.js-only language; it accepts either the native binary
  or cli.js.

* test+refactor(providers, cli): address PR review feedback

Two test gaps and one doc nit from the PR review (#1217):

- Extract the `--no-env-file` decision into a pure exported helper
  `shouldPassNoEnvFile(cliPath)` so the native-binary branch is unit
  testable without mocking `BUNDLED_IS_BINARY` or running the full
  sendQuery pathway. Six new tests cover undefined, cli.js, native
  binary (Linux + Windows), Homebrew symlink, and suffix-only matching.
  Also adds a `claude.subprocess_env_file_flag` debug log so the
  security-adjacent decision is auditable.

- Extract the three install-location probes in setup.ts into exported
  wrappers (`probeFileExists`, `probeNpmRoot`, `probeWhichClaude`) and
  export `detectClaudeExecutablePath` itself, so the probe order can be
  spied on. Six new tests cover each tier winning, fall-through
  ordering, npm-tier skip when not installed, and the
  which-resolved-but-stale-path edge case.

- CLAUDE.md `claudeBinaryPath` placeholder updated to reflect that the
  field accepts either the native binary or cli.js (the example value
  was previously `/absolute/path/to/cli.js`, slightly misleading now
  that the curl-installer native binary is the default).

Skipped from the review by deliberate scope decision:

- `resolveClaudeBinaryPath` async-with-no-await: matches Codex's
  resolver signature exactly. Changing only Claude breaks symmetry;
  if pursued, do both providers in a separate cleanup PR.
- `isAbsolute()` validation in parseClaudeConfig: Codex doesn't do it
  either. Resolver throws on non-existence already.
- Atomic `.env` writes in setup wizard: pre-existing pattern this PR
  touched only adjacently. File as separate issue if needed.
- classifyError branch in dag-executor for setup errors: scope creep.
- `.env.example` "missing #" claim: false positive (verified all
  CLAUDE_BIN_PATH lines have proper comment prefixes).

* fix(test): use path.join in Windows-compatible probe-order test

The "tier 2 wins (npm cli.js)" test hardcoded forward-slash path
comparisons, but `path.join` produces backslashes on Windows. Caused
the Windows CI leg of the test suite to fail while macOS and Linux
passed. Use `path.join` for both the mock return value and the
expectation so the separator matches whatever the platform produces.
2026-04-14 17:56:37 +03:00
Rasmus Widing
33d31c44f1
fix: lock workflow runs by working_path (#1036, #1188 part 2) (#1212)
* fix: lock workflow runs by working_path (#1036, #1188 part 2)

Both bugs reduce to the same primitive: there's no enforced lock on
working_path, so two dispatches that resolve to the same filesystem
location can race. The DB row is the lock token; pending/running/paused
are "lock held"; terminal statuses release.

Changes:

- getActiveWorkflowRunByPath includes `pending` (with 5-min stale-orphan
  age window), accepts excludeId + selfStartedAt, and orders by
  (started_at ASC, id ASC) for a deterministic older-wins tiebreaker.
  Eliminates the both-abort race where two near-simultaneous dispatches
  with similar timestamps could mutually abort each other.

- Move the executor's guard call site to AFTER workflowRun is finalized
  (preCreated, resumed, or freshly created). This guarantees we always
  have self-ID + started_at to pass to the lock query.

- On guard fire after row creation: mark self as 'cancelled' so we don't
  leave a zombie pending row that would then become its own lock holder.

- New error message includes workflow name, duration, short run id, and
  three concrete next-action commands (status / cancel / different
  branch). Replaces the vague "Workflow already running".

- Resume orphan fix: when executor activates a resumable run, mark the
  orchestrator's pre-created row as 'cancelled'. Without this, every
  resume leaks a pending row that would block the user's own
  back-to-back resume until the 5-min stale window.

- New formatDuration helper for the error message (8 unit tests).

Tests:

- 5 new tests in db/workflows.test.ts: pending in active set, age window,
  excludeId exclusion, tiebreaker SQL shape, ordering.
- 5 new tests in executor.test.ts: self-id passed to query, self-cancel
  on guard fire, new message format, resume orphan cancellation,
  resume proceeds even if orphan cancel fails.
- Updated 2 executor-preamble tests for new structural behavior
  (row-then-guard, new message format).
- 8 new tests for formatDuration.

Deferred (kept scope tight):
- Worktree-layer advisory lockfile (residual #1188.2 microsecond race
  where both dispatches reach provider.create — bounded by git's own
  atomicity for `worktree add`).
- Startup cleanup of pre-existing stale pending rows (5-min age window
  makes them harmless).
- DB partial UNIQUE constraint migration (code-only is sufficient).

Fixes #1036
Fixes #1188 (part 2)

* fix: SQLite Date binding + UTC timestamp parse for path lock guard

Two issues found during E2E smoke testing:

1. bun:sqlite rejects Date objects as bindings ("Binding expected
   string, TypedArray, boolean, number, bigint or null"). Serialize
   selfStartedAt to ISO string before passing — PostgreSQL accepts
   ISO strings for TIMESTAMPTZ comparison too.

2. SQLite returns datetimes as plain strings without timezone suffix
   ("YYYY-MM-DD HH:MM:SS"), and JS new Date() parses such strings as
   local time. The blocking message was showing "running 3h" for
   workflows started seconds ago in a UTC+3 timezone.

   Added parseDbTimestamp helper that:
   - Returns Date.getTime() unchanged for Date inputs (PG path)
   - Treats SQLite-style strings as UTC by appending Z

   Used at both call sites: the lock query (selfStartedAt) and the
   blocking message duration.

Tests:
- 4 new tests in duration.test.ts for parseDbTimestamp covering
  Date input, SQLite UTC interpretation, explicit Z, and explicit
  +/-HH:MM offsets.
- Updated workflows.test.ts assertion for ISO serialization.

E2E smoke verified end-to-end:
- Sanity (single dispatch) succeeds.
- Two concurrent --no-worktree dispatches: one wins, one blocked
  with actionable message showing correct "Xs" duration.
- Resume + back-to-back resume both succeed (orphan correctly
  cancelled when resume activates).

* fix: address review — resume timestamp, lock-leak paths, status copy

CodeRabbit review on #1212 surfaced three real correctness gaps:

CRITICAL — resumeWorkflowRun preserved historical started_at, letting
a resumed row sort ahead of a currently-active holder in the lock
query's older-wins tiebreaker. Two active workflows could end up on
the same working_path. Fix: refresh started_at to NOW() in
resumeWorkflowRun. Original creation time is recoverable from
workflow_events history if needed for analytics.

MAJOR — lock-leak failure paths:
- If resumeWorkflowRun() throws, the orchestrator's pre-created row
  was left as 'pending' until the 5-min stale window. Fix: cancel
  preCreatedRun in the resume catch.
- If getActiveWorkflowRunByPath() throws, workflowRun (possibly
  already promoted to 'running' via resume) was left active with no
  auto-cleanup. Fix: cancel workflowRun in the guard catch.

MINOR — the blocking message always said "running" but the lock
query returns running, paused, AND fresh-pending rows. Telling a
user to "wait for it to finish" on a paused run (waiting on user
approval) would block them indefinitely. Fix: status-aware copy:
- paused: "paused waiting for user input" + approve/reject actions
- pending: "starting" verb
- running: keep current

Tests:
- New: resume refreshes started_at (asserts SQL contains
  `started_at = NOW()`)
- New: cancels preCreatedRun when resumeWorkflowRun throws
- New: cancels workflowRun when guard query throws
- New: paused message uses approve/reject actions, NOT "wait"
- New: pending message uses "starting" verb
- New: running message uses default copy
- Updated: existing tests for new error string ("already active"
  reflects status-aware semantics, not just "running")

Note: the user-facing error string changed from "already running on
this path" to "already active on this path (status)". Internal use
only — surfaced via getResult().error, not directly to users.

* fix: SQLite tiebreaker dialect bug + paired self struct + UX polish

CodeRabbit second review found one critical issue and several polish
items not addressed in 008013da.

CRITICAL — SQLite tiebreaker silently broken under default deployment.
SQLite stores started_at as TEXT "YYYY-MM-DD HH:MM:SS" (space sep).
Our ISO param is "YYYY-MM-DDTHH:MM:SS.mmmZ" (T sep). SQLite compares
text lexically: char 11 is space (0x20) in column vs T (0x54) in param,
so EVERY column value lex-sorts before EVERY ISO param. Result:
`started_at < $param` is always TRUE regardless of actual time. In
true concurrent dispatches, both sides see each other as "older" and
both abort — defeating the older-wins guarantee under SQLite, which
is the default deployment.

Fix: dialect-aware comparison in getActiveWorkflowRunByPath:
  - PostgreSQL: `started_at < $3::timestamptz` (TIMESTAMPTZ + cast)
  - SQLite: `datetime(started_at) < datetime($3)` (forces chronological
    via SQLite's date/time functions)

Documented with reproducer tests in adapters/sqlite.test.ts: lexical
returns wrong answer for "2026-04-14 12:00:00" < "2026-04-14T10:00:00Z";
datetime() returns correct answer.

Type design — collapse paired params into struct.
`excludeId` and `selfStartedAt` had to travel together (tiebreaker
references both) but were two independent optionals — future callers
could pass one without the other and silently degrade. Replaced with
a single `self?: { id: string; startedAt: Date }` to make the
paired-or-nothing invariant structural.

formatDuration(0) consistency.
Old: `if (ms <= 0) return '0s'` — special-cased 0ms despite the
"sub-second rounds up to 1s" comment. Fixed to `ms < 0` so 0ms
returns '1s' (a run that just started in the same DB second should
display as active, not literal zero).

Comment fix: "We acquired the lock via createWorkflowRun" was
misleading — createWorkflowRun creates a row; the lock is determined
later by the query.

Log context: added cwd to workflow.guard_self_cancel_failed and
pendingRunId to db_active_workflow_check_failed so operators can
correlate leaked rows.

Doc fixes:
- /workflow abandon doc said "marks as failed" — actually 'cancelled'
- database.md "Prevents concurrent workflow execution" → accurate
  description of path-based lock with stale-pending tolerance

Test additions:
- 3 SQLite-direct tests in adapters/sqlite.test.ts proving the
  lexical-vs-chronological bug and the datetime() fix
- Guard self-cancel update throw still surfaces failure to user

Signature change rippled through:
- IWorkflowStore.getActiveWorkflowRunByPath now takes (path, self?)
- All internal callers updated
2026-04-14 15:19:38 +03:00
Rasmus Widing
5a4541b391
fix: route canonical path failures through blocked classification (#1211)
Follow-up to #1206 review: the early getCanonicalRepoPath() wrap in
resolve() threw directly, escaping the classification flow that
createNewEnvironment uses. Permission errors, malformed worktree
pointers, ENOENT, etc. surfaced as unclassified crashes instead of
becoming an actionable `blocked` result.

Mirror createNewEnvironment's contract:
- isKnownIsolationError → return { status: 'blocked', reason:
  'creation_failed', userMessage: classifyIsolationError(err) + suffix }
- unknown errors → throw (programming bugs stay visible as crashes,
  not silent isolation failures)

Adds two tests in resolver.test.ts:
- EACCES classifies to "Permission denied" blocked message
- Unknown error propagates as throw

Addresses CodeRabbit review comment on #1206.
2026-04-14 15:19:13 +03:00
Rasmus Widing
fd3f043125
fix: extend worktree ownership guard to resolver adoption paths (#1206)
* fix: extend worktree ownership guard to resolver adoption paths (#1183, #1188)

PR #1198 guarded WorktreeProvider.findExisting(), but IsolationResolver
has three earlier adoption paths that bypass the provider layer:

- findReusable (DB lookup by workflow identity)
- findLinkedIssueEnv (cross-reference via linked issues)
- tryBranchAdoption (PR branch discovery)

Two clones of the same remote share codebase_id (identity is derived
from owner/repo). Without these guards, clone B silently adopts clone
A's worktree via any of the three paths.

Changes:
- Extract verifyWorktreeOwnership from WorktreeProvider (private) to
  @archon/git/src/worktree.ts as an exported function, sitting next to
  getCanonicalRepoPath which parses the same .git file format
- Call the shared function from all three resolver paths; throw on
  cross-clone mismatch (DB rows are preserved — they legitimately
  belong to the other clone)
- Compute canonicalRepoPath once at the top of resolve()
- Six new tests in resolver.test.ts covering each guarded path's
  cross-checkout and same-clone behaviors

Fixes #1183
Fixes #1188 (part 1 — cross-checkout; part 2 parallel collision deferred
to follow-up alongside #1036)

* fix: address PR review — polish, observability, secondary gap, docs

Addresses the multi-agent review on #1206:

Code fixes:
- worktree.adoption_refused_cross_checkout log event renamed to match
  CLAUDE.md {domain}.{action}_{state} convention
- verifyWorktreeOwnership now preserves err.code and err via { cause }
  when wrapping fs errors, so classifyIsolationError is robust to Node
  message format changes
- Structured fields (codebaseId, canonicalRepoPath) added to all
  cross-clone rejection logs for incident debugging
- Wrap getCanonicalRepoPath at top of resolve() with classified error
  instead of letting it propagate as an unclassified crash
- Extract assertWorktreeOwnership helper on IsolationResolver —
  centralizes warn-then-rethrow contract, removes duplication
- Dedupe toWorktreePath(existing.working_path) calls in resolver paths
- Add code comment on findLinkedIssueEnv explaining why throw-on-first
  is intentional (user decision — surfaces anomaly instead of masking)

Secondary gap closed:
- WorktreeProvider.findExisting PR-branch adoption path
  (findWorktreeByBranch) now also verifies ownership — same class of
  bug as the main path, just via a different lookup

Tests:
- 8 new unit tests for verifyWorktreeOwnership in @archon/git
  (matching pointer, different clone, EISDIR/ENOENT errno preservation,
  submodule pointer, corrupted .git, trailing-slash normalization,
  cause chain)
- tryBranchAdoption cross-clone test now asserts store.create was
  never called (symmetry with paths 1+2 asserting updateStatus)
- New test for cross-clone rejection in the PR-branch-adoption
  secondary path in worktree.test.ts

Docs:
- CHANGELOG.md Unreleased entry for the cross-clone fix series
- troubleshooting.md "Worktree Belongs to a Different Clone" section
  documenting all four new error patterns with resolution steps and
  pointer to #1192 for the architectural fix

* fix(git): use raw .git pointer in cross-clone error message

verifyWorktreeOwnership previously called path.resolve() on the gitdir
path before embedding it in the error message. On Windows, resolve()
prepends a drive letter to a POSIX-style path (e.g., /other/clone →
C:\other\clone), which:

1. Misled users by showing a path that doesn't match what's actually
   in their .git file
2. Broke a Windows-only test asserting the error contains the literal
   /other/clone path

Compare on resolved paths (correct — normalizes trailing slashes and
relative components for the equality check) but display the raw match
in the error message (recognizable to the user).
2026-04-14 12:10:19 +03:00
Rasmus Widing
af9ed84157
fix: prevent worktree isolation bypass via prompt and git-level adoption (#1198)
* fix: prevent worktree isolation bypass via prompt and git-level adoption (#1193, #1188)

Three fixes for workflows operating on wrong branches:

- archon-implement prompt: replace ambiguous branch table with decision
  tree that trusts the worktree isolation system, uses $BASE_BRANCH
  explicitly, and instructs AI to never switch branches
- WorktreeProvider.findExisting: verify worktree's parent repo matches
  the request before adopting, preventing cross-clone adoption
- WorktreeProvider.createNewBranch: reset stale orphan branches to the
  intended start-point instead of silently inheriting old commits

Fixes #1193
Relates to #1188

* fix: address PR review — strict worktree verification, align sibling prompts

Address CodeRabbit + self-review findings on #1198:

Code fixes:
- findExisting now throws on cross-checkout or unverifiable state instead of
  returning null, avoiding a confusing cascade through createNewBranch
- verifyWorktreeOwnership handles .git errors precisely: ENOENT/EACCES/EIO
  throw a fail-fast error; EISDIR (full checkout at path) throws a clear
  "not a worktree" error; unmatched gitdir (submodule, malformed) throws
- Path comparison uses resolve() to normalize trailing slashes
- Added classifyIsolationError patterns so new errors produce actionable
  user messages

Test fixes:
- mockClear readFile/rm in afterEach
- New tests: cross-checkout throws, EISDIR throws, EACCES throws,
  submodule pointer throws, trailing-slash normalization, branch -f
  reset failure propagates without retry
- Updated existing tests that relied on permissive adoption to provide
  valid matching gitdir

Prompt fixes (sweep of all default commands):
- archon-implement.md: clarify "never switch branches" applies to worktree
  context; non-worktree branch creation still allowed
- archon-fix-issue.md + archon-implement-issue.md: aligned decision tree
  with archon-implement pattern; use $BASE_BRANCH instead of MAIN/MASTER
- archon-plan-setup.md: converted table to ordered decision tree with
  IN WORKTREE? first; removed ambiguous "already on correct feature
  branch" row
2026-04-14 09:44:12 +03:00
Rasmus Widing
d6e24f5075
feat: Phase 2 — community-friendly provider registry system (#1195)
* feat: replace hardcoded provider factory with typed registry system

Replace the built-in-only factory switch with a typed ProviderRegistration
registry where entries carry metadata (displayName, capabilities,
isModelCompatible) alongside the factory function. This enables community
providers to register without modifying core code.

- Add ProviderRegistration and ProviderInfo types to contract layer
- Create registry.ts with register/get/list/clear API, delete factory.ts
- Bootstrap registerBuiltinProviders() at server and CLI entrypoints
- Widen provider unions from 'claude' | 'codex' to string across schemas,
  config types, deps, executors, and API validation
- Replace hardcoded model-validation with registry-driven isModelCompatible
  and inferProviderFromModel (built-in only inference)
- Add GET /api/providers endpoint returning registry metadata
- Dynamic provider dropdowns in Web UI (BuilderToolbar, NodeInspector,
  WorkflowBuilder, SettingsPage) via useProviders hook
- Dynamic provider selection in CLI setup command
- Registry test suite covering full lifecycle

* feat: generalize assistant config and tighten registry validation

- Add ProviderDefaults/ProviderDefaultsMap generic types to contract layer
- Add index signatures to ClaudeProviderDefaults/CodexProviderDefaults
- Introduce AssistantDefaults/AssistantDefaultsConfig intersection types
  that combine ProviderDefaultsMap with typed built-in entries
- Replace hardcoded claude/codex config merging with generic
  mergeAssistantDefaults() that iterates all provider entries
- Replace hardcoded toSafeConfig projection with generic
  toSafeAssistantDefaults() that strips server-internal fields
- Validate provider strings at all config-entry surfaces: env override,
  global config, repo config all throw on unknown providers
- Validate provider on PATCH /api/config/assistants (400 on unknown)
- Move validator.ts from hardcoded Codex checks to capability-driven
  warnings using registry getProviderCapabilities()
- Remove resolveProvider() default to 'claude' — returns undefined when
  no provider is set, skipping capability warnings for unresolved nodes
- Widen config API schemas to generic Record<string, ProviderDefaults>
- Rewrite SettingsPage to iterate providers dynamically with built-in
  specific UI for Claude/Codex and generic JSON view for community
- Extract bootstrap to provider-bootstrap modules in CLI and server
- Remove all as Record<...> casts from dag-executor, executor,
  orchestrator — clean indexing via ProviderDefaultsMap intersection

* fix: remove remaining hardcoded provider assumptions and regenerate types

- Replace hardcoded 'claude' defaults in CLI setup with registry lookup
  (getRegisteredProviders().find(p => p.builtIn)?.id)
- Replace hardcoded 'claude' default in clone.ts folder detection with
  registry-driven fallback
- Update config YAML comment from "claude or codex" to "registered provider"
- Make bootstrap test assertions use toContain instead of exact toEqual
  so they don't break when community providers are registered
- Widen validator.test.ts helper from 'claude' | 'codex' to string
- Remove unnecessary type casts in NodeInspector, WorkflowBuilder,
  SettingsPage now that generated types use string
- Regenerate api.generated.d.ts from updated OpenAPI spec — all provider
  fields are now string instead of 'claude' | 'codex' union

* fix: address PR review findings — consistency, tests, docs

Critical fixes:
- isModelCompatible now throws on unknown providers (fail-fast parity
  with getProviderCapabilities) instead of silently returning true
- Schema provider fields use z.string().trim().min(1) to reject
  whitespace-only values
- validator.ts resolveProvider accepts defaultProvider param so
  capability warnings fire for config-inherited providers
- PATCH /api/config/assistants validates assistants keys against
  registry (rejects unknown provider IDs in the map)

YAGNI cleanup:
- Delete provider-bootstrap.ts wrappers in CLI and server — call
  registerBuiltinProviders() directly
- Remove no-op .map(provider => provider) in SettingsPage

Test coverage:
- Add GET /api/providers endpoint tests (shape, projection, capabilities)
- Add config-loader throw-path tests for unknown providers in env var,
  global config, and repo config
- Add isModelCompatible throw test for unknown providers

Docs:
- CLAUDE.md: factory.ts → registry.ts in directory tree, add
  GET /api/providers to API endpoints section
- .env.example: update DEFAULT_AI_ASSISTANT comment
- docs-web configuration reference: update provider constraint docs

UI:
- Settings default-assistant dropdown uses allProviderEntries fallback
  (no longer silently empty on API failure)
- clearRegistry marked @internal in JSDoc

* fix: use registry defaults in getDefaults/registerProject, document type design

- getDefaults() initializes assistant defaults from registered providers
  instead of hardcoding { claude: {}, codex: {} }
- getDefaults() uses first registered built-in as default assistant
  instead of hardcoding 'claude'
- handleRegisterProject uses config.assistant instead of hardcoded 'claude'
  for new codebase ai_assistant_type
- Document AssistantDefaults/AssistantDefaultsConfig intersection types:
  built-in keys are typed for parseClaudeConfig/parseCodexConfig type
  safety; community providers use the generic [string] index
- Document WorkflowConfig.assistants intersection type with same rationale

* docs: update stale provider references to reflect registry system

- architecture.md: DB schema comment now says 'registered provider'
- first-workflow.md: provider field accepts any registered provider
- quick-reference.md: provider type changed from enum to string
- authoring-workflows.md: provider type changed from enum to string
- title-generator.ts: @param doc updated from 'claude or codex' to
  generic provider identifier

* docs: fix remaining stale provider references in quick-reference and authoring guide

- quick-reference.md: per-node provider type changed from enum to string
- quick-reference.md: model mismatch guidance updated for registry pattern
- authoring-workflows.md: provider comment says 'any registered provider'
2026-04-13 21:27:11 +03:00
Rasmus Widing
b5c5f81c8a
refactor: extract provider metadata seam for Phase 2 registry readiness (#1185)
* refactor: extract provider metadata seam for Phase 2 registry readiness

- Add static capability constants (capabilities.ts) for Claude and Codex
- Export getProviderCapabilities() from @archon/providers for capability
  queries without provider instantiation
- Add inferProviderFromModel() to model-validation.ts, replacing three
  copy-pasted inline inference blocks in executor.ts and dag-executor.ts
- Replace throwaway provider instantiation in dag-executor with static
  capability lookup (getProviderCapabilities)
- Add orchestrator warning when env vars are configured but provider
  doesn't support envInjection

* refactor: address LOW findings from code review

- Remove CLAUDE_CAPABILITIES/CODEX_CAPABILITIES from public index (YAGNI —
  callers should use getProviderCapabilities(), not raw constants)
- Remove dead _deps parameter from resolveNodeProviderAndModel and its
  two call-sites (no longer needed after static capability lookup refactor)
- Update factory.ts module JSDoc to mention both exported functions
- Add edge-case tests for getProviderCapabilities: empty string and
  case-sensitive throws (parity with existing getAgentProvider tests)
- Add test for inferProviderFromModel with empty string (returns default,
  documenting the falsy-string shortcut)
2026-04-13 16:10:48 +03:00
Rasmus Widing
bf20063e5a
feat: propagate managed execution env to all workflow surfaces (#1161)
* Implement managed execution env propagation

* Address managed env review feedback
2026-04-13 15:21:57 +03:00
Rasmus Widing
a8ac3f057b
security: prevent target repo .env from leaking into subprocesses (#1135)
Remove the entire env-leak scanning/consent infrastructure: scanner,
allow_env_keys DB column usage, allow_target_repo_keys config, PATCH
consent route, --allow-env-keys CLI flag, and UI consent toggle.

The env-leak gate was the wrong primitive. Target repo .env protection
is already structural:
- stripCwdEnv() at boot removes Bun-auto-loaded CWD .env keys
- Archon loads its own env sources afterward (~/.archon/.env)
- process.env is clean before any subprocess spawns
- Managed env injection (config.yaml env: + DB vars) is unchanged

No scanning, no consent, no blocking. Any repo can be registered and
used. Subprocesses receive the already-clean process.env.
2026-04-13 13:46:24 +03:00
Rasmus Widing
c9c6ab47cb test: add comprehensive e2e smoke test workflows
- e2e-all-nodes: exercises bash, prompt, script (bun), structured output,
  model override (haiku), effort control, and $nodeId.output refs
- e2e-mixed-providers: tests Claude + Codex in the same workflow with
  cross-provider output references
- echo-args.js: simple script node test helper
2026-04-13 11:26:05 +03:00
Rasmus Widing
37aeadb8c8
refactor: decompose provider sendQuery() into explicit helper boundaries (#1162)
* refactor: decompose provider sendQuery() into explicit helper boundaries (#1139)

sendQuery() in both Claude and Codex providers was a monolith mixing SDK option
building, nodeConfig translation, stream normalization, and error classification.
This makes it hard to safely extend for Phase 2 provider extensibility.

Decompose both providers into focused internal helpers:

Claude:
- buildBaseClaudeOptions: SDK option construction
- buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup
- applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings
- streamClaudeMessages: raw SDK event → MessageChunk normalization
- classifyAndEnrichError: error classification with retry decisions

Codex:
- buildTurnOptions: per-turn option construction (output schema, abort)
- streamCodexEvents: raw SDK event → MessageChunk normalization
- classifyAndEnrichCodexError: error classification with retry decisions

Also introduces ProviderWarning { code, message } replacing raw string warnings
for machine-readable provider translation warnings.

Adds 43 focused unit tests covering the extracted helpers directly.

Fixes #1139

* fix: export ToolResultEntry type used in public buildBaseClaudeOptions API

* fix: unexport internal helpers to prevent API surface leakage, fix retry state bug

Review findings:
1. Internal helpers were exported and reachable through package.json subpath
   exports (./claude/provider, ./codex/provider), widening the public API.
   All new helpers are now file-local — the only public exports remain
   ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML,
   withFirstMessageTimeout, getProcessUid.

2. Codex streamState (lastTodoListSignature) was shared across retry
   attempts, causing todo-list dedup to suppress output on retry.
   Now creates fresh state per attempt.

Removed direct helper test imports — existing sendQuery e2e tests
(51 Claude + 42 Codex) cover all behavior paths.

* fix: address review findings — abort handling, retry bugs, error swallowing

Fixes from CodeRabbit + multi-agent review:

1. classifyAndEnrichError preserves first-event timeout diagnostic instead
   of collapsing it into generic "Query aborted" (the timeout aborts the
   controller, but the original error carries the #1067 breadcrumb)

2. nodeConfigWarnings emitted once before retry loop, not per attempt

3. buildSubprocessEnv() called once before retry loop (was re-logging
   auth mode and rebuilding { ...process.env } per attempt)

4. Abort signal listener registered once with forwarding to current
   controller (was accumulating per-retry listeners)

5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on
   circular refs — was asymmetric with PostToolUseFailure which had it)

6. Codex streamCodexEvents throws on abort instead of silent break
   (callers were getting truncated stream with no result/error)

7. Both providers store enrichedError (not raw error) for retry
   exhaustion — preserves stderr context in final throw

8. Log is_error result events at error level in Claude stream normalizer

* test: add black-box behavioral tests for sendQuery decomposition fixes

Restore test coverage for the specific fixes from the decomposition review,
exercised through sendQuery (black-box) since helpers are file-local:

Claude (6 tests):
- Timeout error preserved (not collapsed into "Query aborted")
- nodeConfig warnings emitted once even when retries occur
- Abort signal cancels across retries via single forwarding listener
- Enriched error (with stderr) thrown at retry exhaustion
- PostToolUse hook handles circular reference without crashing
- is_error result events logged at error level

Codex (3 tests):
- Abort signal throws instead of silently truncating stream
- Enriched error thrown at retry exhaustion
- Todo-list dedup state resets between retry attempts
2026-04-13 11:24:36 +03:00
Rasmus Widing
6a6740af38
fix: make env-integration test cross-platform (Windows CI) (#1160)
* fix: make env-integration test cross-platform (Windows CI)

Check for Windows env var equivalents (Path instead of PATH,
USERPROFILE instead of HOME) in scenario 3 assertions.

Closes #1128

* fix: Windows PATH/HOME casing in provider subprocess env test

Same cross-platform fix for ClaudeProvider test — spread objects
lose Windows case-insensitive behavior (Path vs PATH, USERPROFILE
vs HOME).
2026-04-13 09:44:58 +03:00
Rasmus Widing
c1ed76524b
refactor: extract providers from @archon/core into @archon/providers (#1137)
* refactor: extract providers from @archon/core into @archon/providers

Move Claude and Codex provider implementations, factory, and SDK
dependencies into a new @archon/providers package. This establishes a
clean boundary: providers own SDK translation, core owns business logic.

Key changes:
- New @archon/providers package with zero-dep contract layer (types.ts)
- @archon/workflows imports from @archon/providers/types — no mirror types
- dag-executor delegates option building to providers via nodeConfig
- IAgentProvider gains getCapabilities() for provider-agnostic warnings
- @archon/core no longer depends on SDK packages directly
- UnknownProviderError standardizes error shape across all surfaces

Zero user-facing changes — same providers, same config, same behavior.

* refactor: remove config type duplication and backward-compat re-exports

Address review findings:
- Move ClaudeProviderDefaults and CodexProviderDefaults to the
  @archon/providers/types contract layer as the single source of truth.
  @archon/core/config/config-types.ts now imports from there.
- Remove provider re-exports from @archon/core (index.ts and types/).
  Consumers should import from @archon/providers directly.
- Update @archon/server to depend on @archon/providers for MessageChunk.

* refactor: move structured output validation into providers

Each provider now normalizes its own structured output semantics:
- Claude already yields structuredOutput from the SDK's native field
- Codex now parses inline agent_message text as JSON when outputFormat
  is set, populating structuredOutput on the result chunk

This eliminates the last provider === 'codex' branch from dag-executor,
making it fully provider-agnostic. The dag-executor checks structuredOutput
uniformly regardless of provider.

Also removes the ClaudeCodexProviderDefaults deprecated alias — all
consumers now use ClaudeProviderDefaults directly.

* fix: address PR review — restore warnings, fix loop options, cleanup

Critical fixes:
- Restore MCP missing env vars user-facing warning (was silently dropped)
- Restore Haiku + MCP tool search warning
- Fix buildLoopNodeOptions to pass workflow-level nodeConfig (effort,
  thinking, betas, sandbox were silently lost for loop nodes)
- Add TODO(#1135) comments documenting env-leak gate gap

Cleanup:
- Remove backward-compat type aliases from deps.ts (keep WorkflowTokenUsage)
- Remove 26 unnecessary eslint-disable comments from test files
- Trim internal helpers from providers barrel (withFirstMessageTimeout,
  getProcessUid, loadMcpConfig, buildSDKHooksFromYAML)
- Add @archon/providers dep to CLI package.json
- Fix 8 stale documentation paths pointing to deleted core/src/providers/
- Add E2E smoke test workflows for both Claude and Codex providers

* fix: forward provider system warnings to users in dag-executor

The dag-executor only forwarded system chunks starting with
"MCP server connection failed:" — all other provider warnings
(missing env vars, Haiku+MCP, structured output issues) were
logged but never reached the user.

Now forwards all system chunks starting with ⚠️ (the prefix
providers use for user-actionable warnings).

* fix: add providers package to Dockerfile and fix CI module resolution

- Add packages/providers/ to all three Dockerfile stages (deps,
  production package.json copy, production source copy)
- Replace wildcard export map (./*) with explicit subpath entries
  to fix module resolution in CI (bun workspace linking)

* chore: update bun.lock for providers package exports
2026-04-13 09:21:36 +03:00
Rasmus Widing
eb75ab60e5
Merge pull request #1130 from coleam00/rules-cleanup
docs: consolidate Claude guidance into CLAUDE.md
2026-04-12 20:31:49 +03:00
Rasmus Widing
39c6f05bad docs: consolidate Claude guidance into CLAUDE.md 2026-04-12 20:21:16 +03:00
Rasmus Widing
a4242e6b49
Merge pull request #1116 from coleam00/rename-iassistantclient-to-iagentprovider
refactor: rename IAssistantClient to IAgentProvider
2026-04-12 20:02:49 +03:00
Rasmus Widing
a7b3b94388 refactor: simplify provider rename follow-through
- ProviderDefaults → CodexProviderDefaults (symmetric with ClaudeProviderDefaults)
- Fix stale "AI client" comments in orchestrator-agent.ts and orchestrator.test.ts
- Remove dead createMockAgentProvider in test/mocks/streaming.ts (zero importers, wrong method names)
- Fix irregular whitespace in .claude/rules/workflows.md
2026-04-12 13:51:45 +03:00
Rasmus Widing
b9a70a5d17 refactor: complete provider rename in config types, logger domains, and docs
- AssistantDefaults → ProviderDefaults, ClaudeAssistantDefaults → ClaudeProviderDefaults
- Logger domains: client.claude → provider.claude, client.codex → provider.codex
- Fix stale JSDoc, error messages, and references in architecture docs, CHANGELOG, testing rules
2026-04-12 13:47:05 +03:00
Rasmus Widing
91c184af57 refactor: rename IAssistantClient to IAgentProvider
Rename the core AI provider interface and all related types, classes,
factory functions, and directory from clients/ to providers/.

Rename map:
- IAssistantClient → IAgentProvider
- ClaudeClient → ClaudeProvider
- CodexClient → CodexProvider
- getAssistantClient → getAgentProvider
- AssistantRequestOptions → AgentRequestOptions
- IWorkflowAssistantClient → IWorkflowAgentProvider
- AssistantClientFactory → AgentProviderFactory
- WorkflowAssistantOptions → WorkflowAgentOptions
- packages/core/src/clients/ → packages/core/src/providers/

NOT renamed (user-facing/DB-stored): assistant config key,
DEFAULT_AI_ASSISTANT env var, ai_assistant_type DB column.

No behavioral changes — purely naming.
2026-04-12 13:11:21 +03:00
github-actions[bot]
c2089117fa chore: update Homebrew formula for v0.3.6 2026-04-12 09:19:27 +00:00
Rasmus Widing
59cda08efa
Merge pull request #1114 from coleam00/dev
Release 0.3.6
2026-04-12 12:17:34 +03:00
Rasmus Widing
883d1369f4 Release 0.3.6 2026-04-12 12:16:49 +03:00
Cole Medin
6da994815c
fix: strip CWD .env leak, remove subprocess allowlist, add first-event timeout (#1067, #1030, #1098, #1070)
* fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout (#1067)

Three bugs fixed: (1) Bun auto-loads CWD .env files before user code, leaking
non-overlapping keys into the Archon process — new stripCwdEnv() boot import
removes them before any module reads env. (2) archon serve hardcoded
skipPlatformAdapters:true, preventing Slack/Telegram/Discord from starting.
(3) Claude SDK query had no first-event timeout, causing silent 30-min hangs
when the subprocess wedges — new withFirstMessageTimeout wrapper races the
first event against a configurable deadline (default 60s).

Changes:
- Add @archon/paths/strip-cwd-env and strip-cwd-env-boot modules
- Import boot module as first import in CLI entry point
- Remove skipPlatformAdapters: true from serve.ts
- Add withFirstMessageTimeout + diagnostics to ClaudeClient
- Add CLAUDECODE=1 nested-session warning to CLI
- Add 9 unit tests (6 strip-cwd-env + 3 timeout)

Fixes #1067

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address review findings for PR #1092

Fixed:
- Clear setTimeout timer in withFirstMessageTimeout finally block (HIGH-1)
- Add strip-cwd-env-boot to server/src/index.ts for direct dev:server path (MEDIUM-1)
- Warn to stderr on non-ENOENT errors in stripCwdEnv (MEDIUM-2)
- Update stale configuration.md docs for new env-loading mechanism (HIGH-2)
- Add ARCHON_CLAUDE_FIRST_EVENT_TIMEOUT_MS and ARCHON_SUPPRESS_NESTED_CLAUDE_WARNING env vars to docs
- Add nested Claude Code hang troubleshooting entry
- Fix boot module JSDoc: "CLI and server" → "CLI" only
- Fix stripCwdEnv JSDoc: remove stale "override: true" reference
- Update .claude/rules/cli.md startup behavior section
- Update CLAUDE.md @archon/paths description with new exports

Tests added:
- Assert controller.signal.aborted on timeout
- Handle generator that completes immediately without yielding
- Strip distinct keys from different .env files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* simplify: replace string sentinel with typed error class in withFirstMessageTimeout

Replace the '__timeout__' string sentinel used to identify timeout rejections
with a dedicated FirstEventTimeoutError class. instanceof checks are more
explicit and robust than string comparison on error messages.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review findings — dotenv version, docs, server warning, marker strip, tests

1. Align dotenv to ^17 (was ^16, rest of monorepo uses ^17.2.3)
2. Remove incorrect SUBPROCESS_ENV_ALLOWLIST claim from docs — the SDK
   bypasses the env option and uses process.env directly (#1097)
3. Add CLAUDECODE=1 warning to server entry point (was only in CLI)
4. Add diagnostic payload content test for withFirstMessageTimeout
5. Integrate #1097's finding: strip CLAUDECODE + CLAUDE_CODE_* session
   markers (except auth vars) + NODE_OPTIONS + VSCODE_INSPECTOR_OPTIONS
   from process.env at entry point. Pattern-matched on CLAUDE_CODE_*
   prefix rather than hardcoding 6 names, so future Claude Code markers
   are handled automatically. Auth vars (CLAUDE_CODE_OAUTH_TOKEN,
   CLAUDE_CODE_USE_BEDROCK, CLAUDE_CODE_USE_VERTEX) are preserved.

   Root cause per #1097: the Claude Agent SDK leaks process.env into the
   spawned child regardless of the explicit env option, so the only way
   to prevent the nested-session deadlock is to delete the markers from
   process.env at the entry point.

Validation: bun run validate passes, 125 paths tests (6 new marker
tests), 60 claude tests (1 new diagnostic test), DATABASE_URL leak
verified stripped (target repo .env DATABASE_URL does not affect Archon
DB selection).

* refactor: remove SUBPROCESS_ENV_ALLOWLIST — trust user config, strip only CWD

The allowlist was wrong for a single-developer tool:
- It blocked keys the user intentionally set in ~/.archon/.env
  (ANTHROPIC_API_KEY, AWS_*, CLAUDE_CONFIG_DIR, MiniMax vars, etc.)
- It was bypassed by the SDK anyway (process.env leaks to subprocess
  regardless of the env option — see #1097)
- It attracted a constant stream of PRs adding keys (#1060, #1093, #1099)

New model: CWD .env keys are the only untrusted source. stripCwdEnv()
at entry point handles that. Everything in ~/.archon/.env + shell env
passes through to the subprocess. No filtering, no second-guessing.

Changes:
- Delete env-allowlist.ts and env-allowlist.test.ts
- Simplify buildSubprocessEnv() to return { ...process.env } with
  auth-mode logging (no token stripping — user controls their config)
- Replace 4 allowlist-based tests with 1 pass-through test
- Remove env-allowlist.test.ts from core test batch
- Update security.md and cli.md docs to reflect the new model

The CLAUDECODE + CLAUDE_CODE_* marker strip and NODE_OPTIONS strip
remain in stripCwdEnv() at entry point — those are process-level
safety (not per-subprocess filtering) and are needed regardless.

* fix: restore override:true for archon env, add integration tests

The integration tests caught a real issue: without override:true, the
~/.archon/.env load doesn't win over shell-inherited env vars. If the
user's shell profile exports PORT=9999 and ~/.archon/.env has PORT=3000,
the user expects Archon to use 3000.

stripCwdEnv() handles CWD .env files (untrusted). override:true handles
shell-inherited vars (trusted but less specific than ~/.archon/.env).
Different concerns, both needed.

Also adds 6 integration tests covering the full entry-point flow:
1. Global auth user with ANTHROPIC_API_KEY in CWD .env — stripped
2. OAuth token in archon env + random key in CWD — CWD stripped, archon kept
3. General leak test — nothing from CWD reaches subprocess
4. Same key in both CWD and archon — archon value wins
5. CLAUDECODE markers stripped even when not from CWD .env
6. CLAUDE_CODE_OAUTH_TOKEN survives marker strip

* test: add DATABASE_URL leak scenarios to env integration tests

* fix: move CLAUDECODE warning into stripCwdEnv, remove dead useGlobalAuth logic

Review findings addressed:

1. CLAUDECODE warning was dead code — the boot import deleted CLAUDECODE
   from process.env before the warning check in cli.ts/server/index.ts
   could fire. Moved the warning into stripCwdEnv() itself, emitted
   BEFORE the deletion. Removed duplicate warning code from both entry
   points.

2. useGlobalAuth token stripping removed (intentional, not regression) —
   the old code stripped CLAUDE_CODE_OAUTH_TOKEN and CLAUDE_API_KEY when
   useGlobalAuth=true. Per design discussion: the user controls
   ~/.archon/.env and all keys they set are intentional. If they want
   global auth, they just don't set tokens. Simplified buildSubprocessEnv
   to log auth mode for diagnostics only, no filtering.

3. Docs "no override needed" corrected — cli.md and configuration.md
   now reflect the actual code (override: true).

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
2026-04-12 12:11:16 +03:00
Cole Medin
b620c04e27 fix(web): add defensive optional chaining for workflow run data access
Prevents "Cannot read properties of undefined (reading 'status')" crash
when navigating between chat and workflow execution views during race
conditions where run data may be transiently undefined.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 20:09:09 -05:00
Cole Medin
bf8bc8e4ae fix: address review findings for workflow context injection
- CRITICAL: fix metadata filter in getRecentWorkflowResultMessages to check
  for workflowResult key presence instead of category (which is never persisted
  to DB); feature was completely non-functional on every call
- HIGH: guard JSON.parse(msg.metadata) with typeof check to handle PostgreSQL
  JSONB columns returned as objects (not strings) by node-postgres
- MEDIUM: add structured warn log inside inner metadata parse catch block
- LOW: use SELECT id, content, metadata instead of SELECT * in new DB query
- LOW: update comments in messages.ts and prompt-builder.ts for accuracy
- Tests: add formatWorkflowContextSection unit tests (pure function coverage)
- Tests: add getRecentWorkflowResultMessages tests (dialect switch + contract)
- Tests: add getDatabaseType mock to messages.test.ts connection mock
- Tests: add ../db/messages mock and formatWorkflowContextSection to
  prompt-builder mock in orchestrator-agent.test.ts
- Tests: add handleMessage workflow context injection behavioral tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 17:59:19 -05:00
Cole Medin
4292c3a24b simplify: replace nested ternary with if/else for headerTitle in WorkflowResultCard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 17:49:55 -05:00
Cole Medin
e4555a769b simplify: reduce complexity in changed files
- Parallelize checksums + tarball fetch in serve.ts (removes waterfall latency)
- Remove redundant existsSync before readFileSync in update-check.ts (catch already handles ENOENT)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 17:47:53 -05:00
Cole Medin
dbe559efd1 fix(web): address review findings — logging and test extraction
- Add console.error logging to silent .catch on SSE reconnect re-fetch
  (ChatInterface.tsx:~544) so production failures are visible in logs
- Extract onText setMessages reducer to chat-message-reducer.ts as a
  pure function (applyOnText) with 14 unit tests covering all 6
  segmentation rules including the new tool-call boundary (issue #1054)
- Refactor ChatInterface.onText to delegate to applyOnText

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 17:45:08 -05:00
Cole Medin
3e3ddf25d5 feat: inject workflow run context into orchestrator prompt (#1055)
After a workflow completes, the AI had no awareness of results when
answering follow-up questions. This adds a "Recent Workflow Results"
section to the orchestrator prompt by querying persisted workflow_result
messages from the conversation.

Changes:
- Add getRecentWorkflowResultMessages() to db/messages.ts
- Add WorkflowResultContext type and formatWorkflowContextSection() to prompt-builder.ts
- Extend buildFullPrompt() with optional workflowContext parameter
- Fetch and inject workflow context in handleMessage() before prompt building

Fixes #1055

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:34:17 -05:00
Cole Medin
4ee5232da3 fix(web): interleave tool calls with text during SSE streaming (#1054)
During SSE streaming, tool calls always appeared below all text because
onText appended to the existing message even when it already had tool
calls. The server-side persistence already segments at this boundary.
Mirror that rule in the client's onText handler: when the last streaming
message has tool calls, seal it and start a new message for incoming text.

Fixes #1054

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:31:38 -05:00
Cole Medin
16b47d3dde fix: archon setup --spawn fails on Windows when repo path contains spaces (#1035)
The cmd.exe fallback in spawnWindowsTerminal() used shell: true, which caused
Bun/Node to flatten args into a single string without proper quoting. Paths
with spaces were split at whitespace, breaking the /D argument to start.

Changes:
- Remove shell: true from cmd.exe fallback spawn options
- Remove shell?: boolean from trySpawn options type (no callers need it)

Fixes #1035

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:29:25 -05:00
Cole Medin
536584db8f
Merge pull request #1026 from coleam00/archon/task-fix-issue-1014
feat(web): loop node iteration visibility in workflow execution view
2026-04-10 16:14:12 -05:00
Cole Medin
60ddda3a12 revert: remove incorrect remainingMessage suppression in stream mode
The suppression broke the "sends remaining message before dispatching
workflow" test — when the AI response contains both text and a command
in a single chunk, the text was never streamed, so suppressing
remainingMessage loses it entirely. The actual duplicate was in the
WorkflowLogs execution view, not the routing AI path, and is already
fixed by the onText message splitting and text content dedup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 16:13:45 -05:00
Cole Medin
1eddf3e6aa fix(web): split workflow status messages in WorkflowLogs onText handler
WorkflowLogs' onText handler was blindly concatenating all SSE text into
a single streaming message, unlike ChatInterface which splits on workflow
status text (🚀/). This caused the "Starting workflow" text to merge
with subsequent text into one giant message, breaking text dedup against
DB messages (which are stored as separate segments). The SSE message
content never matched any single DB message exactly, so both appeared.

Add the same workflow status boundary detection from ChatInterface:
close the current streaming message and start a new one when a workflow
status message arrives or when regular text follows a status message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 15:51:40 -05:00
Cole Medin
4e56c86dff fix: eliminate duplicate text and tool calls in workflow execution view
Three fixes for message duplication during live workflow execution:

1. dag-executor: Add missing `tool_call_formatted` category to loop iteration
   tool messages. Without this, the web adapter sent tool text as both a regular
   SSE text event AND a structured tool_call event, causing each tool to appear
   twice (raw text + rendered card). Regular DAG nodes already had this metadata.

2. WorkflowLogs: Add text content dedup in SSE/DB merge. During live execution,
   the same text (e.g. "Starting workflow...") can appear in both DB (REST fetch)
   and SSE (event buffer replay). Collects DB text into a Set and skips matching
   SSE text messages.

3. orchestrator-agent: Suppress remainingMessage re-send in stream mode. The
   routing AI streams text chunks before /invoke-workflow is detected, then
   retracts them. Without suppression, remainingMessage re-sends the same text.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 15:48:40 -05:00
Cole Medin
5685b41d18 fix(cli): add cli. domain prefix to log event names
Apply review finding: rename flat log event names to use the
cli.{action}_{state} convention matching the rest of the file.

- workflow_dispatch_surface_failed → cli.workflow_dispatch_surface_failed
- workflow_output_surface_failed → cli.workflow_result_surface_failed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 10:27:12 -05:00
Cole Medin
b8e367f35d simplify: reduce complexity in changed files
Deduplicate JSON branch in workflowStatusCommand by computing the output
array once with a single console.log call, removing the duplicated
verbose/non-verbose conditional branches.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 10:16:27 -05:00
Cole Medin
e8334b313a Merge branch 'archon/task-fix-issue-1015' into dev
Resolve merge conflict in MessageList.tsx by combining:
- PR #1025: status/duration/nodes/artifacts enrichment for WorkflowResultCard
- PR #1023: ArtifactViewerModal clickable file paths in result card content

Both features now work together — the result card shows status-aware
headers, node counts, duration, and artifact summaries while also
supporting clickable artifact file paths in the markdown content.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 10:16:06 -05:00
Cole Medin
25757b8f56 simplify: remove redundant String() wrapping in template literals
Template literals automatically coerce numbers to strings; wrapping with
String() is redundant. Removed from formatAge, formatDuration, and all
console.log calls in workflow.ts. Also compacted a two-line object
spread in workflowStatusCommand to a single line.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 10:13:55 -05:00
Cole Medin
7cae3a10d4 fix(cli): guard dispatch sendMessage, improve comments and add tests for PR #1052
- Wrap dispatch sendMessage in try/catch (matches result card pattern) to
  prevent UI notification failures from blocking workflow execution
- Update dispatch comment to accurately describe structural similarity to
  orchestrator while noting synchronous CLI semantics and that
  workerConversationId === conversationId in the CLI path
- Add note to result card comment about paused-path exclusion
- Add 4 integration tests for workflowRunCommand: dispatch ordering and
  metadata shape, result card with summary, no result card without summary,
  and non-throwing DB failure on result persist

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 10:08:01 -05:00
Cole Medin
5683007ef8 fix(web,core): address review findings for WorkflowResultCard
- Fix unsafe `e.data` cast: use runtime type narrowing instead of
  `as Record<string, string | undefined>` for event artifact extraction
- Fix invalid ArtifactType fallback: change 'file' to 'file_created'
  (a valid member of the ArtifactType union)
- Handle useQuery error state: destructure `isError` and render a
  degraded card (no node count, duration, or artifacts) when the API
  fetch fails, preventing a misleading "Workflow complete" display
- Emit workflowResult metadata on failure/cancel: the orchestrator now
  attaches workflowResult to failed workflow messages so the chat
  renders a result card with status icon and "View full logs" link
- Add 'workflowRun' to invalidateWorkflowQueries() so singular run
  cache entries are invalidated alongside the plural list caches

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 10:02:04 -05:00
Cole Medin
69193512e7 fix(cli): send workflow dispatch/result messages for Web UI cards (#1017)
CLI-launched workflows were visible in the Web UI chat but displayed as
plain text only — no WorkflowProgressCard or WorkflowResultCard. The CLI
adapter already handled both metadata fields; the sendMessage calls were
simply missing from workflowRunCommand.

Changes:
- Send workflowDispatch message before executeWorkflow (mirrors orchestrator.ts)
- Send workflowResult message after successful completion with summary
- Wrap result message in try/catch with warn log (same pattern as orchestrator)

Fixes #1017

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:46:47 -05:00
Rasmus Widing
60489388c9
Merge pull request #1049 from coleam00/fix/binary-version-check
fix(server): use BUNDLED_VERSION for app version in binary mode
2026-04-10 17:07:28 +03:00
Cole Medin
ffe803ecc7
feat(web): make artifact file paths clickable in chat messages (#1023)
* feat: make artifact file paths clickable in chat messages (#1016)

Artifact paths in AI responses (e.g. `artifacts/runs/{uuid}/report.md`) were
rendered as plain inline code. Now they render as clickable buttons (.md opens
ArtifactViewerModal) or links (other types open in new tab).

Changes:
- Add artifact path detection via regex in MessageBubble code renderer
- Convert MARKDOWN_COMPONENTS to factory function for modal state access
- Add ArtifactViewerModal integration in chat view
- Add 'workflow_artifact' to WORKFLOW_EVENT_TYPES in store.ts

Fixes #1016

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(web): improve MessageBubble type clarity and code comments

- Replace opaque `ComponentPropsWithoutRef<never>` return type on
  `makeMarkdownComponents` with the idiomatic `Components` type from
  react-markdown, which is the actual contract the prop expects
- Add explanatory comment on `ARTIFACT_PATH_RE` describing group
  semantics and cross-platform path handling intent
- Add comment on `useMemo` empty dep array explaining that
  `setArtifactViewer` is a stable React state setter

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* simplify: merge duplicate lucide-react imports in MessageBubble

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(web): harden extractArtifactInfo UUID regex and path traversal check

Make ARTIFACT_PATH_RE case-insensitive for hex digits ([a-fA-F0-9-]+) so
uppercase UUID characters are matched correctly. Also reject filenames
containing '..' path segments to prevent path traversal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(web): add artifact path detection to WorkflowResultCard markdown

The artifact-aware code component was only in MessageBubble.tsx but
WorkflowResultCard in MessageList.tsx used its own static markdown
components without artifact detection. Artifact paths in workflow
results were rendering as plain code spans instead of clickable links.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(server,web): artifact route uses codebase name for owner/repo + bright link styling

The artifact route derived owner/repo from the working_path, which fails
for worktree-based runs (worktrees use local filesystem username, not
GitHub username). Now looks up the codebase record by codebase_id to get
the correct owner/repo from the codebase name.

Also fixes artifact link styling in WorkflowResultCard — uses
text-accent-bright instead of text-[inherit] which was inheriting the
dim text-text-secondary color from the parent container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve merge conflict markers in api.ts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style(web): brighter artifact link color in workflow result cards

Bumped artifact link color from accent-bright (0.72) to a brighter
oklch(0.78 0.18 250) with font-medium for better readability against
the dark text-text-secondary card background.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style(web): use inline styles for artifact links to guarantee visibility

Tailwind classes were being overridden by parent text-text-secondary.
Inline styles ensure the bright blue color and pointer cursor always
apply regardless of CSS cascade.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(web): use <a> tags for all artifact links with !text-accent-bright

Replaced <button> with <a> for .md artifacts (pointer cursor by default),
unified styling using !text-accent-bright to override parent
text-text-secondary, hover brightens to oklch(0.85). All artifact links
now consistently bright and clickable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(web,server): restore missing import, fix artifact link colors and cursor

- Restore getArchonWorkspacesPath import removed by prior commit (still
  used in 3 other places in api.ts, breaking the build)
- Use text-accent-bright instead of dark text-accent for artifact links
  in MessageBubble (was nearly invisible on dark background)
- Add cursor-pointer to .md artifact buttons in MessageBubble
- Replace arbitrary oklch hover value with hover:!text-primary in
  MessageList artifact links

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:00:49 -05:00
Rasmus Widing
21cceb31dd fix: reduce update check cache TTL from 24h to 1h 2026-04-10 16:44:56 +03:00
Rasmus Widing
428094ebf2 fix(server): use BUNDLED_VERSION for app version in binary mode 2026-04-10 16:43:31 +03:00
Rasmus Widing
9da44d32c0 Merge remote-tracking branch 'origin/dev' into dev 2026-04-10 16:33:07 +03:00
Rasmus Widing
47796dfe55 chore(homebrew): update formula to v0.3.5 2026-04-10 16:33:01 +03:00
github-actions[bot]
70f6052bb8 chore: update Homebrew formula for v0.3.5 2026-04-10 13:32:47 +00:00
Rasmus Widing
b09975038f
Merge pull request #1048 from coleam00/dev
Release 0.3.5
2026-04-10 16:30:52 +03:00
Rasmus Widing
09a40ec73a Release 0.3.5 2026-04-10 16:29:43 +03:00
Rasmus Widing
4b8e4c9792
Merge pull request #1047 from coleam00/fix/serve-static-paths
fix(server): strip leading slash in serveStatic for absolute webDistPath
2026-04-10 16:28:24 +03:00
Rasmus Widing
45c9b9f86a fix: remove no-op rewriteRequestPath, add webDistPath check, use process.once
- Remove stripLeadingSlash — Bun's path.join handles leading slashes correctly
- Add existsSync check on webDistPath at startup with log.warn
- Switch process.on to process.once for signal handlers in serve.ts
- Keep favicon.png explicit route (useful addition)
2026-04-10 16:25:57 +03:00
Rasmus Widing
bc989c0ebe fix(server): strip leading slash in serveStatic for absolute webDistPath
Hono's serveStatic passes c.req.path (e.g. '/assets/foo.js') to path.join
with the root. When root is absolute, path.join treats the leading slash
as an absolute segment and discards root entirely. Add rewriteRequestPath
to strip the leading slash so the full path resolves correctly.
2026-04-10 16:17:14 +03:00
Rasmus Widing
9f72704819 fix(cli): keep serve process alive by blocking on SIGINT/SIGTERM 2026-04-10 16:12:10 +03:00
Rasmus Widing
c0cfdd1ceb Merge remote-tracking branch 'origin/dev' into dev 2026-04-10 16:06:31 +03:00
Rasmus Widing
905336c39e chore(homebrew): update formula to v0.3.4 2026-04-10 16:06:10 +03:00
github-actions[bot]
9aa67f782d chore: update Homebrew formula for v0.3.4 2026-04-10 13:06:02 +00:00
Rasmus Widing
46cf2b1e29
Merge pull request #1046 from coleam00/dev
Release 0.3.4
2026-04-10 16:04:04 +03:00
Rasmus Widing
e2a6654b84 Release 0.3.4 2026-04-10 16:03:36 +03:00
Rasmus Widing
35197103b4
Merge pull request #1045 from coleam00/fix/binary-env-loading
fix: remove CWD env stripping, load ~/.archon/.env with override for binary support
2026-04-10 16:02:07 +03:00
Rasmus Widing
bbdcc65722 fix: address review — error handling, typing, stale docs 2026-04-10 16:01:41 +03:00
Rasmus Widing
4c018208a5 fix: remove CWD env stripping, load ~/.archon/.env with override for binary support
The CWD .env stripping was redundant — SUBPROCESS_ENV_ALLOWLIST already
blocks target repo credentials from reaching AI subprocesses, and the
env-leak gate scans target repos before spawning.

The stripping also broke compiled binaries: it nuked all env vars, then
tried to reload from a baked CI path that doesn't exist on user machines.

Changes:
- Remove CWD .env stripping from both server and CLI
- Server loads ~/.archon/.env with override: true (all keys, not just DATABASE_URL)
- Skip import.meta.dir .env loading in binary mode (path is frozen at build time)
- Add CLAUDE_USE_GLOBAL_AUTH defaulting to server (same as CLI)
- Fix envFile reference in no_ai_credentials error to show useful path
2026-04-10 15:53:23 +03:00
Cole Medin
0d5ec665f8
feat(docs): add logo, dark theme, feature cards to docs site (#1022)
* feat(docs): add logo, dark theme, feature cards to docs site (#1018)

The docs site lacked brand identity — no logo, default light theme,
and plain bullet lists for features. Add the existing shield logo to
the header and hero, default to dark theme via localStorage script,
replace the "What is Archon?" bullets with Starlight CardGrid
components, and extend custom CSS with dark theme colors, tighter
hero spacing, and sidebar active-link gradient.

Changes:
- astro.config.mjs: add logo config + dark theme head script
- index.md → index.mdx: add hero image, import CardGrid/Card components
- custom.css: dark theme vars, hero padding, sidebar highlight

Fixes #1018

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(docs): address review findings for PR #1022

- Add missing "Portable" feature card to CardGrid (restores multi-platform
  support callout dropped in original conversion from bullet list)
- Fix icon="random" on "Isolated" card — use "laptop" instead (shuffle icon
  does not communicate worktree isolation)
- Reassign icons: Portable gets "puzzle", Composable gets "setting"
- Add comment explaining !important in .hero is needed to override Starlight
  inline styles
- Add comment noting .sidebar-content is an internal Starlight class (not
  public API) that should be re-tested after Starlight upgrades

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:46:20 -05:00
Rasmus Widing
eff8b0dc60 fix: sync all workspace versions from root and automate in release skill 2026-04-10 15:31:43 +03:00
Rasmus Widing
027ed32b2a fix(cli): read version from monorepo root package.json, not CLI package 2026-04-10 15:29:37 +03:00
Rasmus Widing
6c1931251c Merge remote-tracking branch 'origin/dev' into dev 2026-04-10 15:26:25 +03:00
Rasmus Widing
65bc42839c Merge remote-tracking branch 'origin/main' into dev 2026-04-10 15:26:14 +03:00
Rasmus Widing
029fb600b6 chore(homebrew): update formula to v0.3.3 2026-04-10 15:26:08 +03:00
github-actions[bot]
c76a422f15 chore: update Homebrew formula for v0.3.3 2026-04-10 12:25:58 +00:00
Rasmus Widing
dab1f7ba2d Merge remote-tracking branch 'origin/main' into dev 2026-04-10 15:24:15 +03:00
Rasmus Widing
42991e06bc fix(web): add all remark-gfm transitive deps for Bun hoisting 2026-04-10 15:24:01 +03:00
Rasmus Widing
47818159c8 Merge remote-tracking branch 'origin/main' into dev 2026-04-10 15:19:53 +03:00
Rasmus Widing
d4f23b8d70 fix(web): add missing remark-gfm transitive deps for production build 2026-04-10 15:19:34 +03:00
Rasmus Widing
d69ed11252 fix(ci): remove 'run' from bun --filter command in release workflow 2026-04-10 15:19:34 +03:00
Rasmus Widing
058c60e18e fix(web): add missing remark-gfm transitive deps for production build 2026-04-10 15:17:29 +03:00
Rasmus Widing
b36868dc66 fix(ci): remove 'run' from bun --filter command in release workflow 2026-04-10 15:13:04 +03:00
Rasmus Widing
c7d1281978
Merge pull request #1043 from coleam00/dev
Release 0.3.3
2026-04-10 15:04:10 +03:00
Rasmus Widing
7318bf3a30 Release 0.3.3 2026-04-10 15:03:28 +03:00
Rasmus Widing
572b23e806
feat: auto-resolve Codex native binary in compiled builds (#995) (#1012)
* Investigate issue #978: one-command web UI install via archon serve

* fix: fail fast when Codex is used from compiled binary (#995)

The @openai/codex-sdk uses createRequire(import.meta.url) to resolve its
native platform binary, which breaks in bun --compile builds where
import.meta.url is frozen to the build host's path. Instead of a cryptic
createRequire crash, throw an actionable error directing users to install
from source or switch to the Claude provider.

Fixes #995

* fix: move binary guard to top of sendQuery, add beforeEach mock clear

Addresses code review: guard now fires before env-leak scanner to avoid
confusing error ordering, and MockCodex.mockClear() prevents cumulative
call counts across tests.

* feat: auto-resolve Codex native binary in compiled builds (#995)

Replace the fail-fast guard with full Codex binary resolution using the
SDK's codexPathOverride constructor option. In compiled binary mode, the
resolver checks (in order): CODEX_BIN_PATH env var, config
assistants.codex.codexBinaryPath, ~/.archon/vendor/codex/ cache, then
auto-downloads from npm registry.

In dev mode (bun link), returns undefined so the SDK uses its normal
node_modules-based resolution.

Adds codexBinaryPath config option so users can point to their own
Codex CLI install location.

Fixes #995

* fix: check env/config paths before platform detection for Windows CI

Move CODEX_BIN_PATH and config codexBinaryPath checks ahead of platform
detection so user-supplied paths work on any platform. Add win32-x64 and
win32-arm64 to the platform map for auto-download support.

* fix: normalize path separators in vendor test for Windows CI

* fix: remove auto-download, simplify resolver, fix review findings

- Remove ~100 lines of auto-download/checksum/extraction code from
  codex-binary-resolver.ts. Binary mode now throws with clear install
  instructions instead of silently downloading ~112 MB from npm.
- Fix init-promise leak: clear codexInitPromise on rejection so next
  call can retry after user installs Codex.
- Simplify Codex constructor call (remove conditional spread).
- Replace PLATFORM_BINARY_SUBPATH map with getVendorBinaryName() function
  that encodes the simple rule: Windows gets .exe, everything else gets
  codex. Rejects unsupported architectures explicitly.
- Restore specific log event name for env-leak gate config failure.
- Move codex-binary-resolver-dev.test.ts to its own bun test batch
  (mock.module isolation).
- Add tests: rejected-promise recovery, undefined-resolver result,
  binary-not-found-anywhere.
- Document CODEX_BIN_PATH in .env.example, codexBinaryPath in CLAUDE.md
  config example, vendor/codex/ in directory tree.
2026-04-10 14:59:03 +03:00
Rasmus Widing
c7412fb562
fix: add explicit export entries for community forge adapters (#1041)
bun build --compile cannot resolve deep subpath imports through
wildcard export maps. Adding explicit entries for gitea and gitlab
adapter paths fixes binary builds.
2026-04-10 14:13:50 +03:00
Rasmus Widing
6f1b72e131
feat: add automatic update check notification for binary users (#1039)
* feat: add automatic update check notification for binary users

Cache-based update check triggered by CLI commands and Web UI page load.
Fetches latest release from GitHub API with 24h cache staleness and 3s
timeout. CLI prints one-liner to stderr (suppressed by --quiet, skipped
for source builds). Web UI shows pulsing badge in TopNav linking to
release page. Also fixes release skill asset count (6 -> 7).

* fix: address review findings for update check notification

- Add BUNDLED_IS_BINARY guard to /api/update-check server route to
  prevent unintended GitHub API calls from source/dev builds
- Replace hand-crafted UpdateCheckResult interface with generated
  OpenAPI type (components['schemas']['UpdateCheckResponse'])
- Add staleness + checkedAt validation to getCachedUpdateCheck,
  matching readCache behavior
- Add debug-level logging to all bare catch blocks in update-check.ts
  for --verbose diagnostics
- Add releaseUrl guard in TopNav to prevent empty href links
- Fix SKILL.md: correct CI scope claim (Step 10 only, not 10-11) and
  clarify merge commit sync note
- Add tests: non-200 HTTP response, stale cache for getCachedUpdateCheck,
  missing checkedAt, and cache content verification
- Document /api/update-check endpoint and update-check.json cache file
  in CLAUDE.md and docs-web
- Regenerate api.generated.d.ts with UpdateCheckResponse schema

* refactor: simplify update check code

- Deduplicate getCachedUpdateCheck by delegating to readCache
- Extract shared noUpdate fallback object in server route
- Move guard clause outside try block in printUpdateNotice
- Fix cachePath variable scoping in readCache catch block
2026-04-10 14:10:33 +03:00
Rasmus Widing
36bd9cff8f
feat: add archon serve command for one-command web UI install (#1011)
* Investigate issue #978: one-command web UI install via archon serve

* feat: add `archon serve` command for one-command web UI install (#978)

Extract `startServer(opts)` from server's monolithic `main()` into an
exported function with `ServerOptions` (webDistPath, port,
skipPlatformAdapters). Add `import.meta.main` guard so the file still
works as a standalone script for `bun dev`.

Create `archon serve` CLI command that lazily downloads a pre-built
web UI tarball from GitHub releases on first run, verifies SHA-256
checksum, extracts atomically, then starts the full server. Cached
per version in `~/.archon/web-dist/<version>/`.

Update release CI to build the web UI, package it as
`archon-web.tar.gz`, and include in release checksums.

* fix: address review findings for archon serve command

- Validate --port range (1-65535) and reject NaN before any other checks
- Capture tar stderr for actionable extraction error messages
- Add structured logging (download_started/download_failed/server_start_failed)
- Post-extraction sanity check for index.html
- Wrap renameSync with error context and tmpDir cleanup
- Wrap fetch() calls to preserve URL context on network errors
- Validate parseChecksum returns 64 hex chars
- Set skipPlatformAdapters: true for standalone web UI mode
- Improve ServerOptions/ServeOptions JSDoc
- Move consoleErrorSpy cleanup to afterEach in tests
- Add tests for port validation and malformed hash rejection
- Update CLAUDE.md: CLI section, directory tree, package descriptions
- Update README.md: mention archon serve for binary installs
- Update docs-web: CLI reference, archon-directories

* refactor: simplify serve command implementation

- Use BUNDLED_IS_BINARY directly instead of version === 'dev' sentinel
- Extract toError() helper for repeated error normalization
- Use dirname() instead of manual substring/lastIndexOf
- Extract cleanupAndThrow() for repeated rmSync + throw pattern
- Add missing assertion on port 0 test for consistency
2026-04-10 13:33:47 +03:00
Cole Medin
968dfadcf5 style: fix prettier formatting in workflow-store.test.ts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:08:08 -05:00
Cole Medin
625666821e fix(web): handle workflow_step SSE events and remove nested interactive element
- Add missing `workflow_step` case in useDashboardSSE switch so loop iteration
  events are routed to the store instead of silently dropped
- Restructure DagNodeItem: replace outer <button> with <div role="row"> and
  change the expand toggle from <span role="button"> to a proper <button>,
  eliminating the interactive-element-in-interactive-element violation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:05:17 -05:00
Cole Medin
a248438cbe simplify: reduce complexity in changed files 2026-04-09 17:44:33 -05:00
Cole Medin
c639063f32 fix: address PR #1026 review findings — iteration state, nested button, guard, tests
- Preserve accumulated iteration state in handleDagNode by spreading existing node before overwriting with event fields (HIGH: iteration badge/list vanished on node completion)
- Replace nested <button> with <span role="button"> + onKeyDown in DagNodeProgress to fix HTML spec violation and broken stopPropagation (HIGH: accessibility + mis-navigation)
- Fix falsy guard `if (!iteration)` → `if (iteration === undefined)` in WorkflowExecution REST enrichment (MEDIUM: would silently drop iteration 0 in future)
- Fix dead file reference in workflow-bridge comments: `useWorkflowStatus.ts` → `workflow-store.ts handleLoopIteration` (MEDIUM: misleading comment)
- Add 7 unit tests for handleLoopIteration in workflow-store.test.ts covering all branches: no-nodeId no-op, ghost-node no-op, first append, upsert, total:0 preservation, accumulation, and iteration state survival across dag_node completion (MEDIUM: zero coverage for core PR logic)
- Clarify two LOW comment gaps in WorkflowExecution.tsx and workflow-store.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 17:41:20 -05:00
Cole Medin
00d06a9469 simplify: reduce complexity in changed files
Simplify verbose onClick handlers in WorkflowResultCard — remove unnecessary
block body and explicit void return type for the navigate call; collapse the
setExpanded handler to a single-line block per ESLint no-confusing-void-expression.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 17:32:25 -05:00
Cole Medin
163ad9d1b1 fix(web): align totalCount semantics and improve WorkflowResultCard comments and docs
- Fix MEDIUM: totalCount in dagNodes live-state path now counts only terminal nodes
  (completed/failed/skipped), matching the semantics of the events fallback path.
  Previously included pending/running nodes, producing a misleading denominator
  during page-refresh mid-execution.
- Fix LOW: Duration badge background changed from bg-surface-elevated to bg-surface
  for visual contrast against the bg-surface-elevated parent header.
- Fix LOW: Expanded staleTime: Infinity comment to explain the immutability invariant.
- Fix LOW: Expanded node count comment to describe that the events-path totalCount
  is an approximation (nodes that reached terminal state, not workflow's full node count).
- Fix LOW: Added Workflow Result Card subsection to web adapter docs describing the
  post-completion card's status icon, header, node count, duration, and artifacts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 17:29:53 -05:00
Cole Medin
8978ad6a1d feat(web): display loop iteration progress in workflow execution view (#1014)
Loop nodes in DAG workflows appeared as flat nodes with no iteration visibility.
The backend already emitted loop_iteration_* events but the frontend dropped them.

Changes:
- Add nodeId to workflow_step SSE events in workflow-bridge
- Add LoopIterationEvent/LoopIterationInfo types and extend DagNodeState
- Handle workflow_step events in useSSE and route to store
- Add handleLoopIteration action to workflow store
- Extract loop_iteration_* events for historical REST view
- Add expandable iteration sub-list in DagNodeProgress sidebar
- Add iteration count badge on graph nodes (ExecutionDagNode)
- Add loop type color/label to graph node type maps

Fixes #1014

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 17:19:38 -05:00
Cole Medin
0751f16ce8 feat(web): enrich workflow result card with status, duration, nodes, and artifacts (#1015)
The WorkflowResultCard showed a hardcoded green checkmark regardless of
actual outcome, with no duration, node count, or artifact links.

Changes:
- Fetch terminal run data via TanStack Query (staleTime: Infinity)
- Merge Zustand live state with API fallback for offline/CLI workflows
- Render StatusIcon for completed/failed/cancelled status awareness
- Display node count and duration pill in the header
- Show ArtifactSummary (PRs, commits, branches, files) above text content
- Derive node counts and artifacts from events when live state unavailable

Fixes #1015

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 17:16:21 -05:00
Cole Medin
95679fa625 fix(cli): workflow reject ignores positional reason argument
The reject command only read `--reason` flag but ignored positional
arguments. Running `bun run cli workflow reject <id> "feedback"` passed
undefined to rejectWorkflow(), defaulting to "Rejected". Now mirrors
the approve command pattern: reads both flag and positional args.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 14:49:35 -05:00
Cole Medin
f121bbfc27 fix: normalize script path separators for Windows compatibility
On Windows, path.join() produces backslashes which caused 3 test
failures in script-discovery on the Windows CI runner.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 13:28:08 -05:00
Rivo Link
53cabd44fd
fix: PowerShell Add-ToUserPath corrupts PATH when single entry exists (#1000) 2026-04-09 19:43:52 +03:00
Rasmus Widing
2994d56861
fix: Promise.any race condition in validator script check (#1007) (#1010)
* Investigate issues #1001, #1002, #1003: interactive-prd workflow bugs

* fix: replace Promise.any with Promise.all in validator script check (#1007)

Promise.any resolves with the first fulfilled promise regardless of value.
Since fileExists always fulfills (never rejects), Promise.any could return
false even when a later extension check would return true — a race condition
that flaked on Windows CI.

Fixes #1007
2026-04-09 19:42:48 +03:00
Cocoon-Break
0d1ff41ad3
fix(utils): correct misleading port-range comment in calculatePortOffset (#1009)
The inline comment stated ports 3100-3999 (implying basePort=3000) but the
actual basePort is 3090, making the true range 3190-4089, which already
matched the JSDoc. Update the inline comment to agree with both the JSDoc
and the arithmetic.

Closes #1008
2026-04-09 19:40:41 +03:00
Rasmus Widing
5cfc7da99e
fix: interactive-prd workflow bugs (#1001, #1002, #1003) (#1005)
* Investigate issues #1001, #1002, #1003: interactive-prd workflow bugs

* fix: interactive-prd workflow — capture responses, fix output path, reuse conversations (#1001, #1002, #1003)

The archon-interactive-prd workflow was completely non-functional due to three bugs:
approval gates didn't capture user responses (missing capture_response: true),
the generate node wrote to .claude/ which the Claude SDK blocks, and CLI
approve/reject created new conversations instead of reusing the original.

- Add capture_response: true to all three approval gates in the YAML
- Change output path from .claude/PRPs/prds/ to $ARTIFACTS_DIR/prds/
- Add conversationId to ApprovalOperationResult and RejectionOperationResult
- Look up and pass through original conversation ID in CLI approve/reject commands
- Add getConversationById mock to CLI workflow tests

Fixes #1001, #1002, #1003

* fix: guard conversation lookups, add tests, document approval node type

- Wrap getConversationById calls in try-catch in CLI approve/reject
  commands to prevent crashes when DB errors occur after approval is
  already recorded (falls back to new conversation ID)
- Log when conversation lookup fails or returns null for observability
- Add happy-path tests verifying platform conversation ID is passed
  through in both approve and reject commands
- Add JSDoc comments clarifying conversationId is a DB UUID (not
  platform ID) in operation result interfaces and WorkflowRunOptions
- Document approval: node type in CLAUDE.md DAG node types list
2026-04-09 19:36:28 +03:00
Rasmus Widing
50f96f870e
feat: script node type for DAG workflows (bun/uv runtimes) (#999)
* feat: add ScriptNode schema and type guards (US-001)

Implements US-001 from the script-nodes PRD.

Changes:
- Add scriptNodeSchema with script, runtime (bun|uv), deps, and timeout fields
- Add ScriptNode type with never fields for mutual exclusivity
- Add isScriptNode type guard
- Add SCRIPT_NODE_AI_FIELDS constant (same as BASH_NODE_AI_FIELDS)
- Update dagNodeSchema superRefine and transform to handle script: nodes
- Update DagNode union type to include ScriptNode
- Add script node dispatch stub in dag-executor.ts (fails fast until US-003)
- Export all new types and values from schemas/index.ts
- Add comprehensive schema tests for ScriptNode parsing and validation

* feat: script discovery from .archon/scripts/

Implements US-002 from PRD.

Changes:
- Add ScriptDefinition type and discoverScripts() in script-discovery.ts
- Auto-detect runtime from file extension (.ts/.js->bun, .py->uv)
- Handle duplicate script name conflicts across extensions
- Add bundled defaults infrastructure (empty) for scripts
- Add tests for discovery, naming, and runtime detection

* feat: script execution engine (inline + named)

Implements US-003 from PRD.

Changes:
- Add executeScriptNode() in dag-executor.ts following executeBashNode pattern
- Support inline bun (-e) and uv (run python -c) execution
- Support named scripts via bun run / uv run
- Wire ScriptNode dispatch replacing 'not yet implemented' stub
- Capture stdout as node output, stderr as warning
- Handle timeout and non-zero exit
- Pass env vars for variable substitution
- Add tests for inline/named/timeout/failure cases

* feat: runtime availability validation at load time

Implements US-004 from PRD.

Changes:
- Add checkRuntimeAvailable() utility for bun/uv binary detection
- Extend validator.ts with script file and runtime validation
- Integrate script validation into parseWorkflow flow in loader.ts
- Add tests for runtime availability detection

* feat: dependency installation for script nodes

Implements US-005 from PRD.

Changes:
- Support deps field for uv nodes: uvx --with dep1... for inline
- Support uv run --with dep1... for named uv scripts
- Bun deps are auto-installed at runtime via bun's native mechanism
- Empty/omitted deps field produces no extra flags
- Add tests for dep injection into both runtimes

* test: integration tests and validation for script nodes

Implements US-006 from PRD.

Changes:
- Fill test coverage gaps for script node feature
- Add script + command mutual exclusivity schema test
- Add env var substitution tests ($WORKFLOW_ID, $ARTIFACTS_DIR in scripts)
- Add stderr handling test (stderr sent to user as platform message)
- Add missing named script file validation tests to validator.test.ts
- Full bun run validate passes

* fix: address review findings in script nodes

- Extract isInlineScript to executor-shared.ts (was duplicated in
  dag-executor.ts and validator.ts)
- Remove dead warnMissingScriptRuntimes from loader.ts (validator
  already covers runtime checks)
- Remove path traversal fallback in executeScriptNode — error when
  named script not found instead of executing arbitrary file paths
- Memoize checkRuntimeAvailable to avoid repeated subprocess spawns
- Add min(1) to scriptNodeSchema.script field for consistency
- Replace dynamic import with static import in validator.ts

* fix(workflows): address review findings for script node implementation

Critical fixes:
- Wrap discoverScripts() in try-catch inside executeScriptNode to prevent
  unhandled rejections when script discovery fails (e.g. duplicate names)
- Add isScriptNode to isNonAiNode check in loader.ts so AI-specific fields
  on script nodes emit warnings (activates SCRIPT_NODE_AI_FIELDS)

Important fixes:
- Surface script stderr in user-facing error messages on non-zero exit
- Replace uvx with uv run --with for inline uv scripts with deps
- Add z.string().min(1) validation on deps array items
- Remove unused ScriptDefinition.content field and readFile I/O
- Add logging in discoverAvailableScripts catch block
- Warn when deps is specified with bun runtime (silently ignored)

Simplifications:
- Merge BASH_DEFAULT_TIMEOUT and SCRIPT_DEFAULT_TIMEOUT into single
  SUBPROCESS_DEFAULT_TIMEOUT constant
- Use scriptDef.runtime instead of re-deriving from extname()
- Extract shared formatValidationResult helper, deduplicate section comments

Tests:
- Add isInlineScript unit tests to executor-shared.test.ts
- Add named-script-not-found executor test to dag-executor.test.ts
- Update deps tests to expect uv instead of uvx

Docs:
- Add script: node type to CLAUDE.md node types and directory structure
- Add script: to .claude/rules/workflows.md DAG Node Types section
2026-04-09 14:48:02 +03:00
Cole Medin
ef03cd2aff fix(adapters): mock global fetch in GitLab adapter tests to prevent CI timeout
On CI (Ubuntu), fetch to gitlab.example.com resolves to a real IP and the
TCP connection hangs, causing the "detects mention at end of string" test
to time out after 5s. Locally on Windows, DNS fails fast so it passes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 13:44:12 -05:00
Cole Medin
1e513dfc00 Merge branch 'dev' of https://github.com/coleam00/Archon into dev 2026-04-08 13:42:29 -05:00
Rasmus Widing
6af6d1117d chore(homebrew): update formula to v0.3.2 2026-04-08 15:51:39 +03:00
github-actions[bot]
1a8a1dbdff chore: update Homebrew formula for v0.3.2 2026-04-08 12:51:22 +00:00
Rasmus Widing
8b3fbb7c23
Merge pull request #994 from coleam00/dev
Release 0.3.2
2026-04-08 15:49:50 +03:00
Rasmus Widing
2bdf87b711 Release 0.3.2 2026-04-08 15:48:22 +03:00
Rasmus Widing
83bd44d30a
fix(env-leak-gate): skip pre-spawn scan for unregistered cwd paths (#991) (#992)
The pre-spawn env-leak guard in ClaudeClient.sendQuery() and
CodexClient.sendQuery() used `if (!codebase?.allow_env_keys)`, which
evaluates truthy when `codebase` is null. Any sendQuery call with an
unregistered cwd (title generation, codebase-less orchestrator runs,
DAG executor paths) ran the sensitive-key scanner and threw
EnvLeakError, blocking every conversation creation on deployed servers
with a .env in scope.

The pre-spawn check is defense-in-depth for registered codebases
without explicit consent. Registration (registerRepoAtPath) is the
canonical gate; unregistered cwd paths are out of scope.

Changes:
- claude.ts: tighten predicate to `codebase && !codebase.allow_env_keys`
- codex.ts: same fix
- claude.test.ts: add regression test for unregistered cwd; update
  existing tests that relied on the null-codebase path to use a
  registered codebase with allow_env_keys: false
- codex.test.ts: same test updates

Fixes #991
2026-04-08 15:43:25 +03:00
Rasmus Widing
e63a0ee3ac
fix(claude): embed SDK cli.js via /embed entry point for compiled binaries (#990)
The Claude Agent SDK falls back to resolving its cli.js via
import.meta.url when pathToClaudeCodeExecutable is not set. In
bun build --compile binaries, import.meta.url of a bundled module
is frozen at build time to the build host's absolute node_modules
path — so every binary shipped from CI carried a
/Users/runner/work/Archon/Archon/node_modules/.bun/... path that
only existed on the GitHub Actions runner, and every workflow run
failed with 'Module not found' after three retries.

The SDK ships a dedicated /embed entry point for exactly this case:
it uses 'import cli.js with { type: "file" }' so bun embeds cli.js
into the compiled binary's $bunfs virtual filesystem, then extracts
it to a real temp path at runtime so the subprocess can exec it.

Verified by building a local binary (scripts/build-binaries.sh) and
running 'workflow run assist' — the binary now spawns Claude
successfully where v0.3.1 fails before the first token.

This bug has been latent since bun build --compile was first wired
up; it surfaced in v0.3.1 because that was the first release where
the homebrew formula SHAs were correct and a user could actually
install the binary.
2026-04-08 15:33:28 +03:00
Rasmus Widing
ad470c2848 chore(homebrew): update formula to v0.3.1 2026-04-08 15:15:42 +03:00
github-actions[bot]
85bfe4cb45 chore: update Homebrew formula for v0.3.1 2026-04-08 12:15:39 +00:00
Cole Medin
688aeeabbe fix(web): clarify server fallback chain comment for absent codebaseCwd
Update inline comment to accurately describe the two-step server fallback:
when cwd is absent the server first tries the first registered codebase
before falling back to bundled defaults. The previous comment skipped the
intermediate step, which could confuse developers debugging unexpected
workflow resolution for "No project" runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 16:34:33 -05:00
Cole Medin
a09d198a8e fix(web): allow workflow graph view to load without a codebase (#958)
The useQuery for workflow definitions required both workflowName and
codebaseCwd to be truthy. For CLI-triggered runs or "No project" web
runs where codebase_id is null, codebaseCwd stays undefined and the
query never fires — showing "Loading graph..." forever. The server
already handles missing cwd by falling back to bundled defaults, so
the client gate only needs workflowName.

Fixes #958

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:20:45 -05:00
301 changed files with 29303 additions and 7535 deletions

View file

@ -131,28 +131,30 @@ git status
### 3.2 Decision Tree
```
```text
┌─ IN WORKTREE?
│ └─ YES → Use it (assume it's for this work)
│ Log: "Using worktree at {path}"
│ └─ YES → Use current branch AS-IS. Do NOT switch branches. Do NOT create
│ new branches. The isolation system has already set up the correct
│ branch; any deviation operates on the wrong code.
│ Log: "Using worktree at {path} on branch {branch}"
├─ ON MAIN/MASTER?
├─ ON $BASE_BRANCH? (main, master, or configured base branch)
│ └─ Q: Working directory clean?
│ ├─ YES → Create branch: fix/issue-{number}-{slug}
│ │ git checkout -b fix/issue-{number}-{slug}
│ └─ NO → Warn user:
│ "Working directory has uncommitted changes.
│ Please commit or stash before proceeding."
│ STOP
│ │ (only applies outside a worktree — e.g., manual CLI usage)
│ └─ NO → STOP: "Uncommitted changes on $BASE_BRANCH.
│ Please commit or stash before proceeding."
├─ ON FEATURE/FIX BRANCH?
│ └─ Use it (assume it's for this work)
├─ ON OTHER BRANCH?
│ └─ Use it AS-IS (assume it was set up for this work).
│ Do NOT switch to another branch (e.g., one shown by `git branch` but
│ not currently checked out).
│ If branch name doesn't contain issue number:
│ Warn: "Branch '{name}' may not be for issue #{number}"
└─ DIRTY STATE?
└─ Warn and suggest: git stash or git commit
STOP
└─ STOP: "Uncommitted changes. Please commit or stash first."
```
### 3.3 Ensure Up-to-Date

View file

@ -132,28 +132,30 @@ git status
### 3.2 Decision Tree
```
```text
┌─ IN WORKTREE?
│ └─ YES → Use it (assume it's for this work)
│ Log: "Using worktree at {path}"
│ └─ YES → Use current branch AS-IS. Do NOT switch branches. Do NOT create
│ new branches. The isolation system has already set up the correct
│ branch; any deviation operates on the wrong code.
│ Log: "Using worktree at {path} on branch {branch}"
├─ ON MAIN/MASTER?
├─ ON $BASE_BRANCH? (main, master, or configured base branch)
│ └─ Q: Working directory clean?
│ ├─ YES → Create branch: fix/issue-{number}-{slug}
│ │ git checkout -b fix/issue-{number}-{slug}
│ └─ NO → Warn user:
│ "Working directory has uncommitted changes.
│ Please commit or stash before proceeding."
│ STOP
│ │ (only applies outside a worktree — e.g., manual CLI usage)
│ └─ NO → STOP: "Uncommitted changes on $BASE_BRANCH.
│ Please commit or stash before proceeding."
├─ ON FEATURE/FIX BRANCH?
│ └─ Use it (assume it's for this work)
├─ ON OTHER BRANCH?
│ └─ Use it AS-IS (assume it was set up for this work).
│ Do NOT switch to another branch (e.g., one shown by `git branch` but
│ not currently checked out).
│ If branch name doesn't contain issue number:
│ Warn: "Branch '{name}' may not be for issue #{number}"
└─ DIRTY STATE?
└─ Warn and suggest: git stash or git commit
STOP
└─ STOP: "Uncommitted changes. Please commit or stash first."
```
### 3.3 Ensure Up-to-Date

View file

@ -93,19 +93,40 @@ Provide a valid plan path or GitHub issue containing the plan.
### 2.1 Check Current State
```bash
# What branch are we on?
git branch --show-current
git status --porcelain
# Are we in a worktree?
git rev-parse --show-toplevel
git worktree list
# Is working directory clean?
git status --porcelain
```
### 2.2 Branch Decision
| Current State | Action |
| ----------------- | ---------------------------------------------------- |
| In worktree | Use it (log: "Using worktree") |
| On base branch, clean | Create branch: `git checkout -b feature/{plan-slug}` |
| On base branch, dirty | STOP: "Stash or commit changes first" |
| On feature branch | Use it (log: "Using existing branch") |
```text
┌─ IN WORKTREE?
│ └─ YES → Use current branch AS-IS. Do NOT switch branches. Do NOT create
│ new branches. The isolation system has already set up the correct
│ branch; any deviation operates on the wrong code.
│ Log: "Using worktree at {path} on branch {branch}"
├─ ON $BASE_BRANCH? (main, master, or configured base branch)
│ └─ Q: Working directory clean?
│ ├─ YES → Create branch: git checkout -b feature/{plan-slug}
│ │ (only applies outside a worktree — e.g., manual CLI usage)
│ └─ NO → STOP: "Stash or commit changes first"
├─ ON OTHER BRANCH?
│ └─ Use it AS-IS. Do NOT switch to another branch (e.g., one shown by
`git branch` but not currently checked out).
│ Log: "Using existing branch {name}"
└─ DIRTY STATE?
└─ STOP: "Stash or commit changes first"
```
### 2.3 Sync with Remote
@ -116,7 +137,7 @@ git pull --rebase origin $BASE_BRANCH 2>/dev/null || true
**PHASE_2_CHECKPOINT:**
- [ ] On correct branch (not base branch with uncommitted work)
- [ ] On correct branch (not $BASE_BRANCH with uncommitted work)
- [ ] Working directory ready
- [ ] Up to date with remote

View file

@ -112,13 +112,26 @@ gh repo view --json nameWithOwner -q .nameWithOwner
### 2.3 Branch Decision
| Current State | Action |
|---------------|--------|
| Already on correct feature branch | Use it, log "Using existing branch: {name}" |
| On base branch, clean working directory | Create and checkout: `git checkout -b {branch-name}` |
| On base branch, dirty working directory | STOP with error: "Uncommitted changes on base branch. Stash or commit first." |
| On different feature branch | STOP with error: "On branch {X}, expected {Y}. Switch branches or adjust plan." |
| In a worktree | Use the worktree's branch, log "Using worktree branch: {name}" |
Evaluate in order (first matching case wins):
```text
┌─ IN WORKTREE?
│ └─ YES → Use current branch AS-IS. Do NOT switch branches. Do NOT create
│ new branches. The isolation system has already set up the correct
│ branch; any deviation operates on the wrong code.
│ Log: "Using worktree branch: {name}"
├─ ON $BASE_BRANCH? (main, master, or configured base branch)
│ └─ Q: Working directory clean?
│ ├─ YES → Create and checkout: `git checkout -b {branch-name}`
│ │ (only applies outside a worktree — e.g., manual CLI usage)
│ └─ NO → STOP: "Uncommitted changes on $BASE_BRANCH. Stash or commit first."
└─ ON OTHER BRANCH?
└─ Q: Does it match the expected branch for this plan?
├─ YES → Use it, log "Using existing branch: {name}"
└─ NO → STOP: "On branch {X}, expected {Y}. Switch branches or adjust plan."
```
### 2.4 Sync with Remote

View file

@ -0,0 +1,13 @@
---
description: E2E test command — echoes back the user message
argument-hint: <any text>
---
# E2E Echo Command
You are a simple echo agent for testing. Your ONLY job is to repeat back the user's message.
User message: $ARGUMENTS
Respond with EXACTLY this format and nothing else:
command-echo: <the user message above>

View file

@ -0,0 +1,3 @@
// Simple script node test — echoes input as JSON
const input = process.argv[2] ?? 'no-input';
console.log(JSON.stringify({ echoed: input, timestamp: new Date().toISOString() }));

View file

@ -0,0 +1,7 @@
"""Simple script node test — echoes input as JSON (uv/Python runtime)."""
import json
import sys
from datetime import datetime, timezone
input_val = sys.argv[1] if len(sys.argv) > 1 else "no-input"
print(json.dumps({"echoed": input_val, "timestamp": datetime.now(timezone.utc).isoformat()}))

View file

@ -52,6 +52,7 @@ nodes:
- id: foundation-gate
approval:
message: "Answer the foundation questions above. Your answers will guide the research phase."
capture_response: true
depends_on: [initiate]
# ═══════════════════════════════════════════════════════════════
@ -106,6 +107,7 @@ nodes:
- id: deepdive-gate
approval:
message: "Answer the deep dive questions above (vision, primary user, JTBD, constraints). Add any adjustments from the research."
capture_response: true
depends_on: [research]
# ═══════════════════════════════════════════════════════════════
@ -172,6 +174,7 @@ nodes:
- id: scope-gate
approval:
message: "Answer the scope questions above (MVP, must-haves, hypothesis, exclusions). This is the final input before PRD generation."
capture_response: true
depends_on: [technical]
# ═══════════════════════════════════════════════════════════════
@ -188,11 +191,11 @@ nodes:
**Deep dive answers**: $deepdive-gate.output
**Scope answers**: $scope-gate.output
Generate a complete PRD file at `.claude/PRPs/prds/{kebab-case-name}.prd.md`.
Generate a complete PRD file at `$ARTIFACTS_DIR/prds/{kebab-case-name}.prd.md`.
First create the directory:
```bash
mkdir -p .claude/PRPs/prds
mkdir -p $ARTIFACTS_DIR/prds
```
**First principles rule**: Before writing the Technical Approach section, READ the
@ -243,9 +246,9 @@ nodes:
Read the PRD file that was just generated. The generate node output the file path:
$generate.output
Find the PRD file — check `.claude/PRPs/prds/` for the most recently created `.prd.md` file:
Find the PRD file — check `$ARTIFACTS_DIR/prds/` for the most recently created `.prd.md` file:
```bash
ls -t .claude/PRPs/prds/*.prd.md | head -1
ls -t $ARTIFACTS_DIR/prds/*.prd.md | head -1
```
Read the entire PRD, then verify EVERY technical claim against the actual codebase:

View file

@ -0,0 +1,26 @@
# E2E smoke test — Claude provider
# Verifies: Claude connectivity (sendQuery), $nodeId.output refs
# Design: Only uses allowed_tools: [] (no tool use) and no output_format (no structured output)
# because the Claude CLI subprocess is slow with those features in CI.
name: e2e-claude-smoke
description: "Smoke test for Claude provider. Verifies prompt response."
provider: claude
model: haiku
nodes:
# 1. Simple prompt — verifies Claude API connectivity via sendQuery
- id: simple
prompt: "What is 2+2? Answer with just the number, nothing else."
allowed_tools: []
idle_timeout: 30000
# 2. Assert non-empty output — fails CI if Claude returned nothing
- id: assert
bash: |
output="$simple.output"
if [ -z "$output" ]; then
echo "FAIL: simple node returned empty output"
exit 1
fi
echo "PASS: simple=$output"
depends_on: [simple]

View file

@ -0,0 +1,40 @@
# E2E smoke test — Codex provider
# Verifies: provider selection, sendQuery, structured output
name: e2e-codex-smoke
description: "E2E smoke test for Codex provider. Runs a simple prompt + structured output node."
provider: codex
model: gpt-5.2
nodes:
- id: simple
prompt: "What is 2+2? Answer with just the number, nothing else."
idle_timeout: 30000
- id: structured
prompt: "Classify this input as 'math' or 'text': '2+2=4'. Return JSON only."
output_format:
type: object
properties:
category:
type: string
enum: ["math", "text"]
required: ["category"]
additionalProperties: false
idle_timeout: 30000
depends_on: [simple]
# Assert both nodes returned output
- id: assert
bash: |
simple_out="$simple.output"
structured_out="$structured.output"
if [ -z "$simple_out" ]; then
echo "FAIL: simple node returned empty output"
exit 1
fi
if [ -z "$structured_out" ]; then
echo "FAIL: structured node returned empty output"
exit 1
fi
echo "PASS: simple=$simple_out structured=$structured_out"
depends_on: [simple, structured]

View file

@ -0,0 +1,66 @@
# E2E smoke test — deterministic nodes (no AI, no API calls)
# Verifies: bash nodes, script nodes (bun + uv), $nodeId.output substitution,
# when conditions, trigger_rule join semantics
name: e2e-deterministic
description: "Pure DAG engine test. Exercises bash, script (bun/uv), conditions, and trigger rules with zero API calls."
nodes:
# Layer 0 — parallel deterministic nodes
- id: bash-echo
bash: "echo '{\"status\":\"ok\",\"value\":42}'"
- id: script-bun
script: echo-args
runtime: bun
timeout: 30000
- id: script-python
script: echo-py
runtime: uv
timeout: 30000
# Layer 1 — test $nodeId.output substitution from bash
- id: bash-read-output
bash: "echo 'upstream-status: $bash-echo.output'"
depends_on: [bash-echo]
# Layer 1 — conditional branches (only one should run)
- id: branch-true
bash: "echo 'branch-true-ran'"
depends_on: [bash-echo]
when: "$bash-echo.output.status == 'ok'"
- id: branch-false
bash: "echo 'branch-false-ran'"
depends_on: [bash-echo]
when: "$bash-echo.output.status == 'fail'"
# Layer 2 — trigger_rule merge (one_success: branch-false will be skipped)
- id: merge-node
bash: "echo 'merge-ok: true=$branch-true.output false=$branch-false.output'"
depends_on: [branch-true, branch-false]
trigger_rule: one_success
# Layer 3 — final verification: assert all outputs are non-empty
- id: verify-all
bash: |
fail=0
for name in bash-echo script-bun script-python bash-read-output branch-true merge-node; do
echo "$name output received"
done
bash_echo="$bash-echo.output"
script_bun="$script-bun.output"
script_python="$script-python.output"
bash_read="$bash-read-output.output"
branch_t="$branch-true.output"
merge="$merge-node.output"
if [ -z "$bash_echo" ]; then echo "FAIL: bash-echo empty"; fail=1; fi
if [ -z "$script_bun" ]; then echo "FAIL: script-bun empty"; fail=1; fi
if [ -z "$script_python" ]; then echo "FAIL: script-python empty"; fail=1; fi
if [ -z "$bash_read" ]; then echo "FAIL: bash-read-output empty"; fail=1; fi
if [ -z "$branch_t" ]; then echo "FAIL: branch-true empty"; fail=1; fi
if [ -z "$merge" ]; then echo "FAIL: merge-node empty"; fail=1; fi
if [ "$fail" -eq 1 ]; then exit 1; fi
echo "PASS: all deterministic nodes produced output"
depends_on: [bash-read-output, script-bun, script-python, merge-node]
trigger_rule: all_success

View file

@ -0,0 +1,38 @@
# E2E smoke test — mixed providers (Claude + Codex in same workflow)
# Verifies: per-node provider override, cross-provider $nodeId.output refs
name: e2e-mixed-providers
description: "Tests Claude and Codex providers in the same workflow with cross-provider output refs."
# Default provider is claude
provider: claude
model: haiku
nodes:
# 1. Claude node — default provider
- id: claude-node
prompt: "Say 'claude-ok' and nothing else."
allowed_tools: []
idle_timeout: 30000
# 2. Codex node — provider override (runs parallel with claude-node, different providers)
- id: codex-node
prompt: "Say 'codex-ok' and nothing else."
provider: codex
model: gpt-5.2
idle_timeout: 30000
# 3. Assert both providers returned output
- id: assert
bash: |
claude_out="$claude-node.output"
codex_out="$codex-node.output"
if [ -z "$claude_out" ]; then
echo "FAIL: claude-node returned empty output"
exit 1
fi
if [ -z "$codex_out" ]; then
echo "FAIL: codex-node returned empty output"
exit 1
fi
echo "PASS: claude=$claude_out codex=$codex_out"
depends_on: [claude-node, codex-node]

View file

@ -0,0 +1,105 @@
# E2E smoke test — Pi provider, every node type
# Covers: prompt, command, loop (AI node types) + bash, script bun/uv
# (deterministic node types) + depends_on / when / trigger_rule / $nodeId.output
# (DAG features).
# Skipped: `approval:` — pauses for human input, incompatible with CI.
# Auth: ANTHROPIC_API_KEY (CI) or your local `pi /login` OAuth.
# Expected runtime: ~10s on haiku (3 AI round-trips + deterministic nodes).
name: e2e-pi-all-nodes-smoke
description: 'Pi provider smoke across every CI-compatible node type.'
provider: pi
model: anthropic/claude-haiku-4-5
nodes:
# ─── AI node types ──────────────────────────────────────────────────────
# 1. prompt: inline prompt (simplest AI node)
- id: prompt-node
prompt: "Reply with exactly the single word 'ok' and nothing else."
allowed_tools: []
effort: low
idle_timeout: 30000
# 2. command: named command file (.archon/commands/e2e-echo-command.md)
# The command echoes back $ARGUMENTS (the workflow invocation message).
- id: command-node
command: e2e-echo-command
allowed_tools: []
idle_timeout: 30000
# 3. loop: iterative AI prompt until completion signal
# Bounded by max_iterations: 2 so a misbehaving model can't hang CI.
- id: loop-node
loop:
prompt: "Reply with exactly 'DONE' and nothing else."
until: 'DONE'
max_iterations: 2
allowed_tools: []
effort: low
idle_timeout: 60000
# ─── Deterministic node types (no AI) ───────────────────────────────────
# 4. bash: shell script with JSON output (enables $nodeId.output.status
# dot-access downstream)
- id: bash-json-node
bash: 'echo ''{"status":"ok"}'''
# 5. script: bun (TypeScript/JavaScript runtime)
- id: script-bun-node
script: echo-args
runtime: bun
timeout: 30000
# 6. script: uv (Python runtime)
- id: script-python-node
script: echo-py
runtime: uv
timeout: 30000
# ─── DAG features ───────────────────────────────────────────────────────
# 7. depends_on + $nodeId.output substitution
- id: downstream
bash: "echo 'downstream got: $prompt-node.output'"
depends_on: [prompt-node]
# 8. when: conditional (JSON dot-access on upstream output)
- id: gated
bash: "echo 'gated-ok'"
depends_on: [bash-json-node]
when: "$bash-json-node.output.status == 'ok'"
# 9. trigger_rule: merge multiple deps (all_success semantics)
- id: merge
bash: "echo 'merge-ok'"
depends_on: [downstream, gated, script-bun-node, script-python-node]
trigger_rule: all_success
# ─── Final assertion ────────────────────────────────────────────────────
# 10. Verify every upstream node produced non-empty output.
- id: assert
bash: |
fail=0
check() {
local name="$1"
local value="$2"
if [ -z "$value" ]; then
echo "FAIL: $name produced empty output"
fail=1
fi
}
check prompt-node "$prompt-node.output"
check command-node "$command-node.output"
check loop-node "$loop-node.output"
check bash-json-node "$bash-json-node.output"
check script-bun-node "$script-bun-node.output"
check script-python-node "$script-python-node.output"
check downstream "$downstream.output"
check gated "$gated.output"
check merge "$merge.output"
if [ "$fail" -eq 1 ]; then exit 1; fi
echo "PASS: all 9 node types produced output"
depends_on: [merge, loop-node, command-node]
trigger_rule: all_success

View file

@ -0,0 +1,35 @@
# E2E smoke test — Pi community provider
# Verifies: Pi connectivity, $nodeId.output refs, async-queue bridge,
# and v2 wiring (thinkingLevel, allowed_tools).
# Design: mirrors e2e-claude-smoke.yaml structure. The `allowed_tools: []`
# idiom disables Pi's built-in read/bash/edit/write so the smoke stays
# fast (no tool-use round-trips). `thinking: minimal` exercises the
# thinkingLevel translation path.
# Auth: picks up either ANTHROPIC_API_KEY env var (CI) or your local
# `pi /login` OAuth credentials from ~/.pi/agent/auth.json.
name: e2e-pi-smoke
description: 'Smoke test for Pi community provider. Verifies prompt response via sendQuery.'
provider: pi
model: anthropic/claude-haiku-4-5
nodes:
# 1. Simple prompt — verifies Pi harness starts, AsyncQueue bridge yields
# assistant chunks, and agent_end produces a result chunk. v2 wiring:
# allowed_tools: [] disables all Pi tools (LLM-only); effort: low is
# translated to Pi's thinkingLevel by options-translator.ts.
- id: simple
prompt: 'What is 2+2? Answer with just the number, nothing else.'
allowed_tools: []
effort: low
idle_timeout: 30000
# 2. Assert non-empty output — fails CI if Pi returned nothing
- id: assert
bash: |
output="$simple.output"
if [ -z "$output" ]; then
echo "FAIL: simple node returned empty output"
exit 1
fi
echo "PASS: simple=$output"
depends_on: [simple]

View file

@ -0,0 +1,34 @@
# E2E smoke test — workflow-level worktree.enabled: false
# Verifies: when a workflow pins worktree.enabled: false, runs happen in the
# live repo checkout (no worktree created, cwd == repo root). Zero AI calls.
name: e2e-worktree-disabled
description: "Pinned-isolation-off smoke. Asserts cwd is the repo root rather than a worktree path, regardless of how the workflow is invoked."
worktree:
enabled: false
nodes:
# Print cwd so the operator can eyeball it, and capture for the assertion node.
- id: print-cwd
bash: "pwd"
# Assertion: cwd must NOT contain '/.archon/workspaces/' — if it does, the
# policy was ignored and a worktree was created anyway. We also assert the
# cwd ends with a git repo (has a .git directory or file visible).
- id: assert-live-checkout
bash: |
cwd="$(pwd)"
echo "assert-live-checkout cwd=$cwd"
case "$cwd" in
*/.archon/workspaces/*/worktrees/*)
echo "FAIL: workflow ran inside a worktree ($cwd) despite worktree.enabled: false"
exit 1
;;
esac
if [ ! -e "$cwd/.git" ]; then
echo "FAIL: cwd $cwd is not a git checkout root (.git missing)"
exit 1
fi
echo "PASS: ran in live checkout (no worktree created by policy)"
depends_on: [print-cwd]
trigger_rule: all_success

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,417 @@
# Investigation: Interactive PRD Workflow — Three Related Bugs
**Issues**: #1001, #1002, #1003
**URLs**:
- https://github.com/coleam00/Archon/issues/1001
- https://github.com/coleam00/Archon/issues/1002
- https://github.com/coleam00/Archon/issues/1003
**Type**: BUG (all three)
**Investigated**: 2026-04-09
### Assessment
| Metric | Value | Reasoning |
|------------|--------|-----------|
| Severity | HIGH | The interactive-prd workflow is completely non-functional: user answers are lost (#1001), output file can't be written (#1002), and CLI creates fragmented conversations (#1003). No workaround exists. |
| Complexity | MEDIUM | 3 files affected, but changes are isolated — YAML config fix, one return type addition, one conversation ID lookup. No architectural changes. |
| Confidence | HIGH | All three root causes are confirmed with code evidence. The fixes are straightforward and well-understood. |
---
## Problem Statement
The `archon-interactive-prd` workflow has three bugs that together make it completely non-functional:
1. Approval gates don't capture user responses, so all downstream nodes receive empty strings
2. The generate node writes to `.claude/` which the Claude SDK blocks
3. CLI `workflow approve` creates new conversations instead of reusing the original, fragmenting the Web UI
---
## Analysis
### Bug 1: Missing `capture_response: true` (#1001)
**Evidence Chain:**
WHY: Downstream nodes receive empty `$foundation-gate.output`, `$deepdive-gate.output`, `$scope-gate.output`
↓ BECAUSE: `approveWorkflow()` stores empty string as node output
Evidence: `packages/core/src/operations/workflow-operations.ts:176`
```typescript
const nodeOutput = approval.captureResponse === true ? approvalComment : '';
```
↓ BECAUSE: `captureResponse` is `undefined` (falsy) in the pause metadata
Evidence: `packages/workflows/src/dag-executor.ts:2337`
```typescript
captureResponse: node.approval.capture_response, // undefined when not set in YAML
```
↓ ROOT CAUSE: All three approval gates in the YAML lack `capture_response: true`
Evidence: `.archon/workflows/defaults/archon-interactive-prd.yaml:52-55, 106-109, 172-175`
**Git History:**
- **Introduced**: `9de8cf2c` (Mar 30) — workflow created before `capture_response` existed
- **Feature added**: #936 (Apr 2) — added `capture_response` support
- **Never updated**: The workflow was not retrofitted after #936
### Bug 2: Generate node writes to blocked `.claude/` path (#1002)
**Evidence Chain:**
WHY: `generate` node fails or produces truncated output
↓ BECAUSE: Claude SDK blocks writes to `.claude/` directory (hardcoded safety boundary)
↓ ROOT CAUSE: The prompt instructs AI to write to `.claude/PRPs/prds/`
Evidence: `.archon/workflows/defaults/archon-interactive-prd.yaml:191-196`
```yaml
Generate a complete PRD file at `.claude/PRPs/prds/{kebab-case-name}.prd.md`.
First create the directory:
```bash
mkdir -p .claude/PRPs/prds
```
```
The `validate` node also reads from the same blocked path at line 246-249.
`$ARTIFACTS_DIR` is pre-created by the executor at `packages/workflows/src/executor.ts:468` and is always writable (resolves to `~/.archon/workspaces/{owner}/{repo}/artifacts/runs/{runId}/`).
### Bug 3: CLI approve creates new conversations (#1003)
**Evidence Chain:**
WHY: Each approval creates a separate conversation in the Web UI
↓ BECAUSE: `workflowRunCommand` always generates a new conversation ID
Evidence: `packages/cli/src/commands/workflow.ts:272`
```typescript
const conversationId = generateConversationId(); // always new cli-{ts}-{rand}
```
↓ BECAUSE: `workflowApproveCommand` calls `workflowRunCommand` without passing the original conversation ID
Evidence: `packages/cli/src/commands/workflow.ts:865`
```typescript
await workflowRunCommand(result.workingPath, result.workflowName, result.userMessage ?? '', {
resume: true,
codebaseId: result.codebaseId ?? undefined,
});
```
↓ ROOT CAUSE: `ApprovalOperationResult` doesn't include `conversationId`, so the approve command can't pass it through
Evidence: `packages/core/src/operations/workflow-operations.ts:32-38`
```typescript
export interface ApprovalOperationResult {
workflowName: string;
workingPath: string | null;
userMessage: string | null;
codebaseId: string | null;
type: 'interactive_loop' | 'approval_gate';
}
```
The workflow run DOES store `conversation_id` (DB UUID) — `packages/core/src/db/workflows.ts:52`. And `getConversationById(id)` at `packages/core/src/db/conversations.ts:19` can retrieve the `platform_conversation_id` from the DB UUID.
### Affected Files
| File | Lines | Action | Description |
|------|-------|--------|-------------|
| `.archon/workflows/defaults/archon-interactive-prd.yaml` | 52-55, 106-109, 172-175, 191-196, 246-249 | UPDATE | Add `capture_response: true` to gates; change `.claude/PRPs/prds/` to `$ARTIFACTS_DIR/prds/` |
| `packages/core/src/operations/workflow-operations.ts` | 32-38, 202-208 | UPDATE | Add `conversationId` to `ApprovalOperationResult` and return it from `approveWorkflow` |
| `packages/cli/src/commands/workflow.ts` | 60-69, 272, 849-880 | UPDATE | Add `conversationId` to `WorkflowRunOptions`, accept it, pass it through from approve |
### Integration Points
- `packages/core/src/operations/workflow-operations.ts:202` returns result consumed by CLI approve command
- `packages/cli/src/commands/workflow.ts:865` calls `workflowRunCommand` with the result
- `packages/cli/src/commands/workflow.ts:272` generates conversation ID (needs to accept override)
- `packages/core/src/db/conversations.ts:19` `getConversationById()` — needed to look up platform ID from DB UUID
- `packages/workflows/src/dag-executor.ts:2337` reads `capture_response` from node YAML
- `packages/core/src/operations/workflow-operations.ts:176` uses `captureResponse` to decide output
- `RejectionOperationResult` at line 40-49 should also get `conversationId` for consistency (same pattern)
---
## Implementation Plan
### Step 1: Add `capture_response: true` to all three approval gates
**File**: `.archon/workflows/defaults/archon-interactive-prd.yaml`
**Action**: UPDATE
**Current code (line 52-55):**
```yaml
- id: foundation-gate
approval:
message: "Answer the foundation questions above. Your answers will guide the research phase."
depends_on: [initiate]
```
**Required change:**
```yaml
- id: foundation-gate
approval:
message: "Answer the foundation questions above. Your answers will guide the research phase."
capture_response: true
depends_on: [initiate]
```
**Same change for `deepdive-gate` (line 106-109) and `scope-gate` (line 172-175).**
---
### Step 2: Change output path from `.claude/PRPs/prds/` to `$ARTIFACTS_DIR/prds/`
**File**: `.archon/workflows/defaults/archon-interactive-prd.yaml`
**Action**: UPDATE
**In `generate` node (line 191):**
Change:
```
Generate a complete PRD file at `.claude/PRPs/prds/{kebab-case-name}.prd.md`.
First create the directory:
```bash
mkdir -p .claude/PRPs/prds
```
```
To:
```
Generate a complete PRD file at `$ARTIFACTS_DIR/prds/{kebab-case-name}.prd.md`.
First create the directory:
```bash
mkdir -p $ARTIFACTS_DIR/prds
```
```
**In `validate` node (line 246-249):**
Change:
```
Find the PRD file — check `.claude/PRPs/prds/` for the most recently created `.prd.md` file:
```bash
ls -t .claude/PRPs/prds/*.prd.md | head -1
```
```
To:
```
Find the PRD file — check `$ARTIFACTS_DIR/prds/` for the most recently created `.prd.md` file:
```bash
ls -t $ARTIFACTS_DIR/prds/*.prd.md | head -1
```
```
---
### Step 3: Add `conversationId` to `ApprovalOperationResult` and `RejectionOperationResult`
**File**: `packages/core/src/operations/workflow-operations.ts`
**Action**: UPDATE
**Current (line 32-38):**
```typescript
export interface ApprovalOperationResult {
workflowName: string;
workingPath: string | null;
userMessage: string | null;
codebaseId: string | null;
type: 'interactive_loop' | 'approval_gate';
}
```
**Required change:**
```typescript
export interface ApprovalOperationResult {
workflowName: string;
workingPath: string | null;
userMessage: string | null;
codebaseId: string | null;
conversationId: string;
type: 'interactive_loop' | 'approval_gate';
}
```
**Same for `RejectionOperationResult` (line 40-49).**
---
### Step 4: Return `conversationId` from `approveWorkflow`
**File**: `packages/core/src/operations/workflow-operations.ts`
**Action**: UPDATE
**Current return (line 166-172 for interactive_loop path):**
```typescript
return {
workflowName: run.workflow_name,
workingPath: run.working_path,
userMessage: run.user_message,
codebaseId: run.codebase_id,
type: 'interactive_loop',
};
```
**Required change:**
```typescript
return {
workflowName: run.workflow_name,
workingPath: run.working_path,
userMessage: run.user_message,
codebaseId: run.codebase_id,
conversationId: run.conversation_id,
type: 'interactive_loop',
};
```
**Same for the approval_gate return at line 202-208.**
Also update `rejectWorkflow` return values (both paths) to include `conversationId: run.conversation_id`.
---
### Step 5: Look up original platform conversation ID and pass it through in CLI
**File**: `packages/cli/src/commands/workflow.ts`
**Action**: UPDATE
**5a. Add `conversationId` to `WorkflowRunOptions` (line 55-69):**
```typescript
interface WorkflowRunOptions {
branchName?: string;
fromBranch?: string;
noWorktree?: boolean;
resume?: boolean;
codebaseId?: string;
allowEnvKeys?: boolean;
quiet?: boolean;
verbose?: boolean;
conversationId?: string; // Reuse existing conversation (e.g., from approve)
}
```
**5b. Use provided `conversationId` instead of generating new one (line 272):**
Change:
```typescript
const conversationId = generateConversationId();
```
To:
```typescript
const conversationId = options.conversationId ?? generateConversationId();
```
**5c. Look up platform conversation ID in `workflowApproveCommand` and pass it (line 849-880):**
After `const result = await approveWorkflow(runId, comment);` (line 850), add a lookup:
```typescript
// Look up the original platform conversation ID to keep all messages in one thread
const originalConversation = await conversationDb.getConversationById(result.conversationId);
const platformConversationId = originalConversation?.platform_conversation_id;
```
Then pass it to `workflowRunCommand`:
```typescript
await workflowRunCommand(result.workingPath, result.workflowName, result.userMessage ?? '', {
resume: true,
codebaseId: result.codebaseId ?? undefined,
conversationId: platformConversationId ?? undefined,
});
```
**5d. Same pattern for `workflowRejectCommand` — when the rejection triggers a retry (not cancellation), pass the original conversation ID through.**
---
### Step 6: Update tests
**Tests to verify:**
1. **YAML validation**: Run `bun run cli validate workflows archon-interactive-prd` to confirm the YAML is valid after changes
2. **Type check**: `bun run type-check` to verify the interface changes compile
3. **Existing tests**: Run `bun run test` to ensure no regressions
**Test cases to consider adding:**
- Unit test for `approveWorkflow` verifying `conversationId` is included in return value
- Integration test that `workflowApproveCommand` passes conversation ID through (may require mocking)
---
## Patterns to Follow
**From codebase — approval gate with capture_response:**
The `capture_response` field is already used elsewhere and the schema supports it:
```typescript
// SOURCE: packages/workflows/src/schemas/dag-node.ts:250
capture_response: z.boolean().optional(),
```
**From codebase — conversation lookup by DB UUID:**
```typescript
// SOURCE: packages/core/src/db/conversations.ts:19-24
export async function getConversationById(id: string): Promise<Conversation | null> {
const result = await pool.query<Conversation>(
'SELECT * FROM remote_agent_conversations WHERE id = $1',
[id]
);
return result.rows[0] ?? null;
}
```
**From codebase — $ARTIFACTS_DIR usage in other workflows:**
`$ARTIFACTS_DIR` is already used in other default workflows and is pre-created by the executor.
---
## Edge Cases & Risks
| Risk/Edge Case | Mitigation |
|----------------|------------|
| Original conversation deleted between run start and approve | `getConversationById` returns null → fall back to generating new ID (graceful degradation) |
| Existing paused runs from before this fix | They work fine — `capture_response` defaults to `undefined`/false, same as current behavior. New runs will capture properly. |
| `$ARTIFACTS_DIR` substitution in YAML | Already supported by the executor's variable substitution. Verified in `executor.ts:178-210`. |
| Reject command also creates new conversation | Fixed in Step 5d — same pattern applied to `workflowRejectCommand` |
---
## Validation
### Automated Checks
```bash
bun run type-check
bun run test
bun run lint
bun run cli validate workflows archon-interactive-prd
```
### Manual Verification
1. Run `bun run cli workflow run archon-interactive-prd "Build a todo app"` — verify it pauses at foundation-gate
2. Run `bun run cli workflow approve <id> "My answers..."` — verify the approve reuses the same conversation
3. Check that the `research` node receives the user's answers in `$foundation-gate.output`
4. Complete all gates and verify the PRD is written to `$ARTIFACTS_DIR/prds/` (not `.claude/PRPs/prds/`)
---
## Scope Boundaries
**IN SCOPE:**
- Adding `capture_response: true` to three approval gates in `archon-interactive-prd.yaml`
- Changing output path from `.claude/PRPs/prds/` to `$ARTIFACTS_DIR/prds/` in generate and validate nodes
- Adding `conversationId` to `ApprovalOperationResult` and `RejectionOperationResult`
- Returning `conversation_id` from approve/reject operations
- Looking up and passing through original conversation ID in CLI approve/reject commands
**OUT OF SCOPE (do not touch):**
- Other workflows or default commands
- The approval gate mechanism itself (it works correctly when `capture_response` is set)
- The executor's variable substitution logic
- Database schema changes (no migration needed)
- Chat platform approve commands (Slack/Telegram already handle conversation continuity differently)
---
## Metadata
- **Investigated by**: Claude
- **Timestamp**: 2026-04-09
- **Artifact**: `.claude/PRPs/issues/issue-1001-1002-1003.md`

View file

@ -0,0 +1,538 @@
# Investigation: One-command web UI install via `archon serve`
**Issue**: #978 (https://github.com/coleam00/Archon/issues/978)
**Type**: ENHANCEMENT
**Investigated**: 2026-04-09T12:00:00Z
### Assessment
| Metric | Value | Reasoning |
| ---------- | ------ | --------------------------------------------------------------------------------------------------------------------------------- |
| Priority | MEDIUM | High user value (removes clone+build friction), but existing Docker path and clone path work; not blocking other work |
| Complexity | HIGH | 8+ files across CLI, server, CI, build scripts; server refactor is the hardest part — `main()` is 600 lines with no library API |
| Confidence | HIGH | Clear codebase analysis, all integration points mapped, no unknowns in the download/extract path; server refactor scope is bounded |
---
## Problem Statement
The compiled Archon CLI binary includes only `packages/cli/src/cli.ts` — no server, no web UI, no `archon serve` command. Users who want the web UI must clone the entire monorepo, install Bun, run `bun install` (2274 packages), and `bun dev`. There is no one-command path to get a working web UI from the binary install.
---
## Analysis
### Change Rationale
The web UI is the most discoverable part of the product, but it's behind the highest friction install path. The proposed approach — lazy-fetching a pre-built web UI tarball from GitHub releases on first `archon serve` — keeps the CLI binary small for CLI-only users while giving web UI users a one-command experience: `brew install coleam00/archon/archon && archon serve`.
### Key Design Decision: Server as Library vs Embedded Mini-Server
The current server (`packages/server/src/index.ts`) is a 721-line script with a monolithic `main()` function (line 129-718). It has no `startServer()` export and cannot be imported as a library. Two approaches:
**Option A: Full server refactor** — Extract `main()` into an exported `startServer(opts)` function, make `@archon/server` a dependency of `@archon/cli`, compile the full server into the binary. Binary grows from ~50MB to ~65MB. All platform adapters (Slack, Telegram, GitHub, Discord) would be compiled in.
**Option B: Minimal embedded server** — Create a lightweight Hono server in `packages/cli/src/commands/serve.ts` that only registers API routes + static serving. No platform adapters. Binary stays closer to current size. Uses `registerApiRoutes()` (already exported from `packages/server/src/routes/api.ts:837`) as the core building block.
**Recommendation: Option A (full refactor)** because:
- Option B would duplicate server initialization logic and diverge over time
- Platform adapters are only instantiated when env vars are present (all conditional, see `index.ts:296-459`) — zero cost if not configured
- The binary size increase (~15MB) is acceptable
- Users get the full server experience, not a subset
### Affected Files
| File | Lines | Action | Description |
|------|-------|--------|-------------|
| `packages/cli/src/commands/serve.ts` | NEW | CREATE | `archon serve` command: download web-dist, start server |
| `packages/cli/src/cli.ts` | 57-82, 231, 266+ | UPDATE | Add `'serve'` to `noGitCommands`, add `case 'serve'` |
| `packages/cli/package.json` | deps | UPDATE | Add `@archon/server` and `@archon/adapters` as dependencies |
| `packages/server/src/index.ts` | 129-718 | UPDATE | Extract `main()` into exported `startServer(opts)` |
| `packages/server/src/index.ts` | 579-593 | UPDATE | Accept `webDistPath` parameter instead of computing from `import.meta.dir` |
| `.github/workflows/release.yml` | 140-173 | UPDATE | Add web UI build + tarball upload step |
| `scripts/build-binaries.sh` | — | NONE | No change needed — `bun build --compile` follows imports automatically |
| `packages/paths/src/archon-paths.ts` | — | UPDATE | Add `getWebDistPath(version)` helper |
| Tests | NEW | CREATE | Cover download, checksum, extraction, server startup from CLI |
### Integration Points
- `packages/cli/src/cli.ts:57-82` imports all commands after dotenv setup
- `packages/server/src/routes/api.ts:837` exports `registerApiRoutes(app, webAdapter, lockManager)` — the only reusable server building block
- `packages/paths/src/bundled-build.ts` provides `BUNDLED_VERSION` for constructing release URLs
- `packages/paths/src/archon-paths.ts:56-74` provides `getArchonHome()` for cache location
- `packages/server/src/index.ts:581-593` resolves `webDistPath` from `import.meta.dir` — needs parameterization
- `.github/workflows/release.yml:163-173` publishes release assets via `softprops/action-gh-release@v2`
### Git History
- **Server last touched**: `4b2bcb0e` (env-leak-gate polish) — active development area
- **CLI last touched**: `dddff870` (embed git commit hash in version) — recent changes
- **Build scripts**: `9adc54af` (wire release workflow to build-binaries.sh) — recently stabilized
---
## Implementation Plan
### Step 1: Extract `startServer(opts)` from server's `main()`
**File**: `packages/server/src/index.ts`
**Lines**: 129-718
**Action**: UPDATE
**Current code (simplified):**
```typescript
async function main(): Promise<void> {
// 600 lines of initialization, adapter creation, route registration, Bun.serve()
}
main().catch(error => { ... process.exit(1); });
```
**Required change:**
```typescript
export interface ServerOptions {
/** Override the web dist path (for CLI binary with downloaded web-dist) */
webDistPath?: string;
/** Override the port */
port?: number;
/** Skip platform adapter initialization (CLI serve mode) */
skipPlatformAdapters?: boolean;
}
export async function startServer(opts: ServerOptions = {}): Promise<void> {
// Move entire main() body here
// Replace webDistPath computation (lines 584-588) with:
// opts.webDistPath ?? pathModule.join(pathModule.dirname(pathModule.dirname(import.meta.dir)), 'web', 'dist')
// Replace port with: opts.port ?? getPort()
// Wrap platform adapter blocks with: if (!opts.skipPlatformAdapters) { ... }
}
// Keep backward compat: script entry point still works
if (import.meta.main) {
startServer().catch(error => {
getLog().fatal({ error: error instanceof Error ? error.message : String(error) }, 'startup_failed');
process.exit(1);
});
}
```
**Why**: Makes the server importable as a library. `import.meta.main` guard ensures the file still works as a standalone script for `bun dev`.
---
### Step 2: Add `getWebDistDir()` path helper
**File**: `packages/paths/src/archon-paths.ts`
**Action**: UPDATE
**Add function:**
```typescript
/**
* Returns the path to the cached web UI distribution for a given version.
* Example: ~/.archon/web-dist/v0.3.2/
*/
export function getWebDistDir(version: string): string {
return join(getArchonHome(), 'web-dist', version);
}
```
**Why**: Centralizes the cache location logic, consistent with existing `getArchonHome()` patterns.
---
### Step 3: Create `archon serve` command
**File**: `packages/cli/src/commands/serve.ts`
**Action**: CREATE
```typescript
import { existsSync } from 'fs';
import { createLogger, getWebDistDir } from '@archon/paths';
import { BUNDLED_IS_BINARY, BUNDLED_VERSION } from '@archon/paths/bundled-build';
const log = createLogger('cli.serve');
const GITHUB_REPO = 'coleam00/Archon';
interface ServeOptions {
port?: number;
downloadOnly?: boolean;
}
export async function serveCommand(opts: ServeOptions): Promise<number> {
const version = BUNDLED_IS_BINARY ? BUNDLED_VERSION : 'dev';
if (version === 'dev') {
console.error('Error: `archon serve` is for compiled binaries only.');
console.error('For development, use: bun run dev');
return 1;
}
const webDistDir = getWebDistDir(version);
if (!existsSync(webDistDir)) {
await downloadWebDist(version, webDistDir);
}
if (opts.downloadOnly) {
log.info({ webDistDir }, 'web_dist.download_completed');
console.log(`Web UI downloaded to: ${webDistDir}`);
return 0;
}
// Import server and start
const { startServer } = await import('@archon/server');
await startServer({
webDistPath: webDistDir,
port: opts.port,
skipPlatformAdapters: false, // Start all configured adapters
});
// Server runs until SIGINT/SIGTERM — never returns
return 0;
}
async function downloadWebDist(version: string, targetDir: string): Promise<void> {
const tarballUrl = `https://github.com/${GITHUB_REPO}/releases/download/v${version}/archon-web.tar.gz`;
const checksumsUrl = `https://github.com/${GITHUB_REPO}/releases/download/v${version}/checksums.txt`;
console.log(`Web UI not found locally — downloading from release v${version}...`);
// Download checksums
const checksumsRes = await fetch(checksumsUrl);
if (!checksumsRes.ok) {
throw new Error(`Failed to download checksums: ${checksumsRes.status} ${checksumsRes.statusText}`);
}
const checksumsText = await checksumsRes.text();
const expectedHash = parseChecksum(checksumsText, 'archon-web.tar.gz');
// Download tarball
console.log(`Downloading ${tarballUrl}...`);
const tarballRes = await fetch(tarballUrl);
if (!tarballRes.ok) {
throw new Error(`Failed to download web UI: ${tarballRes.status} ${tarballRes.statusText}`);
}
const tarballBuffer = await tarballRes.arrayBuffer();
// Verify checksum
const hasher = new Bun.CryptoHasher('sha256');
hasher.update(new Uint8Array(tarballBuffer));
const actualHash = hasher.digest('hex');
if (actualHash !== expectedHash) {
throw new Error(`Checksum mismatch: expected ${expectedHash}, got ${actualHash}`);
}
console.log('Checksum verified.');
// Extract to temp dir, then atomic rename
const tmpDir = `${targetDir}.tmp`;
const { mkdirSync, renameSync, rmSync } = await import('fs');
// Clean up any previous failed attempt
rmSync(tmpDir, { recursive: true, force: true });
mkdirSync(tmpDir, { recursive: true });
// Extract tarball using tar (available on macOS/Linux)
const proc = Bun.spawn(['tar', 'xzf', '-', '-C', tmpDir, '--strip-components=1'], {
stdin: new Uint8Array(tarballBuffer),
});
const exitCode = await proc.exited;
if (exitCode !== 0) {
rmSync(tmpDir, { recursive: true, force: true });
throw new Error(`tar extraction failed with exit code ${exitCode}`);
}
// Atomic move
renameSync(tmpDir, targetDir);
console.log(`Extracted to ${targetDir}`);
}
function parseChecksum(checksums: string, filename: string): string {
for (const line of checksums.split('\n')) {
const parts = line.trim().split(/\s+/);
if (parts.length >= 2 && parts[1] === filename) {
return parts[0];
}
}
throw new Error(`Checksum not found for ${filename} in checksums.txt`);
}
```
**Why**: Self-contained command following existing CLI patterns. Atomic extraction prevents half-broken state. Checksum verification prevents supply chain attacks.
---
### Step 4: Wire `serve` into CLI command dispatch
**File**: `packages/cli/src/cli.ts`
**Lines**: 57-82, 231, 266+
**Action**: UPDATE
**Change 1** — Add import (after line 82):
```typescript
import { serveCommand } from './commands/serve.js';
```
**Change 2** — Add to `noGitCommands` (line 231):
```typescript
const noGitCommands = ['version', 'help', 'setup', 'chat', 'continue', 'serve'];
```
**Change 3** — Add case in switch (after the existing `case 'continue'` block):
```typescript
case 'serve': {
const servePort = values.port ? Number(values.port) : undefined;
const downloadOnly = Boolean(values['download-only']);
return await serveCommand({ port: servePort, downloadOnly });
}
```
**Change 4** — Add `--port` and `--download-only` to `parseArgs` options:
```typescript
port: { type: 'string' },
'download-only': { type: 'boolean', default: false },
```
**Change 5** — Update `printUsage()` to include `serve`:
```
serve Start the web UI server (downloads web UI on first run)
--port <port> Override server port (default: 3090)
--download-only Download web UI without starting the server
```
**Why**: Follows exact patterns of existing commands. `serve` doesn't need a git repo.
---
### Step 5: Add `@archon/server` dependency to CLI package
**File**: `packages/cli/package.json`
**Action**: UPDATE
Add to `dependencies`:
```json
"@archon/server": "workspace:*",
"@archon/adapters": "workspace:*"
```
**Why**: The CLI needs to import `startServer` from `@archon/server`. `@archon/adapters` is a transitive dependency of `@archon/server` and should be explicit.
---
### Step 6: Update release CI to build and publish web UI tarball
**File**: `.github/workflows/release.yml`
**Action**: UPDATE
**Add new job** (or add steps to existing `release` job, after artifact download):
```yaml
- name: Setup Bun
uses: oven-sh/setup-bun@v2
- name: Install dependencies
run: bun install --frozen-lockfile
- name: Build web UI
run: bun --filter @archon/web run build
- name: Package web dist
run: |
tar czf dist/archon-web.tar.gz -C packages/web/dist .
- name: Generate checksums
run: |
cd dist
sha256sum archon-* archon-web.tar.gz > checksums.txt
cat checksums.txt
```
**Update** the `files:` block in the release step:
```yaml
files: |
dist/archon-*
dist/archon-web.tar.gz
dist/checksums.txt
```
**Why**: Publishes a single platform-independent web UI tarball alongside the existing per-platform binaries. Checksums cover all artifacts.
---
### Step 7: Add/Update Tests
**File**: `packages/cli/src/commands/serve.test.ts`
**Action**: CREATE
**Test cases to add:**
```typescript
describe('serveCommand', () => {
it('should reject in dev mode (non-binary)', () => {
// Mock BUNDLED_IS_BINARY = false
// Expect exit code 1 with "compiled binaries only" message
});
it('should download web-dist when not cached', () => {
// Mock fetch to return tarball + checksums
// Verify extraction to correct path
});
it('should skip download when already cached', () => {
// Pre-create the web-dist dir
// Verify no fetch calls
});
it('should fail on checksum mismatch', () => {
// Mock fetch with wrong checksum
// Expect error, no leftover .tmp dir
});
it('should handle network failure gracefully', () => {
// Mock fetch to throw
// Expect actionable error message
});
it('should support --download-only', () => {
// Mock fetch, run with downloadOnly: true
// Verify no startServer call
});
});
describe('parseChecksum', () => {
it('should extract hash for matching filename', () => {
// Known checksums.txt format
});
it('should throw for missing filename', () => {
// checksums.txt without the expected entry
});
});
```
---
## Patterns to Follow
**From codebase — mirror these exactly:**
```typescript
// SOURCE: packages/cli/src/commands/version.ts:79-88
// Pattern for binary detection
if (BUNDLED_IS_BINARY) {
version = BUNDLED_VERSION;
gitCommit = BUNDLED_GIT_COMMIT;
} else {
const devInfo = await getDevVersion();
version = devInfo.version;
gitCommit = await getDevGitCommit();
}
```
```typescript
// SOURCE: packages/paths/src/archon-paths.ts:56-74
// Pattern for path resolution with ARCHON_HOME override
export function getArchonHome(): string {
if (isDocker()) {
return '/.archon';
}
const envHome = process.env.ARCHON_HOME;
if (envHome) { /* ... */ return expandTilde(envHome); }
return join(homedir(), '.archon');
}
```
```typescript
// SOURCE: packages/server/src/index.ts:579-593
// Pattern for static file serving (to be parameterized)
if (process.env.NODE_ENV === 'production' || !process.env.WEB_UI_DEV) {
const { serveStatic } = await import('hono/bun');
app.use('/assets/*', serveStatic({ root: webDistPath }));
app.get('*', serveStatic({ root: webDistPath, path: 'index.html' }));
}
```
---
## Edge Cases & Risks
| Risk/Edge Case | Mitigation |
|---------------|------------|
| Server refactor breaks `bun dev` | `import.meta.main` guard keeps script-mode working; test both paths |
| Binary size bloat from including server | Monitor: current ~50MB, expected ~65MB. Acceptable for the value. |
| Tarball extraction fails (permissions, disk space) | Atomic extraction (`.tmp` → rename); clean up on failure; clear error message |
| GitHub release rate limiting | `fetch` will return 403 — surface the error with retry suggestion |
| Air-gapped environments | `--download-only` allows pre-caching; future `--web-dist <path>` for offline |
| Version mismatch (binary v0.3.2 but no release exists yet) | Fail with "release not found" — only happens if someone builds from source with wrong version |
| `tar` not available on system | Available on all macOS/Linux; for Windows, use Bun's built-in tar or `decompress` |
| Concurrent `archon serve` calls during first download | Atomic rename prevents corruption; second process sees complete dir or retries |
| `@archon/server` import increases CLI startup time | Use dynamic `await import()` in serve command only — other commands unaffected |
---
## Validation
### Automated Checks
```bash
bun run type-check
bun run test
bun run lint
bun run validate # Full pre-PR validation
```
### Manual Verification
1. Run `bun run dev` — verify server still starts normally (script mode preserved)
2. Build binary: `VERSION=test scripts/build-binaries.sh` — verify it compiles
3. Run binary with `archon serve` — verify download + extraction + server start
4. Run binary with `archon serve --download-only` — verify download without server
5. Run binary with `archon serve` a second time — verify cached (no download)
6. Run `archon workflow list` — verify no startup time regression from server dep
7. Verify `archon serve --port 4000` — verify port override works
---
## Scope Boundaries
**IN SCOPE:**
- Server library refactor (extract `startServer()`)
- `archon serve` CLI command with download + checksum + extract
- `--port` and `--download-only` flags
- Release CI changes to build and publish `archon-web.tar.gz`
- Path helper for web-dist cache location
- Tests for download/extract/checksum logic
**OUT OF SCOPE (do not touch):**
- `bun dev` workflow — stays as-is for contributors
- Docker image — orthogonal, not affected
- CDN mirroring — GitHub releases sufficient for now
- `archon serve --web-version=latest` — defer to future issue
- `archon serve --offline --web-dist=./path` — defer (can add later)
- Homebrew formula changes — just update docs, no formula change needed
- Auto-update of cached web-dist — version-keyed dirs handle this naturally
- Deprecating clone-and-bun-dev — keep for contributors
- Platform adapter lazy loading optimization — all adapters already conditional on env vars
---
## Implementation Order
The steps have a strict dependency chain:
1. **Step 2** (path helper) — no deps, can go first
2. **Step 1** (server refactor) — the hardest part, do early
3. **Step 5** (CLI package.json dep) — needed before Step 3
4. **Step 3** (serve command) — depends on Steps 1, 2, 5
5. **Step 4** (CLI wiring) — depends on Step 3
6. **Step 7** (tests) — depends on Steps 3, 4
7. **Step 6** (CI changes) — independent, can be done in parallel with 3-7
---
## Metadata
- **Investigated by**: Claude
- **Timestamp**: 2026-04-09T12:00:00Z
- **Artifact**: `.claude/PRPs/issues/issue-978.md`

View file

@ -23,7 +23,7 @@ Restate the feature request in your own words. Identify:
3. **Scope boundaries** — What is explicitly in scope vs. out of scope?
4. **Package impact** — Which of the 8 packages are affected? (`paths`, `git`, `isolation`,
`workflows`, `core`, `adapters`, `server`, `web`)
5. **Interface changes** — Does this touch `IPlatformAdapter`, `IAssistantClient`,
5. **Interface changes** — Does this touch `IPlatformAdapter`, `IAgentProvider`,
`IDatabase`, or `IWorkflowStore`? New interfaces needed?
---
@ -85,7 +85,7 @@ Before writing tasks, reason through:
**Interface design:**
- Prefer extending existing narrow interfaces over creating fat ones.
- New interface methods only if they have a concrete current caller.
- Avoid adding methods to `IPlatformAdapter` or `IAssistantClient` unless essential.
- Avoid adding methods to `IPlatformAdapter` or `IAgentProvider` unless essential.
**Test isolation strategy:**
- `mock.module()` is process-global and permanent in Bun — plan test file placement carefully.

View file

@ -39,11 +39,11 @@ Read `packages/core/src/state/session-transitions.ts` in full — `TransitionTri
### 5. Understand AI Client Patterns
List clients:
!`ls packages/core/src/clients/`
List providers:
!`ls packages/core/src/providers/`
Read `packages/core/src/clients/factory.ts` for provider selection logic.
Read `packages/core/src/clients/claude.ts` first 50 lines — `IAssistantClient` implementation
Read `packages/core/src/providers/factory.ts` for provider selection logic.
Read `packages/core/src/providers/claude.ts` first 50 lines — `IAgentProvider` implementation
with streaming event loop pattern.
### 6. Understand Database Layer
@ -52,7 +52,7 @@ List DB modules:
!`ls packages/core/src/db/`
Read `packages/core/src/types/index.ts` (or the main types file) first 60 lines for key
interfaces: `IPlatformAdapter`, `IAssistantClient`, `Conversation`, `Session`.
interfaces: `IPlatformAdapter`, `IAgentProvider`, `Conversation`, `Session`.
### 7. Understand the Server
@ -81,9 +81,9 @@ Summarize (under 250 words):
- `TransitionTrigger` values and their behaviors
- Only `plan-to-execute` immediately creates a new session; others deactivate first
### AI Clients
- `ClaudeClient` (claude-agent-sdk) and `CodexClient` (codex-sdk)
- `IAssistantClient` streaming pattern: `for await (const event of events)`
### AI Providers
- `ClaudeProvider` (claude-agent-sdk) and `CodexProvider` (codex-sdk)
- `IAgentProvider` streaming pattern: `for await (const event of events)`
### Key Database Tables
- conversations, sessions, codebases, isolation_environments, workflow_runs, workflow_events, messages

View file

@ -51,7 +51,7 @@ bridges these to SSE via `WorkflowEventBridge`.
### 7. Understand Dependency Injection
Read `packages/workflows/src/deps.ts``WorkflowDeps` type: `IWorkflowPlatform`,
`IWorkflowAssistantClient`, `IWorkflowStore` injected at runtime. No direct DB or AI imports
`IWorkflowAgentProvider`, `IWorkflowStore` injected at runtime. No direct DB or AI imports
inside this package.
### 8. See What Workflows Are Available

View file

@ -64,8 +64,8 @@ Provide a concise summary (under 300 words) covering:
### Architecture
- Package dependency order and each package's responsibility
- Key interfaces: `IPlatformAdapter`, `IAssistantClient`, `IDatabase`, `IWorkflowStore`
- Message flow: platform adapter → orchestrator-agent → command handler OR AI client
- Key interfaces: `IPlatformAdapter`, `IAgentProvider`, `IDatabase`, `IWorkflowStore`
- Message flow: platform adapter → orchestrator-agent → command handler OR AI provider
- Workflow execution: `discoverWorkflows` → router → `executeWorkflow` (steps / loop / DAG)
### Current State

View file

@ -21,7 +21,7 @@ Runs `tsc --noEmit` across all 8 packages via `bun --filter '*' type-check`.
**What to look for:**
- Missing return types (explicit return types required on all functions)
- Incorrect interface implementations (`IPlatformAdapter`, `IAssistantClient`, etc.)
- Incorrect interface implementations (`IPlatformAdapter`, `IAgentProvider`, etc.)
- Import type errors (use `import type` for type-only imports)
- Package boundary violations (e.g., `@archon/workflows` importing from `@archon/core`)

View file

@ -33,7 +33,7 @@ Slack event
→ Otherwise → buildOrchestratorPrompt() (prompt-builder.ts:116)
→ Prompt includes: registered projects, discovered workflows, /invoke-workflow format
→ sessionDb.getActiveSession() → transitionSession('first-message') if none (orchestrator-agent.ts:462)
→ getAssistantClient(conversation.ai_assistant_type) (orchestrator-agent.ts:470)
→ getAgentProvider(conversation.ai_assistant_type) (orchestrator-agent.ts:470)
→ cwd = getArchonWorkspacesPath() (orchestrator-agent.ts:458)
→ handleBatchMode() or handleStreamMode() based on getStreamingMode()
@ -313,7 +313,7 @@ Narrows `IPlatformAdapter` to `WebAdapter` for web-specific methods: `setConvers
| Message entry | `adapters/src/chat/slack/adapter.ts`, `server/src/index.ts` |
| Orchestration | `core/src/orchestrator/orchestrator-agent.ts`, `core/src/orchestrator/orchestrator.ts` |
| Locking | `core/src/utils/conversation-lock.ts` |
| AI clients | `core/src/clients/claude.ts`, `core/src/clients/factory.ts` |
| AI providers | `core/src/providers/claude.ts`, `core/src/providers/factory.ts` |
| Commands | `core/src/handlers/command-handler.ts` |
| Sessions | `core/src/db/sessions.ts`, `core/src/state/session-transitions.ts` |
| Workflows | `workflows/src/executor.ts`, `workflows/src/dag-executor.ts`, `workflows/src/loader.ts` |

View file

@ -1,44 +0,0 @@
---
paths:
- "packages/adapters/**/*.ts"
---
# Adapters Conventions
## Key Patterns
- **Auth is inside adapters** — every adapter checks authorization before calling `onMessage()`. Silent rejection (no error response), log with masked user ID: `userId.slice(0, 4) + '***'`.
- **Whitelist parsing in constructor** — parse env var (`SLACK_ALLOWED_USER_IDS`, `TELEGRAM_ALLOWED_USER_IDS`, `GITHUB_ALLOWED_USERS`) using a co-located `parseAllowedUserIds()` / `parseAllowedUsers()` function. Empty list = open access.
- **Lazy logger pattern** — ALL adapter files use a module-level `cachedLog` + `getLog()` getter so test mocks intercept `createLogger` before the logger is instantiated. Never initialize logger at module scope.
- **Two handler patterns** (both valid):
- **Chat adapters** (Slack, Telegram, Discord): `onMessage(handler)` — adapter owns the event loop (polling/WebSocket), fires registered callback. Lock manager lives in the server's callback closure. Errors handled by caller via `createMessageErrorHandler`.
- **Forge adapters** (GitHub, Gitea): `handleWebhook(payload, signature)` — server HTTP route calls directly, returns 200 immediately. Full pipeline inside adapter (signature verification, repo cloning, command loading, context building). Lock manager injected in constructor. Errors caught internally and posted to issue/PR.
- **Message splitting** — use shared `splitIntoParagraphChunks(message, maxLength)` from `../../utils/message-splitting`. Two-pass: paragraph breaks first, then line breaks. Limits: Slack 12000, Telegram 4096, GitHub 65000.
- **`ensureThread()` is often a no-op** — Slack returns the same ID (already encoded as `channel:ts`), Telegram has no threads, GitHub issues are inherently threaded.
## Conversation ID Formats
| Platform | Format | Example |
|----------|--------|---------|
| Slack | `channel:thread_ts` | `C123ABC:1234567890.123456` |
| Telegram | numeric chat ID as string | `"1234567890"` |
| GitHub | `owner/repo#number` | `"acme/api#42"` |
| Web | user-provided string | `"my-chat"` |
| Discord | channel ID string | `"987654321098765432"` |
## Architecture
- All chat adapters implement `IPlatformAdapter` from `@archon/core`
- GitHub adapter is webhook-based (no polling); Slack/Telegram/Discord use polling
- GitHub adapter holds its own `ConversationLockManager` (injected in constructor)
- Slack conversation ID encodes both channel and thread: `sendMessage()` splits on `:` to extract `thread_ts`
- GitHub adapter adds `<!-- archon-bot-response -->` marker to prevent self-triggering loops
- GitHub only responds to `issue_comment.created` events — NOT `issues.opened` / `pull_request.opened` (descriptions contain documentation, not commands; see #96)
## Anti-patterns
- Never put auth logic outside the adapter (no auth middleware in server routes)
- Never throw from `onMessage` handlers; errors surface to the caller
- Never call `sendMessage()` with a raw token or credential string in the message
- Never use the generic `exec` — always use `execFileAsync` for subprocess calls
- Never add a new adapter method to `IPlatformAdapter` unless ALL adapters need it; use optional methods (`sendStructuredEvent?`) for platform-specific capabilities

View file

@ -1,89 +0,0 @@
---
paths:
- "packages/cli/**/*.ts"
---
# CLI Conventions
## Commands
```bash
# Workflow commands (require git repo)
bun run cli workflow list [--json]
bun run cli workflow run <name> [message] [--branch <branch>] [--from-branch <base>] [--no-worktree] [--resume]
bun run cli workflow status [runId]
# Isolation commands
bun run cli isolation list
bun run cli isolation cleanup [days] # default: 7 days
bun run cli isolation cleanup --merged # removes merged branches + remote refs
bun run cli complete <branch-name> [--force] # full lifecycle: worktree + local/remote branches
# Interactive
bun run cli chat [--cwd <path>]
# Setup
bun run cli setup
bun run cli version
```
## Startup Behavior
1. Deletes `process.env.DATABASE_URL` (prevent target repo's DB from leaking in)
2. Loads `~/.archon/.env` with `override: true`
3. Smart Claude auth default: if no `CLAUDE_API_KEY` or `CLAUDE_CODE_OAUTH_TOKEN`, sets `CLAUDE_USE_GLOBAL_AUTH=true`
4. Imports all commands AFTER dotenv setup
## WorkflowRunOptions Interface
```typescript
interface WorkflowRunOptions {
branchName?: string; // Explicit branch name for the worktree
fromBranch?: string; // Override base branch (start-point for worktree)
noWorktree?: boolean; // Opt out of isolation, run in live checkout
resume?: boolean; // Reuse worktree from last failed run
}
```
**Default behavior**: Creates worktree with auto-generated branch name (`archon/task-{workflow}-{timestamp}`).
**Mutually exclusive** (enforced in both `cli.ts` pre-flight and `workflowRunCommand`):
- `--branch` + `--no-worktree`
- `--from` + `--no-worktree`
- `--resume` + `--branch`
- `--branch feature-auth` → creates/reuses worktree for that branch
- (no flags) → creates worktree with auto-generated `archon/task-*` branch (isolation by default)
- `--no-worktree` → runs directly in live checkout (opt-out of isolation)
- `--from dev` → overrides the start-point for new worktree (works with or without `--branch`)
- `--resume` → resumes last run for this conversation (mutually exclusive with `--branch`)
## Git Repo Requirement
Workflow and isolation commands resolve CWD to the git repo root. Run from within a git repository (subdirectories work). The CLI calls `git rev-parse --show-toplevel` to find the root.
## Conversation ID Format
CLI generates: `cli-{timestamp}-{random6}` (e.g., `cli-1703123456789-a7f3bc`)
## Port Allocation
Worktree-aware: same hash-based algorithm as server (31904089 range). Running `bun dev` in a worktree auto-allocates a unique port. Same worktree always gets same port.
## CLIAdapter
The `CLIAdapter` implements `IPlatformAdapter`. It streams output to stdout. `getStreamingMode()` defaults to `'batch'` (configurable via constructor options). No auth needed — CLI is local only.
## Architecture
- `@archon/cli` depends on `@archon/core`, `@archon/workflows`, `@archon/git`, `@archon/isolation`, `@archon/paths`
- Uses `createWorkflowDeps()` from `@archon/core/workflows/store-adapter` to build workflow deps
- Database shared with server (same `~/.archon/archon.db` or `DATABASE_URL`)
- Conversation lifecycle: create → run workflow → persist messages (same DB as web UI)
## Anti-patterns
- Never run CLI commands without being inside a git repository (workflow/isolation commands will fail)
- Never set `DATABASE_URL` in `~/.archon/.env` to point at a target app's database
- Never use `--force` on `complete` unless branch is truly safe to delete (skips uncommitted check)
- Never add interactive prompts inside CLI commands — use flags for all options (non-interactive tool)

View file

@ -1,90 +0,0 @@
---
paths:
- "packages/core/src/db/**/*.ts"
- "migrations/**/*.sql"
---
# Database Conventions
## 7 Tables (all prefixed `remote_agent_`)
| Table | Purpose |
|-------|---------|
| `remote_agent_conversations` | Platform conversations, soft-delete (`deleted_at`), title, `hidden` flag |
| `remote_agent_sessions` | AI SDK sessions with `parent_session_id` audit chain, `transition_reason` |
| `remote_agent_codebases` | Repository metadata, `commands` JSONB |
| `remote_agent_isolation_environments` | Git worktree tracking, `workflow_type`, `workflow_id` |
| `remote_agent_workflow_runs` | Execution state, `working_path`, `last_activity_at` |
| `remote_agent_workflow_events` | Step-level event log per run |
| `remote_agent_messages` | Conversation history, tool call metadata as JSONB |
## IDatabase Interface
Auto-detects at startup: PostgreSQL if `DATABASE_URL` set, SQLite (`~/.archon/archon.db`) otherwise.
```typescript
import { pool, getDialect } from './connection'; // pool = IDatabase instance
// $1, $2 placeholders work for both PostgreSQL and SQLite
const result = await pool.query<Conversation>(
'SELECT * FROM remote_agent_conversations WHERE id = $1',
[id]
);
const row = result.rows[0]; // rows is readonly T[]
```
Use `getDialect()` for dialect-specific expressions: `dialect.generateUuid()`, `dialect.now()`, `dialect.jsonMerge(col, paramIdx)`, `dialect.jsonArrayContains(col, path, paramIdx)`, `dialect.nowMinusDays(paramIdx)`.
## Import Pattern — Namespaced Exports
```typescript
// Use namespace imports for DB modules (consistent project-wide pattern)
import * as conversationDb from '@archon/core/db/conversations';
import * as sessionDb from '@archon/core/db/sessions';
import * as codebaseDb from '@archon/core/db/codebases';
import * as workflowDb from '@archon/core/db/workflows';
import * as messageDb from '@archon/core/db/messages';
```
## INSERT Error Handling
```typescript
try {
const result = await pool.query('INSERT INTO remote_agent_conversations ...', params);
return result.rows[0];
} catch (error) {
log.error({ err: error, params }, 'db_insert_failed');
throw new Error('Failed to create conversation');
}
```
## UPDATE with rowCount Verification
`updateConversation()` and similar throw `ConversationNotFoundError` / `SessionNotFoundError` when `rowCount === 0`. Callers must handle:
```typescript
try {
await db.updateConversation(conversationId, { codebase_id: codebaseId });
} catch (error) {
if (error instanceof ConversationNotFoundError) {
// Handle missing conversation specifically
}
throw error; // Re-throw unexpected errors
}
```
## Session Audit Trail
Sessions are immutable. Every new session links back: `parent_session_id` → previous session, `transition_reason: TransitionTrigger`. Query the chain to understand history. `active = true` means the current session.
## Soft Delete
Conversations use soft-delete: `deleted_at IS NULL` filter should be included in all user-facing queries. `hidden = true` conversations are worker conversations (background workflows) — excluded from UI listings.
## Anti-patterns
- Never `SELECT *` in production queries on large tables — select specific columns
- Never write raw SQL strings in application code outside `packages/core/src/db/` modules
- Never bypass the `IDatabase` interface to call database drivers directly from other packages
- Never assume `rows[0]` exists without null-checking — queries can return empty arrays
- Never use `RETURNING *` in UPDATE when only checking success — check `rowCount` instead

View file

@ -1,22 +0,0 @@
# DX Quirks
## Bun Log Elision
When running `bun dev` from repo root, `--filter` truncates logs to `[N lines elided]`.
To see full logs: `cd packages/server && bun --watch src/index.ts` or `bun --cwd packages/server run dev`.
## mock.module() Pollution
`mock.module()` is process-global and irreversible — `mock.restore()` does NOT undo it.
Never add `afterAll(() => mock.restore())` for `mock.module()` cleanup.
Use `spyOn()` for internal modules (spy.mockRestore() DOES work).
When adding tests with `mock.module()`, ensure package.json runs it in a separate `bun test` invocation.
## Worktree Port Allocation
Worktrees auto-allocate ports (3190-4089 range, hash-based on path). Same worktree always gets same port.
Main repo defaults to 3090. Override: `PORT=4000 bun dev`.
## bun run test vs bun test
NEVER run `bun test` from repo root — it discovers all test files across packages in one process, causing ~135 mock pollution failures. Always use `bun run test` (which uses `bun --filter '*' test` for per-package isolation).

View file

@ -1,40 +0,0 @@
# Isolation Architecture Patterns
## Core Design
- ALL isolation logic is centralized in the orchestrator — adapters are thin
- Every @mention auto-creates a worktree (simplicity > efficiency; worktrees are cheap)
- Data model is work-centric (`isolation_environments` table), enabling cross-platform sharing
- Cleanup is a separate service using git-first checks
## Directory Structure
```
~/.archon/workspaces/owner/repo/
├── source/ # Clone or symlink to local path
├── worktrees/ # Git worktrees for this project
├── artifacts/ # Workflow artifacts (NEVER in git)
│ ├── runs/{id}/ # Per-run artifacts ($ARTIFACTS_DIR)
│ └── uploads/{convId}/ # Web UI file uploads (ephemeral)
└── logs/ # Workflow execution logs
```
## Resolution Flow
1. Adapter provides `IsolationHints` (conversationId, workflowId, branch preference)
2. Orchestrator's `validateAndResolveIsolation()` resolves hints → environment
3. WorktreeProvider creates worktree if needed, syncs with origin first
4. Environment tracked in `isolation_environments` table
## Key Packages
- `@archon/isolation` (`packages/isolation/src/`) — types, providers, resolver, error classifiers
- `@archon/git` (`packages/git/src/`) — branch, worktree, repo operations
- `@archon/paths` (`packages/paths/src/`) — path resolution utilities
## Safety Rules
- NEVER run `git clean -fd` — permanently deletes untracked files
- Use `classifyIsolationError()` to map git errors to user-friendly messages
- Trust git's natural guardrails (refuse to remove worktree with uncommitted changes)
- Use `execFileAsync` (not `exec`) when calling git directly

View file

@ -1,77 +0,0 @@
---
paths:
- "packages/isolation/**/*.ts"
- "packages/git/**/*.ts"
---
# Isolation & Git Conventions
## Branded Types (packages/git/src/types.ts)
Always use the branded constructors — they reject empty strings at runtime and prevent passing the wrong string type:
```typescript
import { toRepoPath, toBranchName, toWorktreePath } from '@archon/git';
import type { RepoPath, BranchName, WorktreePath } from '@archon/git';
const repo = toRepoPath('/home/user/owner/repo'); // RepoPath
const branch = toBranchName('feature-auth'); // BranchName
const wt = toWorktreePath('/home/.archon/worktrees/x'); // WorktreePath
```
Git operations return `GitResult<T>` discriminated union: `{ ok: true; value: T }` or `{ ok: false; error: GitError }`. Always check `.ok` before accessing `.value`.
## IsolationResolver — 7-Step Resolution Order
1. **Existing env** — use `existingEnvId` if worktree still exists on disk
2. **No codebase** — skip isolation entirely, return `status: 'none'`
3. **Workflow reuse** — find active env with same `(codebaseId, workflowType, workflowId)`
4. **Linked issue sharing** — PR can reuse the worktree from a linked issue
5. **PR branch adoption** — find existing worktree by branch name (`findWorktreeByBranch`)
6. **Limit check + auto-cleanup** — if at `maxWorktrees` (default 25), try `makeRoom()` first
7. **Create new** — call `provider.create(isolationRequest)` then `store.create()`
If `store.create()` fails after `provider.create()` succeeds, the orphaned worktree is cleaned up best-effort before re-throwing.
## Error Handling Pattern
```typescript
import { classifyIsolationError, isKnownIsolationError } from '@archon/isolation';
try {
await provider.create(request);
} catch (error) {
const err = error instanceof Error ? error : new Error(String(error));
if (!isKnownIsolationError(err)) {
throw err; // Unknown = programming bug, propagate as crash
}
const userMessage = classifyIsolationError(err); // Maps to friendly message
// ...send userMessage to platform, return blocked resolution
}
```
Known error patterns: `permission denied`, `eacces`, `timeout`, `no space left`, `enospc`, `not a git repository`, `branch not found`.
`IsolationBlockedError` signals ALL message handling should stop — the user has already been notified.
## Git Safety Rules
- **NEVER run `git clean -fd`** — permanently deletes untracked files. Use `git checkout .` instead.
- **Always use `execFileAsync`** (from `@archon/git/exec`), never `exec` or `execSync`
- `hasUncommittedChanges()` returns `true` on unexpected errors (conservative — prevents data loss)
- Worktree paths follow project-scoped layout: `~/.archon/workspaces/{owner}/{repo}/worktrees/{branch}`
## Architecture
- `@archon/git` — zero `@archon/*` dependencies; only branded types and `execFileAsync` wrapper
- `@archon/isolation` — depends only on `@archon/git` + `@archon/paths`
- `IIsolationStore` interface injected into `IsolationResolver` — never call DB directly from git package
- `IIsolationProvider` interface — `WorktreeProvider` is the only implementation
- Stale env cleanup is best-effort: `markDestroyedBestEffort()` logs errors but never throws
## Anti-patterns
- Never call `git` via `exec()` or shell string — always `execFileAsync('git', [...args])`
- Never treat `IsolationBlockedError` as recoverable — it means user was notified, stop processing
- Never use a plain `string` where `RepoPath` / `BranchName` / `WorktreePath` is expected
- Never skip the `isKnownIsolationError()` check — unknown errors must propagate as crashes

View file

@ -1,121 +0,0 @@
---
paths:
- "packages/core/src/orchestrator/**/*.ts"
- "packages/core/src/handlers/**/*.ts"
- "packages/core/src/state/**/*.ts"
---
# Orchestrator Conventions
## Message Flow — Routing Agent Architecture
```
Platform message
→ ConversationLockManager.acquireLock()
→ handleMessage() (orchestrator-agent.ts:383)
→ inheritThreadContext() — copy parent's codebase/cwd if child thread
→ Deterministic gate: 10 commands (help, status, reset, workflow, register-project, update-project, remove-project, commands, init, worktree)
→ Everything else → AI routing call:
→ listCodebases() + discoverAllWorkflows()
→ buildFullPrompt() → buildOrchestratorPrompt() or buildProjectScopedPrompt()
→ AI responds with natural language ± /invoke-workflow or /register-project
→ parseOrchestratorCommands() extracts structured commands from AI response
→ If /invoke-workflow found → dispatchOrchestratorWorkflow()
→ If /register-project found → handleRegisterProject()
→ Otherwise → send AI text to user
```
Lock manager returns `{ status: 'started' | 'queued-conversation' | 'queued-capacity' }`. Always use the return value to decide whether to emit a "queued" notice — never call `isActive()` separately (TOCTOU race).
## Deterministic Commands (command-handler.ts)
Only **10 commands** are handled deterministically:
| Command | Behavior |
|---------|----------|
| `/help` | Show available commands |
| `/status` | Show conversation/session state |
| `/reset` | Deactivate current session |
| `/workflow` | Subcommands: `list`, `run`, `status`, `cancel`, `reload` |
| `/register-project` | Handled inline — creates codebase DB record |
| `/update-project` | Handled inline — updates codebase path |
| `/remove-project` | Handled inline — deletes codebase DB record |
| `/commands` | List registered codebase commands |
| `/init` | Scaffold `.archon/` in current repo |
| `/worktree` | Worktree subcommands |
**All other slash commands fall through to the AI router.** Unrecognized commands return an "Unknown command" error.
## Routing AI — Prompt Building (prompt-builder.ts)
The choice between prompts depends on whether the conversation has an attached project:
- **No project**`buildOrchestratorPrompt()` (prompt-builder.ts:116) — lists all projects equally, asks user to clarify if ambiguous
- **Has project**`buildProjectScopedPrompt()` (prompt-builder.ts:153) — active project shown first, ambiguous requests default to it
Both prompts include: registered projects, discovered workflows, and the `/invoke-workflow` + `/register-project` format specification.
### `/invoke-workflow` Protocol
The AI emits: `/invoke-workflow <name> --project <project> --prompt "user's intent"`
`parseOrchestratorCommands()` (orchestrator-agent.ts:90) parses this with:
- Workflow name validated against discovered workflows via `findWorkflow()`
- Project name validated via `findCodebaseByName()` — case-insensitive, supports partial path segment match (e.g., `"repo"` matches `"owner/repo"`)
- `--project` must appear before `--prompt`
### `filterToolIndicators()` (orchestrator-agent.ts:163)
Batch mode only. Strips paragraphs starting with emoji tool indicators (🔧💭📝✏️🗑️📂🔍) from accumulated AI response before sending to user.
## Session Transitions
Sessions are **immutable** — never mutated, only deactivated and replaced. The audit trail is via `parent_session_id` + `transition_reason`.
**Only `plan-to-execute` immediately creates a new session.** All other triggers only deactivate; the new session is created on the next AI message.
```typescript
import { getTriggerForCommand, shouldCreateNewSession } from '../state/session-transitions';
const trigger = getTriggerForCommand('reset'); // 'reset-requested'
if (shouldCreateNewSession(trigger)) {
// plan-to-execute only
}
```
`TransitionTrigger` values: `'first-message'`, `'plan-to-execute'`, `'isolation-changed'`, `'reset-requested'`, `'worktree-removed'`, `'conversation-closed'`.
## Isolation Resolution
`validateAndResolveIsolation()` (orchestrator.ts:108) delegates to `IsolationResolver` and handles:
- Sending contextual messages to the platform (e.g., "Reusing worktree from issue #42")
- Updating the DB (`conversation.isolation_env_id`, `conversation.cwd`)
- Retrying once when a stale reference is found (`stale_cleaned`)
- Throwing `IsolationBlockedError` after platform notification when blocked
When isolation is blocked, **stop all further processing**`IsolationBlockedError` means the user was already notified.
## Background Workflow Dispatch (Web only)
`dispatchBackgroundWorkflow()` (orchestrator.ts:256) creates a hidden worker conversation (`web-worker-{timestamp}-{random}`), sets up event bridging from worker SSE → parent SSE, pre-creates the workflow run row (prevents 404 on immediate UI navigation), and fires-and-forgets `executeWorkflow()`. On completion, surfaces `result.summary` to the parent conversation.
## Lazy Logger Pattern
All files in this area use the deferred logger pattern — NEVER initialize at module scope:
```typescript
let cachedLog: ReturnType<typeof createLogger> | undefined;
function getLog(): ReturnType<typeof createLogger> {
if (!cachedLog) cachedLog = createLogger('orchestrator');
return cachedLog;
}
```
## Anti-patterns
- Never call `isActive()` and then `acquireLock()` — race condition, use the lock return value
- Never access `conversation.isolation_env_id` directly without going through the resolver
- Never skip `IsolationBlockedError` — it must propagate to stop all further message handling
- Never add platform-specific logic to the orchestrator; it uses `IPlatformAdapter` interface only
- Never transition sessions by mutating them; always deactivate and create a new linked session
- Never assume a slash command is deterministic — only the 10 listed above bypass the AI router

View file

@ -1,109 +0,0 @@
---
paths:
- "packages/server/**/*.ts"
---
# Server API Conventions
## Hono Framework
```typescript
import { Hono } from 'hono';
import { streamSSE } from 'hono/streaming';
import { cors } from 'hono/cors';
// CORS: allow-all for single-developer tool (override with WEB_UI_ORIGIN)
app.use('/api/*', cors({ origin: process.env.WEB_UI_ORIGIN || '*' }));
// Error response helper pattern
function apiError(c: Context, status: 400 | 404 | 500, message: string): Response {
return c.json({ error: message }, status);
}
```
## SSE Streaming
Always check `stream.closed` before writing. Use `stream.onAbort()` for cleanup. Hono's `streamSSE` callback receives an SSE writer:
```typescript
app.get('/api/stream/:id', (c) => {
return streamSSE(c, async (stream) => {
stream.onAbort(() => {
transport.removeStream(conversationId, writer);
});
// Write events:
if (!stream.closed) {
await stream.writeSSE({ data: JSON.stringify(event) });
}
});
});
```
`SSETransport` in `src/adapters/web/transport.ts` manages the stream registry. `removeStream()` accepts an `expectedStream` reference to prevent race conditions (StrictMode double-mount).
## Webhook Signature Verification
```typescript
// ALWAYS use c.req.text() for raw webhook body — JSON.parse separately
const payload = await c.req.text();
const signature = c.req.header('X-Hub-Signature-256') ?? '';
// timingSafeEqual prevents timing attacks
const hmac = createHmac('sha256', webhookSecret);
const digest = 'sha256=' + hmac.update(payload).digest('hex');
const isValid = timingSafeEqual(Buffer.from(digest), Buffer.from(signature));
```
Return 200 immediately for webhook events; process async. Never log the full signature.
## Auto Port Allocation (Worktrees)
`getPort()` from `@archon/core` returns:
- Main repo: `PORT` env var or `3090`
- Worktrees: hash-based port in range 31904089 (deterministic per worktree path)
Same worktree always gets same port. Override with `PORT=4000` env var.
## Static SPA Fallback
```typescript
// Serve web dist; fall back to index.html for client-side routing
app.use('/*', serveStatic({ root: path.join(import.meta.dir, '../../web/dist') }));
app.get('*', (c) => c.html(/* index.html */));
```
Use `import.meta.dir` (absolute) NOT relative paths — `bun --filter @archon/server start` changes CWD to `packages/server/`.
## Graceful Shutdown
```typescript
process.on('SIGTERM', () => {
stopCleanupScheduler();
void pool.close();
process.exit(0);
});
```
## Key API Routes
| Method | Path | Purpose |
|--------|------|---------|
| GET | `/api/conversations` | List conversations |
| POST | `/api/conversations` | Create conversation |
| POST | `/api/conversations/:id/message` | Send message |
| GET | `/api/stream/:id` | SSE stream |
| GET | `/api/workflows` | List workflows |
| POST | `/api/workflows/validate` | Validate YAML (in-memory) |
| GET | `/api/workflows/:name` | Get single workflow |
| PUT | `/api/workflows/:name` | Save workflow |
| DELETE | `/api/workflows/:name` | Delete workflow |
| GET | `/api/commands` | List commands |
| POST | `/webhooks/github` | GitHub webhook |
## Anti-patterns
- Never use `c.req.json()` for webhooks — signature must be verified against raw body
- Never expose API keys in JSON error responses
- Never serve static files with relative paths (use `import.meta.dir`)
- Never skip the `stream.closed` check before writing SSE
- Never call platform adapters directly from route handlers — use `handleMessage()` + lock manager

View file

@ -1,105 +0,0 @@
---
paths:
- "**/*.test.ts"
- "**/*.spec.ts"
---
# Testing Conventions
## CRITICAL: mock.module() Pollution Rules
`mock.module()` permanently replaces modules in the **process-wide module cache**. `mock.restore()` does NOT undo it ([oven-sh/bun#7823](https://github.com/oven-sh/bun/issues/7823)).
**Rules:**
1. **Never add `afterAll(() => mock.restore())` for `mock.module()` calls** — it has no effect
2. **Never have two test files `mock.module()` the same path with different implementations in the same `bun test` invocation**
3. **Use `spyOn()` for internal modules**`spy.mockRestore()` DOES work for spies
```typescript
// CORRECT: spy (restorable)
import * as git from '@archon/git';
const spy = spyOn(git, 'checkout');
spy.mockImplementation(async () => ({ ok: true, value: undefined }));
// afterEach:
spy.mockRestore();
// CORRECT: mock.module() for external deps (not restorable — isolate in separate test file)
mock.module('@slack/bolt', () => ({ App: mock(() => mockApp), LogLevel: { INFO: 'info' } }));
```
## Test Batching Per Package
Each package splits tests into separate `bun test` invocations to prevent pollution:
| Package | Batches |
|---------|---------|
| `@archon/core` | 7 batches (clients, handlers, db+utils, path-validation, cleanup-service, title-generator, workflows, orchestrator) |
| `@archon/workflows` | 5 batches |
| `@archon/adapters` | 3 batches (chat+community+forge-auth, github-adapter, github-context) |
| `@archon/isolation` | 3 batches |
**Never run `bun test` from the repo root** — causes ~135 mock pollution failures. Always use:
```bash
bun run test # Correct: per-package isolation via bun --filter '*' test
bun run test --watch # Watch mode (single package)
```
## Mock Pattern for Lazy Loggers
All adapter/db/orchestrator files use lazy logger pattern. Mock before import:
```typescript
// MUST come before import of the module under test
const mockLogger = {
fatal: mock(() => undefined), error: mock(() => undefined),
warn: mock(() => undefined), info: mock(() => undefined),
debug: mock(() => undefined), trace: mock(() => undefined),
};
mock.module('@archon/paths', () => ({ createLogger: mock(() => mockLogger) }));
import { SlackAdapter } from './adapter'; // Import AFTER mock
```
## Database Test Mocking
```typescript
import { createQueryResult, mockPostgresDialect } from '../test/mocks/database';
const mockQuery = mock(() => Promise.resolve(createQueryResult([])));
mock.module('./connection', () => ({
pool: { query: mockQuery },
getDialect: () => mockPostgresDialect,
}));
// In tests:
mockQuery.mockResolvedValueOnce(createQueryResult([existingRow]));
mockQuery.mockClear(); // in beforeEach
```
## Test Structure
```typescript
import { describe, test, expect, mock, beforeEach, afterEach } from 'bun:test';
describe('ComponentName', () => {
beforeEach(() => {
mockFn.mockClear(); // Reset call counts
});
test('does thing when condition', async () => {
mockQuery.mockResolvedValueOnce(createQueryResult([fixture]));
const result = await functionUnderTest(input);
expect(result).toEqual(expected);
expect(mockQuery).toHaveBeenCalledTimes(1);
});
});
```
## Anti-patterns
- Never `import` a module before all `mock.module()` calls for its dependencies
- Never use `afterAll(() => mock.restore())` for `mock.module()` — it silently does nothing
- Never test with real database or filesystem in unit tests — always mock
- Never run `bun test` from the repo root
- Never add a new test file with conflicting `mock.module()` to an existing batch — create a new batch in the package's `package.json` test script

View file

@ -1,90 +0,0 @@
---
paths:
- "packages/web/**/*.tsx"
- "packages/web/**/*.ts"
- "packages/web/**/*.css"
---
# Web Frontend Conventions
## Tech Stack
- React 19 + Vite 6 + TypeScript
- Tailwind CSS v4 (CSS-first config)
- shadcn/ui components
- TanStack Query v5 for REST data
- React Router v7 (`react-router`, NOT `react-router-dom`)
- Manual `EventSource` for SSE streaming (no library)
- **Dark theme only** — no light mode toggle
## Tailwind v4 Critical Differences
```css
/* CORRECT: CSS-first import */
@import 'tailwindcss';
@import 'tw-animate-css'; /* NOT tailwindcss-animate */
/* CORRECT: theme variables in @theme inline block */
@theme inline {
--color-surface: var(--surface);
--color-accent-bright: var(--accent-bright);
}
/* WRONG: never use @tailwind base/components/utilities */
```
Plugin in `vite.config.ts`: `import tailwindcss from '@tailwindcss/vite'` — uses Vite plugin, **not PostCSS**. `components.json` has blank `tailwind.config` for v4.
## Color Palette (oklch)
All custom colors are OKLCH. Key tokens (defined in `:root` in `index.css`):
- `--surface` (0.18): main surface
- `--surface-elevated` (0.22): cards, popovers
- `--background` (0.14): page background
- `--primary` / `--ring`: blue accent at oklch(0.65 0.18 250)
- `--text-primary` (0.93), `--text-secondary` (0.65), `--text-tertiary` (0.45)
- `--success` (green 155), `--warning` (yellow 75), `--error` (red 25)
Use CSS variables via Tailwind utilities: `bg-surface`, `text-text-primary`, `border-border`, `text-accent-bright`, etc.
## SSE Streaming Pattern
`useSSE()` in `src/hooks/useSSE.ts` is the single SSE consumer. It:
- Opens `EventSource` to `/api/stream/{conversationId}`
- Batches text events (50ms flush timer) to reduce re-renders
- Flushes immediately before `tool_call`, `tool_result`, `workflow_dispatch` events
- Marks disconnected only on `CLOSED` state (not `CONNECTING` — avoids flicker)
- `handlersRef` pattern ensures stable EventSource with fresh handlers
Event types: `text`, `tool_call`, `tool_result`, `error`, `conversation_lock`, `session_info`, `workflow_step`, `workflow_status`, `parallel_agent`, `workflow_artifact`, `dag_node`, `workflow_dispatch`, `workflow_output_preview`, `warning`, `retract`, `heartbeat`.
## Routing
```tsx
// CORRECT
import { BrowserRouter, Routes, Route } from 'react-router';
// WRONG
import { BrowserRouter } from 'react-router-dom';
```
Routes: `/` (Dashboard), `/chat`, `/chat/*`, `/workflows`, `/workflows/builder`, `/workflows/runs/:runId`, `/settings`.
## API Client Pattern
```typescript
// src/lib/api.ts exports SSE_BASE_URL and REST functions
import { SSE_BASE_URL } from '@/lib/api';
// In dev: Vite proxies /api/* to localhost:{VITE_API_PORT}
// API port injected at build time: import.meta.env.VITE_API_PORT
```
TanStack Query `staleTime: 10_000`, `refetchOnWindowFocus: true`.
## Anti-patterns
- Never add a light mode — dark-only is intentional
- Never use `react-router-dom` — use `react-router` (v7)
- Never configure Tailwind in `tailwind.config.js/ts` — v4 is CSS-first
- Never use `tailwindcss-animate` — use `tw-animate-css`
- Never open a second `EventSource` per conversation — `useSSE()` handles it
- Never pass inline style objects for theme colors — use Tailwind classes with CSS variables

View file

@ -1,100 +0,0 @@
---
paths:
- "packages/workflows/**/*.ts"
- ".archon/workflows/**/*.yaml"
- ".archon/commands/**/*.md"
---
# Workflows Conventions
## DAG Workflow Format
All workflows use the DAG (Directed Acyclic Graph) format with `nodes:`. Loop nodes are supported as a node type within DAGs.
```yaml
nodes:
- id: classify
prompt: "Is this a bug or feature? Answer JSON: {type: 'BUG'|'FEATURE'}"
output_format: {type: object, properties: {type: {type: string}}}
- id: implement
command: execute
depends_on: [classify]
when: "$classify.output.type == 'FEATURE'"
- id: run_lint
bash: "bun run lint"
depends_on: [implement]
- id: iterate
loop:
until: "COMPLETE"
max_iterations: 10
prompt: "Iterate until the tests pass. Signal COMPLETE when done."
depends_on: [run_lint]
```
## Variable Substitution
| Variable | Resolved to |
|----------|-------------|
| `$1`, `$2`, `$3` | Positional arguments from user message |
| `$ARGUMENTS` | All user arguments as single string |
| `$ARTIFACTS_DIR` | Pre-created external artifacts directory |
| `$WORKFLOW_ID` | Current workflow run ID |
| `$BASE_BRANCH` | Base branch from config or auto-detected |
| `$DOCS_DIR` | Documentation directory path (default: `docs/`) |
| `$nodeId.output` | Captured stdout/AI output from completed DAG node |
## WorkflowDeps — Dependency Injection
`@archon/workflows` has ZERO `@archon/core` dependency. Everything is injected:
```typescript
interface WorkflowDeps {
store: IWorkflowStore; // DB abstraction
getAssistantClient: AssistantClientFactory; // Returns claude or codex client
loadConfig: (cwd: string) => Promise<WorkflowConfig>;
}
// Core creates the adapter:
import { createWorkflowDeps } from '@archon/core/workflows/store-adapter';
const deps = createWorkflowDeps();
await executeWorkflow(deps, platform, conversationId, cwd, workflow, ...);
```
## DAG Node Types
- `command:` — named file from `.archon/commands/`, AI-executed
- `prompt:` — inline prompt string, AI-executed
- `bash:` — shell script, no AI; stdout captured as `$nodeId.output`; default timeout 120000ms
DAG node options: `depends_on`, `when` (condition expression), `trigger_rule` (`all_success` | `one_success` | `none_failed_min_one_success` | `all_done`), `output_format` (JSON Schema, Claude only), `allowed_tools` / `denied_tools` (Claude only), `idle_timeout` (ms), `context: 'fresh'`, per-node `provider` and `model`.
## Event Emitter for Observability
```typescript
import { getWorkflowEventEmitter } from '@archon/workflows';
const emitter = getWorkflowEventEmitter();
emitter.registerRun(runId, conversationId);
// Subscribe (returns unsubscribe fn)
const unsubscribe = emitter.subscribeForConversation(conversationId, (event) => {
// event.type: 'step_started' | 'step_completed' | 'node_started' | ...
});
```
Listener errors never propagate to the executor — fire-and-forget with internal catch.
## Architecture
- Model validation at load time — invalid provider/model combinations fail `parseWorkflow()` with clear error
- Resilient discovery — one broken YAML doesn't abort `discoverWorkflows()`; errors returned in `WorkflowLoadResult.errors`
- Bundled defaults embedded in binary builds; loaded from filesystem in source builds
- Repo workflows override bundled defaults by name
- Router fallback: if no `/invoke-workflow` produced → falls back to `archon-assist`; raw AI response only when `archon-assist` unavailable
## Anti-patterns
- Never import `@archon/core` from `@archon/workflows` (circular dependency)
- Never add `clearContext: true` to every step — context continuity is valuable; use sparingly
- Never put `output_format` on Codex nodes — it logs a warning and is ignored
- Never set `allowed_tools: undefined` expecting "no tools" — use `allowed_tools: []` for that

View file

@ -119,9 +119,11 @@ If Bun was just installed in Prerequisites (macOS/Linux), use `~/.bun/bin/bun` i
3. Verify: `archon version`
4. Check Claude is installed: `which claude`, then `claude /login` if needed
> **Note — Claude Code binary path.** Archon does not bundle Claude Code. In compiled Archon binaries (quick install, Homebrew), the Claude Code SDK needs `CLAUDE_BIN_PATH` set to the absolute path of its `cli.js`. The `archon setup` wizard in Step 4 auto-detects this via `npm root -g` and writes it to `~/.archon/.env` — no manual action needed in the typical case. Source installs (`bun run`) don't need this; the SDK finds `cli.js` via `node_modules` automatically.
## Step 4: Configure Credentials
The CLI loads infrastructure config (database, tokens) from `~/.archon/.env` only. This prevents conflicts with project `.env` files that may contain different database URLs.
Archon loads infrastructure config (database, tokens) from two archon-owned files — `~/.archon/.env` (user scope) and `<cwd>/.archon/.env` (repo scope, overrides user). The project's own `<cwd>/.env` is stripped at boot so it cannot leak into Archon; `archon setup` never writes to it.
Credential configuration runs in a separate terminal so your API keys stay private — the AI assistant won't see them.
@ -144,7 +146,7 @@ Tell the user:
> 2. AI assistant configuration (Claude and/or Codex)
> 3. Platform tokens for any integrations you selected
>
> It saves configuration to both `~/.archon/.env` and the repo `.env`."
> By default it saves to `~/.archon/.env` (user scope). Re-run with `archon setup --scope project` to write `<repo>/.archon/.env` instead (project overrides user for this repo). Existing values are preserved — a timestamped backup is written before every rewrite."
**If the terminal opened automatically**, add:
> "Complete the wizard in the new terminal window that just opened."
@ -158,7 +160,7 @@ Both paths are normal — the manual path is not an error.
Wait for the user to confirm they've completed the setup wizard before proceeding.
### 5c: Verify Configuration
### 4c: Verify Configuration
After the user confirms setup is complete:
@ -170,7 +172,7 @@ Should show:
- `Database: sqlite` (default, zero setup) or `Database: postgresql` (if DATABASE_URL was configured)
- No errors about missing configuration
### 5d: Run Database Migrations (PostgreSQL only)
### 4d: Run Database Migrations (PostgreSQL only)
**SQLite users: skip this step.** SQLite is auto-initialized on first run with zero setup.
@ -299,16 +301,21 @@ For advanced users — these are not needed for basic setup:
### Environment Files (`.env`)
Infrastructure config (database URL, platform tokens) is stored in `.env` files:
Archon's env model is scoped by directory ownership: `.archon/` is archon-owned, anything else belongs to you.
| Location | Used by | Purpose |
|----------|---------|---------|
| `~/.archon/.env` | **CLI** | Global infrastructure config — database, AI tokens |
| `<archon-repo>/.env` | **Server** | Platform tokens for Telegram/Slack/GitHub/Discord |
| Path | Stripped at boot? | Archon loads? | `archon setup` writes? |
|------|-------------------|---------------|------------------------|
| `<cwd>/.env` | **yes** (safety guard) | never | never |
| `<cwd>/.archon/.env` | no | yes (project scope, overrides user scope) | yes iff `--scope project` |
| `~/.archon/.env` | no | yes (user scope) | yes iff `--scope home` (default) |
**Best practice**: Use `~/.archon/.env` as the single source of truth. Symlink or copy to `<archon-repo>/.env` if running the server.
**Which should I use?**
**Note**: The CLI does NOT load `.env` from the current working directory. This prevents conflicts when running Archon from projects that have their own database configurations.
- `~/.archon/.env` — defaults that apply everywhere (your personal `SLACK_WEBHOOK`, `DATABASE_URL`, bot tokens).
- `<cwd>/.archon/.env` — per-project overrides (different webhook per repo, different DB per environment).
- `<cwd>/.env` — your app's env file; archon strips these keys at boot so nothing leaks between your app and archon.
`archon setup` writes to exactly one archon-owned file chosen by `--scope` (default `home`), merges into existing content so user-added keys survive, and writes a timestamped backup before every rewrite. Use `--force` to opt into wholesale overwrite (backup still written).
### Config Files (YAML)

View file

@ -170,7 +170,7 @@ Command/prompt nodes only:
required: [issue_type]
```
Enables `$classify.output.issue_type` field access. Works with Claude and Codex.
Enables `$classify.output.issue_type` field access. SDK-enforced on Claude and Codex; best-effort on Pi (schema is appended to the prompt and JSON is parsed out of the result text).
## Per-Node Provider and Model

View file

@ -97,13 +97,16 @@ Read the commit messages and the actual diffs (`git diff main..dev`) to understa
- `pyproject.toml`: update `version = "x.y.z"`
- `Cargo.toml`: update `version = "x.y.z"`
2. **Lockfile refresh** (stack-dependent):
2. **Workspace version sync** (monorepo only):
- If `scripts/sync-versions.sh` exists, run `bash scripts/sync-versions.sh` to sync all `packages/*/package.json` versions to match the root version.
3. **Lockfile refresh** (stack-dependent):
- `package.json` + `bun.lock`: run `bun install`
- `package.json` + `package-lock.json`: run `npm install --package-lock-only`
- `pyproject.toml` + `uv.lock`: run `uv lock --quiet`
- `Cargo.toml`: run `cargo update --workspace`
3. **`CHANGELOG.md`** — prepend new version section:
4. **`CHANGELOG.md`** — prepend new version section:
```markdown
## [x.y.z] - YYYY-MM-DD
@ -141,8 +144,8 @@ Ask: "Does this look good? I'll commit and create the PR."
Only after user approval:
```bash
# Stage version file, lockfile, and changelog
git add <version-file> <lockfile> CHANGELOG.md
# Stage version file, workspace packages, lockfile, and changelog
git add <version-file> packages/*/package.json <lockfile> CHANGELOG.md
git commit -m "Release x.y.z"
# Push dev
@ -186,12 +189,22 @@ git pull origin main
git push origin dev
```
> **Important**: This sync ensures dev has the merge commit from main. Without it,
> dev and main diverge. The CI `update-homebrew` job only pushes the formula
> commit to dev — it does not bring the PR merge commit onto dev. This manual
> `git pull origin main` is what ensures dev has the merge commit.
The GitHub Release is distinct from the git tag — without it, the release won't appear on the repository's Releases page. Always create it.
If the user merges the PR themselves and comes back, still offer to tag, release, and sync.
### Step 10: Wait for Release Workflow and Update Homebrew Formula
> **Note**: The `update-homebrew` CI job in `.github/workflows/release.yml` runs automatically
> after the release job and handles the formula update + push to dev (part of Step 10).
> Step 11 (tap sync to `coleam00/homebrew-archon`) is always manual. Check the Actions tab
> before running Step 10 manually.
After the tag is pushed, `.github/workflows/release.yml` builds platform binaries and uploads them to the GitHub release. This takes 5-10 minutes. The Homebrew formula SHA256 values cannot be known until these binaries exist.
**Wait for all assets to appear on the release:**
@ -200,16 +213,16 @@ After the tag is pushed, `.github/workflows/release.yml` builds platform binarie
echo "Waiting for release workflow to finish uploading binaries..."
for i in {1..30}; do
ASSET_COUNT=$(gh release view "vx.y.z" --repo coleam00/Archon --json assets --jq '.assets | length')
# Expect 6 assets: 5 binaries (darwin-arm64, darwin-x64, linux-arm64, linux-x64, windows-x64.exe) + checksums.txt
if [ "$ASSET_COUNT" -ge 6 ]; then
# Expect 7 assets: 5 binaries (darwin-arm64, darwin-x64, linux-arm64, linux-x64, windows-x64.exe) + archon-web.tar.gz + checksums.txt
if [ "$ASSET_COUNT" -ge 7 ]; then
echo "All $ASSET_COUNT assets uploaded"
break
fi
echo " Assets so far: $ASSET_COUNT/6 — waiting 30s (attempt $i/30)..."
echo " Assets so far: $ASSET_COUNT/7 — waiting 30s (attempt $i/30)..."
sleep 30
done
if [ "$ASSET_COUNT" -lt 6 ]; then
if [ "$ASSET_COUNT" -lt 7 ]; then
echo "ERROR: Release workflow did not finish uploading assets after 15 minutes"
echo "Check https://github.com/coleam00/Archon/actions for the release workflow run"
exit 1

View file

@ -222,7 +222,23 @@ git commit -q --allow-empty -m init
### Test 3 — SDK path works (assist workflow)
In the same `$TESTREPO`:
**Prerequisite.** Compiled binaries require Claude Code installed on the host and a configured binary path. Before running this test, ensure one of:
```bash
# Option A — env var (easy for ad-hoc testing)
# After the native installer (Anthropic's default):
export CLAUDE_BIN_PATH="$HOME/.local/bin/claude"
# Or after npm global install:
export CLAUDE_BIN_PATH="$(npm root -g)/@anthropic-ai/claude-code/cli.js"
# Option B — config file (persistent)
# Add to ~/.archon/config.yaml:
# assistants:
# claude:
# claudeBinaryPath: /absolute/path/to/claude
```
Then in the same `$TESTREPO`:
```bash
"$BINARY" workflow run assist "say hello and nothing else" 2>&1 | tee /tmp/archon-test-assist.log
@ -232,15 +248,34 @@ In the same `$TESTREPO`:
- Exit code 0
- The Claude subprocess spawns successfully (no `spawn EACCES`, `ENOENT`, or `process exited with code 1` in the early output)
- No `Claude Code CLI not found` error (that means the resolver rejected the configured path — verify the cli.js actually exists)
- A response is produced (any response — even just "hello" — proves the SDK round-trip works)
**Common failures:**
- `Claude Code not found``CLAUDE_BIN_PATH` / `claudeBinaryPath` is unset or points at a non-existent file. Fix the path and re-run.
- `Module not found "/Users/runner/..."` → regression of #1210: the resolver was bypassed and the SDK's `import.meta.url` fallback leaked a build-host path. Investigate `packages/providers/src/claude/provider.ts` and the resolver.
- `Credit balance is too low` → auth is pointing at an exhausted API key (check `CLAUDE_USE_GLOBAL_AUTH` and `~/.archon/.env`)
- `unable to determine transport target for "pino-pretty"`#960 regression, binary crashes on TTY
- `package.json not found (bad installation?)`#961 regression, `isBinaryBuild` detection broken
- Process exits before producing output → generic spawn failure, capture stderr
### Test 3b — Resolver error path (run without `CLAUDE_BIN_PATH`)
Quickly verify the resolver fails loud when nothing is configured:
```bash
(unset CLAUDE_BIN_PATH; "$BINARY" workflow run assist "hello" 2>&1 | tee /tmp/archon-test-no-path.log)
```
**Pass criteria (when no `~/.archon/config.yaml` configures `claudeBinaryPath`):**
- Error message contains `Claude Code not found`
- Error message mentions both `CLAUDE_BIN_PATH` and `claudeBinaryPath` as remediation options
- No `Module not found` stack traces referencing the CI filesystem
If you *do* have `claudeBinaryPath` set globally, skip this test or temporarily rename `~/.archon/config.yaml`.
### Test 4 — Env-leak gate refuses a leaky .env (optional, for releases including #1036/#1038/#983)
Create a second throwaway repo with a fake sensitive key:

View file

@ -14,6 +14,20 @@ CLAUDE_USE_GLOBAL_AUTH=true
# CLAUDE_CODE_OAUTH_TOKEN=...
# CLAUDE_API_KEY=...
# Claude Code executable path (REQUIRED for compiled Archon binaries)
# Archon does not bundle Claude Code — install it separately and point us at it.
# Dev mode (`bun run`) auto-resolves via node_modules.
# Alternatively, set `assistants.claude.claudeBinaryPath` in ~/.archon/config.yaml.
#
# Install (Anthropic's recommended native installer):
# macOS/Linux: curl -fsSL https://claude.ai/install.sh | bash
# Windows: irm https://claude.ai/install.ps1 | iex
#
# Then:
# CLAUDE_BIN_PATH=$HOME/.local/bin/claude (native installer)
# CLAUDE_BIN_PATH=$(npm root -g)/@anthropic-ai/claude-code/cli.js (npm alternative)
# CLAUDE_BIN_PATH=
# Codex Authentication (get from ~/.codex/auth.json after running 'codex login')
# Required if using Codex as AI assistant
# On Linux/Mac: cat ~/.codex/auth.json
@ -22,9 +36,29 @@ CODEX_ID_TOKEN=
CODEX_ACCESS_TOKEN=
CODEX_REFRESH_TOKEN=
CODEX_ACCOUNT_ID=
# CODEX_BIN_PATH= # Optional: path to Codex native binary (binary builds only)
# Default AI Assistant (claude | codex)
# Used for new conversations when no codebase specified
# Pi (community provider — @mariozechner/pi-coding-agent)
# One adapter, ~20 LLM backends. Archon's Pi adapter picks up credentials
# you've already configured via the Pi CLI (`pi /login` writes to
# ~/.pi/agent/auth.json), plus these env vars for backends you haven't
# logged into via OAuth. Env vars override auth.json per-request.
#
# Use by setting `provider: pi` and `model: <pi-provider-id>/<model-id>` in
# workflow YAML or `.archon/config.yaml` (e.g. model: google/gemini-2.5-pro).
#
# ANTHROPIC_API_KEY= # Pi provider id: anthropic
# OPENAI_API_KEY= # Pi provider id: openai
# GEMINI_API_KEY= # Pi provider id: google
# GROQ_API_KEY= # Pi provider id: groq
# MISTRAL_API_KEY= # Pi provider id: mistral
# CEREBRAS_API_KEY= # Pi provider id: cerebras
# XAI_API_KEY= # Pi provider id: xai
# OPENROUTER_API_KEY= # Pi provider id: openrouter
# HUGGINGFACE_API_KEY= # Pi provider id: huggingface
# Default AI Assistant (must match a registered provider, e.g. claude, codex, pi)
# Used for new conversations when no codebase specified — errors on unknown values
DEFAULT_AI_ASSISTANT=claude
# Title Generation Model (optional)
@ -118,7 +152,7 @@ GITEA_ALLOWED_USERS=
# GITEA_BOT_MENTION=archon
# Server
PORT=3000
# PORT=3090 # Default: 3090. Uncomment to override — must match between server and Vite proxy.
# HOST=0.0.0.0 # Bind address (default: 0.0.0.0). Set to 127.0.0.1 to restrict to localhost only.
# Cloud Deployment (for --profile cloud with Caddy reverse proxy)
@ -172,3 +206,17 @@ MAX_CONCURRENT_CONVERSATIONS=10 # Maximum concurrent AI conversations (default:
# Session Retention
# SESSION_RETENTION_DAYS=30 # Delete inactive sessions older than N days (default: 30)
# Anonymous Telemetry (optional)
# Archon sends anonymous workflow-invocation events to PostHog so maintainers
# can see which workflows get real usage. No PII — workflow name/description +
# platform + Archon version + a random install UUID. No identities, no prompts,
# no paths, no code. See README "Telemetry" for the full list.
#
# Opt out (any one disables telemetry):
# ARCHON_TELEMETRY_DISABLED=1
# DO_NOT_TRACK=1 (de facto standard)
#
# Point at a self-hosted PostHog or a different project:
# POSTHOG_API_KEY=phc_yourKeyHere
# POSTHOG_HOST=https://eu.i.posthog.com (default: https://us.i.posthog.com)

123
.github/workflows/e2e-smoke.yml vendored Normal file
View file

@ -0,0 +1,123 @@
name: E2E Smoke Tests
on:
push:
branches: [main, dev]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
# ─── Tier 1: Deterministic (no API keys needed) ────────────────────────
e2e-deterministic:
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: 1.3.11
- name: Setup uv (for Python script nodes)
uses: astral-sh/setup-uv@v4
- name: Install dependencies
run: bun install --frozen-lockfile
- name: Run deterministic workflow
run: bun run cli workflow run e2e-deterministic --no-worktree "smoke test"
# ─── Tier 2a: Claude provider ──────────────────────────────────────────
e2e-claude:
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: 1.3.11
- name: Install Claude Code CLI
run: |
curl -fsSL https://claude.ai/install.sh | bash
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Install dependencies
run: bun install --frozen-lockfile
- name: Run Claude smoke test
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
CLAUDE_BIN_PATH: ~/.local/bin/claude
run: bun run cli workflow run e2e-claude-smoke --no-worktree "smoke test"
# ─── Tier 2b: Codex provider ───────────────────────────────────────────
e2e-codex:
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: 1.3.11
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 22
- name: Install Codex CLI
run: npm install -g @openai/codex
- name: Install dependencies
run: bun install --frozen-lockfile
- name: Run Codex smoke test
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
CODEX_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: bun run cli workflow run e2e-codex-smoke --no-worktree "smoke test"
# ─── Tier 3: Mixed providers ───────────────────────────────────────────
e2e-mixed:
runs-on: ubuntu-latest
timeout-minutes: 5
needs: [e2e-claude, e2e-codex]
steps:
- uses: actions/checkout@v4
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: 1.3.11
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 22
- name: Install Claude Code CLI
run: |
curl -fsSL https://claude.ai/install.sh | bash
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Install Codex CLI
run: npm install -g @openai/codex
- name: Install dependencies
run: bun install --frozen-lockfile
- name: Run mixed providers test
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
CODEX_API_KEY: ${{ secrets.OPENAI_API_KEY }}
CLAUDE_BIN_PATH: ~/.local/bin/claude
run: bun run cli workflow run e2e-mixed-providers --no-worktree "smoke test"

View file

@ -124,6 +124,83 @@ jobs:
exit 1
fi
- name: Smoke-test Claude binary-path resolver (negative case)
if: matrix.target == 'bun-linux-x64' && runner.os == 'Linux'
run: |
# With no CLAUDE_BIN_PATH and no config, running a Claude workflow must
# fail with a clear, user-facing error — NOT with "Module not found
# /Users/runner/..." which would indicate the resolver was bypassed.
BIN="$PWD/dist/${{ matrix.binary }}"
TMP_REPO=$(mktemp -d)
cd "$TMP_REPO"
git init -q
git -c user.email=ci@example.com -c user.name=ci commit --allow-empty -q -m init
# Run without CLAUDE_BIN_PATH set. Expect a clean resolver error.
# Capture both stdout and stderr; we only care that the resolver message is present.
set +e
OUTPUT=$(env -u CLAUDE_BIN_PATH "$BIN" workflow run archon-assist "hello" 2>&1)
EXIT_CODE=$?
set -e
echo "$OUTPUT"
if echo "$OUTPUT" | grep -qE 'Module not found.*Users/runner'; then
echo "::error::Resolver was bypassed — SDK hit the import.meta.url fallback (regression of #1210)"
exit 1
fi
if ! echo "$OUTPUT" | grep -q "Claude Code not found"; then
echo "::error::Expected 'Claude Code not found' error when CLAUDE_BIN_PATH is unset"
exit 1
fi
if ! echo "$OUTPUT" | grep -q "CLAUDE_BIN_PATH"; then
echo "::error::Error message does not reference CLAUDE_BIN_PATH remediation"
exit 1
fi
echo "::notice::Resolver error path works (exit code: $EXIT_CODE)"
- name: Smoke-test Claude subprocess spawn (positive case)
if: matrix.target == 'bun-linux-x64' && runner.os == 'Linux'
run: |
# Install Claude Code via the native installer (Anthropic's recommended
# default) and run a workflow with CLAUDE_BIN_PATH set. The subprocess
# must spawn cleanly. We do NOT require the query to succeed (no auth
# in CI — an auth error is fine and expected); we only fail if the SDK
# can't find the executable, which would indicate a resolver regression.
curl -fsSL https://claude.ai/install.sh | bash
CLI_PATH="$HOME/.local/bin/claude"
if [ ! -x "$CLI_PATH" ]; then
echo "::error::Claude Code binary not found after curl install at $CLI_PATH"
ls -la "$HOME/.local/bin/" || true
exit 1
fi
echo "Using CLAUDE_BIN_PATH=$CLI_PATH"
BIN="$PWD/dist/${{ matrix.binary }}"
TMP_REPO=$(mktemp -d)
cd "$TMP_REPO"
git init -q
git -c user.email=ci@example.com -c user.name=ci commit --allow-empty -q -m init
set +e
OUTPUT=$(CLAUDE_BIN_PATH="$CLI_PATH" "$BIN" workflow run archon-assist "hello" 2>&1)
EXIT_CODE=$?
set -e
echo "$OUTPUT"
if echo "$OUTPUT" | grep -qE 'Module not found.*(cli\.js|Users/runner)'; then
echo "::error::Subprocess could not find the executable (resolver regression)"
exit 1
fi
if echo "$OUTPUT" | grep -q "Claude Code not found"; then
echo "::error::Resolver failed even though CLAUDE_BIN_PATH was set to an existing file"
exit 1
fi
# Any of these outcomes are acceptable — they prove the subprocess spawned:
# - auth error ("credit balance", "unauthorized", "authentication")
# - rate-limit / API error
# - successful query (if auth was injected via some other mechanism)
echo "::notice::Claude subprocess spawn path is healthy (exit code: $EXIT_CODE)"
- name: Upload binary artifact
uses: actions/upload-artifact@v4
with:
@ -145,10 +222,25 @@ jobs:
path: dist
merge-multiple: true
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: 1.3.11
- name: Install dependencies
run: bun install --frozen-lockfile
- name: Build web UI
run: bun --filter @archon/web build
- name: Package web dist
run: |
tar czf dist/archon-web.tar.gz -C packages/web/dist .
- name: Generate checksums
run: |
cd dist
sha256sum archon-* > checksums.txt
sha256sum archon-* archon-web.tar.gz > checksums.txt
cat checksums.txt
- name: Get version
@ -170,6 +262,7 @@ jobs:
generate_release_notes: true
files: |
dist/archon-*
dist/archon-web.tar.gz
dist/checksums.txt
body: |
## Installation

View file

@ -27,6 +27,9 @@ jobs:
- name: Install dependencies
run: bun install --frozen-lockfile
- name: Check bundled defaults
run: bun run check:bundled
- name: Type check
run: bun run type-check

4
.gitignore vendored
View file

@ -45,6 +45,9 @@ e2e-screenshots/
.archon/logs/
.archon/artifacts/
# Cross-run workflow state (e.g. issue-triage memory)
.archon/state/
# Agent artifacts (generated, local only)
.agents/
.agents/rca-reports/
@ -54,6 +57,7 @@ e2e-screenshots/
.claude/archon/
.claude/mockups/
.claude/settings.local.json
.claude/scheduled_tasks.lock
e2e-testing-findings-session2.md
# Local workspace

View file

@ -22,6 +22,9 @@ workspace/
# Lock files (auto-generated)
package-lock.json
# Auto-generated source (regenerated by scripts/generate-bundled-defaults.ts)
**/*.generated.ts
# Agent commands and documentation (user-managed)
.agents/
.claude/

View file

@ -7,6 +7,140 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
### Added
- **Home-scoped commands at `~/.archon/commands/`** — personal command helpers now reusable across every repo. Resolution precedence: `<repoRoot>/.archon/commands/` > `~/.archon/commands/` > bundled defaults. Surfaced in the Web UI workflow-builder node palette under a dedicated "Global (~/.archon/commands/)" section.
- **Home-scoped scripts at `~/.archon/scripts/`** — personal Bun/uv scripts now reusable across every repo. Script nodes (`script: my-helper`) resolve via `<repoRoot>/.archon/scripts/` first, then `~/.archon/scripts/`. Repo-scoped scripts with the same name override home-scoped ones silently; within a single scope, duplicate basenames across extensions still throw (unchanged from prior behavior).
- **1-level subfolder support for workflows, commands, and scripts.** Files can live one folder deep under their respective `.archon/` root (e.g. `.archon/workflows/triage/foo.yaml`) and resolve by name or filename regardless of subfolder. Matches the existing `defaults/` convention. Deeper nesting is ignored silently — see docs for the full convention.
- **`'global'` variant on `WorkflowSource`** — workflows at `~/.archon/workflows/` and commands at `~/.archon/commands/` now render with a distinct source label (no longer coerced to `'project'`). Web UI badges updated.
- **`getHomeWorkflowsPath()`, `getHomeCommandsPath()`, `getHomeScriptsPath()`, `getLegacyHomeWorkflowsPath()`** helpers in `@archon/paths`, exported for both internal discovery and external callers that want to target the home scope directly.
- **`discoverScriptsForCwd(cwd)`** in `@archon/workflows/script-discovery` — merges home-scoped + repo-scoped scripts with repo winning on name collisions. Used by the DAG executor and validator; callers no longer need to know about the two-scope shape.
- **Workflow-level worktree policy (`worktree.enabled` in workflow YAML).** A workflow can now pin whether its runs use isolation regardless of how they were invoked: `worktree.enabled: false` always runs in the live checkout (CLI `--branch` / `--from` hard-error; web/chat/orchestrator short-circuits `validateAndResolveIsolation`), `worktree.enabled: true` requires isolation (CLI `--no-worktree` hard-errors). Omit the block to let the caller decide (current default). First consumer: `.archon/workflows/repo-triage.yaml` pinned to `enabled: false` since it's read-only.
- **Per-project worktree path (`worktree.path` in `.archon/config.yaml`).** Opt-in repo-relative directory (e.g. `.worktrees`) where Archon places worktrees for that repo, instead of the default `~/.archon/workspaces/<owner>/<repo>/worktrees/`. Co-locates worktrees with the project so they appear in the IDE file tree. Validated as a safe relative path (no absolute, no `..`); malformed values fail loudly at worktree creation. Users opting in are responsible for `.gitignore`ing the directory themselves — no automatic file mutation. Credits @joelsb for surfacing the need in #1117.
- **Three-path env model with operator-visible log lines.** The CLI and server now load env vars from `~/.archon/.env` (user scope) and `<cwd>/.archon/.env` (repo scope, overrides user) at boot, both with `override: true`. A new `[archon] loaded N keys from <path>` line is emitted per source (only when N > 0). `[archon] stripped N keys from <cwd> (...)` now also prints when stripCwdEnv removes target-repo env keys, replacing the misleading `[dotenv@17.3.1] injecting env (0) from .env` preamble that always reported 0. The `quiet: true` flag suppresses dotenv's own output. (#1302)
- **`archon setup --scope home|project` and `--force` flags.** Default is `--scope home` (writes `~/.archon/.env`). `--scope project` targets `<cwd>/.archon/.env` instead. `--force` overwrites the target wholesale rather than merging; a timestamped backup is still written. (#1303)
- **Merge-only setup writes with timestamped backups.** `archon setup` now reads the existing target file, preserves non-empty values, carries user-added custom keys forward, and writes a `<target>.archon-backup-<ISO-ts>` before every rewrite. Fixes silent PostgreSQL→SQLite downgrade and silent token loss on re-run. (#1303)
- **`getArchonEnvPath()` and `getRepoArchonEnvPath(cwd)`** helpers in `@archon/paths`, plus a new `@archon/paths/env-loader` subpath exporting `loadArchonEnv(cwd)` shared by the CLI and server entry points.
- **Inline sub-agent definitions on DAG nodes (`agents:`).** Define Claude Agent SDK `AgentDefinition`s directly in workflow YAML, keyed by kebab-case agent ID. The main agent can spawn them in parallel via the `Task` tool — useful for map-reduce patterns where a cheap model (e.g. Haiku) briefs items and a stronger model reduces. Removes the need to author `.claude/agents/*.md` files for workflow-scoped helpers. Claude only; Codex and community providers that don't support inline agents emit a capability warning and ignore the field. Merges with the internal `dag-node-skills` wrapper set by `skills:` on the same node — user-defined agents win on ID collision (a warning is logged). (#1276)
- **Pi community provider (`@mariozechner/pi-coding-agent`).** First community provider under the Phase 2 registry (`builtIn: false`). One adapter exposes ~20 LLM backends (Anthropic, OpenAI, Google, Groq, Mistral, Cerebras, xAI, OpenRouter, Hugging Face, and more) via a `<pi-provider-id>/<model-id>` model format. Reads credentials from `~/.pi/agent/auth.json` (populated by running `pi /login` for OAuth subscriptions like Claude Pro/Max, ChatGPT Plus, GitHub Copilot) AND from env vars (env vars take priority per-request). Per-node workflow options supported: `effort`/`thinking` → Pi `thinkingLevel`; `allowed_tools`/`denied_tools` → filter Pi's 7 built-in coding tools; `skills` → resolved against `.agents/skills`, `.claude/skills` (project + user-global); `systemPrompt`; codebase env vars; session resume via `sessionId` round-trip. Unsupported fields (MCP, hooks, structured output, cost limits, fallback model, sandbox) trigger an explicit dag-executor warning rather than silently dropping. Use in workflow YAML: `provider: pi` + `model: anthropic/claude-haiku-4-5`. (#1270)
- **`registerCommunityProviders()` aggregator** in `@archon/providers`. Process entrypoints (CLI, server, config-loader) now call one function to register every bundled community provider. Adding a new community provider is a single-line edit to this aggregator rather than touching each entrypoint — makes the Phase 2 "community providers are a localized addition" promise real.
- **`contributing/adding-a-community-provider.md` guide** — contributor-facing walkthrough of the Phase 2 registry pattern using Pi as the reference implementation.
### Fixed
- **`archon setup` no longer writes to `<repo>/.env`.** Prior versions unconditionally wrote the generated config to both `~/.archon/.env` and `<repo>/.env`, destroying user-added secrets and silently downgrading PostgreSQL configs to SQLite when re-run in "Add" mode. The write side now targets exactly one archon-owned file (home or project scope via `--scope`), merges into existing content by default, and writes a timestamped backup. `<repo>/.env` is never touched — it belongs to the user's target project. (#1303)
- **CLI and server no longer silently lose repo-local env vars.** Previously, env vars in `<repo>/.env` were parsed, deleted from `process.env` by `stripCwdEnv()`, and the only output operators saw was `[dotenv@17.3.1] injecting env (0) from .env` — which read as "file was empty." Workflows that needed `SLACK_WEBHOOK` or similar had no way to recover without knowing to use `~/.archon/.env`. The new `<cwd>/.archon/.env` path + archon-owned log lines make the load state observable and recoverable. (#1302)
- **Server startup no longer marks actively-running workflows as failed.** The `failOrphanedRuns()` call has been removed from `packages/server/src/index.ts` to match the CLI precedent (`packages/cli/src/cli.ts:256-258`). Per the new CLAUDE.md principle "No Autonomous Lifecycle Mutation Across Process Boundaries", a stuck `running` row is now transitioned explicitly by the user: via the per-row Cancel/Abandon buttons on the dashboard workflow card, or `archon workflow abandon <run-id>` from the CLI. (`archon workflow cleanup` is a separate command that deletes OLD terminal runs for disk hygiene — it does not handle stuck `running` rows.) Closes #1216.
### Changed
- **Home-scoped workflow location moved to `~/.archon/workflows/`** (was `~/.archon/.archon/workflows/` — a double-nested path left over from reusing the repo-relative discovery helper for home scope). The new path sits next to `~/.archon/workspaces/`, `archon.db`, and `config.yaml`, matching the rest of the `~/.archon/` convention. If Archon detects workflows at the old location, it emits a one-time WARN per process with the exact migration command: `mv ~/.archon/.archon/workflows ~/.archon/workflows && rmdir ~/.archon/.archon`. The old path is no longer read — users must migrate manually (clean cut, no deprecation window). Rollback caveat: if you downgrade after migrating, move the directory back to the old location.
- **Workflow discovery no longer takes a `globalSearchPath` option.** `discoverWorkflows()` and `discoverWorkflowsWithConfig()` now consult `~/.archon/workflows/` automatically — every caller gets home-scoped discovery for free. Previously-missed call sites in the chat command handler (`command-handler.ts`), the Web UI workflow picker (`api.ts GET /api/workflows`), and the orchestrator's single-codebase resolve path now see home-scoped workflows without needing a maintainer patch at every new call site. Closes #1136; supersedes that PR (credits @jonasvanderhaegen for surfacing the bug class).
- **Dashboard nav tab** now shows a numeric count of running workflows instead of a binary pulse dot. Reads from the existing `/api/dashboard/runs` `counts.running` field; same 10s polling interval.
- **Workflow run destructive actions** (Abandon, Cancel, Delete, Reject) now use a proper confirmation dialog matching the codebase-delete UX, replacing the browser's native `window.confirm()` popups. Each dialog includes context-appropriate copy describing what the action does to the run record.
- **Claude Code binary resolution** (breaking for compiled binary users): Archon no longer embeds the Claude Code SDK into compiled binaries. In compiled builds, you must install Claude Code separately (`curl -fsSL https://claude.ai/install.sh | bash` on macOS/Linux, `irm https://claude.ai/install.ps1 | iex` on Windows, or `npm install -g @anthropic-ai/claude-code`) and point Archon at the executable via `CLAUDE_BIN_PATH` env var or `assistants.claude.claudeBinaryPath` in `.archon/config.yaml`. The Claude Agent SDK accepts either the native compiled binary (from the curl/PowerShell installer at `~/.local/bin/claude`) or a JS `cli.js` (from the npm install). Dev mode (`bun run`) is unaffected — the SDK resolves via `node_modules` as before. The Docker image ships Claude Code pre-installed with `CLAUDE_BIN_PATH` pre-set, so `docker run` still works out of the box. Resolves silent "Module not found /Users/runner/..." failures on macOS (#1210) and Windows (#1087).
### Added
- **`CLAUDE_BIN_PATH` environment variable** — highest-precedence override for the Claude Code SDK `cli.js` path (#1176)
- **`assistants.claude.claudeBinaryPath` config option** — durable config-file alternative to the env var (#1176)
- **Release-workflow Claude subprocess smoke test** — the release CI now installs Claude Code on the Linux runner and exercises the resolver + subprocess spawn, catching binary-resolution regressions before they ship
### Removed
- **`globalSearchPath` option** from `discoverWorkflows()` and `discoverWorkflowsWithConfig()`. Callers that previously passed `{ globalSearchPath: getArchonHome() }` should drop the argument; home-scoped discovery is now automatic.
- **`@anthropic-ai/claude-agent-sdk/embed` import** — the Bun `with { type: 'file' }` asset-embedding path and its `$bunfs` extraction logic. The embed was a bundler-dependent optimization that failed silently when Bun couldn't produce a usable virtual FS path (#1210, #1087); it is replaced by explicit binary-path resolution.
### Fixed
- **Cross-clone worktree isolation**: prevent workflows in one local clone from silently adopting worktrees or DB state owned by another local clone of the same remote. Two clones sharing a remote previously resolved to the same `codebase_id`, causing the isolation resolver's DB-driven paths (`findReusable`, `findLinkedIssueEnv`, `tryBranchAdoption`) to return the other clone's environment. All adoption paths now verify the worktree's `.git` pointer matches the requesting clone and throw a classified error on mismatch. `archon-implement` prompt was also tightened to stop AI agents from adopting unrelated branches they see via `git branch`. Thanks to @halindrome for the three-issue root-cause mapping. (#1193, #1188, #1183, #1198, #1206)
## [0.3.6] - 2026-04-12
Web UI workflow experience improvements, CWD environment leak protection, and bug fixes.
### Added
- Workflow result card now shows status, duration, node count, and artifact links in chat (#1015)
- Loop iteration progress display in the workflow execution view (#1014)
- Artifact file paths in chat messages are now clickable (#1023)
### Changed
- CWD `.env` variables are now stripped from AI subprocess environments at the `@archon/paths` layer, replacing the old `SUBPROCESS_ENV_ALLOWLIST` approach. Prevents accidental credential leaks from target repo `.env` files (#1067, #1030, #1098, #1070)
- Update check cache TTL reduced from 24 hours to 1 hour
### Fixed
- Duplicate text and tool calls appearing in workflow execution view
- `workflow_step` SSE events not handled correctly, causing missing progress updates
- Nested interactive elements in workflow UI causing React warnings
- Workflow status messages not splitting correctly in WorkflowLogs
- Incorrect `remainingMessage` suppression in stream mode causing lost output
- Binary builds now use `BUNDLED_VERSION` for the app version instead of reading `package.json`
## [0.3.5] - 2026-04-10
Fixes for `archon serve` process lifecycle and static file serving.
### Fixed
- **`archon serve` process exits immediately**: the CLI called `process.exit(0)` after `startServer()` returned, killing the server. Now blocks on SIGINT/SIGTERM so the server stays running (#1047)
- **Web dist path existence check**: server logs a warning at startup if the web dist directory is missing, instead of silently serving 404s
- **Favicon route**: added explicit `/favicon.png` route for the web UI
## [0.3.4] - 2026-04-10
Binary env loading fix and release infrastructure improvements.
### Added
- **Docs site redesign**: logo, dark theme, feature cards, and enhanced CSS (#1022)
### Changed
- **Server env loading for binary support**: removed redundant CWD `.env` stripping — `SUBPROCESS_ENV_ALLOWLIST` and the env-leak gate already prevent target repo credentials from reaching AI subprocesses. Server now loads `~/.archon/.env` with `override: true` for all keys (not just `DATABASE_URL`), skips the `import.meta.dir` `.env` path in binary mode, and defaults `CLAUDE_USE_GLOBAL_AUTH=true` when no explicit credentials are set (#1045)
- **Workspace version sync**: all `packages/*/package.json` versions now sync from the root `package.json` during releases via `scripts/sync-versions.sh`
### Fixed
- **`archon serve` crash in compiled binaries**: the CWD env stripping + baked `import.meta.dir` path caused all credentials to be lost, triggering `no_ai_credentials` exit on every startup
- **CLI `version` command reading stale version**: dev mode now reads from the monorepo root `package.json` instead of the CLI package's own version field
- **Release CI web build**: fixed `bun --filter` syntax and added missing `remark-gfm` transitive dependencies for Bun hoisting
## [0.3.3] - 2026-04-10
Binary distribution improvements, new workflow node type, and a batch of bug fixes.
### Added
- **`archon serve` command**: one-command way for compiled binary users to start the web UI server. Downloads a pre-built web UI tarball from GitHub releases on first run, verifies SHA-256 checksum, caches locally, then starts the full server (#1011)
- **Automatic update check**: binary users see a notification when a newer version is available on GitHub. Non-blocking, cached for 24 hours (#1039)
- **Script node type for DAG workflows**: `script:` nodes run inline TypeScript/Python or named scripts from `.archon/scripts/` via `bun` or `uv` runtimes. Supports `deps:` for dependency installation and `timeout:` in milliseconds (#999)
- **Codex native binary auto-resolution**: compiled builds now locate the Codex CLI binary automatically instead of requiring a manual `CODEX_CLI_PATH` override (#995, #1012)
### Fixed
- **Workflow reject ignores positional reason**: `archon workflow reject <id> <reason>` now correctly passes the reason argument to the rejection handler
- **Windows script path separators**: normalize backslashes to forward slashes in script node paths for cross-platform compatibility
- **PowerShell `Add-ToUserPath` corruption**: installer no longer corrupts `PATH` when only a single entry exists (#1000)
- **Validator `Promise.any` race condition**: script runtime checks no longer fail intermittently due to a `Promise.any` edge case (#1007, #1010)
- **Interactive-prd workflow bugs**: fixes to loop gate handling, variable substitution, and node ordering (#1001, #1002, #1003, #1005)
- **Community forge adapter exports**: added explicit export entries for Gitea and GitLab adapters so they resolve correctly in compiled builds (#1041)
- **Workflow graph view without codebase**: the web UI workflow graph now loads correctly even when no codebase is selected (#958)
## [0.3.2] - 2026-04-08
Critical hotfix: compiled binaries could not spawn Claude. Also fixes an env-leak gate false-positive for unregistered working directories.
### Fixed
- **Claude SDK spawn in compiled binaries**: the Claude Agent SDK was resolving its `cli.js` via `import.meta.url` of the bundled module, which `bun build --compile` freezes at build time to the build host's absolute `node_modules` path. Every binary shipped from CI carried a `/Users/runner/work/Archon/...` path that existed only on the GitHub Actions runner, and every `workflow run` hit `Module not found` after three retries. Now imports `@anthropic-ai/claude-agent-sdk/embed` so `cli.js` is embedded into the binary's `$bunfs` and extracted to a real temp path at runtime (#990).
- **Env-leak gate false-positive for unregistered cwd**: pre-spawn scan now skips cwd paths that aren't registered as codebases instead of blocking the workflow (#991, #992).
## [0.3.1] - 2026-04-08
Patch release: SQLite migration fix for existing databases and release build pipeline fix.
@ -96,7 +230,7 @@ Chat-first navigation redesign, DAG graph viewer, per-node MCP and skills, and e
- Idle timeout not detecting stuck tool calls during execution (#649)
- `commitAllChanges` failing on empty commits (#745)
- Explicit base branch config now required for worktree creation (#686)
- Subprocess-level retry added to CodexClient (#641)
- Subprocess-level retry added to CodexProvider (#641)
- Validate `cwd` query param against registered codebases (#630)
- Server-internal paths redacted from `/api/config` response (#632)
- SQLite conversations index missing `WHERE deleted_at IS NULL` (#629)
@ -148,7 +282,7 @@ DAG hardening, security fixes, validate-pr workflow, and worktree lifecycle mana
- **`--json` flag for `workflow list`** — machine-readable workflow output (#594)
- **`archon-validate-pr` workflow** with per-node idle timeout support (#635)
- **Typed SessionMetadata** with Zod validation for safer metadata handling (#600)
- **`persistSession: false`** in ClaudeClient to avoid disk pollution from session transcripts (#626)
- **`persistSession: false`** in ClaudeProvider to avoid disk pollution from session transcripts (#626)
- **DAG workflow for GitHub issue resolution** with structured node pipeline
### Changed

105
CLAUDE.md
View file

@ -68,7 +68,7 @@ These are implementation constraints, not slogans. Apply them by default.
**SRP + ISP — Single Responsibility + Interface Segregation**
- Keep each module and package focused on one concern
- Extend behavior by implementing existing narrow interfaces (`IPlatformAdapter`, `IAssistantClient`, `IDatabase`, `IWorkflowStore`) whenever possible
- Extend behavior by implementing existing narrow interfaces (`IPlatformAdapter`, `IAgentProvider`, `IDatabase`, `IWorkflowStore`) whenever possible
- Avoid fat interfaces and "god modules" that mix policy, transport, and storage
- Do not add unrelated methods to an existing interface — define a new one
@ -77,6 +77,12 @@ These are implementation constraints, not slogans. Apply them by default.
- Never silently broaden permissions or capabilities
- Document fallback behavior with a comment when a fallback is intentional and safe; otherwise throw
**No Autonomous Lifecycle Mutation Across Process Boundaries**
- When a process cannot reliably distinguish "actively running elsewhere" from "orphaned by a crash" — typically because the work was started by a different process or input source (CLI, adapter, webhook, web UI, cron) — it must not autonomously mark that work as failed/cancelled/abandoned based on a timer or staleness guess.
- Surface the ambiguous state to the user and provide a one-click action.
- Heuristics for *recoverable* operations (retry backoff, subprocess timeouts, hygiene cleanup of terminal-status data) remain appropriate; the rule is about destructive mutation of *non-terminal* state owned by an unknowable other party.
- Reference: #1216 and the CLI orphan-cleanup precedent at `packages/cli/src/cli.ts:256-258`.
**Determinism + Reproducibility**
- Prefer reproducible commands and locked dependency behavior in CI-sensitive paths
- Keep tests deterministic — no flaky timing or network dependence without guardrails
@ -122,7 +128,7 @@ bun test --watch # Watch mode (single package)
bun test packages/core/src/handlers/command-handler.test.ts # Single file
```
**Test isolation (mock.module pollution):** Bun's `mock.module()` permanently replaces modules in the process-wide cache — `mock.restore()` does NOT undo it ([oven-sh/bun#7823](https://github.com/oven-sh/bun/issues/7823)). To prevent cross-file pollution, packages that have conflicting `mock.module()` calls split their tests into separate `bun test` invocations: `@archon/core` (7 batches), `@archon/workflows` (5), `@archon/adapters` (4), `@archon/isolation` (3). See each package's `package.json` for the exact splits.
**Test isolation (mock.module pollution):** Bun's `mock.module()` permanently replaces modules in the process-wide cache — `mock.restore()` does NOT undo it ([oven-sh/bun#7823](https://github.com/oven-sh/bun/issues/7823)). To prevent cross-file pollution, packages that have conflicting `mock.module()` calls split their tests into separate `bun test` invocations: `@archon/core` (7 batches), `@archon/workflows` (5), `@archon/adapters` (3), `@archon/isolation` (3). See each package's `package.json` for the exact splits.
**Do NOT run `bun test` from the repo root** — it discovers all test files across all packages and runs them in one process, causing ~135 mock pollution failures. Always use `bun run test` (which uses `bun --filter '*' test` for per-package isolation).
@ -144,7 +150,7 @@ bun run format:check
bun run validate
```
This runs type-check, lint, format check, and tests. All four must pass for CI to succeed.
This runs `check:bundled`, type-check, lint, format check, and tests. All five must pass for CI to succeed.
### ESLint Guidelines
@ -198,10 +204,6 @@ bun run cli workflow run implement --branch feature-auth "Add auth"
# Opt out of isolation (run in live checkout)
bun run cli workflow run quick-fix --no-worktree "Fix typo"
# Grant env-leak-gate consent during auto-registration (for repos whose .env
# contains sensitive keys). Audit-logged with actor: 'user-cli'.
bun run cli workflow run plan --cwd /path/to/leaky/repo --allow-env-keys "..."
# Show running workflows
bun run cli workflow status
@ -244,6 +246,11 @@ bun run cli validate commands my-command # Single command
bun run cli complete <branch-name>
bun run cli complete <branch-name> --force # Skip uncommitted-changes check
# Start the web UI server (compiled binary only, downloads web UI on first run)
bun run cli serve
bun run cli serve --port 4000
bun run cli serve --download-only # Download without starting
# Show version
bun run cli version
```
@ -261,9 +268,17 @@ packages/
│ ├── adapters/ # CLI adapter (stdout output)
│ ├── commands/ # CLI command implementations
│ └── cli.ts # CLI entry point
├── providers/ # @archon/providers - AI agent providers (SDK deps live here)
│ └── src/
│ ├── types.ts # Contract layer (IAgentProvider, SendQueryOptions, MessageChunk — ZERO SDK deps)
│ ├── registry.ts # Typed provider registry (ProviderRegistration records)
│ ├── errors.ts # UnknownProviderError
│ ├── claude/ # ClaudeProvider + parseClaudeConfig + MCP/hooks/skills translation
│ ├── codex/ # CodexProvider + parseCodexConfig + binary-resolver
│ ├── community/pi/ # PiProvider (builtIn: false) — @mariozechner/pi-coding-agent, ~20 LLM backends
│ └── index.ts # Package exports
├── core/ # @archon/core - Shared business logic
│ └── src/
│ ├── clients/ # AI SDK clients (Claude, Codex)
│ ├── config/ # YAML config loading
│ ├── db/ # Database connection, queries
│ ├── handlers/ # Command handler (slash commands)
@ -284,7 +299,7 @@ packages/
│ ├── executor.ts # Workflow execution orchestrator (executeWorkflow)
│ ├── dag-executor.ts # DAG-specific execution logic
│ ├── store.ts # IWorkflowStore interface (database abstraction)
│ ├── deps.ts # WorkflowDeps injection types (IWorkflowPlatform, IWorkflowAssistantClient)
│ ├── deps.ts # WorkflowDeps injection types (IWorkflowPlatform, imports from @archon/providers/types)
│ ├── event-emitter.ts # Workflow observability events
│ ├── logger.ts # JSONL file logger
│ ├── validator.ts # Resource validation (command files, MCP configs, skill dirs)
@ -378,7 +393,7 @@ import type { DagNode, WorkflowDefinition } from '@/lib/api';
5. **`workflow_runs`** - Workflow execution tracking and state
6. **`workflow_events`** - Step-level workflow event log (step transitions, artifacts, errors)
7. **`messages`** - Conversation message history with tool call metadata (JSONB)
8. **`codebase_env_vars`** - Per-project env vars injected into Claude SDK subprocess env (managed via Web UI or `env:` in config)
8. **`codebase_env_vars`** - Per-project env vars injected into project-scoped execution surfaces (Claude, Codex, bash/script nodes, and direct chat when codebase-scoped), managed via Web UI or `env:` in config
**Key Patterns:**
- Conversation ID format: Platform-specific (`thread_ts`, `chat_id`, `user/repo#123`)
@ -394,12 +409,13 @@ import type { DagNode, WorkflowDefinition } from '@/lib/api';
### Architecture Layers
**Package Split:**
- **@archon/paths**: Path resolution utilities and Pino logger factory (no @archon/* deps)
- **@archon/paths**: Path resolution utilities, Pino logger factory, web dist cache path (`getWebDistDir`), CWD env stripper (`stripCwdEnv`, `strip-cwd-env-boot`) (no @archon/* deps; `pino` and `dotenv` are allowed external deps)
- **@archon/git**: Git operations - worktrees, branches, repos, exec wrappers (depends only on @archon/paths)
- **@archon/providers**: AI agent providers (Claude, Codex, Pi community) — owns SDK deps, `IAgentProvider` interface, `sendQuery()` contract, and provider-specific option translation. `@archon/providers/types` is the contract subpath (zero SDK deps, zero runtime side effects) that `@archon/workflows` imports from. Providers receive raw `nodeConfig` + `assistantConfig` and translate to SDK-specific options internally. Core providers live under `claude/` and `codex/`; community providers live under `community/` (currently `community/pi/`, registered with `builtIn: false`).
- **@archon/isolation**: Worktree isolation types, providers, resolver, error classifiers (depends only on @archon/git + @archon/paths)
- **@archon/workflows**: Workflow engine - loader, router, executor, DAG, logger, bundled defaults (depends only on @archon/git + @archon/paths + @hono/zod-openapi + zod; DB/AI/config injected via `WorkflowDeps`)
- **@archon/cli**: Command-line interface for running workflows
- **@archon/core**: Business logic, database, orchestration, AI clients (provides `createWorkflowStore()` adapter bridging core DB → `IWorkflowStore`)
- **@archon/workflows**: Workflow engine - loader, router, executor, DAG, logger, bundled defaults (depends only on @archon/git + @archon/paths + @archon/providers/types + @hono/zod-openapi + zod; DB/AI/config injected via `WorkflowDeps`)
- **@archon/cli**: Command-line interface for running workflows and starting the web UI server (depends on @archon/server + @archon/adapters for the serve command)
- **@archon/core**: Business logic, database, orchestration (depends on @archon/providers for AI; provides `createWorkflowStore()` adapter bridging core DB → `IWorkflowStore`)
- **@archon/adapters**: Platform adapters for Slack, Telegram, GitHub, Discord (depends on @archon/core)
- **@archon/server**: OpenAPIHono HTTP server (Zod + OpenAPI spec generation via `@hono/zod-openapi`), Web adapter (SSE), API routes, Web UI static serving (depends on @archon/adapters)
- **@archon/web**: React frontend (Vite + Tailwind v4 + shadcn/ui + Zustand), SSE streaming to server. `WorkflowRunStatus`, `WorkflowDefinition`, and `DagNode` are all derived from `src/lib/api.generated.d.ts` (generated from the OpenAPI spec via `bun generate:types`; never import from `@archon/workflows`)
@ -424,7 +440,8 @@ import type { DagNode, WorkflowDefinition } from '@/lib/api';
**2. Command Handler** (`packages/core/src/handlers/`)
- Process slash commands (deterministic, no AI)
- Commands: `/command-set`, `/load-commands`, `/clone`, `/getcwd`, `/setcwd`, `/repos`, `/repo`, `/repo-remove`, `/worktree`, `/workflow`, `/status`, `/commands`, `/help`, `/reset`, `/reset-context`, `/init`
- The orchestrator treats only these top-level commands as deterministic: `/help`, `/status`, `/reset`, `/workflow`, `/register-project`, `/update-project`, `/remove-project`, `/commands`, `/init`, `/worktree`
- `/workflow` handles subcommands like `list`, `run`, `status`, `cancel`, `resume`, `abandon`, `approve`, `reject`
- Update database, perform operations, return responses
**3. Orchestrator** (`packages/core/src/orchestrator/`)
@ -434,10 +451,11 @@ import type { DagNode, WorkflowDefinition } from '@/lib/api';
- Session management: Create new or resume existing
- Stream AI responses to platform
**4. AI Assistant Clients** (`packages/core/src/clients/`)
- Implement `IAssistantClient` interface
- **ClaudeClient**: `@anthropic-ai/claude-agent-sdk`
- **CodexClient**: `@openai/codex-sdk`
**4. AI Agent Providers** (`packages/providers/src/`)
- Implement `IAgentProvider` interface
- **ClaudeProvider**: `@anthropic-ai/claude-agent-sdk`
- **CodexProvider**: `@openai/codex-sdk`
- **PiProvider** (community, `builtIn: false`): `@mariozechner/pi-coding-agent` — one harness for ~20 LLM backends via `<provider>/<model>` refs (e.g. `anthropic/claude-haiku-4-5`, `openrouter/qwen/qwen3-coder`); supports extensions, skills, tool restrictions, thinking level, best-effort structured output. See `packages/docs-web/src/content/docs/getting-started/ai-assistants.md` for setup, capability matrix, and extension config.
- Streaming: `for await (const event of events) { await platform.send(event) }`
### Configuration
@ -458,12 +476,18 @@ assistants:
settingSources: # Controls which CLAUDE.md files Claude SDK loads
- project # Default: only project-level CLAUDE.md
- user # Optional: also load ~/.claude/CLAUDE.md
claudeBinaryPath: /absolute/path/to/claude # Optional: Claude Code executable.
# Native binary (curl installer at
# ~/.local/bin/claude) or npm cli.js.
# Required in compiled binaries if
# CLAUDE_BIN_PATH env var is not set.
codex:
model: gpt-5.3-codex
modelReasoningEffort: medium # 'minimal' | 'low' | 'medium' | 'high' | 'xhigh'
webSearchMode: live # 'disabled' | 'cached' | 'live'
additionalDirectories:
- /absolute/path/to/other/repo
codexBinaryPath: /usr/local/bin/codex # Optional: custom Codex CLI binary path
# docs:
# path: docs # Optional: default is docs/
@ -524,12 +548,15 @@ curl http://localhost:3637/api/conversations/<conversationId>/messages
```
~/.archon/
├── workspaces/owner/repo/ # Project-centric layout
│ ├── source/ # Clone (from /clone) or symlink → local path
│ ├── source/ # Cloned repo or symlink → local path
│ ├── worktrees/ # Git worktrees for this project
│ ├── artifacts/ # Workflow artifacts (NEVER in git)
│ │ ├── runs/{id}/ # Per-run artifacts ($ARTIFACTS_DIR)
│ │ └── uploads/{convId}/ # Web UI file uploads (ephemeral)
│ └── logs/ # Workflow execution logs
├── vendor/codex/ # Codex native binary (binary builds, user-placed)
├── web-dist/<version>/ # Cached web UI dist (archon serve, binary only)
├── update-check.json # Update check cache (binary builds, 24h TTL)
├── archon.db # SQLite database (when DATABASE_URL not set)
└── config.yaml # Global configuration (non-secrets)
```
@ -539,6 +566,8 @@ curl http://localhost:3637/api/conversations/<conversationId>/messages
.archon/
├── commands/ # Custom commands
├── workflows/ # Workflow definitions (YAML files)
├── scripts/ # Named scripts for script: nodes (.ts/.js for bun, .py for uv)
├── state/ # Cross-run workflow state (gitignored — never in git)
└── config.yaml # Repo-specific configuration
```
@ -551,7 +580,7 @@ curl http://localhost:3637/api/conversations/<conversationId>/messages
**Quick reference:**
- **Platform Adapters**: Implement `IPlatformAdapter`, handle auth, polling/webhooks
- **AI Clients**: Implement `IAssistantClient`, session management, streaming
- **AI Providers**: Implement `IAgentProvider`, session management, streaming
- **Slash Commands**: Add to command-handler.ts, update database, no AI
- **Database Operations**: Use `IDatabase` interface (supports PostgreSQL and SQLite via adapters)
@ -665,13 +694,13 @@ async function createSession(conversationId: string, codebaseId: string) {
1. **Codebase Commands** (per-repo):
- Stored in `.archon/commands/` (plain text/markdown)
- Auto-detected via `/clone` or `/load-commands <folder>`
- Loaded by `/clone` or `/load-commands`, invoked by AI via orchestrator routing
- Discovered from the repository `.archon/commands/` directory
- Surfaced via `GET /api/commands` for the workflow builder and invoked by workflow `command:` nodes
2. **Workflows** (YAML-based):
- Stored in `.archon/workflows/` (searched recursively)
- Multi-step AI execution chains, discovered at runtime
- **`nodes:` (DAG format)**: Nodes with explicit `depends_on` edges; independent nodes in the same topological layer run concurrently. Node types: `command:` (named command file), `prompt:` (inline prompt), `bash:` (shell script, stdout captured as `$nodeId.output`, no AI), `loop:` (iterative AI prompt until completion signal) . Supports `when:` conditions, `trigger_rule` join semantics, `$nodeId.output` substitution, `output_format` for structured JSON output (Claude and Codex), `allowed_tools`/`denied_tools` for per-node tool restrictions (Claude only), `hooks` for per-node SDK hook callbacks (Claude only), `mcp` for per-node MCP server config files (Claude only, env vars expanded at execution time), and `skills` for per-node skill preloading via AgentDefinition wrapping (Claude only), and `effort`/`thinking`/`maxBudgetUsd`/`systemPrompt`/`fallbackModel`/`betas`/`sandbox` for Claude SDK advanced options (Claude only, also settable at workflow level)
- **`nodes:` (DAG format)**: Nodes with explicit `depends_on` edges; independent nodes in the same topological layer run concurrently. Node types: `command:` (named command file), `prompt:` (inline prompt), `bash:` (shell script, stdout captured as `$nodeId.output`, no AI, receives managed per-project env vars in its subprocess environment when configured), `loop:` (iterative AI prompt until completion signal), `approval:` (human gate; pauses until user approves or rejects; `capture_response: true` stores the user's comment as `$<node-id>.output` for downstream nodes, default false), `script:` (inline TypeScript/Python or named script from `.archon/scripts/`, runs via `bun` or `uv`, stdout captured as `$nodeId.output`, no AI, receives managed per-project env vars in its subprocess environment when configured, supports `deps:` for dependency installation and `timeout:` in ms, requires `runtime: bun` or `runtime: uv`) . Supports `when:` conditions, `trigger_rule` join semantics, `$nodeId.output` substitution, `output_format` for structured JSON output (Claude and Codex via SDK enforcement; Pi best-effort via prompt augmentation + JSON extraction), `allowed_tools`/`denied_tools` for per-node tool restrictions (Claude only), `hooks` for per-node SDK hook callbacks (Claude only), `mcp` for per-node MCP server config files (Claude only, env vars expanded at execution time), and `skills` for per-node skill preloading via AgentDefinition wrapping (Claude only), `agents` for inline sub-agent definitions invokable via the Task tool (Claude only), and `effort`/`thinking`/`maxBudgetUsd`/`systemPrompt`/`fallbackModel`/`betas`/`sandbox` for Claude SDK advanced options (Claude only, also settable at workflow level)
- Provider inherited from `.archon/config.yaml` unless explicitly set; per-node `provider` and `model` overrides supported
- Model and options can be set per workflow or inherited from config defaults
- `interactive: true` at the workflow level forces foreground execution on web (required for approval-gate workflows in the web UI)
@ -684,14 +713,21 @@ async function createSession(conversationId: string, codebaseId: string) {
**Defaults:**
- Bundled in `.archon/commands/defaults/` and `.archon/workflows/defaults/`
- Binary builds: Embedded at compile time (no filesystem access needed)
- Binary builds: Embedded at compile time (no filesystem access needed) via `packages/workflows/src/defaults/bundled-defaults.generated.ts`
- Source builds: Loaded from filesystem at runtime
- Merged with repo-specific commands/workflows (repo overrides defaults by name)
- Opt-out: Set `defaults.loadDefaultCommands: false` or `defaults.loadDefaultWorkflows: false` in `.archon/config.yaml`
- **After adding, removing, or editing a default file, run `bun run generate:bundled`** to refresh the embedded bundle. `bun run validate` (and CI) run `check:bundled` and will fail loudly if the generated file is stale.
**Global workflows** (user-level, applies to every project):
- Path: `~/.archon/.archon/workflows/` (or `$ARCHON_HOME/.archon/workflows/`)
- Load priority: bundled < global < repo-specific (repo overrides global by filename)
**Home-scoped ("global") workflows, commands, and scripts** (user-level, applies to every project):
- Workflows: `~/.archon/workflows/` (or `$ARCHON_HOME/workflows/`)
- Commands: `~/.archon/commands/` (or `$ARCHON_HOME/commands/`)
- Scripts: `~/.archon/scripts/` (or `$ARCHON_HOME/scripts/`)
- Source label: `source: 'global'` on workflows and commands (scripts don't have a source label)
- Load priority: bundled < global < project (repo overrides global by filename or script name)
- Subfolders: supported 1 level deep (e.g. `~/.archon/workflows/triage/foo.yaml`). Deeper nesting is ignored silently.
- Discovery is automatic — `discoverWorkflowsWithConfig(cwd, loadConfig)` and `discoverScriptsForCwd(cwd)` both read home-scoped paths unconditionally; no caller option needed
- **Migration from pre-0.x `~/.archon/.archon/workflows/`**: if Archon detects files at the old location it emits a one-time WARN with the exact `mv` command and does NOT load from there. Move with: `mv ~/.archon/.archon/workflows ~/.archon/workflows && rmdir ~/.archon/.archon`
- See the docs site at `packages/docs-web/` for details
### Error Handling
@ -749,9 +785,11 @@ Pattern: Use `classifyIsolationError()` (from `@archon/isolation`) to map git er
**Codebases:**
- `GET /api/codebases` / `GET /api/codebases/:id` - List / fetch codebases
- `POST /api/codebases` - Register a codebase (clone or local path); body accepts `allowEnvKeys` for the env-leak gate
- `PATCH /api/codebases/:id` - Flip the `allow_env_keys` consent bit; body: `{ allowEnvKeys: boolean }`. Audit-logged at `warn` level on every grant/revoke (`env_leak_consent_granted` / `env_leak_consent_revoked`) with `codebaseId`, `path`, `files`, `keys`, `scanStatus`, `actor`
- `POST /api/codebases` - Register a codebase (clone or local path)
- `DELETE /api/codebases/:id` - Delete a codebase and clean up resources
- `GET /api/codebases/:id/env` - List env var keys for a codebase (never returns values)
- `PUT /api/codebases/:id/env` / `DELETE /api/codebases/:id/env/:key` - Upsert / delete a single codebase env var
- `GET /api/codebases/:id/environments` - List tracked isolation environments for a codebase
**Artifact Files:**
- `GET /api/artifacts/:runId/*` - Serve a workflow artifact file by run ID and relative path; returns `text/markdown` for `.md` files, `text/plain` otherwise; 400 on path traversal (`..`), 404 if run or file not found
@ -759,6 +797,13 @@ Pattern: Use `classifyIsolationError()` (from `@archon/isolation`) to map git er
**Command Listing:**
- `GET /api/commands` - List available command names (bundled + project-defined); optional `?cwd=`; returns `{ commands: [{ name, source: 'bundled' | 'project' }] }`
**Providers:**
- `GET /api/providers` - List registered AI providers; returns `{ providers: [{ id, displayName, capabilities, builtIn }] }`
**System:**
- `GET /api/health` - Health check with adapter/system status
- `GET /api/update-check` - Check for available updates; returns `{ updateAvailable, currentVersion, latestVersion, releaseUrl }`; skips GitHub API call for non-binary builds
**OpenAPI Spec:**
- `GET /api/openapi.json` - Generated OpenAPI 3.0 spec for all Zod-validated routes

View file

@ -17,15 +17,20 @@ Thank you for your interest in contributing to Archon!
Before submitting a PR, ensure:
```bash
bun run type-check # TypeScript types
bun run lint # ESLint
bun run format # Prettier
bun run test # All tests (per-package isolation)
bun run check:bundled # Bundled defaults are up to date (see note below)
bun run type-check # TypeScript types
bun run lint # ESLint
bun run format # Prettier
bun run test # All tests (per-package isolation)
# Or run the full validation suite:
bun run validate
```
**Bundled defaults**: If you added, removed, or edited a file under
`.archon/commands/defaults/` or `.archon/workflows/defaults/`, run
`bun run generate:bundled` to refresh the embedded bundle before committing.
**Important:** Use `bun run test` (not `bun test` from the repo root) to avoid mock pollution across packages.
### Commit Messages

View file

@ -24,6 +24,7 @@ COPY packages/docs-web/package.json ./packages/docs-web/
COPY packages/git/package.json ./packages/git/
COPY packages/isolation/package.json ./packages/isolation/
COPY packages/paths/package.json ./packages/paths/
COPY packages/providers/package.json ./packages/providers/
COPY packages/server/package.json ./packages/server/
COPY packages/web/package.json ./packages/web/
COPY packages/workflows/package.json ./packages/workflows/
@ -107,6 +108,14 @@ RUN apt-get update && apt-get install -y --no-install-recommends nodejs npm \
# Point agent-browser to system Chromium (avoids ~400MB Chrome for Testing download)
ENV AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
# Pre-configure the Claude Code SDK cli.js path for any consumer that runs
# a compiled Archon binary inside (or extending) this image. In source mode
# (the default `bun run start` ENTRYPOINT), BUNDLED_IS_BINARY is false and
# this variable is ignored — the SDK resolves cli.js via node_modules. Kept
# here so extenders don't need to rediscover the path.
# Path matches the hoisted layout produced by `bun install --linker=hoisted`.
ENV CLAUDE_BIN_PATH=/app/node_modules/@anthropic-ai/claude-agent-sdk/cli.js
# Create non-root user for running Claude Code
# Claude Code refuses to run with --dangerously-skip-permissions as root for security
RUN useradd -m -u 1001 -s /bin/bash appuser \
@ -130,6 +139,7 @@ COPY packages/docs-web/package.json ./packages/docs-web/
COPY packages/git/package.json ./packages/git/
COPY packages/isolation/package.json ./packages/isolation/
COPY packages/paths/package.json ./packages/paths/
COPY packages/providers/package.json ./packages/providers/
COPY packages/server/package.json ./packages/server/
COPY packages/web/package.json ./packages/web/
COPY packages/workflows/package.json ./packages/workflows/
@ -144,6 +154,7 @@ COPY packages/core/ ./packages/core/
COPY packages/git/ ./packages/git/
COPY packages/isolation/ ./packages/isolation/
COPY packages/paths/ ./packages/paths/
COPY packages/providers/ ./packages/providers/
COPY packages/server/ ./packages/server/
COPY packages/workflows/ ./packages/workflows/

View file

@ -171,6 +171,22 @@ irm https://archon.diy/install.ps1 | iex
brew install coleam00/archon/archon
```
> **Compiled binaries need a `CLAUDE_BIN_PATH`.** The quick-install binaries
> don't bundle Claude Code. Install it separately, then point Archon at it:
>
> ```bash
> # macOS / Linux / WSL
> curl -fsSL https://claude.ai/install.sh | bash
> export CLAUDE_BIN_PATH="$HOME/.local/bin/claude"
>
> # Windows (PowerShell)
> irm https://claude.ai/install.ps1 | iex
> $env:CLAUDE_BIN_PATH = "$env:USERPROFILE\.local\bin\claude.exe"
> ```
>
> Or set `assistants.claude.claudeBinaryPath` in `~/.archon/config.yaml`.
> The Docker image ships Claude Code pre-installed. See [AI Assistants → Binary path configuration](https://archon.diy/docs/getting-started/ai-assistants/#binary-path-configuration-compiled-binaries-only) for details.
### Start Using Archon
Once you've completed either setup path, go to your project and start working:
@ -194,7 +210,7 @@ The coding agent handles workflow selection, branch naming, and worktree isolati
## Web UI
Archon includes a web dashboard for chatting with your coding agent, running workflows, and monitoring activity. To start it, ask your coding agent to run the frontend from the Archon repo, or run `bun run dev` from the repo root yourself.
Archon includes a web dashboard for chatting with your coding agent, running workflows, and monitoring activity. Binary installs: run `archon serve` to download and start the web UI in one step. From source: ask your coding agent to run the frontend from the Archon repo, or run `bun run dev` from the repo root yourself.
Register a project by clicking **+** next to "Project" in the chat sidebar - enter a GitHub URL or local path. Then start a conversation, invoke workflows, and watch progress in real time.
@ -254,7 +270,7 @@ The Web UI and CLI work out of the box. Optionally connect a chat platform for r
```
┌─────────────────────────────────────────────────────────┐
│ Platform Adapters (Web UI, CLI, Telegram, Slack, │
│ Discord, GitHub)
│ Discord, GitHub) │
└──────────────────────────┬──────────────────────────────┘
@ -268,7 +284,7 @@ The Web UI and CLI work out of the box. Optionally connect a chat platform for r
▼ ▼ ▼ ▼
┌───────────┐ ┌────────────┐ ┌──────────────────────────┐
│ Command │ │ Workflow │ │ AI Assistant Clients │
│ Handler │ │ Executor │ │ (Claude / Codex)
│ Handler │ │ Executor │ │ (Claude / Codex / Pi)
│ (Slash) │ │ (YAML) │ │ │
└───────────┘ └────────────┘ └──────────────────────────┘
│ │ │
@ -294,17 +310,38 @@ Full documentation is available at **[archon.diy](https://archon.diy)**.
| [Authoring Workflows](https://archon.diy/guides/authoring-workflows/) | Create custom YAML workflows |
| [Authoring Commands](https://archon.diy/guides/authoring-commands/) | Create reusable AI commands |
| [Configuration](https://archon.diy/reference/configuration/) | All config options, env vars, YAML settings |
| [AI Assistants](https://archon.diy/getting-started/ai-assistants/) | Claude and Codex setup details |
| [AI Assistants](https://archon.diy/getting-started/ai-assistants/) | Claude, Codex, and Pi setup details |
| [Deployment](https://archon.diy/deployment/) | Docker, VPS, production setup |
| [Architecture](https://archon.diy/reference/architecture/) | System design and internals |
| [Troubleshooting](https://archon.diy/reference/troubleshooting/) | Common issues and fixes |
## Telemetry
Archon sends a single anonymous event — `workflow_invoked` — each time a workflow starts, so maintainers can see which workflows get real usage and prioritize accordingly. **No PII, ever.**
**What's collected:** the workflow name, the workflow description (both authored by you in YAML), the platform that triggered it (`cli`, `web`, `slack`, etc.), the Archon version, and a random install UUID stored at `~/.archon/telemetry-id`. Nothing else.
**What's *not* collected:** your code, prompts, messages, git remotes, file paths, usernames, tokens, AI output, workflow node details — none of it.
**Opt out:** set any of these in your environment:
```bash
ARCHON_TELEMETRY_DISABLED=1
DO_NOT_TRACK=1 # de facto standard honored by Astro, Bun, Prisma, Nuxt, etc.
```
Self-host PostHog or use a different project by setting `POSTHOG_API_KEY` and `POSTHOG_HOST`.
## Contributing
Contributions welcome! See the open [issues](https://github.com/coleam00/Archon/issues) for things to work on.
Please read [CONTRIBUTING.md](CONTRIBUTING.md) before submitting a pull request.
## Star History
[![Star History Chart](https://api.star-history.com/chart?repos=coleam00/Archon&type=date&legend=top-left)](https://www.star-history.com/?repos=coleam00%2FArchon&type=date&legend=top-left)
## License
[MIT](LICENSE)

1012
bun.lock

File diff suppressed because it is too large Load diff

View file

@ -46,7 +46,7 @@ TELEGRAM_BOT_TOKEN=123456789:ABC...
# ============================================
# Optional
# ============================================
PORT=3000
PORT=3000 # Docker deployment default (the included compose/Caddy configs target :3000). For local dev (no Docker), omit PORT — server and Vite proxy both default to 3090.
# TELEGRAM_STREAMING_MODE=stream
# DISCORD_STREAMING_MODE=batch

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
# Ensure required subdirectories exist.
# Named volumes inherit these from the image layer on first run; bind mounts do not,
# which causes the Claude subprocess to fail silently when spawned with a missing cwd.
mkdir -p /.archon/workspaces /.archon/worktrees
# Determine if we need to use gosu for privilege dropping
if [ "$(id -u)" = "0" ]; then
# Running as root: fix volume permissions, then drop to appuser

View file

@ -17,9 +17,11 @@ export default tseslint.config(
'worktrees/**',
'.claude/worktrees/**',
'.claude/skills/**',
'**/*.generated.ts', // Auto-generated source files (content inlined via JSON.stringify)
'**/*.js',
'*.mjs',
'**/*.test.ts',
'**/src/test/**', // Test helper files (mock factories, fixtures)
'*.d.ts', // Root-level declaration files (not in tsconfig project scope)
'**/*.generated.d.ts', // Auto-generated declaration files (e.g. openapi-typescript output)
'packages/web/vite.config.ts', // Vite config doesn't need type-checked linting
@ -40,7 +42,7 @@ export default tseslint.config(
// Project-specific settings
{
files: ['packages/*/src/**/*.{ts,tsx}'],
files: ['packages/*/src/**/*.{ts,tsx}', 'scripts/**/*.ts'],
languageOptions: {
parserOptions: {
projectService: true,

View file

@ -7,28 +7,28 @@
class Archon < Formula
desc "Remote agentic coding platform - control AI assistants from anywhere"
homepage "https://github.com/coleam00/Archon"
version "0.3.0"
version "0.3.6"
license "MIT"
on_macos do
on_arm do
url "https://github.com/coleam00/Archon/releases/download/v#{version}/archon-darwin-arm64"
sha256 "2ff39add5306d839b28e05e58a98442a55d7b1a27d3045999ca62e9ccc7557b9"
sha256 "96b6dac50b046eece9eddbb988a0c39b4f9a0e2faac66e49b977ba6360069e86"
end
on_intel do
url "https://github.com/coleam00/Archon/releases/download/v#{version}/archon-darwin-x64"
sha256 "7d5719a00e95d05303e0fd2586f6d69c41102bde1e23b11aa7e662905c235100"
sha256 "09f1dbe12417b4300b7b07b531eb7391a286305f8d4eafc11e7f61f5d26eb8eb"
end
end
on_linux do
on_arm do
url "https://github.com/coleam00/Archon/releases/download/v#{version}/archon-linux-arm64"
sha256 "8bf7c0a335455b10f7362902d78b2b9a90778d4d2e979153ab5b114d4edb996c"
sha256 "80b06a6ff699ec57cd4a3e49cfe7b899a3e8212688d70285f5a887bf10086731"
end
on_intel do
url "https://github.com/coleam00/Archon/releases/download/v#{version}/archon-linux-x64"
sha256 "f1f730ebea4d77e6fa533a8fbdd194fb356ddc441160fae5f63f6225c27ff8fc"
sha256 "09f5dac6db8037ed6f3e5b7e9c5eb8e37f19822a4ed2bf4cd7e654780f9d00de"
end
end

View file

@ -1,6 +1,6 @@
{
"name": "archon",
"version": "0.3.1",
"version": "0.3.6",
"private": true,
"workspaces": [
"packages/*"
@ -14,9 +14,11 @@
"build": "bun --filter '*' build",
"build:binaries": "bash scripts/build-binaries.sh",
"build:checksums": "bash scripts/checksums.sh",
"generate:bundled": "bun run scripts/generate-bundled-defaults.ts",
"check:bundled": "bun run scripts/generate-bundled-defaults.ts --check",
"test": "bun --filter '*' --parallel test",
"test:watch": "bun --filter @archon/server test:watch",
"type-check": "bun --filter '*' type-check",
"type-check": "bun --filter '*' type-check && bun x tsc --noEmit -p scripts/tsconfig.json",
"lint": "bun x eslint . --cache",
"lint:fix": "bun x eslint . --cache --fix",
"format": "bun x prettier --write .",
@ -25,7 +27,7 @@
"build:web": "bun --filter @archon/web build",
"dev:docs": "bun --filter @archon/docs-web dev",
"build:docs": "bun --filter @archon/docs-web build",
"validate": "bun run type-check && bun run lint --max-warnings 0 && bun run format:check && bun run test",
"validate": "bun run check:bundled && bun run type-check && bun run lint --max-warnings 0 && bun run format:check && bun run test",
"prepare": "husky",
"setup-auth": "bun --filter @archon/server setup-auth"
},

View file

@ -1,12 +1,14 @@
{
"name": "@archon/adapters",
"version": "0.1.0",
"version": "0.3.6",
"type": "module",
"main": "./src/index.ts",
"types": "./src/index.ts",
"exports": {
".": "./src/index.ts",
"./*": "./src/*"
"./*": "./src/*",
"./community/forge/gitea": "./src/community/forge/gitea/index.ts",
"./community/forge/gitlab": "./src/community/forge/gitlab/index.ts"
},
"scripts": {
"test": "bun test src/chat/ src/community/chat/ src/community/forge/gitlab/auth.test.ts src/forge/github/auth.test.ts src/utils/ && bun test src/forge/github/adapter.test.ts && bun test src/forge/github/context.test.ts && bun test src/community/forge/gitea/adapter.test.ts && bun test src/community/forge/gitlab/adapter.test.ts",
@ -20,7 +22,7 @@
"@octokit/rest": "^22.0.0",
"@slack/bolt": "^4.6.0",
"discord.js": "^14.16.0",
"telegraf": "^4.16.0",
"grammy": "^1.36.0",
"telegramify-markdown": "^1.3.0"
},
"peerDependencies": {

View file

@ -52,7 +52,7 @@ describe('TelegramAdapter', () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const bot = adapter.getBot();
expect(bot).toBeDefined();
expect(bot.telegram).toBeDefined();
expect(bot.api).toBeDefined();
});
});
@ -64,9 +64,8 @@ describe('TelegramAdapter', () => {
adapter = new TelegramAdapter('fake-token-for-testing');
mockSendMessage = mock(() => Promise.resolve());
// Override bot's sendMessage
(
adapter.getBot().telegram as unknown as { sendMessage: Mock<() => Promise<void>> }
).sendMessage = mockSendMessage;
(adapter.getBot().api as unknown as { sendMessage: Mock<() => Promise<void>> }).sendMessage =
mockSendMessage;
});
test('should send with MarkdownV2 parse_mode', async () => {
@ -172,7 +171,7 @@ describe('TelegramAdapter', () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const ctx = {
chat: { id: 12345 },
} as unknown as import('telegraf').Context;
} as unknown as import('grammy').Context;
expect(adapter.getConversationId(ctx)).toBe('12345');
});
@ -181,7 +180,7 @@ describe('TelegramAdapter', () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const ctx = {
chat: { id: -987654321 },
} as unknown as import('telegraf').Context;
} as unknown as import('grammy').Context;
expect(adapter.getConversationId(ctx)).toBe('-987654321');
});
@ -190,7 +189,7 @@ describe('TelegramAdapter', () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const ctx = {
chat: { id: -1001234567890 },
} as unknown as import('telegraf').Context;
} as unknown as import('grammy').Context;
expect(adapter.getConversationId(ctx)).toBe('-1001234567890');
});
@ -199,7 +198,7 @@ describe('TelegramAdapter', () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const ctx = {
chat: undefined,
} as unknown as import('telegraf').Context;
} as unknown as import('grammy').Context;
expect(() => adapter.getConversationId(ctx)).toThrow('No chat in context');
});
@ -208,7 +207,7 @@ describe('TelegramAdapter', () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const ctx = {
chat: null,
} as unknown as import('telegraf').Context;
} as unknown as import('grammy').Context;
expect(() => adapter.getConversationId(ctx)).toThrow('No chat in context');
});
@ -235,6 +234,16 @@ describe('TelegramAdapter', () => {
});
});
describe('stop()', () => {
test('should call bot.stop()', () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const mockStop = mock(() => undefined);
(adapter.getBot() as unknown as { stop: typeof mockStop }).stop = mockStop;
adapter.stop();
expect(mockStop).toHaveBeenCalledTimes(1);
});
});
describe('start()', () => {
beforeEach(() => {
mockLogger.warn.mockClear();
@ -243,14 +252,20 @@ describe('TelegramAdapter', () => {
test('should retry on 409 and succeed on second attempt', async () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const mockLaunch = mock<() => Promise<void>>()
// grammY's start() resolves when bot stops, not when started — onStart fires on startup
const mockStart = mock<
(opts?: { drop_pending_updates?: boolean; onStart?: () => void }) => Promise<void>
>()
.mockRejectedValueOnce(new Error('409: Conflict: terminated by other getUpdates request'))
.mockResolvedValueOnce(undefined);
(adapter.getBot() as unknown as { launch: typeof mockLaunch }).launch = mockLaunch;
.mockImplementationOnce(opts => {
opts?.onStart?.();
return new Promise(() => {});
});
(adapter.getBot() as unknown as { start: typeof mockStart }).start = mockStart;
await adapter.start({ retryDelayMs: 0 });
expect(mockLaunch).toHaveBeenCalledTimes(2);
expect(mockStart).toHaveBeenCalledTimes(2);
expect(mockLogger.warn).toHaveBeenCalledWith(
expect.objectContaining({ attempt: 1, maxAttempts: 3 }),
'telegram.start_conflict_retrying'
@ -260,41 +275,48 @@ describe('TelegramAdapter', () => {
test('should throw immediately on non-409 error', async () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const mockLaunch = mock<() => Promise<void>>().mockRejectedValueOnce(
new Error('401: Unauthorized')
);
(adapter.getBot() as unknown as { launch: typeof mockLaunch }).launch = mockLaunch;
const mockStart = mock<
(opts?: { drop_pending_updates?: boolean; onStart?: () => void }) => Promise<void>
>().mockRejectedValueOnce(new Error('401: Unauthorized'));
(adapter.getBot() as unknown as { start: typeof mockStart }).start = mockStart;
await expect(adapter.start({ retryDelayMs: 0 })).rejects.toThrow('401: Unauthorized');
expect(mockLaunch).toHaveBeenCalledTimes(1);
expect(mockStart).toHaveBeenCalledTimes(1);
});
test('should retry twice on 409 and succeed on third attempt', async () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const conflictError = new Error('409: Conflict: terminated by other getUpdates request');
const mockLaunch = mock<() => Promise<void>>()
const mockStart = mock<
(opts?: { drop_pending_updates?: boolean; onStart?: () => void }) => Promise<void>
>()
.mockRejectedValueOnce(conflictError)
.mockRejectedValueOnce(conflictError)
.mockResolvedValueOnce(undefined);
(adapter.getBot() as unknown as { launch: typeof mockLaunch }).launch = mockLaunch;
.mockImplementationOnce(opts => {
opts?.onStart?.();
return new Promise(() => {});
});
(adapter.getBot() as unknown as { start: typeof mockStart }).start = mockStart;
await adapter.start({ retryDelayMs: 0 });
expect(mockLaunch).toHaveBeenCalledTimes(3);
expect(mockStart).toHaveBeenCalledTimes(3);
expect(mockLogger.warn).toHaveBeenCalledTimes(2);
});
test('should throw after exhausting all 409 retry attempts', async () => {
const adapter = new TelegramAdapter('fake-token-for-testing');
const conflictError = new Error('409: Conflict: terminated by other getUpdates request');
const mockLaunch = mock<() => Promise<void>>()
const mockStart = mock<
(opts?: { drop_pending_updates?: boolean; onStart?: () => void }) => Promise<void>
>()
.mockRejectedValueOnce(conflictError)
.mockRejectedValueOnce(conflictError)
.mockRejectedValueOnce(conflictError);
(adapter.getBot() as unknown as { launch: typeof mockLaunch }).launch = mockLaunch;
(adapter.getBot() as unknown as { start: typeof mockStart }).start = mockStart;
await expect(adapter.start({ retryDelayMs: 0 })).rejects.toThrow('409');
expect(mockLaunch).toHaveBeenCalledTimes(3);
expect(mockStart).toHaveBeenCalledTimes(3);
});
});
});

View file

@ -1,8 +1,8 @@
/**
* Telegram platform adapter using Telegraf SDK
* Telegram platform adapter using grammY SDK
* Handles message sending with 4096 character limit splitting
*/
import { Telegraf, Context } from 'telegraf';
import { Bot, Context } from 'grammy';
import type { IPlatformAdapter, MessageMetadata } from '@archon/core';
import { createLogger } from '@archon/paths';
import { parseAllowedUserIds, isUserAuthorized } from './auth';
@ -20,17 +20,14 @@ function getLog(): ReturnType<typeof createLogger> {
const MAX_LENGTH = 4096;
export class TelegramAdapter implements IPlatformAdapter {
private bot: Telegraf;
private bot: Bot;
private streamingMode: 'stream' | 'batch';
private allowedUserIds: number[];
private messageHandler: ((ctx: TelegramMessageContext) => Promise<void>) | null = null;
constructor(token: string, mode: 'stream' | 'batch' = 'stream') {
// Disable handler timeout to support long-running AI operations
// Default is 90 seconds which is too short for complex coding tasks
this.bot = new Telegraf(token, {
handlerTimeout: Infinity,
});
// grammY does not impose a handler timeout by default (unlike Telegraf's 90s limit)
this.bot = new Bot(token);
this.streamingMode = mode;
// Parse Telegram user whitelist (optional - empty = open access)
@ -87,20 +84,20 @@ export class TelegramAdapter implements IPlatformAdapter {
let subChunk = '';
for (const line of lines) {
if (subChunk.length + line.length + 1 > MAX_LENGTH - 100) {
if (subChunk) await this.bot.telegram.sendMessage(id, subChunk);
if (subChunk) await this.bot.api.sendMessage(id, subChunk);
subChunk = line;
} else {
subChunk += (subChunk ? '\n' : '') + line;
}
}
if (subChunk) await this.bot.telegram.sendMessage(id, subChunk);
if (subChunk) await this.bot.api.sendMessage(id, subChunk);
return;
}
// Try MarkdownV2 formatting
const formatted = convertToTelegramMarkdown(chunk);
try {
await this.bot.telegram.sendMessage(id, formatted, { parse_mode: 'MarkdownV2' });
await this.bot.api.sendMessage(id, formatted, { parse_mode: 'MarkdownV2' });
getLog().debug({ chunkLength: chunk.length }, 'telegram.markdownv2_chunk_sent');
} catch (error) {
// Fallback to stripped plain text for this chunk
@ -113,14 +110,14 @@ export class TelegramAdapter implements IPlatformAdapter {
},
'telegram.markdownv2_failed'
);
await this.bot.telegram.sendMessage(id, stripMarkdown(chunk));
await this.bot.api.sendMessage(id, stripMarkdown(chunk));
}
}
/**
* Get the Telegraf bot instance
* Get the grammY bot instance
*/
getBot(): Telegraf {
getBot(): Bot {
return this.bot;
}
@ -171,17 +168,15 @@ export class TelegramAdapter implements IPlatformAdapter {
*/
async start(options?: { retryDelayMs?: number }): Promise<void> {
// Register message handler before launch
this.bot.on('message', ctx => {
if (!('text' in ctx.message)) return;
this.bot.on('message:text', ctx => {
const message = ctx.message.text;
if (!message) return;
// Authorization check - verify sender is in whitelist
const userId = ctx.from.id;
const userId = ctx.from?.id;
if (!isUserAuthorized(userId, this.allowedUserIds)) {
// Log unauthorized attempt (mask user ID for privacy)
const maskedId = `${String(userId).slice(0, 4)}***`;
const maskedId = userId !== undefined ? `${String(userId).slice(0, 4)}***` : 'unknown';
getLog().info({ maskedUserId: maskedId }, 'telegram.unauthorized_message');
return; // Silent rejection
}
@ -190,6 +185,11 @@ export class TelegramAdapter implements IPlatformAdapter {
const conversationId = this.getConversationId(ctx);
// Fire-and-forget - errors handled by caller
void this.messageHandler({ conversationId, message, userId });
} else {
// Intentional: message dropped silently if handler not registered yet.
// In production the server always calls onMessage() before start(); this
// path only surfaces during development or misconfiguration.
getLog().debug({ chatId: ctx.chat?.id }, 'telegram.message_dropped_no_handler');
}
});
@ -200,9 +200,26 @@ export class TelegramAdapter implements IPlatformAdapter {
const RETRY_DELAY_MS = options?.retryDelayMs ?? 60_000;
for (let attempt = 1; attempt <= MAX_ATTEMPTS; attempt++) {
try {
// dropPendingUpdates: true — discard queued messages from while the bot was offline
// drop_pending_updates: true — discard queued messages from while the bot was offline
// to avoid reprocessing stale commands after a container restart.
await this.bot.launch({ dropPendingUpdates: true });
// grammY's start() resolves only when the bot stops; use onStart callback to detect
// successful launch and return immediately while the bot continues running in background.
await new Promise<void>((resolve, reject) => {
this.bot
.start({
drop_pending_updates: true,
onStart: () => {
resolve();
},
})
.catch((err: unknown) => {
const error = err instanceof Error ? err : new Error(String(err));
// Log post-startup crashes — after onStart fires the reject() below is a no-op
// (Promise already settled), but the error should still be observable in logs.
getLog().error({ err: error }, 'telegram.bot_runtime_error');
reject(error);
});
});
getLog().info('telegram.bot_started');
return;
} catch (err) {

View file

@ -80,6 +80,10 @@ mock.module('@archon/isolation', () => ({
IsolationHints: {},
}));
// Mock global fetch to prevent real HTTP calls (gitlab.example.com hangs on CI Linux)
const mockFetch = mock(() => Promise.resolve(new Response(JSON.stringify({}), { status: 200 })));
globalThis.fetch = mockFetch as typeof globalThis.fetch;
// Now import the adapter (after all mocks)
const { GitLabAdapter } = await import('./adapter');
const { ConversationLockManager } = await import('@archon/core');
@ -158,6 +162,7 @@ describe('GitLabAdapter', () => {
beforeEach(() => {
mockHandleMessage.mockClear();
mockOnConversationClosed.mockClear();
mockFetch.mockClear();
// Reset env
delete process.env.GITLAB_ALLOWED_USERS;
});

View file

@ -1,6 +1,6 @@
{
"name": "@archon/cli",
"version": "0.2.13",
"version": "0.3.6",
"type": "module",
"main": "./src/cli.ts",
"bin": {
@ -8,14 +8,17 @@
},
"scripts": {
"cli": "bun src/cli.ts",
"test": "bun test src/commands/version.test.ts src/commands/setup.test.ts && bun test src/commands/workflow.test.ts && bun test src/commands/isolation.test.ts && bun test src/commands/chat.test.ts",
"test": "bun test src/commands/version.test.ts src/commands/setup.test.ts && bun test src/commands/workflow.test.ts && bun test src/commands/isolation.test.ts && bun test src/commands/chat.test.ts && bun test src/commands/serve.test.ts",
"type-check": "bun x tsc --noEmit"
},
"dependencies": {
"@archon/adapters": "workspace:*",
"@archon/core": "workspace:*",
"@archon/git": "workspace:*",
"@archon/isolation": "workspace:*",
"@archon/paths": "workspace:*",
"@archon/providers": "workspace:*",
"@archon/server": "workspace:*",
"@archon/workflows": "workspace:*",
"@clack/prompts": "^1.0.0",
"dotenv": "^17.2.3"

View file

@ -26,6 +26,8 @@ describe('CLI argument parsing', () => {
spawn: { type: 'boolean' },
quiet: { type: 'boolean', short: 'q' },
verbose: { type: 'boolean', short: 'v' },
scope: { type: 'string' },
force: { type: 'boolean' },
},
allowPositionals: true,
strict: false,
@ -165,6 +167,35 @@ describe('CLI argument parsing', () => {
expect(result.positionals).toContain('/path'); // /path becomes positional
});
});
describe('setup --scope and --force flags (#1303)', () => {
it('parses --scope home', () => {
const result = parseCliArgs(['setup', '--scope', 'home']);
expect(result.values.scope).toBe('home');
});
it('parses --scope project', () => {
const result = parseCliArgs(['setup', '--scope', 'project']);
expect(result.values.scope).toBe('project');
});
it('defaults --scope to undefined when not provided', () => {
const result = parseCliArgs(['setup']);
expect(result.values.scope).toBeUndefined();
});
it('parses --force as boolean', () => {
const result = parseCliArgs(['setup', '--force']);
expect(result.values.force).toBe(true);
});
it('captures an invalid --scope value verbatim for caller validation', () => {
// parseArgs itself does not validate the enum; cli.ts validates and
// exits on unknown scope values. The test documents the contract.
const result = parseCliArgs(['setup', '--scope', 'nonsense']);
expect(result.values.scope).toBe('nonsense');
});
});
});
describe('Conversation ID generation', () => {

View file

@ -7,41 +7,21 @@
* archon workflow run <name> [msg] Run a workflow
* archon version Show version info
*/
// Must be the very first import — strips Bun-auto-loaded CWD .env keys before
// any module reads process.env at init time (e.g. @archon/paths/logger reads LOG_LEVEL).
import '@archon/paths/strip-cwd-env-boot';
// Then load archon-owned env from ~/.archon/.env (user scope) and
// <cwd>/.archon/.env (repo scope, wins over user). Both with override: true.
// See packages/paths/src/env-loader.ts and the three-path model (#1302 / #1303).
import { loadArchonEnv } from '@archon/paths/env-loader';
loadArchonEnv(process.cwd());
import { parseArgs } from 'util';
import { config } from 'dotenv';
import { resolve } from 'path';
import { existsSync } from 'fs';
// Strip all vars that Bun may have auto-loaded from CWD's .env.
// Bun auto-loads .env relative to CWD before any user code runs. The CLI
// runs from target repos whose .env contains keys for that app (ANTHROPIC_API_KEY,
// DATABASE_URL, OPENAI_API_KEY, etc.) — none of which should affect Archon.
// Strategy: parse the CWD .env without applying it, then delete those keys.
const cwdEnvPath = resolve(process.cwd(), '.env');
if (existsSync(cwdEnvPath)) {
const cwdEnvResult = config({ path: cwdEnvPath, processEnv: {} });
// If parse fails, cwdEnvResult.parsed is undefined — safe to skip:
// Bun uses the same RFC-style parser, so a file dotenv cannot parse
// was also unparseable by Bun and contributed no keys to process.env.
if (cwdEnvResult.parsed) {
for (const key of Object.keys(cwdEnvResult.parsed)) {
Reflect.deleteProperty(process.env, key);
}
}
}
// Load .env from global Archon config only (override: true so ~/.archon/.env
// always wins over any remaining Bun-auto-loaded vars)
const globalEnvPath = resolve(process.env.HOME ?? '~', '.archon', '.env');
if (existsSync(globalEnvPath)) {
const result = config({ path: globalEnvPath, override: true });
if (result.error) {
// Logger may not be available yet (early startup), so use console for user-facing error
console.error(`Error loading .env from ${globalEnvPath}: ${result.error.message}`);
console.error('Hint: Check for syntax errors in your .env file.');
process.exit(1);
}
}
// CLAUDECODE=1 warning is emitted inside stripCwdEnv() (boot import above)
// BEFORE the marker is deleted from process.env. No duplicate warning here.
// Smart defaults for Claude auth
// If no explicit tokens, default to global auth from `claude /login`
@ -53,6 +33,11 @@ if (!process.env.CLAUDE_API_KEY && !process.env.CLAUDE_CODE_OAUTH_TOKEN) {
// DATABASE_URL is no longer required - SQLite will be used as default
// Bootstrap provider registry before any provider lookups
import { registerBuiltinProviders, registerCommunityProviders } from '@archon/providers';
registerBuiltinProviders();
registerCommunityProviders();
// Import commands after dotenv is loaded
import { versionCommand } from './commands/version';
import {
@ -78,8 +63,16 @@ import { continueCommand } from './commands/continue';
import { chatCommand } from './commands/chat';
import { setupCommand } from './commands/setup';
import { validateWorkflowsCommand, validateCommandsCommand } from './commands/validate';
import { serveCommand } from './commands/serve';
import { closeDatabase } from '@archon/core';
import { setLogLevel, createLogger } from '@archon/paths';
import {
setLogLevel,
createLogger,
checkForUpdate,
BUNDLED_IS_BINARY,
BUNDLED_VERSION,
shutdownTelemetry,
} from '@archon/paths';
import * as git from '@archon/git';
/** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */
@ -110,6 +103,7 @@ Commands:
isolation cleanup --merged Remove environments with branches merged into main
continue <branch> [msg] Continue work on an existing worktree with prior context
complete <branch> [...] Complete branch lifecycle (remove worktree + branches)
serve Start the web UI server (downloads web UI on first run)
validate workflows [name] Validate workflow definitions and their references
validate commands [name] Validate command files
version Show version info
@ -127,9 +121,8 @@ Options:
--json Output machine-readable JSON (for workflow list)
--workflow <name> Workflow to run for 'continue' (default: archon-assist)
--no-context Skip context injection for 'continue'
--allow-env-keys Grant env-key consent during auto-registration
(bypasses the env-leak gate for this codebase;
logs an audit entry)
--port <port> Override server port for 'serve' (default: 3090)
--download-only Download web UI without starting the server
Examples:
archon chat "What does the orchestrator do?"
@ -155,6 +148,20 @@ async function closeDb(): Promise<void> {
}
}
async function printUpdateNotice(quiet: boolean | undefined): Promise<void> {
if (quiet || !BUNDLED_IS_BINARY) return;
try {
const result = await checkForUpdate(BUNDLED_VERSION);
if (result?.updateAvailable) {
process.stderr.write(
`Update available: v${result.currentVersion} → v${result.latestVersion}${result.releaseUrl}\n`
);
}
} catch (err) {
getLog().debug({ err }, 'update_check.notice_failed');
}
}
/**
* Main CLI entry point
* Returns exit code (0 = success, non-zero = failure)
@ -193,7 +200,10 @@ async function main(): Promise<number> {
reason: { type: 'string' },
workflow: { type: 'string' },
'no-context': { type: 'boolean' },
'allow-env-keys': { type: 'boolean' },
port: { type: 'string' },
'download-only': { type: 'boolean' },
scope: { type: 'string' },
force: { type: 'boolean' },
},
allowPositionals: true,
strict: false, // Allow unknown flags to pass through
@ -215,8 +225,6 @@ async function main(): Promise<number> {
const resumeFlag = values.resume as boolean | undefined;
const spawnFlag = values.spawn as boolean | undefined;
const jsonFlag = values.json as boolean | undefined;
const allowEnvKeysFlag = values['allow-env-keys'] as boolean | undefined;
// Handle help flag
if (values.help) {
printUsage();
@ -228,7 +236,7 @@ async function main(): Promise<number> {
const subcommand = positionals[1];
// Commands that don't require git repo validation
const noGitCommands = ['version', 'help', 'setup', 'chat', 'continue'];
const noGitCommands = ['version', 'help', 'setup', 'chat', 'continue', 'serve'];
const requiresGitRepo = !noGitCommands.includes(command ?? '');
try {
@ -282,9 +290,30 @@ async function main(): Promise<number> {
break;
}
case 'setup':
await setupCommand({ spawn: spawnFlag, repoPath: cwd });
case 'setup': {
const rawScope = values.scope as string | undefined;
if (rawScope !== undefined && rawScope !== 'home' && rawScope !== 'project') {
console.error(`Error: Invalid --scope: "${rawScope}". Must be "home" or "project".`);
return 1;
}
const scope: 'home' | 'project' = rawScope ?? 'home';
const forceFlag = (values.force as boolean | undefined) ?? false;
// For --scope project, resolve to the git repo root so running from a
// subdirectory writes to <repo-root>/.archon/.env (what loadArchonEnv
// reads at boot) — not <subdir>/.archon/.env.
let repoPath = cwd;
if (scope === 'project') {
const repoRoot = await git.findRepoRoot(cwd);
if (!repoRoot) {
console.error('Error: --scope project requires running from inside a git repository.');
console.error('Run from the repo root, pass --cwd <repo>, or use --scope home.');
return 1;
}
repoPath = repoRoot;
}
await setupCommand({ spawn: spawnFlag, repoPath, scope, force: forceFlag });
break;
}
case 'workflow':
switch (subcommand) {
@ -328,7 +357,6 @@ async function main(): Promise<number> {
fromBranch,
noWorktree,
resume: resumeFlag,
allowEnvKeys: allowEnvKeysFlag,
quiet: values.quiet as boolean | undefined,
verbose: values.verbose as boolean | undefined,
};
@ -376,10 +404,11 @@ async function main(): Promise<number> {
case 'reject': {
const rejectRunId = positionals[2];
if (!rejectRunId) {
console.error('Usage: archon workflow reject <run-id> [--reason "..."]');
console.error('Usage: archon workflow reject <run-id> [reason]');
return 1;
}
const rejectReason = values.reason as string | undefined;
const rejectReason =
(values.reason as string | undefined) || positionals.slice(3).join(' ') || undefined;
await workflowRejectCommand(rejectRunId, rejectReason);
break;
}
@ -534,6 +563,12 @@ async function main(): Promise<number> {
break;
}
case 'serve': {
const servePort = values.port !== undefined ? Number(values.port) : undefined;
const downloadOnly = Boolean(values['download-only']);
return await serveCommand({ port: servePort, downloadOnly });
}
default:
if (command === undefined) {
console.error('Missing command');
@ -543,6 +578,7 @@ async function main(): Promise<number> {
printUsage();
return 1;
}
await printUpdateNotice(values.quiet as boolean | undefined);
return 0;
} catch (error) {
const err = error as Error;
@ -552,6 +588,9 @@ async function main(): Promise<number> {
}
return 1;
} finally {
// Flush queued telemetry events before the CLI process exits.
// Short-lived CLI commands lose buffered events if shutdown() is skipped.
await shutdownTelemetry();
// Always close database connection
await closeDb();
}

View file

@ -36,7 +36,9 @@ mock.module('@archon/core/db/workflows', () => ({
getActiveWorkflowRunByPath: mockGetActiveWorkflowRunByPath,
}));
const mockRemoveEnvironment = mock(() => Promise.resolve());
const mockRemoveEnvironment = mock(() =>
Promise.resolve({ worktreeRemoved: true, branchDeleted: true, warnings: [] })
);
const mockCleanupMergedWorktrees = mock(() => Promise.resolve({ removed: [], skipped: [] }));
mock.module('@archon/core/services/cleanup-service', () => ({
@ -136,7 +138,11 @@ describe('isolationCompleteCommand', () => {
it('completes a branch when env is found and all checks pass', async () => {
mockFindActiveByBranchName.mockResolvedValueOnce(mockEnv);
mockRemoveEnvironment.mockResolvedValueOnce(undefined);
mockRemoveEnvironment.mockResolvedValueOnce({
worktreeRemoved: true,
branchDeleted: true,
warnings: [],
});
await isolationCompleteCommand(['feature-branch'], { force: false, deleteRemote: true });
@ -309,7 +315,11 @@ describe('isolationCompleteCommand', () => {
it('skips PR check with warning when gh CLI is not available', async () => {
mockFindActiveByBranchName.mockResolvedValueOnce(mockEnv);
mockRemoveEnvironment.mockResolvedValueOnce(undefined);
mockRemoveEnvironment.mockResolvedValueOnce({
worktreeRemoved: true,
branchDeleted: true,
warnings: [],
});
mockExecFileAsync.mockImplementation((cmd: string) => {
if (cmd === 'gh') {
const err = Object.assign(new Error('spawn gh ENOENT'), { code: 'ENOENT' });
@ -335,7 +345,11 @@ describe('isolationCompleteCommand', () => {
id: 'run-abc',
workflow_name: 'implement',
});
mockRemoveEnvironment.mockResolvedValueOnce(undefined);
mockRemoveEnvironment.mockResolvedValueOnce({
worktreeRemoved: true,
branchDeleted: true,
warnings: [],
});
await isolationCompleteCommand(['dirty-branch'], { force: true, deleteRemote: true });
@ -368,7 +382,7 @@ describe('isolationCompleteCommand', () => {
.mockResolvedValueOnce(null) // not found: branch-2
.mockResolvedValueOnce(mockEnv); // found: branch-3 (will fail)
mockRemoveEnvironment
.mockResolvedValueOnce(undefined) // branch-1 succeeds
.mockResolvedValueOnce({ worktreeRemoved: true, branchDeleted: true, warnings: [] }) // branch-1 succeeds
.mockRejectedValueOnce(new Error('some error')); // branch-3 fails
await isolationCompleteCommand(['branch-1', 'branch-2', 'branch-3'], {
@ -378,6 +392,59 @@ describe('isolationCompleteCommand', () => {
expect(consoleLogSpy).toHaveBeenCalledWith('\nComplete: 1 completed, 1 failed, 1 not found');
});
it('counts as failed when removeEnvironment returns skippedReason (ghost worktree)', async () => {
mockFindActiveByBranchName.mockResolvedValueOnce(mockEnv);
mockRemoveEnvironment.mockResolvedValueOnce({
worktreeRemoved: false,
branchDeleted: false,
skippedReason: 'has uncommitted changes',
warnings: [],
});
await isolationCompleteCommand(['ghost-branch'], { force: true, deleteRemote: true });
expect(consoleErrorSpy).toHaveBeenCalledWith(
' Blocked: ghost-branch — has uncommitted changes'
);
expect(consoleErrorSpy).toHaveBeenCalledWith(' Use --force to override.');
expect(consoleLogSpy).toHaveBeenCalledWith('\nComplete: 0 completed, 1 failed, 0 not found');
});
it('counts as failed when removeEnvironment returns partial (worktree not removed, branch deleted)', async () => {
mockFindActiveByBranchName.mockResolvedValueOnce(mockEnv);
mockRemoveEnvironment.mockResolvedValueOnce({
worktreeRemoved: false,
branchDeleted: true,
warnings: ['Some warning'],
skippedReason: undefined,
});
await isolationCompleteCommand(['partial-branch'], { force: true, deleteRemote: true });
expect(consoleErrorSpy).toHaveBeenCalledWith(
' Partial: partial-branch — worktree was not removed from disk (branch deleted, DB updated)'
);
expect(consoleErrorSpy).toHaveBeenCalledWith(' ⚠ Some warning');
expect(consoleLogSpy).toHaveBeenCalledWith('\nComplete: 0 completed, 1 failed, 0 not found');
});
it('surfaces warnings from removeEnvironment result', async () => {
mockFindActiveByBranchName.mockResolvedValueOnce(mockEnv);
mockRemoveEnvironment.mockResolvedValueOnce({
worktreeRemoved: true,
branchDeleted: false,
warnings: ["Cannot delete branch 'feature-branch': checked out elsewhere"],
});
await isolationCompleteCommand(['feature-branch'], { force: true, deleteRemote: true });
expect(consoleWarnSpy).toHaveBeenCalledWith(
" Warning: Cannot delete branch 'feature-branch': checked out elsewhere"
);
// Should still count as completed since worktree was removed
expect(consoleLogSpy).toHaveBeenCalledWith(' Completed: feature-branch');
expect(consoleLogSpy).toHaveBeenCalledWith('\nComplete: 1 completed, 0 failed, 0 not found');
});
});
describe('isolationCleanupMergedCommand', () => {

View file

@ -13,7 +13,10 @@ import {
getDefaultBranch,
} from '@archon/git';
import { getIsolationProvider } from '@archon/isolation';
import { removeEnvironment } from '@archon/core/services/cleanup-service';
import {
removeEnvironment,
type RemoveEnvironmentResult,
} from '@archon/core/services/cleanup-service';
import {
listEnvironments,
cleanupMergedEnvironments,
@ -298,12 +301,37 @@ export async function isolationCompleteCommand(
}
try {
await removeEnvironment(env.id, {
const result: RemoveEnvironmentResult = await removeEnvironment(env.id, {
force: options.force,
deleteRemoteBranch: options.deleteRemote ?? true,
});
console.log(` Completed: ${branch}`);
completed++;
// Surface warnings from partial cleanup
for (const warning of result.warnings) {
console.warn(` Warning: ${warning}`);
}
if (result.skippedReason) {
console.error(` Blocked: ${branch}${result.skippedReason}`);
if (result.skippedReason === 'has uncommitted changes') {
console.error(' Use --force to override.');
}
failed++;
} else if (!result.worktreeRemoved) {
const parts: string[] = [];
if (result.branchDeleted) parts.push('branch deleted');
parts.push('DB updated');
console.error(
` Partial: ${branch} — worktree was not removed from disk (${parts.join(', ')})`
);
for (const warning of result.warnings) {
console.error(`${warning}`);
}
failed++;
} else {
console.log(` Completed: ${branch}`);
completed++;
}
} catch (error) {
const err = error as Error;
getLog().warn({ err, branch, envId: env.id }, 'isolation.complete_failed');

View file

@ -0,0 +1,118 @@
import { describe, it, expect, mock, beforeEach, afterEach, spyOn } from 'bun:test';
// Mock @archon/paths BEFORE importing the module under test.
// This sets BUNDLED_IS_BINARY = false (dev mode) so serveCommand rejects.
const mockLogger = {
fatal: mock(() => undefined),
error: mock(() => undefined),
warn: mock(() => undefined),
info: mock(() => undefined),
debug: mock(() => undefined),
trace: mock(() => undefined),
};
mock.module('@archon/paths', () => ({
createLogger: mock(() => mockLogger),
getWebDistDir: mock((version: string) => `/tmp/test-archon/web-dist/${version}`),
BUNDLED_IS_BINARY: false,
BUNDLED_VERSION: 'dev',
}));
import { serveCommand, parseChecksum } from './serve';
describe('parseChecksum', () => {
const validHash = 'a'.repeat(64);
it('should extract hash for matching filename', () => {
const checksums = [
`${'b'.repeat(64)} archon-linux-x64`,
`${validHash} archon-web.tar.gz`,
`${'c'.repeat(64)} archon-darwin-arm64`,
].join('\n');
expect(parseChecksum(checksums, 'archon-web.tar.gz')).toBe(validHash);
});
it('should handle single-space separator', () => {
const checksums = `${validHash} archon-web.tar.gz\n`;
expect(parseChecksum(checksums, 'archon-web.tar.gz')).toBe(validHash);
});
it('should throw for missing filename', () => {
const checksums = `${validHash} archon-linux-x64\n`;
expect(() => parseChecksum(checksums, 'archon-web.tar.gz')).toThrow(
'Checksum not found for archon-web.tar.gz'
);
});
it('should throw for empty checksums text', () => {
expect(() => parseChecksum('', 'archon-web.tar.gz')).toThrow('Checksum not found');
});
it('should skip blank lines', () => {
const checksums = `\n${validHash} archon-web.tar.gz\n\n`;
expect(parseChecksum(checksums, 'archon-web.tar.gz')).toBe(validHash);
});
it('should throw for malformed hash (not 64 hex chars)', () => {
const checksums = 'short_hash archon-web.tar.gz\n';
expect(() => parseChecksum(checksums, 'archon-web.tar.gz')).toThrow(
'Malformed checksum entry for archon-web.tar.gz'
);
});
it('should throw for uppercase hex hash', () => {
const checksums = `${'A'.repeat(64)} archon-web.tar.gz\n`;
expect(() => parseChecksum(checksums, 'archon-web.tar.gz')).toThrow(
'Malformed checksum entry for archon-web.tar.gz'
);
});
});
describe('serveCommand', () => {
let consoleErrorSpy: ReturnType<typeof spyOn>;
beforeEach(() => {
consoleErrorSpy = spyOn(console, 'error').mockImplementation(() => {});
});
afterEach(() => {
consoleErrorSpy.mockRestore();
});
it('should reject in dev mode (non-binary)', async () => {
const exitCode = await serveCommand({});
expect(exitCode).toBe(1);
expect(consoleErrorSpy).toHaveBeenCalledWith(
'Error: `archon serve` is for compiled binaries only.'
);
});
it('should reject with downloadOnly in dev mode', async () => {
const exitCode = await serveCommand({ downloadOnly: true });
expect(exitCode).toBe(1);
});
it('should reject invalid port (NaN)', async () => {
const exitCode = await serveCommand({ port: NaN });
expect(exitCode).toBe(1);
expect(consoleErrorSpy).toHaveBeenCalledWith(
expect.stringContaining('--port must be an integer between 1 and 65535')
);
});
it('should reject port out of range', async () => {
const exitCode = await serveCommand({ port: 99999 });
expect(exitCode).toBe(1);
expect(consoleErrorSpy).toHaveBeenCalledWith(
expect.stringContaining('--port must be an integer between 1 and 65535')
);
});
it('should reject port 0', async () => {
const exitCode = await serveCommand({ port: 0 });
expect(exitCode).toBe(1);
expect(consoleErrorSpy).toHaveBeenCalledWith(
expect.stringContaining('--port must be an integer between 1 and 65535')
);
});
});

View file

@ -0,0 +1,186 @@
import { dirname } from 'path';
import { existsSync, mkdirSync, renameSync, rmSync } from 'fs';
import { createLogger, getWebDistDir, BUNDLED_IS_BINARY, BUNDLED_VERSION } from '@archon/paths';
const log = createLogger('cli.serve');
const GITHUB_REPO = 'coleam00/Archon';
function toError(err: unknown): Error {
return err instanceof Error ? err : new Error(String(err));
}
export interface ServeOptions {
/** TCP port to bind. Ignored when downloadOnly is true. Range: 165535. */
port?: number;
/** Download the web UI and exit without starting the server. */
downloadOnly?: boolean;
}
export async function serveCommand(opts: ServeOptions): Promise<number> {
if (
opts.port !== undefined &&
(!Number.isInteger(opts.port) || opts.port < 1 || opts.port > 65535)
) {
console.error(`Error: --port must be an integer between 1 and 65535, got: ${opts.port}`);
return 1;
}
if (!BUNDLED_IS_BINARY) {
console.error('Error: `archon serve` is for compiled binaries only.');
console.error('For development, use: bun run dev');
return 1;
}
const version = BUNDLED_VERSION;
const webDistDir = getWebDistDir(version);
if (!existsSync(webDistDir)) {
try {
await downloadWebDist(version, webDistDir);
} catch (err) {
const error = toError(err);
log.error({ err: error, version, webDistDir }, 'web_dist.download_failed');
console.error(`Error: Failed to download web UI: ${error.message}`);
return 1;
}
} else {
log.info({ webDistDir }, 'web_dist.cache_hit');
}
if (opts.downloadOnly) {
log.info({ webDistDir }, 'web_dist.download_completed');
console.log(`Web UI downloaded to: ${webDistDir}`);
return 0;
}
// Import server and start (dynamic import keeps CLI startup fast for other commands)
try {
const { startServer } = await import('@archon/server');
await startServer({
webDistPath: webDistDir,
port: opts.port,
});
} catch (err) {
const error = toError(err);
log.error({ err: error, version, webDistDir, port: opts.port }, 'server.start_failed');
console.error(`Error: Server failed to start: ${error.message}`);
return 1;
}
// Block forever — Bun.serve() keeps the event loop alive, but the CLI's
// process.exit(exitCode) would kill it. Wait on a promise that only resolves
// on SIGINT/SIGTERM so the server stays running.
await new Promise<void>(resolve => {
process.once('SIGINT', resolve);
process.once('SIGTERM', resolve);
});
return 0;
}
async function downloadWebDist(version: string, targetDir: string): Promise<void> {
const tarballUrl = `https://github.com/${GITHUB_REPO}/releases/download/v${version}/archon-web.tar.gz`;
const checksumsUrl = `https://github.com/${GITHUB_REPO}/releases/download/v${version}/checksums.txt`;
log.info({ version, targetDir }, 'web_dist.download_started');
console.log(`Web UI not found locally — downloading from release v${version}...`);
// Download checksums and tarball in parallel
console.log(`Downloading ${tarballUrl}...`);
const [checksumsRes, tarballRes] = await Promise.all([
fetch(checksumsUrl).catch((err: unknown) => {
throw new Error(
`Network error fetching checksums from ${checksumsUrl}: ${(err as Error).message}`
);
}),
fetch(tarballUrl).catch((err: unknown) => {
throw new Error(
`Network error fetching tarball from ${tarballUrl}: ${(err as Error).message}`
);
}),
]);
if (!checksumsRes.ok) {
throw new Error(
`Failed to download checksums: ${checksumsRes.status} ${checksumsRes.statusText}`
);
}
if (!tarballRes.ok) {
throw new Error(`Failed to download web UI: ${tarballRes.status} ${tarballRes.statusText}`);
}
const [checksumsText, tarballBuffer] = await Promise.all([
checksumsRes.text(),
tarballRes.arrayBuffer(),
]);
const expectedHash = parseChecksum(checksumsText, 'archon-web.tar.gz');
// Verify checksum
const hasher = new Bun.CryptoHasher('sha256');
hasher.update(new Uint8Array(tarballBuffer));
const actualHash = hasher.digest('hex');
if (actualHash !== expectedHash) {
throw new Error(`Checksum mismatch: expected ${expectedHash}, got ${actualHash}`);
}
console.log('Checksum verified.');
// Extract to temp dir, then atomic rename
const tmpDir = `${targetDir}.tmp`;
// Clean up any previous failed attempt
rmSync(tmpDir, { recursive: true, force: true });
mkdirSync(tmpDir, { recursive: true });
// Extract tarball using tar (available on macOS/Linux)
const proc = Bun.spawn(['tar', 'xzf', '-', '-C', tmpDir, '--strip-components=1'], {
stdin: new Uint8Array(tarballBuffer),
stderr: 'pipe',
});
const exitCode = await proc.exited;
if (exitCode !== 0) {
const stderrText = await new Response(proc.stderr).text();
cleanupAndThrow(tmpDir, `tar extraction failed (exit ${exitCode}): ${stderrText.trim()}`);
}
// Verify extraction produced expected layout
if (!existsSync(`${tmpDir}/index.html`)) {
cleanupAndThrow(
tmpDir,
'Extraction produced unexpected layout — index.html not found in extracted dir'
);
}
// Atomic move into place
mkdirSync(dirname(targetDir), { recursive: true });
try {
renameSync(tmpDir, targetDir);
} catch (err) {
cleanupAndThrow(
tmpDir,
`Failed to move extracted web UI from ${tmpDir} to ${targetDir}: ${(err as Error).message}`
);
}
console.log(`Extracted to ${targetDir}`);
}
function cleanupAndThrow(tmpDir: string, message: string): never {
rmSync(tmpDir, { recursive: true, force: true });
throw new Error(message);
}
/**
* Parse a SHA-256 checksum from a checksums.txt file (sha256sum format).
* Format: `<hash> <filename>` or `<hash> <filename>`
*/
export function parseChecksum(checksums: string, filename: string): string {
for (const line of checksums.split('\n')) {
const parts = line.trim().split(/\s+/);
if (parts.length >= 2 && parts[1] === filename) {
const hash = parts[0];
if (!/^[0-9a-f]{64}$/.test(hash)) {
throw new Error(`Malformed checksum entry for ${filename}: "${line.trim()}"`);
}
return hash;
}
}
throw new Error(`Checksum not found for ${filename} in checksums.txt`);
}

View file

@ -11,7 +11,13 @@ import {
generateWebhookSecret,
spawnTerminalWithSetup,
copyArchonSkill,
detectClaudeExecutablePath,
writeScopedEnv,
serializeEnv,
resolveScopedEnvPath,
} from './setup';
import * as setupModule from './setup';
import { parse as parseDotenv } from 'dotenv';
// Test directory for file operations
const TEST_DIR = join(tmpdir(), 'archon-setup-test-' + Date.now());
@ -148,7 +154,9 @@ CODEX_ACCOUNT_ID=account1
expect(content).toContain('# Using SQLite (default)');
expect(content).toContain('CLAUDE_USE_GLOBAL_AUTH=true');
expect(content).toContain('DEFAULT_AI_ASSISTANT=claude');
expect(content).toContain('PORT=3000');
// PORT is intentionally commented out — server and Vite both default to 3090 when unset (#1152).
expect(content).toContain('# PORT=3090');
expect(content).not.toMatch(/^PORT=/m);
expect(content).not.toContain('DATABASE_URL=');
});
@ -176,6 +184,41 @@ CODEX_ACCOUNT_ID=account1
expect(content).toContain('CLAUDE_API_KEY=sk-test-key');
});
it('emits CLAUDE_BIN_PATH when claudeBinaryPath is configured', () => {
const content = generateEnvContent({
database: { type: 'sqlite' },
ai: {
claude: true,
claudeAuthType: 'global',
claudeBinaryPath: '/usr/local/lib/node_modules/@anthropic-ai/claude-code/cli.js',
codex: false,
defaultAssistant: 'claude',
},
platforms: { github: false, telegram: false, slack: false, discord: false },
botDisplayName: 'Archon',
});
expect(content).toContain(
'CLAUDE_BIN_PATH=/usr/local/lib/node_modules/@anthropic-ai/claude-code/cli.js'
);
});
it('omits CLAUDE_BIN_PATH when not configured', () => {
const content = generateEnvContent({
database: { type: 'sqlite' },
ai: {
claude: true,
claudeAuthType: 'global',
codex: false,
defaultAssistant: 'claude',
},
platforms: { github: false, telegram: false, slack: false, discord: false },
botDisplayName: 'Archon',
});
expect(content).not.toContain('CLAUDE_BIN_PATH=');
});
it('should include platform configurations', () => {
const content = generateEnvContent({
database: { type: 'sqlite' },
@ -418,3 +461,278 @@ CODEX_ACCOUNT_ID=account1
});
});
});
describe('detectClaudeExecutablePath probe order', () => {
// Use spies on the exported probe wrappers so each tier can be controlled
// independently without touching the real filesystem or shell.
let fileExistsSpy: ReturnType<typeof spyOn>;
let npmRootSpy: ReturnType<typeof spyOn>;
let whichSpy: ReturnType<typeof spyOn>;
beforeEach(() => {
fileExistsSpy = spyOn(setupModule, 'probeFileExists').mockReturnValue(false);
npmRootSpy = spyOn(setupModule, 'probeNpmRoot').mockReturnValue(null);
whichSpy = spyOn(setupModule, 'probeWhichClaude').mockReturnValue(null);
});
afterEach(() => {
fileExistsSpy.mockRestore();
npmRootSpy.mockRestore();
whichSpy.mockRestore();
});
it('returns the native installer path when present (tier 1 wins)', () => {
// Native path exists; subsequent probes must not be called.
fileExistsSpy.mockImplementation(
(p: string) => p.includes('.local/bin/claude') || p.includes('.local\\bin\\claude')
);
const result = detectClaudeExecutablePath();
expect(result).toBeTruthy();
expect(result).toMatch(/\.local[\\/]bin[\\/]claude/);
// Tier 2 / 3 must not have been consulted.
expect(npmRootSpy).not.toHaveBeenCalled();
expect(whichSpy).not.toHaveBeenCalled();
});
it('falls through to npm cli.js when native is missing (tier 2 wins)', () => {
// Use path.join so the expected result matches whatever separator the
// production code produces on the current platform (backslash on Windows,
// forward slash elsewhere).
const npmRoot = join('fake', 'npm', 'root');
const expectedCliJs = join(npmRoot, '@anthropic-ai', 'claude-code', 'cli.js');
npmRootSpy.mockReturnValue(npmRoot);
fileExistsSpy.mockImplementation((p: string) => p === expectedCliJs);
const result = detectClaudeExecutablePath();
expect(result).toBe(expectedCliJs);
// Tier 3 must not have been consulted.
expect(whichSpy).not.toHaveBeenCalled();
});
it('falls through to which/where when native and npm probes both miss (tier 3 wins)', () => {
npmRootSpy.mockReturnValue('/fake/npm/root');
// Native miss, npm cli.js miss, but `which claude` returns a path that exists.
whichSpy.mockReturnValue('/opt/homebrew/bin/claude');
fileExistsSpy.mockImplementation((p: string) => p === '/opt/homebrew/bin/claude');
const result = detectClaudeExecutablePath();
expect(result).toBe('/opt/homebrew/bin/claude');
});
it('returns null when every probe misses', () => {
// All defaults already return false/null; nothing to override.
expect(detectClaudeExecutablePath()).toBeNull();
});
it('does not return a which-resolved path that fails the existsSync check', () => {
// `which` returns a path string but the file is not actually present
// (stale PATH entry, dangling symlink, etc.) — must not be returned.
npmRootSpy.mockReturnValue('/fake/npm/root');
whichSpy.mockReturnValue('/stale/path/claude');
fileExistsSpy.mockReturnValue(false);
expect(detectClaudeExecutablePath()).toBeNull();
});
it('skips npm tier when probeNpmRoot returns null (e.g. npm not installed)', () => {
// npm probe fails; tier 3 must still run.
whichSpy.mockReturnValue('/usr/local/bin/claude');
fileExistsSpy.mockImplementation((p: string) => p === '/usr/local/bin/claude');
const result = detectClaudeExecutablePath();
expect(result).toBe('/usr/local/bin/claude');
expect(npmRootSpy).toHaveBeenCalled();
});
});
/**
* Tests for the three-path env write model (#1303).
*
* Invariants:
* - <repo>/.env is NEVER written.
* - Default write targets ~/.archon/.env (home scope) with merge preserving
* existing non-empty values.
* - --scope project writes to <repo>/.archon/.env.
* - --force overwrites the target wholesale, still writes a backup.
* - Merge preserves user-added keys not in the proposed content.
*/
describe('writeScopedEnv (#1303)', () => {
const ROOT = join(tmpdir(), 'archon-write-scoped-env-test-' + Date.now());
const HOME_DIR = join(ROOT, 'archon-home');
const REPO_DIR = join(ROOT, 'repo');
let originalArchonHome: string | undefined;
beforeEach(() => {
mkdirSync(HOME_DIR, { recursive: true });
mkdirSync(REPO_DIR, { recursive: true });
originalArchonHome = process.env.ARCHON_HOME;
process.env.ARCHON_HOME = HOME_DIR;
});
afterEach(() => {
if (originalArchonHome === undefined) delete process.env.ARCHON_HOME;
else process.env.ARCHON_HOME = originalArchonHome;
rmSync(ROOT, { recursive: true, force: true });
});
it('fresh home scope writes content with no backup', () => {
const result = writeScopedEnv('DATABASE_URL=sqlite:local\nPORT=3090\n', {
scope: 'home',
repoPath: REPO_DIR,
force: false,
});
expect(result.targetPath).toBe(join(HOME_DIR, '.env'));
expect(result.backupPath).toBeNull();
expect(result.preservedKeys).toEqual([]);
expect(readFileSync(result.targetPath, 'utf-8')).toContain('DATABASE_URL=sqlite:local');
});
it('merge preserves user-added custom keys across re-runs', () => {
// First write
writeScopedEnv('DATABASE_URL=sqlite:local\n', {
scope: 'home',
repoPath: REPO_DIR,
force: false,
});
// User adds a custom var
const envPath = join(HOME_DIR, '.env');
writeFileSync(envPath, readFileSync(envPath, 'utf-8') + 'MY_CUSTOM_SECRET=preserve-me\n');
// Second setup run (proposes a different-shape config)
const result = writeScopedEnv('DATABASE_URL=sqlite:local\nPORT=3090\n', {
scope: 'home',
repoPath: REPO_DIR,
force: false,
});
const merged = parseDotenv(readFileSync(result.targetPath, 'utf-8'));
expect(merged.MY_CUSTOM_SECRET).toBe('preserve-me');
expect(merged.PORT).toBe('3090');
expect(result.backupPath).not.toBeNull();
});
it('merge preserves existing PostgreSQL DATABASE_URL when proposed is SQLite', () => {
const envPath = join(HOME_DIR, '.env');
writeFileSync(envPath, 'DATABASE_URL=postgresql://localhost:5432/mydb\n');
const result = writeScopedEnv(
'# Using SQLite (default) - no DATABASE_URL needed\nDATABASE_URL=\n',
{ scope: 'home', repoPath: REPO_DIR, force: false }
);
const merged = parseDotenv(readFileSync(result.targetPath, 'utf-8'));
expect(merged.DATABASE_URL).toBe('postgresql://localhost:5432/mydb');
expect(result.preservedKeys).toContain('DATABASE_URL');
});
it('merge preserves existing bot tokens', () => {
const envPath = join(HOME_DIR, '.env');
writeFileSync(
envPath,
'SLACK_BOT_TOKEN=xoxb-existing\nCLAUDE_CODE_OAUTH_TOKEN=sk-ant-existing\n'
);
// Proposed content has these keys with different/empty values
writeScopedEnv('SLACK_BOT_TOKEN=xoxb-new-placeholder\nCLAUDE_CODE_OAUTH_TOKEN=\n', {
scope: 'home',
repoPath: REPO_DIR,
force: false,
});
const merged = parseDotenv(readFileSync(join(HOME_DIR, '.env'), 'utf-8'));
expect(merged.SLACK_BOT_TOKEN).toBe('xoxb-existing');
expect(merged.CLAUDE_CODE_OAUTH_TOKEN).toBe('sk-ant-existing');
});
it('--force overwrites wholesale but writes a timestamped backup', () => {
const envPath = join(HOME_DIR, '.env');
writeFileSync(envPath, 'OLD_KEY=old\nDATABASE_URL=postgresql://legacy\n');
const result = writeScopedEnv('DATABASE_URL=sqlite:local\nNEW_KEY=new\n', {
scope: 'home',
repoPath: REPO_DIR,
force: true,
});
expect(result.forced).toBe(true);
expect(result.backupPath).not.toBeNull();
expect(result.backupPath).toMatch(/\.archon-backup-\d{4}-\d{2}-\d{2}T/);
// Backup has the old content
expect(readFileSync(result.backupPath as string, 'utf-8')).toContain('OLD_KEY=old');
// Target has the new content only — OLD_KEY is gone
const newContent = readFileSync(result.targetPath, 'utf-8');
expect(newContent).toContain('DATABASE_URL=sqlite:local');
expect(newContent).toContain('NEW_KEY=new');
expect(newContent).not.toContain('OLD_KEY');
});
it('--force on a non-existent target writes cleanly with no backup', () => {
const result = writeScopedEnv('PORT=3090\n', {
scope: 'home',
repoPath: REPO_DIR,
force: true,
});
expect(result.backupPath).toBeNull();
expect(result.forced).toBe(false); // no existing file means force was effectively a no-op
});
it('--scope project writes to <repo>/.archon/.env, creating the directory', () => {
expect(existsSync(join(REPO_DIR, '.archon'))).toBe(false);
const result = writeScopedEnv('FOO=bar\n', {
scope: 'project',
repoPath: REPO_DIR,
force: false,
});
expect(result.targetPath).toBe(join(REPO_DIR, '.archon', '.env'));
expect(existsSync(result.targetPath)).toBe(true);
expect(existsSync(join(HOME_DIR, '.env'))).toBe(false);
});
it('<repo>/.env is never touched by writeScopedEnv in any scope/mode', () => {
const repoEnvPath = join(REPO_DIR, '.env');
const sentinel = 'USER_SECRET=do-not-touch\n';
writeFileSync(repoEnvPath, sentinel);
// Home scope, merge
writeScopedEnv('FOO=bar\n', { scope: 'home', repoPath: REPO_DIR, force: false });
// Home scope, force
writeScopedEnv('FOO=baz\n', { scope: 'home', repoPath: REPO_DIR, force: true });
// Project scope, merge
writeScopedEnv('FOO=qux\n', { scope: 'project', repoPath: REPO_DIR, force: false });
// Project scope, force
writeScopedEnv('FOO=xyz\n', { scope: 'project', repoPath: REPO_DIR, force: true });
expect(readFileSync(repoEnvPath, 'utf-8')).toBe(sentinel);
});
it('resolveScopedEnvPath returns the archon-owned path for each scope', () => {
expect(resolveScopedEnvPath('home', REPO_DIR)).toBe(join(HOME_DIR, '.env'));
expect(resolveScopedEnvPath('project', REPO_DIR)).toBe(join(REPO_DIR, '.archon', '.env'));
});
it('serializeEnv round-trips through dotenv.parse', () => {
const entries = {
SIMPLE: 'value',
WITH_SPACE: 'hello world',
WITH_HASH: 'value#not-a-comment',
EMPTY: '',
};
const serialized = serializeEnv(entries);
const parsed = parseDotenv(serialized);
expect(parsed.SIMPLE).toBe('value');
expect(parsed.WITH_SPACE).toBe('hello world');
expect(parsed.WITH_HASH).toBe('value#not-a-comment');
expect(parsed.EMPTY).toBe('');
});
it('serializeEnv escapes \\r so bare CRs survive round-trip', () => {
const entries = { WITH_CR: 'line1\rline2', WITH_CRLF: 'a\r\nb' };
const serialized = serializeEnv(entries);
const parsed = parseDotenv(serialized);
expect(parsed.WITH_CR).toBe('line1\rline2');
expect(parsed.WITH_CRLF).toBe('a\r\nb');
});
it('merge treats whitespace-only existing values as empty (replaces them)', () => {
const envPath = join(HOME_DIR, '.env');
writeFileSync(envPath, 'API_KEY= \nNORMAL=keep-me\n');
const result = writeScopedEnv('API_KEY=real-token\nNORMAL=from-wizard\n', {
scope: 'home',
repoPath: REPO_DIR,
force: false,
});
const merged = parseDotenv(readFileSync(result.targetPath, 'utf-8'));
// Whitespace-only API_KEY was replaced by the proposed value.
expect(merged.API_KEY).toBe('real-token');
// Non-empty NORMAL was preserved and reported.
expect(merged.NORMAL).toBe('keep-me');
expect(result.preservedKeys).toContain('NORMAL');
expect(result.preservedKeys).not.toContain('API_KEY');
});
});

View file

@ -6,7 +6,17 @@
* - AI assistants (Claude and/or Codex)
* - Platform connections (GitHub, Telegram, Slack, Discord)
*
* Writes configuration to both ~/.archon/.env and <repo>/.env
* Writes configuration to one archon-owned env file, chosen by --scope:
* - 'home' (default) ~/.archon/.env
* - 'project' <repo>/.archon/.env
*
* Never writes to <repo>/.env that file is stripped at boot by stripCwdEnv()
* (see #1302 / #1303 three-path model). Writing there would be incoherent
* (values would be silently deleted on the next run).
*
* Writes are merge-only by default: existing non-empty values are preserved,
* user-added custom keys survive, and a timestamped backup is written before
* every rewrite. `--force` skips the merge (proposed wins) but still backs up.
*/
import {
intro,
@ -22,12 +32,18 @@ import {
cancel,
log,
} from '@clack/prompts';
import { existsSync, readFileSync, writeFileSync, mkdirSync } from 'fs';
import { existsSync, readFileSync, writeFileSync, mkdirSync, copyFileSync, chmodSync } from 'fs';
import { parse as parseDotenv } from 'dotenv';
import { join, dirname } from 'path';
import { BUNDLED_SKILL_FILES } from '../bundled-skill';
import { homedir } from 'os';
import { randomBytes } from 'crypto';
import { spawn, execSync, type ChildProcess } from 'child_process';
import { getRegisteredProviders } from '@archon/providers';
import {
getArchonEnvPath as pathsGetArchonEnvPath,
getRepoArchonEnvPath as pathsGetRepoArchonEnvPath,
} from '@archon/paths';
// =============================================================================
// Types
@ -43,9 +59,12 @@ interface SetupConfig {
claudeAuthType?: 'global' | 'apiKey' | 'oauthToken';
claudeApiKey?: string;
claudeOauthToken?: string;
/** Absolute path to Claude Code SDK's cli.js. Written as CLAUDE_BIN_PATH
* in ~/.archon/.env. Required in compiled Archon binaries; harmless in dev. */
claudeBinaryPath?: string;
codex: boolean;
codexTokens?: CodexTokens;
defaultAssistant: 'claude' | 'codex';
defaultAssistant: string;
};
platforms: {
github: boolean;
@ -105,6 +124,10 @@ interface ExistingConfig {
interface SetupOptions {
spawn?: boolean;
repoPath: string;
/** Which archon-owned file to target. Default: 'home'. */
scope?: 'home' | 'project';
/** Skip merge and overwrite the target wholesale (backup still written). Default: false. */
force?: boolean;
}
interface SpawnResult {
@ -159,6 +182,85 @@ function isCommandAvailable(command: string): boolean {
}
}
/**
* Probe wrappers exported so tests can spy on each tier independently.
* Direct imports of `existsSync` and `execSync` cannot be intercepted by
* `spyOn` (esm rebinding limitation), so we route the probes through these
* thin wrappers and let the test mock them in isolation.
*/
export function probeFileExists(path: string): boolean {
return existsSync(path);
}
export function probeNpmRoot(): string | null {
try {
const out = execSync('npm root -g', {
encoding: 'utf-8',
stdio: ['ignore', 'pipe', 'ignore'],
}).trim();
return out || null;
} catch {
return null;
}
}
export function probeWhichClaude(): string | null {
try {
const checkCmd = process.platform === 'win32' ? 'where' : 'which';
const resolved = execSync(`${checkCmd} claude`, {
encoding: 'utf-8',
stdio: ['ignore', 'pipe', 'ignore'],
}).trim();
// On Windows, `where` can return multiple lines — take the first.
const first = resolved.split(/\r?\n/)[0]?.trim();
return first ?? null;
} catch {
return null;
}
}
/**
* Try to locate the Claude Code executable on disk.
*
* Compiled Archon binaries need an explicit path because the Claude Agent
* SDK's `import.meta.url` resolution is frozen to the build host's filesystem.
* The SDK's `pathToClaudeCodeExecutable` accepts either:
* - A native compiled binary (from the curl/PowerShell/winget installers current default)
* - A JS `cli.js` (from `npm install -g @anthropic-ai/claude-code` older path)
*
* We probe the well-known install locations in order:
* 1. Native installer (`~/.local/bin/claude` on macOS/Linux, `%USERPROFILE%\.local\bin\claude.exe` on Windows)
* 2. npm global `cli.js`
* 3. `which claude` / `where claude` fallback if the user installed via Homebrew, winget, or a custom layout
*
* Returns null on total failure so the caller can prompt the user.
* Detection is best-effort; the caller should let users override.
*
* Exported so the probe order can be tested directly by spying on the
* tier wrappers above (`probeFileExists`, `probeNpmRoot`, `probeWhichClaude`).
*/
export function detectClaudeExecutablePath(): string | null {
// 1. Native installer default location (primary Anthropic-recommended path)
const nativePath =
process.platform === 'win32'
? join(homedir(), '.local', 'bin', 'claude.exe')
: join(homedir(), '.local', 'bin', 'claude');
if (probeFileExists(nativePath)) return nativePath;
// 2. npm global cli.js
const npmRoot = probeNpmRoot();
if (npmRoot) {
const npmCliJs = join(npmRoot, '@anthropic-ai', 'claude-code', 'cli.js');
if (probeFileExists(npmCliJs)) return npmCliJs;
}
// 3. Fallback: resolve via `which` / `where` (Homebrew, winget, custom layouts)
const fromPath = probeWhichClaude();
if (fromPath && probeFileExists(fromPath)) return fromPath;
return null;
}
/**
* Get Node.js version if installed, or null if not
*/
@ -209,7 +311,7 @@ After installation, run: claude /login`,
Install using one of these methods:
Recommended for macOS (no Node.js required):
brew install --cask codex
brew install codex
Or via npm (requires Node.js 18+):
npm install -g @openai/codex
@ -226,16 +328,19 @@ After installation, run 'codex' to authenticate.`,
};
/**
* Check for existing configuration at ~/.archon/.env
* Check for existing configuration at the selected scope's archon-owned env
* file. Defaults to home scope for backward compatibility callers writing to
* project scope must pass a path so the Add/Update/Fresh decision reflects the
* actual target.
*/
export function checkExistingConfig(): ExistingConfig | null {
const envPath = join(getArchonHome(), '.env');
export function checkExistingConfig(envPath?: string): ExistingConfig | null {
const path = envPath ?? join(getArchonHome(), '.env');
if (!existsSync(envPath)) {
if (!existsSync(path)) {
return null;
}
const content = readFileSync(envPath, 'utf-8');
const content = readFileSync(path, 'utf-8');
return {
hasDatabase: hasEnvValue(content, 'DATABASE_URL'),
@ -352,6 +457,62 @@ function tryReadCodexAuth(): CodexTokens | null {
/**
* Collect Claude authentication method
*/
/**
* Resolve the Claude Code executable path for CLAUDE_BIN_PATH.
* Auto-detects common install locations and falls back to prompting the user.
* Returns undefined if the user declines to configure (setup continues; the
* compiled binary will error with clear instructions on first Claude query).
*/
async function collectClaudeBinaryPath(): Promise<string | undefined> {
const detected = detectClaudeExecutablePath();
if (detected) {
const useDetected = await confirm({
message: `Found Claude Code at ${detected}. Write this to CLAUDE_BIN_PATH?`,
initialValue: true,
});
if (isCancel(useDetected)) {
cancel('Setup cancelled.');
process.exit(0);
}
if (useDetected) return detected;
}
const nativeExample =
process.platform === 'win32' ? '%USERPROFILE%\\.local\\bin\\claude.exe' : '~/.local/bin/claude';
note(
'Compiled Archon binaries need CLAUDE_BIN_PATH set to the Claude Code executable.\n' +
'In dev (`bun run`) this is ignored — the SDK resolves it via node_modules.\n\n' +
'Recommended (Anthropic default — native installer):\n' +
` macOS/Linux: ${nativeExample}\n` +
' Windows: %USERPROFILE%\\.local\\bin\\claude.exe\n\n' +
'Alternative (npm global install):\n' +
' $(npm root -g)/@anthropic-ai/claude-code/cli.js',
'Claude binary path'
);
const customPath = await text({
message: 'Absolute path to the Claude Code executable (leave blank to skip):',
placeholder: nativeExample,
});
if (isCancel(customPath)) {
cancel('Setup cancelled.');
process.exit(0);
}
const trimmed = (customPath ?? '').trim();
if (!trimmed) return undefined;
if (!existsSync(trimmed)) {
log.warning(
`Path does not exist: ${trimmed}. Saving anyway — the compiled binary will error on first use until this is correct.`
);
}
return trimmed;
}
async function collectClaudeAuth(): Promise<{
authType: 'global' | 'apiKey' | 'oauthToken';
apiKey?: string;
@ -534,7 +695,8 @@ async function collectCodexAuth(): Promise<CodexTokens | null> {
*/
async function collectAIConfig(): Promise<SetupConfig['ai']> {
const assistants = await multiselect({
message: 'Which AI assistant(s) will you use? (↑↓ navigate, space select, enter confirm)',
message:
'Which built-in AI assistant(s) will you use? (↑↓ navigate, space select, enter confirm)',
options: [
{ value: 'claude', label: 'Claude (Recommended)', hint: 'Anthropic Claude Code SDK' },
{ value: 'codex', label: 'Codex', hint: 'OpenAI Codex SDK' },
@ -653,13 +815,14 @@ After upgrading, run 'archon setup' again.`,
return {
claude: false,
codex: false,
defaultAssistant: 'claude',
defaultAssistant: getRegisteredProviders().find(p => p.builtIn)?.id ?? 'claude',
};
}
let claudeAuthType: 'global' | 'apiKey' | 'oauthToken' | undefined;
let claudeApiKey: string | undefined;
let claudeOauthToken: string | undefined;
let claudeBinaryPath: string | undefined;
let codexTokens: CodexTokens | undefined;
// Collect Claude auth if selected
@ -668,6 +831,7 @@ After upgrading, run 'archon setup' again.`,
claudeAuthType = claudeAuth.authType;
claudeApiKey = claudeAuth.apiKey;
claudeOauthToken = claudeAuth.oauthToken;
claudeBinaryPath = await collectClaudeBinaryPath();
}
// Collect Codex auth if selected
@ -676,16 +840,21 @@ After upgrading, run 'archon setup' again.`,
codexTokens = tokens ?? undefined;
}
// Determine default assistant
let defaultAssistant: 'claude' | 'codex' = 'claude';
// Determine default assistant — use the registry, but keep setup/auth flows built-in only.
// Default to first registered built-in provider rather than hardcoding 'claude'.
let defaultAssistant = getRegisteredProviders().find(p => p.builtIn)?.id ?? 'claude';
if (hasClaude && hasCodex) {
const providerChoices = getRegisteredProviders()
.filter(p => p.builtIn)
.map(p => ({
value: p.id,
label: p.id === 'claude' ? `${p.displayName} (Recommended)` : p.displayName,
}));
const defaultChoice = await select({
message: 'Which should be the default AI assistant?',
options: [
{ value: 'claude', label: 'Claude (Recommended)' },
{ value: 'codex', label: 'Codex' },
],
options: providerChoices,
});
if (isCancel(defaultChoice)) {
@ -703,6 +872,7 @@ After upgrading, run 'archon setup' again.`,
claudeAuthType,
claudeApiKey,
claudeOauthToken,
...(claudeBinaryPath !== undefined ? { claudeBinaryPath } : {}),
codex: hasCodex,
codexTokens,
defaultAssistant,
@ -1063,6 +1233,9 @@ export function generateEnvContent(config: SetupConfig): string {
lines.push('CLAUDE_USE_GLOBAL_AUTH=false');
lines.push(`CLAUDE_CODE_OAUTH_TOKEN=${config.ai.claudeOauthToken}`);
}
if (config.ai.claudeBinaryPath) {
lines.push(`CLAUDE_BIN_PATH=${config.ai.claudeBinaryPath}`);
}
} else {
lines.push('# Claude not configured');
}
@ -1139,8 +1312,12 @@ export function generateEnvContent(config: SetupConfig): string {
}
// Server
// PORT is intentionally omitted: both the Hono server (packages/core/src/utils/port-allocation.ts)
// and the Vite dev proxy (packages/web/vite.config.ts) default to 3090 when unset, which keeps
// them in sync. Writing a fixed PORT here risked a mismatch if ~/.archon/.env leaks a PORT that
// the Vite proxy (which only reads repo-local .env) never sees — see #1152.
lines.push('# Server');
lines.push('PORT=3000');
lines.push('# PORT=3090 # Default: 3090. Uncomment to override.');
lines.push('');
// Concurrency
@ -1151,28 +1328,120 @@ export function generateEnvContent(config: SetupConfig): string {
}
/**
* Write .env files to both global and repo locations
* Resolve the target path for the selected scope. Delegates to `@archon/paths`
* so Docker (`/.archon`), the `ARCHON_HOME` override, and the "undefined"
* literal guard behave identically to the loader. Never resolves to
* `<repoPath>/.env` that path belongs to the user.
*/
function writeEnvFiles(
content: string,
repoPath: string
): { globalPath: string; repoEnvPath: string } {
const archonHome = getArchonHome();
const globalPath = join(archonHome, '.env');
const repoEnvPath = join(repoPath, '.env');
export function resolveScopedEnvPath(scope: 'home' | 'project', repoPath: string): string {
if (scope === 'project') return pathsGetRepoArchonEnvPath(repoPath);
return pathsGetArchonEnvPath();
}
// Create ~/.archon/ if needed
if (!existsSync(archonHome)) {
mkdirSync(archonHome, { recursive: true });
/**
* Serialize a key/value map back to `KEY=value` lines. Values with whitespace,
* `#`, `"`, `'`, `\n`, or `\r` are double-quoted with `\\`, `"`, `\n`, `\r`
* escaped so round-tripping through dotenv.parse is stable.
*/
export function serializeEnv(entries: Record<string, string>): string {
const lines: string[] = [];
for (const [key, rawValue] of Object.entries(entries)) {
const value = rawValue;
const needsQuoting = /[\s#"'\n\r]/.test(value) || value === '';
if (needsQuoting) {
const escaped = value
.replace(/\\/g, '\\\\')
.replace(/"/g, '\\"')
.replace(/\n/g, '\\n')
.replace(/\r/g, '\\r');
lines.push(`${key}="${escaped}"`);
} else {
lines.push(`${key}=${value}`);
}
}
return lines.join('\n') + (lines.length > 0 ? '\n' : '');
}
/**
* Produce a filesystem-safe ISO timestamp (no `:` or `.` characters).
*/
function backupTimestamp(): string {
return new Date().toISOString().replace(/[:.]/g, '-');
}
interface WriteScopedEnvResult {
targetPath: string;
backupPath: string | null;
/** Keys present in the existing file that were preserved against the proposed set. */
preservedKeys: string[];
/** True when `--force` overrode the merge. */
forced: boolean;
}
/**
* Write env content to exactly one archon-owned file, selected by scope.
* Merge-only by default (existing non-empty values win, user-added keys
* survive). Backs up the existing file (if any) before every rewrite, even
* when `--force` is set.
*/
export function writeScopedEnv(
content: string,
options: { scope: 'home' | 'project'; repoPath: string; force: boolean }
): WriteScopedEnvResult {
const targetPath = resolveScopedEnvPath(options.scope, options.repoPath);
const parentDir = dirname(targetPath);
if (!existsSync(parentDir)) {
mkdirSync(parentDir, { recursive: true });
}
// Write to global location
writeFileSync(globalPath, content);
const exists = existsSync(targetPath);
let backupPath: string | null = null;
if (exists) {
backupPath = `${targetPath}.archon-backup-${backupTimestamp()}`;
copyFileSync(targetPath, backupPath);
// Backups carry tokens/secrets — match the 0o600 we set on the live file.
chmodSync(backupPath, 0o600);
}
// Write to repo location
writeFileSync(repoEnvPath, content);
const preservedKeys: string[] = [];
let finalContent: string;
return { globalPath, repoEnvPath };
if (options.force || !exists) {
finalContent = content;
if (options.force && backupPath) {
process.stderr.write(
`[archon] --force: overwriting ${targetPath} (backup at ${backupPath})\n`
);
}
} else {
// Merge: existing non-empty values win; proposed-only keys are added;
// existing-only keys (user customizations) are preserved verbatim.
const existingRaw = readFileSync(targetPath, 'utf-8');
const existing = parseDotenv(existingRaw);
const proposed = parseDotenv(content);
const merged: Record<string, string> = { ...existing };
for (const [key, value] of Object.entries(proposed)) {
const prior = existing[key];
// Treat whitespace-only existing values as empty — otherwise a
// copy-paste stray ` ` would silently defeat the wizard's update for
// that key forever.
const priorIsEmpty = prior === undefined || prior.trim() === '';
if (!(key in existing) || priorIsEmpty) {
merged[key] = value;
} else {
preservedKeys.push(key);
}
}
finalContent = serializeEnv(merged);
}
// 0o600 — env files hold secrets. Prevents group/world-readable writes on a
// permissive umask. writeFileSync's default mode is 0o666 & ~umask.
writeFileSync(targetPath, finalContent, { mode: 0o600 });
// writeFileSync preserves mode for existing files; chmod guarantees 0o600
// even when overwriting a file that pre-existed with looser permissions.
chmodSync(targetPath, 0o600);
return { targetPath, backupPath, preservedKeys, forced: options.force && exists };
}
/**
@ -1203,7 +1472,7 @@ export function copyArchonSkill(targetPath: string): void {
function trySpawn(
command: string,
args: string[],
options: { detached: boolean; stdio: 'ignore'; shell?: boolean }
options: { detached: boolean; stdio: 'ignore' }
): boolean {
try {
const child: ChildProcess = spawn(command, args, options);
@ -1238,7 +1507,6 @@ function spawnWindowsTerminal(repoPath: string): SpawnResult {
trySpawn('cmd.exe', ['/c', 'start', '""', '/D', repoPath, 'cmd', '/k', 'archon setup'], {
detached: true,
stdio: 'ignore',
shell: true,
})
) {
return { success: true };
@ -1366,8 +1634,28 @@ export async function setupCommand(options: SetupOptions): Promise<void> {
// Interactive setup flow
intro('Archon Setup Wizard');
// Check for existing configuration
const existing = checkExistingConfig();
// Resolve scope + target path up-front so everything downstream (existing-
// config check, merge, write) agrees on which file we're touching.
const scope: 'home' | 'project' = options.scope ?? 'home';
const force = options.force ?? false;
const targetEnvPath = resolveScopedEnvPath(scope, options.repoPath);
// If a pre-existing <repo>/.env is present, tell the operator once that
// archon does NOT manage it — avoids confusion for users upgrading from
// versions that used to write there.
const legacyRepoEnv = join(options.repoPath, '.env');
if (existsSync(legacyRepoEnv)) {
log.info(
`Note: ${legacyRepoEnv} exists but is not managed by archon.\n` +
' Values there are stripped from the archon process at runtime (safety guard).\n' +
' Put archon env vars in ~/.archon/.env (home scope) or ' +
`${join(options.repoPath, '.archon', '.env')} (project scope).`
);
}
// Check for existing configuration at the selected scope (not unconditionally
// ~/.archon/.env) so the Add/Update/Fresh decision reflects the actual target.
const existing = checkExistingConfig(targetEnvPath);
type SetupMode = 'fresh' | 'add' | 'update';
let mode: SetupMode = 'fresh';
@ -1420,7 +1708,7 @@ export async function setupCommand(options: SetupOptions): Promise<void> {
ai: {
claude: existing?.hasClaude ?? false,
codex: existing?.hasCodex ?? false,
defaultAssistant: 'claude',
defaultAssistant: getRegisteredProviders().find(p => p.builtIn)?.id ?? 'claude',
},
platforms: {
github: existing?.platforms.github ?? false,
@ -1489,13 +1777,41 @@ export async function setupCommand(options: SetupOptions): Promise<void> {
config.botDisplayName = await collectBotDisplayName();
}
// Generate and write configuration
s.start('Writing configuration files...');
// Generate and write configuration. Wrap in try/catch so any fs exception
// (permission denied, read-only FS, backup copy failure, etc.) stops the
// spinner cleanly and surfaces an actionable error instead of a raw stack
// trace after the user has filled out the entire wizard.
s.start('Writing configuration...');
const envContent = generateEnvContent(config);
const { globalPath, repoEnvPath } = writeEnvFiles(envContent, options.repoPath);
let writeResult: ReturnType<typeof writeScopedEnv>;
try {
writeResult = writeScopedEnv(envContent, {
scope,
repoPath: options.repoPath,
force,
});
} catch (error) {
s.stop('Failed to write configuration');
const err = error as NodeJS.ErrnoException;
const code = err.code ? ` (${err.code})` : '';
cancel(`Could not write ${targetEnvPath}${code}: ${err.message}`);
process.exit(1);
}
s.stop('Configuration files written');
s.stop('Configuration written');
// Tell the operator exactly what happened — especially that <repo>/.env was
// NOT touched, because prior versions wrote there and this is the biggest
// behavior change for returning users.
if (writeResult.preservedKeys.length > 0) {
log.info(
`Preserved ${writeResult.preservedKeys.length} existing value(s) (use --force to overwrite): ${writeResult.preservedKeys.join(', ')}`
);
}
if (writeResult.backupPath) {
log.info(`Backup written to ${writeResult.backupPath}`);
}
// Offer to install the Archon skill
const shouldCopySkill = await confirm({
@ -1596,9 +1912,8 @@ export async function setupCommand(options: SetupOptions): Promise<void> {
`Default: ${config.ai.defaultAssistant}`,
`Platforms: ${configuredPlatforms.length > 0 ? configuredPlatforms.join(', ') : 'None'}`,
'',
'Files written:',
` ${globalPath}`,
` ${repoEnvPath}`,
`File written (${scope} scope):`,
` ${writeResult.targetPath}`,
];
if (config.platforms.github && config.github) {
@ -1619,7 +1934,7 @@ export async function setupCommand(options: SetupOptions): Promise<void> {
// Additional options note
note(
'Other settings you can customize in ~/.archon/.env:\n' +
' - PORT (default: 3000)\n' +
' - PORT (default: 3090)\n' +
' - MAX_CONCURRENT_CONVERSATIONS (default: 10)\n' +
' - *_STREAMING_MODE (stream | batch per platform)\n\n' +
'These defaults work well for most users.',

View file

@ -8,7 +8,9 @@ import { discoverWorkflowsWithConfig } from '@archon/workflows/workflow-discover
import {
validateWorkflowResources,
validateCommand,
validateScript,
discoverAvailableCommands,
discoverAvailableScripts,
findSimilar,
makeWorkflowResult,
} from '@archon/workflows/validator';
@ -16,6 +18,7 @@ import type {
ValidationIssue,
WorkflowValidationResult,
ValidationConfig,
ScriptValidationResult,
} from '@archon/workflows/validator';
import { loadConfig, loadRepoConfig } from '@archon/core';
@ -52,22 +55,22 @@ function formatIssue(issue: ValidationIssue, indent = ' '): string {
return line;
}
function formatWorkflowResult(result: WorkflowValidationResult): string {
const errors = result.issues.filter(i => i.level === 'error');
const warnings = result.issues.filter(i => i.level === 'warning');
function formatValidationResult(displayName: string, issues: ValidationIssue[]): string {
const hasErrors = issues.some(i => i.level === 'error');
const hasWarnings = issues.some(i => i.level === 'warning');
const statusLabel = hasErrors ? 'ERRORS' : hasWarnings ? 'WARNINGS' : 'ok';
const statusLabel = errors.length > 0 ? 'ERRORS' : warnings.length > 0 ? 'WARNINGS' : 'ok';
const namePad = result.workflowName.padEnd(40, ' ');
let output = ` ${namePad} ${statusLabel}`;
for (const issue of result.issues) {
let output = ` ${displayName.padEnd(40, ' ')} ${statusLabel}`;
for (const issue of issues) {
output += '\n' + formatIssue(issue);
}
return output;
}
function formatWorkflowResult(result: WorkflowValidationResult): string {
return formatValidationResult(result.workflowName, result.issues);
}
// =============================================================================
// Workflow validation command
// =============================================================================
@ -82,6 +85,8 @@ export async function validateWorkflowsCommand(
json?: boolean
): Promise<number> {
const config = await buildValidationConfig(cwd);
const mergedConfig = await loadConfig(cwd);
const defaultProvider = mergedConfig.assistant;
const { workflows: workflowEntries, errors: loadErrors } = await discoverWorkflowsWithConfig(
cwd,
loadConfig
@ -102,7 +107,7 @@ export async function validateWorkflowsCommand(
// Validate successfully parsed workflows (Level 3)
for (const { workflow } of workflowEntries) {
const issues = await validateWorkflowResources(workflow, cwd, config);
const issues = await validateWorkflowResources(workflow, cwd, config, defaultProvider);
results.push(makeWorkflowResult(workflow.name, issues));
}
@ -175,11 +180,16 @@ export async function validateWorkflowsCommand(
}
// =============================================================================
// Command validation command
// Command and script validation command
// =============================================================================
function formatScriptResult(result: ScriptValidationResult): string {
return formatValidationResult(`[script] ${result.scriptName}`, result.issues);
}
/**
* Validate all commands or a specific command.
* Also validates scripts from .archon/scripts/ alongside commands.
* Returns exit code: 0 = all valid, 1 = errors found.
*/
export async function validateCommandsCommand(
@ -208,41 +218,50 @@ export async function validateCommandsCommand(
// Validate all commands
const allCommands = await discoverAvailableCommands(cwd, config);
const commandResults = await Promise.all(
allCommands.map(cmd => validateCommand(cmd, cwd, config))
);
if (allCommands.length === 0) {
if (jsonOutput) {
console.log(JSON.stringify({ results: [], summary: { total: 0, valid: 0, errors: 0 } }));
} else {
console.log('\nNo commands found.');
}
return 0;
}
// Validate all scripts
const allScripts = await discoverAvailableScripts(cwd);
const scriptResults = await Promise.all(allScripts.map(s => validateScript(s.name, cwd)));
const results = await Promise.all(allCommands.map(cmd => validateCommand(cmd, cwd, config)));
const totalErrors = results.filter(r => !r.valid).length;
const totalCommandErrors = commandResults.filter(r => !r.valid).length;
const totalScriptErrors = scriptResults.filter(r => !r.valid).length;
const totalErrors = totalCommandErrors + totalScriptErrors;
if (jsonOutput) {
console.log(
JSON.stringify({
results,
results: commandResults,
scripts: scriptResults,
summary: {
total: results.length,
valid: results.length - totalErrors,
total: commandResults.length + scriptResults.length,
valid: commandResults.length + scriptResults.length - totalErrors,
errors: totalErrors,
},
})
);
} else {
console.log(`\nValidating commands in ${cwd}\n`);
for (const result of results) {
if (commandResults.length === 0 && scriptResults.length === 0) {
console.log('\nNo commands or scripts found.');
return 0;
}
console.log(`\nValidating commands and scripts in ${cwd}\n`);
for (const result of commandResults) {
const statusLabel = result.valid ? 'ok' : 'ERRORS';
console.log(` ${result.commandName.padEnd(40, ' ')} ${statusLabel}`);
for (const issue of result.issues) {
console.log(formatIssue(issue));
}
}
console.log(`\nResults: ${results.length - totalErrors} valid, ${totalErrors} with errors`);
for (const result of scriptResults) {
console.log(formatScriptResult(result));
}
console.log(
`\nResults: ${commandResults.length + scriptResults.length - totalErrors} valid, ${totalErrors} with errors`
);
}
return totalErrors > 0 ? 1 : 0;

View file

@ -30,7 +30,8 @@ interface PackageJson {
* Get version for development mode (reads package.json)
*/
async function getDevVersion(): Promise<{ name: string; version: string }> {
const pkgPath = join(SCRIPT_DIR, '../../package.json');
// Read root package.json (monorepo version), not the CLI package's own
const pkgPath = join(SCRIPT_DIR, '../../../../package.json');
let content: string;
try {

View file

@ -107,6 +107,7 @@ mock.module('@archon/core/db/conversations', () => ({
getOrCreateConversation: mock(() =>
Promise.resolve({ id: 'conv-123', platform_type: 'cli', platform_conversation_id: 'cli-123' })
),
getConversationById: mock(() => Promise.resolve(null)),
updateConversation: mock(() => Promise.resolve()),
}));
@ -309,7 +310,7 @@ describe('workflowListCommand', () => {
expect(consoleSpy).toHaveBeenCalledWith(expect.stringContaining('Found 1 workflow(s)'));
});
it('passes globalSearchPath to discoverWorkflowsWithConfig', async () => {
it('calls discoverWorkflowsWithConfig with (cwd, loadConfig) — home scope is internal', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [],
@ -318,11 +319,9 @@ describe('workflowListCommand', () => {
await workflowListCommand('/test/path');
expect(discoverWorkflowsWithConfig).toHaveBeenCalledWith(
'/test/path',
expect.any(Function),
expect.objectContaining({ globalSearchPath: '/home/test/.archon' })
);
// After the globalSearchPath refactor, discovery reads ~/.archon/workflows/
// on every call with no option — every caller inherits home-scope for free.
expect(discoverWorkflowsWithConfig).toHaveBeenCalledWith('/test/path', expect.any(Function));
});
it('should throw error when discoverWorkflows fails', async () => {
@ -866,6 +865,146 @@ describe('workflowRunCommand', () => {
expect(createCallsAfter).toBe(createCallsBefore);
});
// -------------------------------------------------------------------------
// Workflow-level `worktree.enabled` policy
// -------------------------------------------------------------------------
it('skips isolation when workflow YAML pins worktree.enabled: false', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
const { executeWorkflow } = await import('@archon/workflows/executor');
const conversationDb = await import('@archon/core/db/conversations');
const codebaseDb = await import('@archon/core/db/codebases');
const isolation = await import('@archon/isolation');
const getIsolationProviderMock = isolation.getIsolationProvider as ReturnType<typeof mock>;
const providerBefore = getIsolationProviderMock.mock.results.at(-1)?.value as
| { create: ReturnType<typeof mock> }
| undefined;
const createCallsBefore = providerBefore?.create.mock.calls.length ?? 0;
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [
makeTestWorkflowWithSource({
name: 'triage',
description: 'Read-only triage',
worktree: { enabled: false },
}),
],
errors: [],
});
(conversationDb.getOrCreateConversation as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'conv-123',
});
(codebaseDb.findCodebaseByDefaultCwd as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'cb-123',
default_cwd: '/test/path',
});
(conversationDb.updateConversation as ReturnType<typeof mock>).mockResolvedValueOnce(undefined);
(executeWorkflow as ReturnType<typeof mock>).mockResolvedValueOnce({
success: true,
workflowRunId: 'run-123',
});
// No flags — policy alone should disable isolation
await workflowRunCommand('/test/path', 'triage', 'go', {});
const providerAfter = getIsolationProviderMock.mock.results.at(-1)?.value as
| { create: ReturnType<typeof mock> }
| undefined;
const createCallsAfter = providerAfter?.create.mock.calls.length ?? 0;
expect(createCallsAfter).toBe(createCallsBefore);
});
it('throws when workflow pins worktree.enabled: false but caller passes --branch', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [
makeTestWorkflowWithSource({
name: 'triage',
description: 'Read-only triage',
worktree: { enabled: false },
}),
],
errors: [],
});
await expect(
workflowRunCommand('/test/path', 'triage', 'go', { branchName: 'feat-x' })
).rejects.toThrow(/worktree\.enabled: false/);
});
it('throws when workflow pins worktree.enabled: false but caller passes --from', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [
makeTestWorkflowWithSource({
name: 'triage',
description: 'Read-only triage',
worktree: { enabled: false },
}),
],
errors: [],
});
await expect(
workflowRunCommand('/test/path', 'triage', 'go', { fromBranch: 'dev' })
).rejects.toThrow(/worktree\.enabled: false/);
});
it('accepts worktree.enabled: false + --no-worktree as redundant (no error)', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
const { executeWorkflow } = await import('@archon/workflows/executor');
const conversationDb = await import('@archon/core/db/conversations');
const codebaseDb = await import('@archon/core/db/codebases');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [
makeTestWorkflowWithSource({
name: 'triage',
description: 'Read-only triage',
worktree: { enabled: false },
}),
],
errors: [],
});
(conversationDb.getOrCreateConversation as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'conv-123',
});
(codebaseDb.findCodebaseByDefaultCwd as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'cb-123',
default_cwd: '/test/path',
});
(conversationDb.updateConversation as ReturnType<typeof mock>).mockResolvedValueOnce(undefined);
(executeWorkflow as ReturnType<typeof mock>).mockResolvedValueOnce({
success: true,
workflowRunId: 'run-123',
});
// Should not throw — redundant, not contradictory
await workflowRunCommand('/test/path', 'triage', 'go', { noWorktree: true });
});
it('throws when workflow pins worktree.enabled: true but caller passes --no-worktree', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [
makeTestWorkflowWithSource({
name: 'build',
description: 'Requires a worktree',
worktree: { enabled: true },
}),
],
errors: [],
});
await expect(
workflowRunCommand('/test/path', 'build', 'go', { noWorktree: true })
).rejects.toThrow(/worktree\.enabled: true/);
});
it('throws when isolation cannot be created due to missing codebase', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
const conversationDb = await import('@archon/core/db/conversations');
@ -974,6 +1113,249 @@ describe('workflowRunCommand', () => {
consoleWarnSpy.mockRestore();
}
});
it('sends dispatch message before executeWorkflow with correct metadata', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
const { executeWorkflow } = await import('@archon/workflows/executor');
const conversationDb = await import('@archon/core/db/conversations');
const codebaseDb = await import('@archon/core/db/codebases');
const messagesDb = await import('@archon/core/db/messages');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [makeTestWorkflowWithSource({ name: 'assist', description: 'Help' })],
errors: [],
});
(conversationDb.getOrCreateConversation as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'conv-123',
});
(codebaseDb.findCodebaseByDefaultCwd as ReturnType<typeof mock>).mockResolvedValueOnce(null);
(conversationDb.updateConversation as ReturnType<typeof mock>).mockResolvedValueOnce(undefined);
// Track call order for assistant messages only (user message is added first via addMessage directly)
const callOrder: string[] = [];
(messagesDb.addMessage as ReturnType<typeof mock>).mockImplementation(
async (_dbId: unknown, role: unknown, content: unknown) => {
if (role === 'assistant') {
callOrder.push(`addMessage:${String(content)}`);
}
}
);
(executeWorkflow as ReturnType<typeof mock>).mockImplementation(async () => {
callOrder.push('executeWorkflow');
return { success: true, workflowRunId: 'run-1' };
});
await workflowRunCommand('/test/path', 'assist', 'hello', { noWorktree: true });
// Dispatch assistant message fires before executeWorkflow
expect(callOrder[0]).toContain('Dispatching workflow');
expect(callOrder[1]).toBe('executeWorkflow');
// Correct metadata shape
expect(messagesDb.addMessage).toHaveBeenCalledWith(
expect.any(String),
'assistant',
'Dispatching workflow: **assist**',
expect.objectContaining({
category: 'workflow_dispatch_status',
workflowDispatch: expect.objectContaining({
workflowName: 'assist',
workerConversationId: expect.stringMatching(/^cli-/),
}),
})
);
});
it('sends result card when executeWorkflow returns a summary', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
const { executeWorkflow } = await import('@archon/workflows/executor');
const conversationDb = await import('@archon/core/db/conversations');
const codebaseDb = await import('@archon/core/db/codebases');
const messagesDb = await import('@archon/core/db/messages');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [makeTestWorkflowWithSource({ name: 'assist', description: 'Help' })],
errors: [],
});
(conversationDb.getOrCreateConversation as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'conv-123',
});
(codebaseDb.findCodebaseByDefaultCwd as ReturnType<typeof mock>).mockResolvedValueOnce(null);
(conversationDb.updateConversation as ReturnType<typeof mock>).mockResolvedValueOnce(undefined);
(executeWorkflow as ReturnType<typeof mock>).mockResolvedValueOnce({
success: true,
workflowRunId: 'run-42',
summary: 'All steps completed. Branch pushed.',
});
(messagesDb.addMessage as ReturnType<typeof mock>).mockClear();
await workflowRunCommand('/test/path', 'assist', 'hello', { noWorktree: true });
expect(messagesDb.addMessage).toHaveBeenCalledWith(
expect.any(String),
'assistant',
'All steps completed. Branch pushed.',
expect.objectContaining({
category: 'workflow_result',
workflowResult: { workflowName: 'assist', runId: 'run-42' },
})
);
});
it('does not send result card when executeWorkflow has no summary', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
const { executeWorkflow } = await import('@archon/workflows/executor');
const conversationDb = await import('@archon/core/db/conversations');
const codebaseDb = await import('@archon/core/db/codebases');
const messagesDb = await import('@archon/core/db/messages');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [makeTestWorkflowWithSource({ name: 'assist', description: 'Help' })],
errors: [],
});
(conversationDb.getOrCreateConversation as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'conv-123',
});
(codebaseDb.findCodebaseByDefaultCwd as ReturnType<typeof mock>).mockResolvedValueOnce(null);
(conversationDb.updateConversation as ReturnType<typeof mock>).mockResolvedValueOnce(undefined);
(executeWorkflow as ReturnType<typeof mock>).mockResolvedValueOnce({
success: true,
workflowRunId: 'run-1',
// no summary field
});
(messagesDb.addMessage as ReturnType<typeof mock>).mockClear();
await workflowRunCommand('/test/path', 'assist', 'hello', { noWorktree: true });
// Only dispatch addMessage call, no result card
const resultCalls = (messagesDb.addMessage as ReturnType<typeof mock>).mock.calls.filter(
(args: unknown[]) => {
const meta = args[3] as Record<string, unknown> | undefined;
return meta?.category === 'workflow_result';
}
);
expect(resultCalls).toHaveLength(0);
});
it('does not throw and logs warn when result message DB persist fails', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
const { executeWorkflow } = await import('@archon/workflows/executor');
const conversationDb = await import('@archon/core/db/conversations');
const codebaseDb = await import('@archon/core/db/codebases');
const messagesDb = await import('@archon/core/db/messages');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [makeTestWorkflowWithSource({ name: 'assist', description: 'Help' })],
errors: [],
});
(conversationDb.getOrCreateConversation as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'conv-123',
});
(codebaseDb.findCodebaseByDefaultCwd as ReturnType<typeof mock>).mockResolvedValueOnce(null);
(conversationDb.updateConversation as ReturnType<typeof mock>).mockResolvedValueOnce(undefined);
(executeWorkflow as ReturnType<typeof mock>).mockResolvedValueOnce({
success: true,
workflowRunId: 'run-1',
summary: 'Done.',
});
// addMessage is called three times: user message persist, dispatch, result
// CLIAdapter internally catches DB errors — it logs 'cli_message_persist_failed' and does not throw.
// Verify workflowRunCommand does not throw even when the result DB write fails.
(messagesDb.addMessage as ReturnType<typeof mock>)
.mockResolvedValueOnce(undefined) // user message persist succeeds
.mockResolvedValueOnce(undefined) // dispatch succeeds
.mockRejectedValueOnce(new Error('DB gone')); // result fails (caught inside CLIAdapter)
// Should not throw — the CLIAdapter swallows the DB error and logs a warn
await expect(
workflowRunCommand('/test/path', 'assist', 'hello', { noWorktree: true })
).resolves.toBeUndefined();
// CLIAdapter logs 'cli_message_persist_failed' when addMessage throws internally
expect(mockLogger.warn).toHaveBeenCalledWith(
expect.objectContaining({ err: expect.any(Error) }),
'cli_message_persist_failed'
);
});
it('does not throw and continues to executeWorkflow when dispatch sendMessage fails', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
const { executeWorkflow } = await import('@archon/workflows/executor');
const conversationDb = await import('@archon/core/db/conversations');
const codebaseDb = await import('@archon/core/db/codebases');
const messagesDb = await import('@archon/core/db/messages');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [makeTestWorkflowWithSource({ name: 'assist', description: 'Help' })],
errors: [],
});
(conversationDb.getOrCreateConversation as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'conv-123',
});
(codebaseDb.findCodebaseByDefaultCwd as ReturnType<typeof mock>).mockResolvedValueOnce(null);
(conversationDb.updateConversation as ReturnType<typeof mock>).mockResolvedValueOnce(undefined);
(executeWorkflow as ReturnType<typeof mock>).mockClear();
(executeWorkflow as ReturnType<typeof mock>).mockResolvedValueOnce({
success: true,
workflowRunId: 'run-1',
});
// First addMessage (user message persist) succeeds, second (dispatch) fails
(messagesDb.addMessage as ReturnType<typeof mock>)
.mockResolvedValueOnce(undefined) // user message persist succeeds
.mockRejectedValueOnce(new Error('DB gone')); // dispatch fails (caught inside CLIAdapter)
// Should not throw — dispatch failure must not block workflow execution
await expect(
workflowRunCommand('/test/path', 'assist', 'hello', { noWorktree: true })
).resolves.toBeUndefined();
// executeWorkflow was still called despite dispatch failure
expect(executeWorkflow).toHaveBeenCalledTimes(1);
});
it('does not send result card when workflow is paused even with summary', async () => {
const { discoverWorkflowsWithConfig } = await import('@archon/workflows/workflow-discovery');
const { executeWorkflow } = await import('@archon/workflows/executor');
const conversationDb = await import('@archon/core/db/conversations');
const codebaseDb = await import('@archon/core/db/codebases');
const messagesDb = await import('@archon/core/db/messages');
(discoverWorkflowsWithConfig as ReturnType<typeof mock>).mockResolvedValueOnce({
workflows: [makeTestWorkflowWithSource({ name: 'assist', description: 'Help' })],
errors: [],
});
(conversationDb.getOrCreateConversation as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'conv-123',
});
(codebaseDb.findCodebaseByDefaultCwd as ReturnType<typeof mock>).mockResolvedValueOnce(null);
(conversationDb.updateConversation as ReturnType<typeof mock>).mockResolvedValueOnce(undefined);
(executeWorkflow as ReturnType<typeof mock>).mockResolvedValueOnce({
success: true,
workflowRunId: 'run-paused',
paused: true,
summary: 'Steps completed so far.',
});
(messagesDb.addMessage as ReturnType<typeof mock>).mockClear();
const consoleSpy = spyOn(console, 'log').mockImplementation(() => {});
try {
await workflowRunCommand('/test/path', 'assist', 'hello', { noWorktree: true });
// Paused guard fires before summary check — no result card despite having a summary
const resultCalls = (messagesDb.addMessage as ReturnType<typeof mock>).mock.calls.filter(
(args: unknown[]) => {
const meta = args[3] as Record<string, unknown> | undefined;
return meta?.category === 'workflow_result';
}
);
expect(resultCalls).toHaveLength(0);
// Confirm paused message was printed
expect(consoleSpy).toHaveBeenCalledWith('\nWorkflow paused — waiting for approval.');
} finally {
consoleSpy.mockRestore();
}
});
});
describe('workflowStatusCommand', () => {
@ -1381,6 +1763,58 @@ describe('workflowApproveCommand', () => {
expect(codebaseDb.getCodebase).toHaveBeenCalledWith('cb-existing');
});
it('should pass original platform conversation ID through to workflowRunCommand', async () => {
const workflowDb = await import('@archon/core/db/workflows');
const codebaseDb = await import('@archon/core/db/codebases');
const conversationsDb = await import('@archon/core/db/conversations');
const workflowDiscovery = await import('@archon/workflows/workflow-discovery');
const core = await import('@archon/core');
(workflowDb.getWorkflowRun as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'run-approve-conv',
workflow_name: 'implement',
status: 'paused',
user_message: 'add auth',
working_path: '/tmp/test-worktree',
codebase_id: 'cb-existing',
conversation_id: 'db-uuid-original',
metadata: { approval: { nodeId: 'review-node', message: 'Approve?' } },
});
// Return a conversation with the original platform ID
(conversationsDb.getConversationById as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'db-uuid-original',
platform_type: 'cli',
platform_conversation_id: 'cli-original-123',
});
(
workflowDiscovery.discoverWorkflowsWithConfig as ReturnType<typeof mock>
).mockResolvedValueOnce({
workflows: [makeTestWorkflowWithSource({ name: 'implement' })],
errors: [],
});
(codebaseDb.getCodebase as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'cb-existing',
name: 'owner/repo',
default_cwd: '/path/to/main-checkout',
});
// Clear call history before our test so we can assert precisely
(conversationsDb.getOrCreateConversation as ReturnType<typeof mock>).mockClear();
try {
await workflowApproveCommand('run-approve-conv');
} catch {
// downstream failure is acceptable — we only need to reach getOrCreateConversation
}
// Verify the original platform conversation ID was passed through
expect(conversationsDb.getConversationById).toHaveBeenCalledWith('db-uuid-original');
expect(conversationsDb.getOrCreateConversation).toHaveBeenCalledWith('cli', 'cli-original-123');
});
});
describe('workflowAbandonCommand', () => {
@ -1566,6 +2000,61 @@ describe('workflowRejectCommand', () => {
expect(consoleSpy).toHaveBeenCalledWith(expect.stringContaining('Rejected workflow'));
});
it('should pass original platform conversation ID through on reject-resume', async () => {
const workflowDb = await import('@archon/core/db/workflows');
const conversationsDb = await import('@archon/core/db/conversations');
const workflowDiscovery = await import('@archon/workflows/workflow-discovery');
const runData = {
id: 'run-reject-conv',
workflow_name: 'my-wf',
status: 'paused',
user_message: 'build it',
working_path: '/repo',
codebase_id: null,
conversation_id: 'db-uuid-reject',
metadata: {
approval: {
type: 'approval',
nodeId: 'gate',
message: 'Approve?',
onRejectPrompt: 'Fix: $REJECTION_REASON',
onRejectMaxAttempts: 3,
},
rejection_count: 0,
},
};
// rejectWorkflow reads the run twice internally (getRunOrThrow + updateWorkflowRun check)
(workflowDb.getWorkflowRun as ReturnType<typeof mock>).mockResolvedValueOnce(runData);
// Return a conversation with the original platform ID
(conversationsDb.getConversationById as ReturnType<typeof mock>).mockResolvedValueOnce({
id: 'db-uuid-reject',
platform_type: 'cli',
platform_conversation_id: 'cli-reject-456',
});
(
workflowDiscovery.discoverWorkflowsWithConfig as ReturnType<typeof mock>
).mockResolvedValueOnce({
workflows: [makeTestWorkflowWithSource({ name: 'my-wf' })],
errors: [],
});
// Clear call history before our test so we can assert precisely
(conversationsDb.getOrCreateConversation as ReturnType<typeof mock>).mockClear();
try {
await workflowRejectCommand('run-reject-conv', 'needs work');
} catch {
// downstream workflowRunCommand failure is acceptable — we only need to reach getOrCreateConversation
}
// Verify the original platform conversation ID was passed through
expect(conversationsDb.getConversationById).toHaveBeenCalledWith('db-uuid-reject');
expect(conversationsDb.getOrCreateConversation).toHaveBeenCalledWith('cli', 'cli-reject-456');
});
it('cancels when max attempts reached', async () => {
const workflowDb = await import('@archon/core/db/workflows');
const core = await import('@archon/core');

View file

@ -10,7 +10,7 @@ import {
} from '@archon/core';
import { WORKFLOW_EVENT_TYPES, type WorkflowEventType } from '@archon/workflows/store';
import { configureIsolation, getIsolationProvider } from '@archon/isolation';
import { createLogger, getArchonHome } from '@archon/paths';
import { createLogger } from '@archon/paths';
import { createWorkflowDeps } from '@archon/core/workflows/store-adapter';
import { discoverWorkflowsWithConfig } from '@archon/workflows/workflow-discovery';
import { resolveWorkflowName } from '@archon/workflows/router';
@ -62,10 +62,10 @@ export interface WorkflowRunOptions {
noWorktree?: boolean;
resume?: boolean;
codebaseId?: string; // Passed by resume/approve to skip path-based lookup
/** When true, skip the env-leak-gate during auto-registration. */
allowEnvKeys?: boolean;
quiet?: boolean;
verbose?: boolean;
/** Platform conversation ID (e.g. `cli-{ts}-{rand}`), NOT a DB UUID. */
conversationId?: string;
}
/**
@ -119,9 +119,9 @@ function renderWorkflowEvent(event: WorkflowEmitterEvent, verbose: boolean): voi
*/
async function loadWorkflows(cwd: string): Promise<WorkflowLoadResult> {
try {
return await discoverWorkflowsWithConfig(cwd, loadConfig, {
globalSearchPath: getArchonHome(),
});
// Home-scoped workflows at ~/.archon/workflows/ are discovered automatically —
// no option needed since the discovery helper reads them unconditionally.
return await discoverWorkflowsWithConfig(cwd, loadConfig);
} catch (error) {
const err = error as Error;
throw new Error(
@ -178,7 +178,7 @@ export async function workflowListCommand(cwd: string, json?: boolean): Promise<
}
if (workflowEntries.length > 0) {
console.log(`\nFound ${String(workflowEntries.length)} workflow(s):\n`);
console.log(`\nFound ${workflowEntries.length} workflow(s):\n`);
for (const { workflow } of workflowEntries) {
console.log(` ${workflow.name}`);
@ -191,7 +191,7 @@ export async function workflowListCommand(cwd: string, json?: boolean): Promise<
}
if (errors.length > 0) {
console.log(`\n${String(errors.length)} workflow(s) failed to load:\n`);
console.log(`\n${errors.length} workflow(s) failed to load:\n`);
for (const e of errors) {
console.log(` ${e.filename}: ${e.error}`);
}
@ -261,6 +261,37 @@ export async function workflowRunCommand(
);
}
// Reconcile workflow-level worktree policy with invocation flags.
// The workflow YAML's `worktree.enabled` pins isolation regardless of caller —
// a mismatch between policy and flags is a user error we surface loudly
// rather than silently applying one side and ignoring the other.
const pinnedEnabled = workflow.worktree?.enabled;
if (pinnedEnabled === false) {
if (options.branchName !== undefined) {
throw new Error(
`Workflow '${workflow.name}' sets worktree.enabled: false (runs in live checkout).\n` +
' --branch requires an isolated worktree.\n' +
" Drop --branch or change the workflow's worktree.enabled."
);
}
if (options.fromBranch !== undefined) {
throw new Error(
`Workflow '${workflow.name}' sets worktree.enabled: false (runs in live checkout).\n` +
' --from/--from-branch only applies when a worktree is created.\n' +
" Drop --from or change the workflow's worktree.enabled."
);
}
// --no-worktree is redundant but not contradictory — silently accept.
} else if (pinnedEnabled === true) {
if (options.noWorktree) {
throw new Error(
`Workflow '${workflow.name}' sets worktree.enabled: true (requires a worktree).\n` +
' --no-worktree conflicts with the workflow policy.\n' +
" Drop --no-worktree or change the workflow's worktree.enabled."
);
}
}
console.log(`Running workflow: ${workflowName}`);
console.log(`Working directory: ${cwd}`);
console.log('');
@ -269,7 +300,7 @@ export async function workflowRunCommand(
const adapter = new CLIAdapter();
// Generate conversation ID
const conversationId = generateConversationId();
const conversationId = options.conversationId ?? generateConversationId();
// Get or create conversation in database
let conversation;
@ -323,7 +354,7 @@ export async function workflowRunCommand(
const repoRoot = await git.findRepoRoot(cwd);
if (repoRoot) {
try {
const result = await registerRepository(repoRoot, options.allowEnvKeys, 'register-cli');
const result = await registerRepository(repoRoot);
codebase = await codebaseDb.getCodebase(result.codebaseId);
if (!result.alreadyExisted) {
getLog().info({ name: result.name }, 'cli.codebase_auto_registered');
@ -403,8 +434,14 @@ export async function workflowRunCommand(
console.log('');
}
// Default to worktree isolation unless --no-worktree or --resume
const wantsIsolation = !options.resume && !options.noWorktree;
// Default to worktree isolation unless --no-worktree or --resume.
// Workflow YAML `worktree.enabled` pins the decision — mismatches with CLI
// flags are rejected above, so by this point the policy (if set) and flags
// agree. `--resume` reuses an existing worktree and takes precedence over
// the pinned policy to avoid disturbing a paused run.
const flagWantsIsolation = !options.resume && !options.noWorktree;
const wantsIsolation =
!options.resume && pinnedEnabled !== undefined ? pinnedEnabled : flagWantsIsolation;
if (wantsIsolation && codebase) {
// Auto-generate branch identifier from workflow name + timestamp when --branch not provided
@ -589,6 +626,24 @@ export async function workflowRunCommand(
renderWorkflowEvent(event, verbose ?? false);
});
// Notify Web UI that a workflow is dispatching.
// Mirrors the orchestrator dispatch message structure (category/segment/workflowDispatch),
// but omits the rocket emoji and "(background)" qualifier since the CLI runs synchronously.
// In the CLI path there is no separate worker conversation — the CLI itself
// is both the dispatcher and the executor, so workerConversationId === conversationId.
try {
await adapter.sendMessage(conversationId, `Dispatching workflow: **${workflow.name}**`, {
category: 'workflow_dispatch_status',
segment: 'new',
workflowDispatch: { workerConversationId: conversationId, workflowName: workflow.name },
});
} catch (dispatchError) {
getLog().warn(
{ err: dispatchError as Error, conversationId },
'cli.workflow_dispatch_surface_failed'
);
}
// Execute workflow with workingCwd (may be worktree path)
let result: Awaited<ReturnType<typeof executeWorkflow>>;
try {
@ -610,6 +665,22 @@ export async function workflowRunCommand(
if (result.success && 'paused' in result && result.paused) {
console.log('\nWorkflow paused — waiting for approval.');
} else if (result.success) {
// Surface workflow result to Web UI as a result card (mirrors orchestrator.ts result message).
// Paused workflows are handled in the branch above and intentionally do not get a result card.
if ('summary' in result && result.summary) {
try {
await adapter.sendMessage(conversationId, result.summary, {
category: 'workflow_result',
segment: 'new',
workflowResult: { workflowName: workflow.name, runId: result.workflowRunId },
});
} catch (surfaceError) {
getLog().warn(
{ err: surfaceError as Error, conversationId },
'cli.workflow_result_surface_failed'
);
}
}
console.log('\nWorkflow completed successfully.');
} else {
throw new Error(`Workflow failed: ${result.error}`);
@ -628,25 +699,25 @@ function formatAge(startedAt: Date | string): string {
if (Number.isNaN(date.getTime())) return 'unknown';
const ms = Date.now() - date.getTime();
const secs = Math.floor(ms / 1000);
if (secs < 60) return `${String(secs)}s`;
if (secs < 60) return `${secs}s`;
const mins = Math.floor(secs / 60);
if (mins < 60) return `${String(mins)}m`;
if (mins < 60) return `${mins}m`;
const hours = Math.floor(mins / 60);
if (hours < 24) return `${String(hours)}h ${String(mins % 60)}m`;
if (hours < 24) return `${hours}h ${mins % 60}m`;
const days = Math.floor(hours / 24);
return `${String(days)}d ${String(hours % 24)}h`;
return `${days}d ${hours % 24}h`;
}
/**
* Format a duration in milliseconds as a compact string.
*/
function formatDuration(ms: number): string {
if (ms < 1000) return `${String(ms)}ms`;
if (ms < 1000) return `${ms}ms`;
const secs = Math.round(ms / 100) / 10;
if (secs < 60) return `${String(secs)}s`;
if (secs < 60) return `${secs}s`;
const mins = Math.floor(secs / 60);
const remSecs = Math.round(secs % 60);
return `${String(mins)}m${String(remSecs)}s`;
return `${mins}m${remSecs}s`;
}
interface NodeSummary {
@ -730,20 +801,16 @@ export async function workflowStatusCommand(json?: boolean, verbose?: boolean):
}
if (json) {
let runsOutput: unknown[] = runs;
if (verbose) {
const eventsPerRun = await Promise.all(
runs.map(run =>
workflowEventsDb.listWorkflowEvents(run.id).catch(() => [] as WorkflowEventRow[])
)
);
const runsWithEvents = runs.map((run, i) => ({
...run,
events: eventsPerRun[i],
}));
console.log(JSON.stringify({ runs: runsWithEvents }, null, 2));
} else {
console.log(JSON.stringify({ runs }, null, 2));
runsOutput = runs.map((run, i) => ({ ...run, events: eventsPerRun[i] }));
}
console.log(JSON.stringify({ runs: runsOutput }, null, 2));
return;
}
@ -752,7 +819,7 @@ export async function workflowStatusCommand(json?: boolean, verbose?: boolean):
return;
}
console.log(`\nActive workflows (${String(runs.length)}):\n`);
console.log(`\nActive workflows (${runs.length}):\n`);
for (const run of runs) {
const age = formatAge(run.started_at);
console.log(` ID: ${run.id}`);
@ -861,10 +928,30 @@ export async function workflowApproveCommand(runId: string, comment?: string): P
console.log('');
console.log('Resuming workflow...');
// Look up the original platform conversation ID to keep all messages in one thread
let platformConversationId: string | undefined;
try {
const originalConversation = await conversationDb.getConversationById(result.conversationId);
platformConversationId = originalConversation?.platform_conversation_id ?? undefined;
if (!originalConversation) {
getLog().info(
{ runId, conversationId: result.conversationId },
'cli.workflow_approve_conversation_not_found'
);
}
} catch (error) {
const err = error as Error;
getLog().warn(
{ err, runId, conversationId: result.conversationId },
'cli.workflow_approve_conversation_lookup_failed'
);
}
try {
await workflowRunCommand(result.workingPath, result.workflowName, result.userMessage ?? '', {
resume: true,
codebaseId: result.codebaseId ?? undefined,
conversationId: platformConversationId,
});
} catch (error) {
const err = error as Error;
@ -900,10 +987,31 @@ export async function workflowRejectCommand(runId: string, reason?: string): Pro
}
console.log(`Rejected workflow: ${result.workflowName}`);
console.log('Resuming with on_reject prompt...');
// Look up the original platform conversation ID to keep all messages in one thread
let platformConversationId: string | undefined;
try {
const originalConversation = await conversationDb.getConversationById(result.conversationId);
platformConversationId = originalConversation?.platform_conversation_id ?? undefined;
if (!originalConversation) {
getLog().info(
{ runId, conversationId: result.conversationId },
'cli.workflow_reject_conversation_not_found'
);
}
} catch (error) {
const err = error as Error;
getLog().warn(
{ err, runId, conversationId: result.conversationId },
'cli.workflow_reject_conversation_lookup_failed'
);
}
try {
await workflowRunCommand(result.workingPath, result.workflowName, result.userMessage ?? '', {
resume: true,
codebaseId: result.codebaseId ?? undefined,
conversationId: platformConversationId,
});
} catch (error) {
const err = error as Error;
@ -925,9 +1033,9 @@ export async function workflowCleanupCommand(days: number): Promise<void> {
try {
const { count } = await workflowDb.deleteOldWorkflowRuns(days);
if (count === 0) {
console.log(`No workflow runs older than ${String(days)} days to clean up.`);
console.log(`No workflow runs older than ${days} days to clean up.`);
} else {
console.log(`Deleted ${String(count)} workflow run(s) older than ${String(days)} days.`);
console.log(`Deleted ${count} workflow run(s) older than ${days} days.`);
}
} catch (error) {
const err = error as Error;

View file

@ -3,14 +3,31 @@
"compilerOptions": {
"noEmit": true,
"paths": {
"@archon/adapters": ["../adapters/src"],
"@archon/adapters/*": ["../adapters/src/*"],
"@archon/core": ["../core/src"],
"@archon/core/*": ["../core/src/*"],
"@archon/server": ["../server/src"],
"@archon/server/*": ["../server/src/*"],
"@archon/workflows": ["../workflows/src"],
"@archon/workflows/*": ["../workflows/src/*"],
"@archon/paths": ["../paths/src"],
"@archon/git": ["../git/src"]
}
},
"include": ["src/**/*", "../core/src/**/*.ts", "../workflows/src/defaults/text-imports.d.ts"],
"exclude": ["node_modules", "dist", "**/*.test.ts", "../core/src/**/*.test.ts"]
"include": [
"src/**/*",
"../core/src/**/*.ts",
"../server/src/**/*.ts",
"../adapters/src/**/*.ts",
"../workflows/src/defaults/text-imports.d.ts"
],
"exclude": [
"node_modules",
"dist",
"**/*.test.ts",
"../core/src/**/*.test.ts",
"../server/src/**/*.test.ts",
"../adapters/src/**/*.test.ts"
]
}

View file

@ -1,6 +1,6 @@
{
"name": "@archon/core",
"version": "0.2.0",
"version": "0.3.6",
"type": "module",
"main": "./src/index.ts",
"types": "./src/index.ts",
@ -9,7 +9,6 @@
"./types": "./src/types/index.ts",
"./db": "./src/db/index.ts",
"./db/*": "./src/db/*.ts",
"./clients": "./src/clients/index.ts",
"./operations": "./src/operations/index.ts",
"./operations/*": "./src/operations/*.ts",
"./workflows": "./src/workflows/index.ts",
@ -23,17 +22,16 @@
"./state/*": "./src/state/*.ts"
},
"scripts": {
"test": "bun test src/clients/ && bun test src/handlers/command-handler.test.ts && bun test src/handlers/clone.test.ts && bun test src/db/adapters/postgres.test.ts && bun test src/db/adapters/sqlite.test.ts src/db/codebases.test.ts src/db/connection.test.ts src/db/conversations.test.ts src/db/env-vars.test.ts src/db/isolation-environments.test.ts src/db/messages.test.ts src/db/sessions.test.ts src/db/workflow-events.test.ts src/db/workflows.test.ts src/utils/defaults-copy.test.ts src/utils/worktree-sync.test.ts src/utils/conversation-lock.test.ts src/utils/credential-sanitizer.test.ts src/utils/port-allocation.test.ts src/utils/error.test.ts src/utils/error-formatter.test.ts src/utils/github-graphql.test.ts src/utils/env-allowlist.test.ts src/utils/env-leak-scanner.test.ts src/config/ src/state/ && bun test src/utils/path-validation.test.ts && bun test src/services/cleanup-service.test.ts && bun test src/services/title-generator.test.ts && bun test src/workflows/ && bun test src/operations/workflow-operations.test.ts && bun test src/operations/isolation-operations.test.ts && bun test src/orchestrator/orchestrator.test.ts && bun test src/orchestrator/orchestrator-agent.test.ts && bun test src/orchestrator/orchestrator-isolation.test.ts",
"test": "bun test src/handlers/command-handler.test.ts && bun test src/handlers/clone.test.ts && bun test src/db/adapters/postgres.test.ts && bun test src/db/connection.test.ts && bun test src/db/adapters/sqlite.test.ts src/db/codebases.test.ts src/db/conversations.test.ts src/db/env-vars.test.ts src/db/isolation-environments.test.ts src/db/messages.test.ts src/db/sessions.test.ts src/db/workflow-events.test.ts src/db/workflows.test.ts src/utils/defaults-copy.test.ts src/utils/worktree-sync.test.ts src/utils/conversation-lock.test.ts src/utils/credential-sanitizer.test.ts src/utils/port-allocation.test.ts src/utils/error.test.ts src/utils/error-formatter.test.ts src/utils/github-graphql.test.ts src/config/ src/state/ && bun test src/utils/path-validation.test.ts && bun test src/services/cleanup-service.test.ts && bun test src/services/title-generator.test.ts && bun test src/workflows/ && bun test src/operations/workflow-operations.test.ts && bun test src/operations/isolation-operations.test.ts && bun test src/orchestrator/orchestrator.test.ts && bun test src/orchestrator/orchestrator-agent.test.ts && bun test src/orchestrator/orchestrator-isolation.test.ts",
"type-check": "bun x tsc --noEmit",
"build": "echo 'No build needed - Bun runs TypeScript directly'"
},
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.2.89",
"@archon/git": "workspace:*",
"@archon/isolation": "workspace:*",
"@archon/paths": "workspace:*",
"@archon/providers": "workspace:*",
"@archon/workflows": "workspace:*",
"@openai/codex-sdk": "^0.116.0",
"pg": "^8.11.0",
"zod": "^3"
},

View file

@ -1,635 +0,0 @@
/**
* Claude Agent SDK wrapper
* Provides async generator interface for streaming Claude responses
*
* Type Safety Pattern:
* - Uses `Options` type from SDK for query configuration
* - SDK message types (SDKMessage, SDKAssistantMessage, etc.) have strict
* type checking that requires explicit type handling for content blocks
* - Content blocks are typed via inline assertions for clarity
*
* Authentication:
* - CLAUDE_USE_GLOBAL_AUTH=true: Use global auth from `claude /login`, filter env tokens
* - CLAUDE_USE_GLOBAL_AUTH=false: Use explicit tokens from env vars
* - Not set: Auto-detect - use tokens if present in env, otherwise global auth
*/
import {
query,
type Options,
type HookCallback,
type HookCallbackMatcher,
} from '@anthropic-ai/claude-agent-sdk';
import {
type AssistantRequestOptions,
type IAssistantClient,
type MessageChunk,
type TokenUsage,
} from '../types';
import { createLogger } from '@archon/paths';
import { buildCleanSubprocessEnv } from '../utils/env-allowlist';
import { scanPathForSensitiveKeys, EnvLeakError } from '../utils/env-leak-scanner';
import * as codebaseDb from '../db/codebases';
import { loadConfig } from '../config/config-loader';
/** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */
let cachedLog: ReturnType<typeof createLogger> | undefined;
function getLog(): ReturnType<typeof createLogger> {
if (!cachedLog) cachedLog = createLogger('client.claude');
return cachedLog;
}
/**
* Content block type for assistant messages
* Represents text or tool_use blocks from Claude API responses
*/
interface ContentBlock {
type: 'text' | 'tool_use';
text?: string;
name?: string;
input?: Record<string, unknown>;
/** Stable Anthropic `tool_use_id` — used to pair `tool_call`/`tool_result` events. */
id?: string;
}
function normalizeClaudeUsage(usage?: {
input_tokens?: number;
output_tokens?: number;
total_tokens?: number;
}): TokenUsage | undefined {
if (!usage) return undefined;
const input = usage.input_tokens;
const output = usage.output_tokens;
if (typeof input !== 'number' || typeof output !== 'number') return undefined;
const total = usage.total_tokens;
return {
input,
output,
...(typeof total === 'number' ? { total } : {}),
};
}
/**
* Build environment for Claude subprocess
*
* Auth behavior:
* - CLAUDE_USE_GLOBAL_AUTH=true: Filter tokens, use global auth from `claude /login`
* - CLAUDE_USE_GLOBAL_AUTH=false: Pass tokens through explicitly
* - Not set: Auto-detect use explicit tokens if present, otherwise fall back to global auth
*/
function buildSubprocessEnv(): NodeJS.ProcessEnv {
const globalAuthSetting = process.env.CLAUDE_USE_GLOBAL_AUTH?.toLowerCase();
// Check for empty token values (common misconfiguration)
const tokenVars = ['CLAUDE_CODE_OAUTH_TOKEN', 'CLAUDE_API_KEY'] as const;
const emptyTokens = tokenVars.filter(v => process.env[v] === '');
if (emptyTokens.length > 0) {
getLog().warn({ emptyTokens }, 'empty_token_values');
}
// Warn if user has the legacy variable but not the new ones
if (
process.env.ANTHROPIC_API_KEY &&
!process.env.CLAUDE_CODE_OAUTH_TOKEN &&
!process.env.CLAUDE_API_KEY
) {
getLog().warn(
{ hint: 'Use CLAUDE_API_KEY or CLAUDE_CODE_OAUTH_TOKEN instead' },
'deprecated_anthropic_api_key_ignored'
);
}
const hasExplicitTokens = Boolean(
process.env.CLAUDE_CODE_OAUTH_TOKEN ?? process.env.CLAUDE_API_KEY
);
// Determine whether to use global auth
let useGlobalAuth: boolean;
if (globalAuthSetting === 'true') {
useGlobalAuth = true;
getLog().info({ authMode: 'global' }, 'using_global_auth');
} else if (globalAuthSetting === 'false') {
useGlobalAuth = false;
getLog().info({ authMode: 'explicit' }, 'using_explicit_tokens');
} else if (globalAuthSetting !== undefined) {
// Unrecognized value - warn and fall back to auto-detect
getLog().warn({ value: globalAuthSetting }, 'unrecognized_global_auth_setting');
useGlobalAuth = !hasExplicitTokens;
} else {
// Not set - auto-detect: use tokens if present, otherwise global auth
useGlobalAuth = !hasExplicitTokens;
if (hasExplicitTokens) {
getLog().info({ authMode: 'explicit', autoDetected: true }, 'using_explicit_tokens');
} else {
getLog().info({ authMode: 'global', autoDetected: true }, 'using_global_auth');
}
}
let baseEnv: NodeJS.ProcessEnv;
if (useGlobalAuth) {
// Start from allowlist-filtered env, then strip auth tokens
const clean = buildCleanSubprocessEnv();
const { CLAUDE_CODE_OAUTH_TOKEN, CLAUDE_API_KEY, ...envWithoutAuth } = clean;
// Log if we're filtering out tokens (helps debug auth issues)
const filtered = [
CLAUDE_CODE_OAUTH_TOKEN && 'CLAUDE_CODE_OAUTH_TOKEN',
CLAUDE_API_KEY && 'CLAUDE_API_KEY',
].filter(Boolean);
if (filtered.length > 0) {
getLog().info({ filteredVars: filtered }, 'global_auth_filtered_tokens');
}
baseEnv = envWithoutAuth;
} else {
// Start from allowlist-filtered env (includes auth tokens)
baseEnv = buildCleanSubprocessEnv();
}
// Clean env vars that interfere with Claude Code subprocess
const cleanedVars: string[] = [];
// Strip nested-session guard marker (claude-code v2.1.41+).
// When the server is started from inside a Claude Code terminal, CLAUDECODE=1
// is inherited and causes the subprocess to refuse to launch.
// See: https://github.com/anthropics/claude-code/issues/25434
if (baseEnv.CLAUDECODE) {
delete baseEnv.CLAUDECODE;
cleanedVars.push('CLAUDECODE');
}
// Strip debugger env vars
// See: https://github.com/anthropics/claude-code/issues/4619
if (baseEnv.NODE_OPTIONS) {
delete baseEnv.NODE_OPTIONS;
cleanedVars.push('NODE_OPTIONS');
}
if (baseEnv.VSCODE_INSPECTOR_OPTIONS) {
delete baseEnv.VSCODE_INSPECTOR_OPTIONS;
cleanedVars.push('VSCODE_INSPECTOR_OPTIONS');
}
if (cleanedVars.length > 0) {
getLog().info({ cleanedVars }, 'subprocess_env_cleaned');
}
return baseEnv;
}
/** Max retries for transient subprocess failures (3 = 4 total attempts).
* SDK subprocess crashes (exit code 1) are often intermittent AJV schema validation
* regressions, stale HTTP/2 connections, and other transient SDK issues typically
* succeed on retry 3 or 4. See: anthropics/claude-code#22973, claude-code-action#853 */
const MAX_SUBPROCESS_RETRIES = 3;
/** Delay between retries in milliseconds */
const RETRY_BASE_DELAY_MS = 2000;
/** Patterns indicating rate limiting in stderr/error messages */
const RATE_LIMIT_PATTERNS = ['rate limit', 'too many requests', '429', 'overloaded'];
/** Patterns indicating auth issues in stderr/error messages */
const AUTH_PATTERNS = [
'credit balance',
'unauthorized',
'authentication',
'invalid token',
'401',
'403',
];
/** Patterns indicating the subprocess crashed (transient, worth retrying) */
const SUBPROCESS_CRASH_PATTERNS = [
'exited with code',
'killed',
'signal',
// "Operation aborted" can appear when the SDK's PostToolUse hook tries to write()
// back to a subprocess pipe that was closed by an abort signal. This is a race
// condition in SDK cleanup — safe to classify as a crash and retry.
'operation aborted',
];
function classifySubprocessError(
errorMessage: string,
stderrOutput: string
): 'rate_limit' | 'auth' | 'crash' | 'unknown' {
const combined = `${errorMessage} ${stderrOutput}`.toLowerCase();
if (RATE_LIMIT_PATTERNS.some(p => combined.includes(p))) return 'rate_limit';
if (AUTH_PATTERNS.some(p => combined.includes(p))) return 'auth';
if (SUBPROCESS_CRASH_PATTERNS.some(p => combined.includes(p))) return 'crash';
return 'unknown';
}
/**
* Returns the current process UID, or undefined on platforms that don't support it (e.g. Windows).
* Exported for testing spyOn(claudeModule, 'getProcessUid') works cross-platform.
*/
export function getProcessUid(): number | undefined {
return typeof process.getuid === 'function' ? process.getuid() : undefined;
}
/**
* Claude AI assistant client
* Implements generic IAssistantClient interface
*/
export class ClaudeClient implements IAssistantClient {
private readonly retryBaseDelayMs: number;
constructor(options?: { retryBaseDelayMs?: number }) {
// Claude Code SDK silently rejects bypassPermissions when running as root (UID 0).
// Check once at construction time so the error surfaces early, not on first query.
// IS_SANDBOX=1 bypasses this check — the SDK itself honours this env var in sandboxed
// environments (Docker, VPS, CI) where running as root is expected.
if (getProcessUid() === 0 && process.env.IS_SANDBOX !== '1') {
throw new Error(
'Claude Code SDK does not support bypassPermissions when running as root (UID 0). ' +
'Run as a non-root user, set IS_SANDBOX=1, or use the Dockerfile which creates a non-root appuser.'
);
}
this.retryBaseDelayMs = options?.retryBaseDelayMs ?? RETRY_BASE_DELAY_MS;
}
/**
* Send a query to Claude and stream responses.
* Includes retry logic for transient failures (up to 3 retries with exponential backoff).
* Enriches errors with stderr context and classification.
*/
async *sendQuery(
prompt: string,
cwd: string,
resumeSessionId?: string,
requestOptions?: AssistantRequestOptions
): AsyncGenerator<MessageChunk> {
// Pre-spawn: check for env key leak if codebase is not explicitly consented.
// Use prefix lookup so worktree paths (e.g. .../worktrees/feature-branch) still
// match the registered source cwd (e.g. .../source).
const codebase =
(await codebaseDb.findCodebaseByDefaultCwd(cwd)) ??
(await codebaseDb.findCodebaseByPathPrefix(cwd));
if (!codebase?.allow_env_keys) {
// Fail-closed: a config load failure (corrupt YAML, permission denied)
// must NOT silently bypass the gate. Catch, log, and treat as
// `allowTargetRepoKeys = false` so the scanner still runs.
let allowTargetRepoKeys = false;
try {
const merged = await loadConfig(cwd);
allowTargetRepoKeys = merged.allowTargetRepoKeys;
} catch (configErr) {
getLog().warn({ err: configErr, cwd }, 'env_leak_gate.config_load_failed_gate_enforced');
}
if (!allowTargetRepoKeys) {
const report = scanPathForSensitiveKeys(cwd);
if (report.findings.length > 0) {
throw new EnvLeakError(report, 'spawn-existing');
}
}
}
// Note: If subprocess crashes mid-stream after yielding chunks, those chunks
// are already consumed by the caller. Retry starts a fresh subprocess, so the
// caller may receive partial output from the failed attempt followed by full
// output from the retry. This is a known limitation of async generator retries.
let lastError: Error | undefined;
for (let attempt = 0; attempt <= MAX_SUBPROCESS_RETRIES; attempt++) {
// Check if already aborted before starting attempt
if (requestOptions?.abortSignal?.aborted) {
throw new Error('Query aborted');
}
const stderrLines: string[] = [];
const toolResultQueue: { toolName: string; toolOutput: string; toolCallId?: string }[] = [];
// Create per-attempt abort controller and wire to caller's signal
const controller = new AbortController();
if (requestOptions?.abortSignal) {
requestOptions.abortSignal.addEventListener(
'abort',
() => {
controller.abort();
},
{ once: true }
);
}
const options: Options = {
cwd,
env: requestOptions?.env
? { ...buildSubprocessEnv(), ...requestOptions.env }
: buildSubprocessEnv(),
model: requestOptions?.model,
abortController: controller,
...(requestOptions?.tools !== undefined ? { tools: requestOptions.tools } : {}),
...(requestOptions?.disallowedTools !== undefined
? { disallowedTools: requestOptions.disallowedTools }
: {}),
// Pass outputFormat for json_schema structured output (Claude Agent SDK v0.2.45+)
...(requestOptions?.outputFormat !== undefined
? { outputFormat: requestOptions.outputFormat }
: {}),
// Note: hooks are merged below (line with `hooks: { ... }`) — not spread here
// Pass MCP servers for per-node MCP support (Claude Agent SDK v0.2.74+)
...(requestOptions?.mcpServers !== undefined
? { mcpServers: requestOptions.mcpServers }
: {}),
// Pass allowedTools for MCP tool wildcards (e.g., 'mcp__github__*')
...(requestOptions?.allowedTools !== undefined
? { allowedTools: requestOptions.allowedTools }
: {}),
// Pass agents/agent for per-node skill scoping via AgentDefinition wrapping
...(requestOptions?.agents !== undefined ? { agents: requestOptions.agents } : {}),
...(requestOptions?.agent !== undefined ? { agent: requestOptions.agent } : {}),
// Skip writing session transcripts to ~/.claude/projects/ — Archon manages its own
// session persistence. persistSession: false reduces disk I/O and keeps the session
// directory clean. Claude Agent SDK v0.2.74+.
...(requestOptions?.persistSession !== undefined
? { persistSession: requestOptions.persistSession }
: {}),
// When forkSession is true, the SDK copies the prior session's history into a new
// session file, leaving the original untouched — safe to use on retries.
...(requestOptions?.forkSession !== undefined
? { forkSession: requestOptions.forkSession }
: {}),
// Forward Claude-only SDK options (effort, thinking, maxBudgetUsd, fallbackModel, betas, sandbox)
...(requestOptions?.effort !== undefined ? { effort: requestOptions.effort } : {}),
...(requestOptions?.thinking !== undefined ? { thinking: requestOptions.thinking } : {}),
...(requestOptions?.maxBudgetUsd !== undefined
? { maxBudgetUsd: requestOptions.maxBudgetUsd }
: {}),
...(requestOptions?.fallbackModel !== undefined
? { fallbackModel: requestOptions.fallbackModel }
: {}),
// betas: string[] from user config; SDK expects SdkBeta[] (string literal union).
// User-provided values are validated upstream — cast is safe.
...(requestOptions?.betas !== undefined
? { betas: requestOptions.betas as Options['betas'] }
: {}),
...(requestOptions?.sandbox !== undefined ? { sandbox: requestOptions.sandbox } : {}),
permissionMode: 'bypassPermissions',
allowDangerouslySkipPermissions: true,
systemPrompt: requestOptions?.systemPrompt ?? { type: 'preset', preset: 'claude_code' },
settingSources: requestOptions?.settingSources ?? ['project'],
// Merge user-provided hooks with our PostToolUse capture hook
hooks: {
...(requestOptions?.hooks ?? {}),
PostToolUse: [
...((requestOptions?.hooks?.PostToolUse ?? []) as HookCallbackMatcher[]),
{
hooks: [
(async (input: Record<string, unknown>): Promise<{ continue: true }> => {
const toolName = (input as { tool_name?: string }).tool_name ?? 'unknown';
const toolUseId = (input as { tool_use_id?: string }).tool_use_id;
const toolResponse = (input as { tool_response?: unknown }).tool_response;
const output =
typeof toolResponse === 'string'
? toolResponse
: JSON.stringify(toolResponse ?? '');
// Truncate large outputs (e.g., file reads) to prevent DB bloat
const maxLen = 10_000;
toolResultQueue.push({
toolName,
toolOutput: output.length > maxLen ? output.slice(0, maxLen) + '...' : output,
...(toolUseId !== undefined ? { toolCallId: toolUseId } : {}),
});
return { continue: true };
}) as HookCallback,
],
},
],
// Without this, errored / interrupted / permission-denied tools never produce
// a paired tool_result chunk and the corresponding UI card spins forever.
// SDK type: PostToolUseFailureHookInput { tool_name, tool_use_id, error, is_interrupt? }
PostToolUseFailure: [
...((requestOptions?.hooks?.PostToolUseFailure ?? []) as HookCallbackMatcher[]),
{
hooks: [
(async (input: Record<string, unknown>): Promise<{ continue: true }> => {
// Always return { continue: true } even on internal errors so a
// malformed SDK payload can never crash the hook dispatch silently.
try {
const toolName = (input as { tool_name?: string }).tool_name ?? 'unknown';
const toolUseId = (input as { tool_use_id?: string }).tool_use_id;
const rawError = (input as { error?: string }).error;
if (rawError === undefined) {
getLog().debug({ input }, 'claude.post_tool_use_failure_no_error_field');
}
const errorText = rawError ?? 'tool failed';
const isInterrupt = (input as { is_interrupt?: boolean }).is_interrupt === true;
const prefix = isInterrupt ? '⚠️ Interrupted' : '❌ Error';
toolResultQueue.push({
toolName,
toolOutput: `${prefix}: ${errorText}`,
...(toolUseId !== undefined ? { toolCallId: toolUseId } : {}),
});
} catch (e) {
getLog().error({ err: e, input }, 'claude.post_tool_use_failure_hook_error');
}
return { continue: true };
}) as HookCallback,
],
},
],
},
stderr: (data: string) => {
const output = data.trim();
if (!output) return;
// Always capture stderr for diagnostics — previous filtering discarded
// useful SDK startup output, leaving stderrContext empty on crashes.
stderrLines.push(output);
const isError =
output.toLowerCase().includes('error') ||
output.toLowerCase().includes('fatal') ||
output.toLowerCase().includes('failed') ||
output.toLowerCase().includes('exception') ||
output.includes('at ') ||
output.includes('Error:');
const isInfoMessage =
output.includes('Spawning Claude Code') ||
output.includes('--output-format') ||
output.includes('--permission-mode');
if (isError && !isInfoMessage) {
getLog().error({ stderr: output }, 'subprocess_error');
}
},
};
if (resumeSessionId) {
options.resume = resumeSessionId;
getLog().debug(
{ sessionId: resumeSessionId, forkSession: requestOptions?.forkSession },
'resuming_session'
);
} else {
getLog().debug({ cwd, attempt }, 'starting_new_session');
}
try {
for await (const msg of query({ prompt, options })) {
// Drain tool results captured by PostToolUse hook before processing the next message
while (toolResultQueue.length > 0) {
const tr = toolResultQueue.shift();
if (tr) {
yield {
type: 'tool_result',
toolName: tr.toolName,
toolOutput: tr.toolOutput,
...(tr.toolCallId !== undefined ? { toolCallId: tr.toolCallId } : {}),
};
}
}
if (msg.type === 'assistant') {
const message = msg as { message: { content: ContentBlock[] } };
const content = message.message.content;
for (const block of content) {
if (block.type === 'text' && block.text) {
yield { type: 'assistant', content: block.text };
} else if (block.type === 'tool_use' && block.name) {
yield {
type: 'tool',
toolName: block.name,
toolInput: block.input ?? {},
...(block.id !== undefined ? { toolCallId: block.id } : {}),
};
}
}
} else if (msg.type === 'system') {
// Check MCP server connection status from system/init
const sysMsg = msg as {
subtype?: string;
mcp_servers?: { name: string; status: string }[];
};
if (sysMsg.subtype === 'init' && sysMsg.mcp_servers) {
const failed = sysMsg.mcp_servers.filter(s => s.status !== 'connected');
if (failed.length > 0) {
const names = failed.map(s => `${s.name} (${s.status})`).join(', ');
yield { type: 'system', content: `MCP server connection failed: ${names}` };
}
} else {
getLog().debug({ subtype: sysMsg.subtype }, 'claude.system_message_unhandled');
}
} else if (msg.type === 'rate_limit_event') {
const rateLimitMsg = msg as { rate_limit_info?: Record<string, unknown> };
getLog().warn(
{ rateLimitInfo: rateLimitMsg.rate_limit_info },
'claude.rate_limit_event'
);
yield { type: 'rate_limit', rateLimitInfo: rateLimitMsg.rate_limit_info ?? {} };
} else if (msg.type === 'result') {
const resultMsg = msg as {
session_id?: string;
is_error?: boolean;
subtype?: string;
usage?: { input_tokens?: number; output_tokens?: number; total_tokens?: number };
structured_output?: unknown;
total_cost_usd?: number;
stop_reason?: string | null;
num_turns?: number;
model_usage?: Record<
string,
{
input_tokens: number;
output_tokens: number;
cache_read_input_tokens?: number;
cache_creation_input_tokens?: number;
}
>;
};
const tokens = normalizeClaudeUsage(resultMsg.usage);
yield {
type: 'result',
sessionId: resultMsg.session_id,
...(tokens ? { tokens } : {}),
...(resultMsg.structured_output !== undefined
? { structuredOutput: resultMsg.structured_output }
: {}),
...(resultMsg.is_error ? { isError: true, errorSubtype: resultMsg.subtype } : {}),
...(resultMsg.total_cost_usd !== undefined ? { cost: resultMsg.total_cost_usd } : {}),
...(resultMsg.stop_reason != null ? { stopReason: resultMsg.stop_reason } : {}),
...(resultMsg.num_turns !== undefined ? { numTurns: resultMsg.num_turns } : {}),
...(resultMsg.model_usage
? { modelUsage: resultMsg.model_usage as Record<string, unknown> }
: {}),
};
}
}
// Drain any remaining tool results from the hook queue.
// Must mirror the in-loop drain — PostToolUseFailure results commonly land
// here (they fire just before the SDK's terminal `result` message), so
// dropping toolCallId here would defeat the stable-pairing fix.
while (toolResultQueue.length > 0) {
const tr = toolResultQueue.shift();
if (tr) {
yield {
type: 'tool_result',
toolName: tr.toolName,
toolOutput: tr.toolOutput,
...(tr.toolCallId !== undefined ? { toolCallId: tr.toolCallId } : {}),
};
}
}
return; // Success - exit retry loop
} catch (error) {
const err = error as Error;
// Don't retry aborted queries
if (controller.signal.aborted) {
throw new Error('Query aborted');
}
const stderrContext = stderrLines.join('\n');
const errorClass = classifySubprocessError(err.message, stderrContext);
getLog().error(
{ err, stderrContext, errorClass, attempt, maxRetries: MAX_SUBPROCESS_RETRIES },
'query_error'
);
// Don't retry auth errors - they won't resolve
if (errorClass === 'auth') {
const enrichedError = new Error(
`Claude Code auth error: ${err.message}${stderrContext ? ` (${stderrContext})` : ''}`
);
enrichedError.cause = error;
throw enrichedError;
}
// Retry transient failures (rate limit, crash)
if (
attempt < MAX_SUBPROCESS_RETRIES &&
(errorClass === 'rate_limit' || errorClass === 'crash')
) {
const delayMs = this.retryBaseDelayMs * Math.pow(2, attempt);
getLog().info({ attempt, delayMs, errorClass }, 'retrying_subprocess');
await new Promise(resolve => setTimeout(resolve, delayMs));
lastError = err;
continue;
}
// Final failure - enrich and throw
const enrichedMessage = stderrContext
? `Claude Code ${errorClass}: ${err.message} (stderr: ${stderrContext})`
: `Claude Code ${errorClass}: ${err.message}`;
const enrichedError = new Error(enrichedMessage);
enrichedError.cause = error;
throw enrichedError;
}
}
// Should not reach here, but handle defensively
throw lastError ?? new Error('Claude Code query failed after retries');
}
/**
* Get the assistant type identifier
*/
getType(): string {
return 'claude';
}
}

View file

@ -1,554 +0,0 @@
/**
* Codex SDK wrapper
* Provides async generator interface for streaming Codex responses
*
* With Bun runtime, we can directly import ESM packages without the
* dynamic import workaround that was needed for CommonJS/Node.js.
*/
import {
Codex,
type ThreadOptions,
type TurnOptions,
type TurnCompletedEvent,
} from '@openai/codex-sdk';
import {
type AssistantRequestOptions,
type IAssistantClient,
type MessageChunk,
type TokenUsage,
} from '../types';
import { createLogger } from '@archon/paths';
import { scanPathForSensitiveKeys, EnvLeakError } from '../utils/env-leak-scanner';
import * as codebaseDb from '../db/codebases';
import { loadConfig } from '../config/config-loader';
/** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */
let cachedLog: ReturnType<typeof createLogger> | undefined;
function getLog(): ReturnType<typeof createLogger> {
if (!cachedLog) cachedLog = createLogger('client.codex');
return cachedLog;
}
// Singleton Codex instance
let codexInstance: Codex | null = null;
/**
* Get or create Codex SDK instance
* Synchronous now that we have direct ESM import
*/
function getCodex(): Codex {
if (!codexInstance) {
codexInstance = new Codex();
}
return codexInstance;
}
/**
* Build thread options for Codex SDK
* Extracted to avoid duplication across thread creation paths
*/
function buildThreadOptions(cwd: string, options?: AssistantRequestOptions): ThreadOptions {
return {
workingDirectory: cwd,
skipGitRepoCheck: true,
sandboxMode: 'danger-full-access', // Full filesystem access (needed for git worktree operations)
networkAccessEnabled: true, // Allow network calls (GitHub CLI, HTTP requests)
approvalPolicy: 'never', // Auto-approve all operations without user confirmation
model: options?.model,
modelReasoningEffort: options?.modelReasoningEffort,
webSearchMode: options?.webSearchMode,
additionalDirectories: options?.additionalDirectories,
};
}
const CODEX_MODEL_FALLBACKS: Record<string, string> = {
'gpt-5.3-codex': 'gpt-5.2-codex',
};
function isModelAccessError(errorMessage: string): boolean {
const m = errorMessage.toLowerCase();
const hasModel = m.includes('model');
const hasAvailabilitySignal =
m.includes('not available') || m.includes('not found') || m.includes('access denied');
return hasModel && hasAvailabilitySignal;
}
function buildModelAccessMessage(model?: string): string {
const normalizedModel = model?.trim();
const selectedModel = normalizedModel || 'the configured model';
const suggested = normalizedModel ? CODEX_MODEL_FALLBACKS[normalizedModel] : undefined;
const fixLine = suggested
? `To fix: update your model in ~/.archon/config.yaml:\n assistants:\n codex:\n model: ${suggested}`
: 'To fix: update your model in ~/.archon/config.yaml to one your account can access.';
const workflowLine = suggested
? `Or set it per-workflow with \`model: ${suggested}\` in workflow YAML.`
: 'Or set it per-workflow with a valid `model:` in workflow YAML.';
return `❌ Model "${selectedModel}" is not available for your account.\n\n${fixLine}\n\n${workflowLine}`;
}
/** Max retries for transient failures (3 = 4 total attempts).
* Mirrors ClaudeClient retry logic Codex process crashes are similarly intermittent. */
const MAX_SUBPROCESS_RETRIES = 3;
/** Delay between retries in milliseconds */
const RETRY_BASE_DELAY_MS = 2000;
/** Patterns indicating rate limiting in error messages */
const RATE_LIMIT_PATTERNS = ['rate limit', 'too many requests', '429', 'overloaded'];
/** Patterns indicating auth issues in error messages */
const AUTH_PATTERNS = [
'credit balance',
'unauthorized',
'authentication',
'invalid token',
'401',
'403',
];
/** Patterns indicating a transient process crash (worth retrying) */
const SUBPROCESS_CRASH_PATTERNS = ['exited with code', 'killed', 'signal', 'codex exec'];
function classifyCodexError(
errorMessage: string
): 'rate_limit' | 'auth' | 'crash' | 'model_access' | 'unknown' {
if (isModelAccessError(errorMessage)) return 'model_access';
const m = errorMessage.toLowerCase();
if (RATE_LIMIT_PATTERNS.some(p => m.includes(p))) return 'rate_limit';
if (AUTH_PATTERNS.some(p => m.includes(p))) return 'auth';
if (SUBPROCESS_CRASH_PATTERNS.some(p => m.includes(p))) return 'crash';
return 'unknown';
}
function extractUsageFromCodexEvent(event: TurnCompletedEvent): TokenUsage {
if (!event.usage) {
getLog().warn({ eventType: event.type }, 'codex.usage_null_on_turn_completed');
return { input: 0, output: 0 };
}
return {
input: event.usage.input_tokens,
output: event.usage.output_tokens,
};
}
/**
* Codex AI assistant client
* Implements generic IAssistantClient interface
*/
export class CodexClient implements IAssistantClient {
private readonly retryBaseDelayMs: number;
constructor(options?: { retryBaseDelayMs?: number }) {
this.retryBaseDelayMs = options?.retryBaseDelayMs ?? RETRY_BASE_DELAY_MS;
}
/**
* Send a query to Codex and stream responses
* @param prompt - User message or prompt
* @param cwd - Working directory for Codex
* @param resumeSessionId - Optional thread ID to resume
*/
async *sendQuery(
prompt: string,
cwd: string,
resumeSessionId?: string,
options?: AssistantRequestOptions
): AsyncGenerator<MessageChunk> {
// Pre-spawn: check for env key leak if codebase is not explicitly consented.
// Use prefix lookup so worktree paths (e.g. .../worktrees/feature-branch) still
// match the registered source cwd (e.g. .../source).
const codebase =
(await codebaseDb.findCodebaseByDefaultCwd(cwd)) ??
(await codebaseDb.findCodebaseByPathPrefix(cwd));
if (!codebase?.allow_env_keys) {
// Fail-closed: a config load failure must NOT silently bypass the gate.
let allowTargetRepoKeys = false;
try {
const merged = await loadConfig(cwd);
allowTargetRepoKeys = merged.allowTargetRepoKeys;
} catch (configErr) {
getLog().warn({ err: configErr, cwd }, 'env_leak_gate.config_load_failed_gate_enforced');
}
if (!allowTargetRepoKeys) {
const report = scanPathForSensitiveKeys(cwd);
if (report.findings.length > 0) {
throw new EnvLeakError(report, 'spawn-existing');
}
}
}
const codex = getCodex();
const threadOptions = buildThreadOptions(cwd, options);
// Check if already aborted before starting
if (options?.abortSignal?.aborted) {
throw new Error('Query aborted');
}
// Track if we fell back from a failed resume (to notify user)
let sessionResumeFailed = false;
// Get or create thread (synchronous operations!)
let thread;
if (resumeSessionId) {
getLog().debug({ sessionId: resumeSessionId }, 'resuming_thread');
try {
// NOTE: resumeThread is synchronous, not async
// IMPORTANT: Must pass options when resuming!
thread = codex.resumeThread(resumeSessionId, threadOptions);
} catch (error) {
getLog().error({ err: error, sessionId: resumeSessionId }, 'resume_thread_failed');
// Fall back to creating new thread
try {
thread = codex.startThread(threadOptions);
} catch (startError) {
const err = startError as Error;
if (isModelAccessError(err.message)) {
throw new Error(buildModelAccessMessage(options?.model));
}
throw new Error(`Codex query failed: ${err.message}`);
}
sessionResumeFailed = true;
}
} else {
getLog().debug({ cwd }, 'starting_new_thread');
// NOTE: startThread is synchronous, not async
try {
thread = codex.startThread(threadOptions);
} catch (error) {
const err = error as Error;
if (isModelAccessError(err.message)) {
throw new Error(buildModelAccessMessage(options?.model));
}
throw new Error(`Codex query failed: ${err.message}`);
}
}
// Notify user if session resume failed (don't silently lose context)
if (sessionResumeFailed) {
yield {
type: 'system',
content: '⚠️ Could not resume previous session. Starting fresh conversation.',
};
}
let lastTodoListSignature: string | undefined;
let lastError: Error | undefined;
for (let attempt = 0; attempt <= MAX_SUBPROCESS_RETRIES; attempt++) {
// Check abort signal before each attempt
if (options?.abortSignal?.aborted) {
throw new Error('Query aborted');
}
// On retries, create a fresh thread (crashed thread is invalid)
if (attempt > 0) {
getLog().debug({ cwd, attempt }, 'starting_new_thread');
try {
thread = codex.startThread(threadOptions);
} catch (startError) {
const err = startError as Error;
if (isModelAccessError(err.message)) {
throw new Error(buildModelAccessMessage(options?.model));
}
throw new Error(`Codex query failed: ${err.message}`);
}
}
try {
// Build per-turn options (structured output schema, abort signal)
const turnOptions: TurnOptions = {};
if (options?.outputFormat) {
turnOptions.outputSchema = options.outputFormat.schema;
}
if (options?.abortSignal) {
turnOptions.signal = options.abortSignal;
}
// Run streamed query (this IS async)
const result = await thread.runStreamed(prompt, turnOptions);
// Process streaming events
for await (const event of result.events) {
// Check abort signal between events
if (options?.abortSignal?.aborted) {
getLog().info('query_aborted_between_events');
break;
}
// Log progress for item.started (visibility fix for Codex appearing to hang)
if (event.type === 'item.started') {
const item = event.item;
getLog().debug(
{ eventType: event.type, itemType: item.type, itemId: item.id },
'item_started'
);
}
// Handle error events
if (event.type === 'error') {
getLog().error({ message: event.message }, 'stream_error');
// Don't send MCP timeout errors (they're optional)
if (!event.message.includes('MCP client')) {
yield { type: 'system', content: `⚠️ ${event.message}` };
}
continue;
}
// Handle turn failed events
if (event.type === 'turn.failed') {
const errorObj = event.error as { message?: string } | undefined;
const errorMessage = errorObj?.message ?? 'Unknown error';
getLog().error({ errorMessage }, 'turn_failed');
yield {
type: 'system',
content: `❌ Turn failed: ${errorMessage}`,
};
break;
}
// Handle item.completed events - map to MessageChunk types
if (event.type === 'item.completed') {
const item = event.item;
// Log progress with context for debugging
const logContext: Record<string, unknown> = {
eventType: event.type,
itemType: item.type,
itemId: item.id,
};
if (item.type === 'command_execution' && item.command) {
logContext.command = item.command;
}
getLog().debug(logContext, 'item_completed');
switch (item.type) {
case 'agent_message':
// Agent text response
if (item.text) {
yield { type: 'assistant', content: item.text };
}
break;
case 'command_execution':
// Tool/command execution. The Codex SDK only emits item.completed
// once the command has fully run, so we emit the start + result
// back-to-back to close the UI's tool card immediately. Without
// the paired tool_result, the card spins forever until lock release.
if (item.command) {
yield { type: 'tool', toolName: item.command };
const exitSuffix =
item.exit_code != null && item.exit_code !== 0
? `\n[exit code: ${item.exit_code}]`
: '';
yield {
type: 'tool_result',
toolName: item.command,
toolOutput: (item.aggregated_output ?? '') + exitSuffix,
};
} else {
getLog().warn({ itemId: item.id }, 'command_execution_missing_command');
}
break;
case 'reasoning':
// Agent reasoning/thinking
if (item.text) {
yield { type: 'thinking', content: item.text };
}
break;
case 'web_search':
if (item.query) {
const searchToolName = `🔍 Searching: ${item.query}`;
yield { type: 'tool', toolName: searchToolName };
// Web search items only fire on completion, so close the card immediately.
yield { type: 'tool_result', toolName: searchToolName, toolOutput: '' };
} else {
getLog().debug({ itemId: item.id }, 'web_search_missing_query');
}
break;
case 'todo_list':
if (Array.isArray(item.items) && item.items.length > 0) {
const normalizedItems = item.items.map(t => ({
text: typeof t.text === 'string' ? t.text : '(unnamed task)',
completed: t.completed ?? false,
}));
const signature = JSON.stringify(normalizedItems);
if (signature !== lastTodoListSignature) {
lastTodoListSignature = signature;
const taskList = normalizedItems
.map(t => `${t.completed ? '✅' : '⬜'} ${t.text}`)
.join('\n');
yield { type: 'system', content: `📋 Tasks:\n${taskList}` };
}
} else {
getLog().debug({ itemId: item.id }, 'todo_list_empty_or_invalid');
}
break;
case 'file_change': {
const statusIcon = item.status === 'failed' ? '❌' : '✅';
const rawError = 'error' in item ? (item as { error?: unknown }).error : undefined;
const fileErrorMessage =
typeof rawError === 'string'
? rawError
: typeof rawError === 'object' && rawError !== null && 'message' in rawError
? String((rawError as { message: unknown }).message)
: undefined;
if (Array.isArray(item.changes) && item.changes.length > 0) {
const changeList = item.changes
.map(c => {
const icon = c.kind === 'add' ? '' : c.kind === 'delete' ? '' : '📝';
return `${icon} ${c.path ?? '(unknown file)'}`;
})
.join('\n');
const errorSuffix =
item.status === 'failed' && fileErrorMessage ? `\n${fileErrorMessage}` : '';
yield {
type: 'system',
content: `${statusIcon} File changes:\n${changeList}${errorSuffix}`,
};
} else if (item.status === 'failed') {
getLog().warn(
{ itemId: item.id, status: item.status },
'file_change_failed_no_changes'
);
const failMsg = fileErrorMessage
? `❌ File change failed: ${fileErrorMessage}`
: '❌ File change failed';
yield { type: 'system', content: failMsg };
} else {
getLog().debug(
{ itemId: item.id, status: item.status },
'file_change_no_changes'
);
}
break;
}
case 'mcp_tool_call': {
const toolInfo =
item.server && item.tool
? `${item.server}/${item.tool}`
: (item.tool ?? item.server ?? 'MCP tool');
const mcpToolName = `🔌 MCP: ${toolInfo}`;
// Always emit start+result so the UI card closes. item.completed
// fires once the call is final (completed or failed).
yield { type: 'tool', toolName: mcpToolName };
if (item.status === 'failed') {
getLog().warn(
{ server: item.server, tool: item.tool, error: item.error, itemId: item.id },
'mcp_tool_call_failed'
);
const errMsg = item.error?.message
? `❌ Error: ${item.error.message}`
: '❌ Error: MCP tool failed';
yield { type: 'tool_result', toolName: mcpToolName, toolOutput: errMsg };
} else {
// status === 'completed' (or 'in_progress', which shouldn't reach
// item.completed but is closed defensively).
let toolOutput = '';
if (item.result?.content) {
if (Array.isArray(item.result.content)) {
toolOutput = JSON.stringify(item.result.content);
} else {
getLog().warn(
{
itemId: item.id,
server: item.server,
tool: item.tool,
resultType: typeof item.result.content,
},
'mcp_tool_call_unexpected_result_shape'
);
}
}
yield { type: 'tool_result', toolName: mcpToolName, toolOutput };
}
break;
}
// Other item types are ignored (like file edits, etc.)
}
}
// Handle turn.completed event
if (event.type === 'turn.completed') {
getLog().debug('turn_completed');
// Yield result with thread ID for persistence
const usage = extractUsageFromCodexEvent(event);
yield {
type: 'result',
sessionId: thread.id ?? undefined,
tokens: usage,
};
// CRITICAL: Break out of event loop - turn is complete!
// Without this, the loop waits for stream to end (causes 90s timeout)
break;
}
}
return; // Success - exit retry loop
} catch (error) {
const err = error as Error;
// Don't retry aborted queries
if (options?.abortSignal?.aborted) {
throw new Error('Query aborted');
}
const errorClass = classifyCodexError(err.message);
getLog().error(
{ err, errorClass, attempt, maxRetries: MAX_SUBPROCESS_RETRIES },
'query_error'
);
// Model access errors are never retryable
if (errorClass === 'model_access') {
throw new Error(buildModelAccessMessage(options?.model));
}
// Auth errors won't resolve on retry
if (errorClass === 'auth') {
const enrichedError = new Error(`Codex auth error: ${err.message}`);
enrichedError.cause = error;
throw enrichedError;
}
// Retry transient failures (rate limit, crash)
if (
attempt < MAX_SUBPROCESS_RETRIES &&
(errorClass === 'rate_limit' || errorClass === 'crash')
) {
const delayMs = this.retryBaseDelayMs * Math.pow(2, attempt);
getLog().info({ attempt, delayMs, errorClass }, 'retrying_query');
await new Promise(resolve => setTimeout(resolve, delayMs));
lastError = err;
continue;
}
// Final failure - enrich and throw
const enrichedError = new Error(`Codex ${errorClass}: ${err.message}`);
enrichedError.cause = error;
throw enrichedError;
}
}
// Should not reach here, but handle defensively
throw lastError ?? new Error('Codex query failed after retries');
}
/**
* Get the assistant type identifier
*/
getType(): string {
return 'codex';
}
}

View file

@ -1,48 +0,0 @@
import { describe, test, expect } from 'bun:test';
import { getAssistantClient } from './factory';
describe('factory', () => {
describe('getAssistantClient', () => {
test('returns ClaudeClient for claude type', () => {
const client = getAssistantClient('claude');
expect(client).toBeDefined();
expect(client.getType()).toBe('claude');
expect(typeof client.sendQuery).toBe('function');
});
test('returns CodexClient for codex type', () => {
const client = getAssistantClient('codex');
expect(client).toBeDefined();
expect(client.getType()).toBe('codex');
expect(typeof client.sendQuery).toBe('function');
});
test('throws error for unknown type', () => {
expect(() => getAssistantClient('unknown')).toThrow(
"Unknown assistant type: unknown. Supported types: 'claude', 'codex'"
);
});
test('throws error for empty string', () => {
expect(() => getAssistantClient('')).toThrow(
"Unknown assistant type: . Supported types: 'claude', 'codex'"
);
});
test('is case sensitive - Claude throws', () => {
expect(() => getAssistantClient('Claude')).toThrow(
"Unknown assistant type: Claude. Supported types: 'claude', 'codex'"
);
});
test('each call returns new instance', () => {
const client1 = getAssistantClient('claude');
const client2 = getAssistantClient('claude');
// Each call should return a new instance
expect(client1).not.toBe(client2);
});
});
});

View file

@ -1,37 +0,0 @@
/**
* AI Assistant Client Factory
*
* Dynamically instantiates the appropriate AI assistant client based on type string.
* Supports Claude and Codex assistants.
*/
import type { IAssistantClient } from '../types';
import { ClaudeClient } from './claude';
import { CodexClient } from './codex';
import { createLogger } from '@archon/paths';
/** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */
let cachedLog: ReturnType<typeof createLogger> | undefined;
function getLog(): ReturnType<typeof createLogger> {
if (!cachedLog) cachedLog = createLogger('client.factory');
return cachedLog;
}
/**
* Get the appropriate AI assistant client based on type
*
* @param type - Assistant type identifier ('claude' or 'codex')
* @returns Instantiated assistant client
* @throws Error if assistant type is unknown
*/
export function getAssistantClient(type: string): IAssistantClient {
switch (type) {
case 'claude':
getLog().debug({ provider: 'claude' }, 'client_selected');
return new ClaudeClient();
case 'codex':
getLog().debug({ provider: 'codex' }, 'client_selected');
return new CodexClient();
default:
throw new Error(`Unknown assistant type: ${type}. Supported types: 'claude', 'codex'`);
}
}

View file

@ -1,16 +0,0 @@
/**
* AI Assistant Clients
*
* Prefer importing from '@archon/core' for most use cases:
* import { ClaudeClient, getAssistantClient } from '@archon/core';
*
* Use this submodule path when you only need client-specific code:
* import { ClaudeClient } from '@archon/core/clients';
*/
export { ClaudeClient } from './claude';
export { CodexClient } from './codex';
export { getAssistantClient } from './factory';
// Re-export types for consumers importing from this submodule directly
export type { IAssistantClient, MessageChunk } from '../types';

View file

@ -224,7 +224,11 @@ concurrency:
const config = await loadConfig();
expect(config.assistant).toBe('claude');
expect(config.assistants).toEqual({ claude: {}, codex: {} });
// Built-ins always present; community providers (like `pi`) are
// seeded dynamically from the registry — check the built-ins
// explicitly rather than asserting an exhaustive shape.
expect(config.assistants.claude).toEqual({});
expect(config.assistants.codex).toEqual({});
expect(config.streaming.telegram).toBe('stream');
expect(config.concurrency.maxConversations).toBe(10);
});
@ -245,6 +249,31 @@ streaming:
expect(config.streaming.telegram).toBe('batch');
});
test('throws on unknown DEFAULT_AI_ASSISTANT env var', async () => {
mockReadConfigFile.mockResolvedValue('');
process.env.DEFAULT_AI_ASSISTANT = 'nonexistent-provider';
await expect(loadConfig()).rejects.toThrow(/not a registered provider/);
});
test('throws on unknown defaultAssistant in global config', async () => {
mockReadConfigFile.mockResolvedValue('defaultAssistant: nonexistent-provider');
await expect(loadConfig()).rejects.toThrow(/not a registered provider/);
});
test('throws on unknown assistant in repo config', async () => {
mockReadConfigFile.mockImplementation(async (path: string) => {
const normalized = path.replace(/\\/g, '/');
if (normalized.includes('/tmp/test-repo/.archon/config.yaml')) {
return 'assistant: nonexistent-provider';
}
return '';
});
await expect(loadConfig('/tmp/test-repo')).rejects.toThrow(/not a registered provider/);
});
test('repo config overrides global config', async () => {
// Helper to check path in cross-platform way (handles both / and \ separators)
const pathMatches = (path: string, pattern: string): boolean => {

View file

@ -28,8 +28,99 @@ export async function writeConfigFile(
): Promise<void> {
await writeFile(path, content, { encoding: 'utf-8', ...options });
}
import type { GlobalConfig, RepoConfig, MergedConfig, SafeConfig } from './config-types';
import type {
GlobalConfig,
RepoConfig,
MergedConfig,
SafeConfig,
AssistantDefaults,
AssistantDefaultsConfig,
} from './config-types';
import { createLogger } from '@archon/paths';
import {
isRegisteredProvider,
getRegisteredProviders,
registerBuiltinProviders,
registerCommunityProviders,
} from '@archon/providers';
/**
* Pure read of registered provider IDs. Registration is guaranteed by
* `loadConfig()`'s bootstrap call before any consumer can observe the
* registry, so this helper must NOT trigger side-effecting registration
* itself that hid the ordering coupling and surprised readers.
*/
function getRegisteredProviderNames(): string[] {
return getRegisteredProviders().map(p => p.id);
}
function mergeAssistantDefaults(
base: AssistantDefaults,
overrides?: AssistantDefaultsConfig
): AssistantDefaults {
// Deep-copy every provider slot present in base. No per-provider listing —
// adding a new community provider must not require editing this function.
const merged: AssistantDefaults = { ...base };
for (const [providerId, providerDefaults] of Object.entries(base)) {
if (providerDefaults && typeof providerDefaults === 'object') {
merged[providerId] = { ...providerDefaults };
}
}
if (!overrides) return merged;
for (const [providerId, providerDefaults] of Object.entries(overrides)) {
if (!providerDefaults || typeof providerDefaults !== 'object') continue;
merged[providerId] = {
...(merged[providerId] ?? {}),
...providerDefaults,
};
}
return merged;
}
/**
* Per-provider allowlist of fields safe to expose to web clients.
*
* **Allowlist (not denylist) by design.** Any field not listed here is
* dropped on its way out. New sensitive fields on a provider default
* config (binary paths, credentials, absolute filesystem paths, etc.)
* are hidden by default you have to opt in to expose them.
*
* Unknown provider IDs (community providers not listed below) fall back
* to the generic empty allowlist: the web UI sees the provider exists,
* but none of its defaults. Providers whose defaults are safe to surface
* register their fields here.
*/
const SAFE_ASSISTANT_FIELDS: Record<string, readonly string[]> = {
claude: ['model'],
codex: ['model', 'modelReasoningEffort', 'webSearchMode'],
// community providers — list each field we're confident is safe to
// show in the web UI. Unknown providers fall through with no fields.
pi: ['model'],
};
function toSafeAssistantDefaults(assistants: AssistantDefaults): SafeConfig['assistants'] {
const safeAssistants: SafeConfig['assistants'] = {};
for (const [providerId, providerDefaults] of Object.entries(assistants)) {
if (!providerDefaults || typeof providerDefaults !== 'object') continue;
const allowed = SAFE_ASSISTANT_FIELDS[providerId] ?? [];
const safeDefaults: Record<string, unknown> = {};
for (const field of allowed) {
const value = (providerDefaults as Record<string, unknown>)[field];
if (value !== undefined) {
safeDefaults[field] = value;
}
}
safeAssistants[providerId] = safeDefaults;
}
return safeAssistants;
}
/** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */
let cachedLog: ReturnType<typeof createLogger> | undefined;
@ -38,24 +129,6 @@ function getLog(): ReturnType<typeof createLogger> {
return cachedLog;
}
/**
* Tracks which env-leak-gate-disabled sources have already warned in this
* process. `loadConfig()` is called once per pre-spawn check (per workflow
* step), so without this guard the warn would flood logs and break alert
* rate-limiting downstream.
*/
const envLeakGateDisabledWarnedSources = new Set<'global_config' | 'repo_config'>();
function warnEnvLeakGateDisabledOnce(source: 'global_config' | 'repo_config'): void {
if (envLeakGateDisabledWarnedSources.has(source)) return;
envLeakGateDisabledWarnedSources.add(source);
getLog().warn({ source }, 'env_leak_gate_disabled');
}
// Test-only: reset the warn-once state so unit tests can re-trigger the log.
export function resetEnvLeakGateWarnedSourcesForTests(): void {
envLeakGateDisabledWarnedSources.clear();
}
/**
* Parse YAML using Bun's native YAML parser
*/
@ -75,7 +148,7 @@ const DEFAULT_CONFIG_CONTENT = `# Archon Global Configuration
# Bot display name (shown in messages)
# botName: Archon
# Default AI assistant (claude or codex)
# Default AI assistant (must match a registered provider, e.g. claude, codex)
# defaultAssistant: claude
# Assistant defaults
@ -188,13 +261,24 @@ export async function loadRepoConfig(repoPath: string): Promise<RepoConfig> {
* Get default configuration
*/
function getDefaults(): MergedConfig {
// Seed one empty entry per registered provider — built-in OR community.
// No per-provider listing here: adding a new provider must not require
// editing this function. `registerBuiltinProviders()` + any community
// registrations run at process bootstrap (see `packages/providers/src/
// registry.ts#registerCommunityProviders`), so by the time this runs the
// registry is populated.
const providers = getRegisteredProviders();
const registeredAssistants: AssistantDefaults = { claude: {}, codex: {} };
for (const provider of providers) {
if (!(provider.id in registeredAssistants)) {
registeredAssistants[provider.id] = {};
}
}
return {
botName: 'Archon',
assistant: 'claude',
assistants: {
claude: {},
codex: {},
},
assistant: providers.find(p => p.builtIn)?.id ?? 'claude',
assistants: registeredAssistants,
streaming: {
telegram: 'stream',
discord: 'batch',
@ -216,7 +300,6 @@ function getDefaults(): MergedConfig {
loadDefaultCommands: true,
loadDefaultWorkflows: true,
},
allowTargetRepoKeys: false,
};
}
@ -230,10 +313,17 @@ function applyEnvOverrides(config: MergedConfig): MergedConfig {
config.botName = envBotName;
}
// Assistant override
// Assistant override — validate against registry, error on unknown provider
const envAssistant = process.env.DEFAULT_AI_ASSISTANT;
if (envAssistant === 'claude' || envAssistant === 'codex') {
config.assistant = envAssistant;
if (envAssistant && envAssistant.length > 0) {
if (isRegisteredProvider(envAssistant)) {
config.assistant = envAssistant;
} else {
throw new Error(
`DEFAULT_AI_ASSISTANT='${envAssistant}' is not a registered provider. ` +
`Available providers: ${getRegisteredProviderNames().join(', ')}`
);
}
}
// Streaming overrides
@ -274,10 +364,7 @@ function applyEnvOverrides(config: MergedConfig): MergedConfig {
function mergeGlobalConfig(defaults: MergedConfig, global: GlobalConfig): MergedConfig {
const result: MergedConfig = {
...defaults,
assistants: {
claude: { ...defaults.assistants.claude },
codex: { ...defaults.assistants.codex },
},
assistants: mergeAssistantDefaults(defaults.assistants),
};
// Bot name preference
@ -285,23 +372,19 @@ function mergeGlobalConfig(defaults: MergedConfig, global: GlobalConfig): Merged
result.botName = global.botName;
}
// Assistant preference
// Assistant preference — validate against registry
if (global.defaultAssistant) {
result.assistant = global.defaultAssistant;
if (isRegisteredProvider(global.defaultAssistant)) {
result.assistant = global.defaultAssistant;
} else {
throw new Error(
`defaultAssistant: '${global.defaultAssistant}' in global config (~/.archon/config.yaml) ` +
`is not a registered provider. Available: ${getRegisteredProviderNames().join(', ')}`
);
}
}
if (global.assistants?.claude?.model) {
result.assistants.claude.model = global.assistants.claude.model;
}
if (global.assistants?.claude?.settingSources) {
result.assistants.claude.settingSources = global.assistants.claude.settingSources;
}
if (global.assistants?.codex) {
result.assistants.codex = {
...result.assistants.codex,
...global.assistants.codex,
};
}
result.assistants = mergeAssistantDefaults(result.assistants, global.assistants);
// Streaming preferences
if (global.streaming) {
@ -321,12 +404,6 @@ function mergeGlobalConfig(defaults: MergedConfig, global: GlobalConfig): Merged
result.concurrency.maxConversations = global.concurrency.maxConversations;
}
// Env-leak gate bypass (global)
if (global.allow_target_repo_keys === true) {
result.allowTargetRepoKeys = true;
warnEnvLeakGateDisabledOnce('global_config');
}
return result;
}
@ -336,29 +413,22 @@ function mergeGlobalConfig(defaults: MergedConfig, global: GlobalConfig): Merged
function mergeRepoConfig(merged: MergedConfig, repo: RepoConfig): MergedConfig {
const result: MergedConfig = {
...merged,
assistants: {
claude: { ...merged.assistants.claude },
codex: { ...merged.assistants.codex },
},
assistants: mergeAssistantDefaults(merged.assistants),
};
// Assistant override (repo-level takes precedence)
// Assistant override (repo-level takes precedence) — validate against registry
if (repo.assistant) {
result.assistant = repo.assistant;
if (isRegisteredProvider(repo.assistant)) {
result.assistant = repo.assistant;
} else {
throw new Error(
`assistant: '${repo.assistant}' in repo config (.archon/config.yaml) ` +
`is not a registered provider. Available: ${getRegisteredProviderNames().join(', ')}`
);
}
}
if (repo.assistants?.claude?.model) {
result.assistants.claude.model = repo.assistants.claude.model;
}
if (repo.assistants?.claude?.settingSources) {
result.assistants.claude.settingSources = repo.assistants.claude.settingSources;
}
if (repo.assistants?.codex) {
result.assistants.codex = {
...result.assistants.codex,
...repo.assistants.codex,
};
}
result.assistants = mergeAssistantDefaults(result.assistants, repo.assistants);
// Commands config
if (repo.commands) {
@ -400,14 +470,6 @@ function mergeRepoConfig(merged: MergedConfig, repo: RepoConfig): MergedConfig {
result.envVars = { ...result.envVars, ...repo.env };
}
// Repo-level env-leak gate override (wins over global)
if (repo.allow_target_repo_keys !== undefined) {
result.allowTargetRepoKeys = repo.allow_target_repo_keys;
if (repo.allow_target_repo_keys) {
warnEnvLeakGateDisabledOnce('repo_config');
}
}
return result;
}
@ -418,6 +480,9 @@ function mergeRepoConfig(merged: MergedConfig, repo: RepoConfig): MergedConfig {
* @returns Merged configuration with all overrides applied
*/
export async function loadConfig(repoPath?: string): Promise<MergedConfig> {
registerBuiltinProviders();
registerCommunityProviders();
// 1. Start with defaults
let config = getDefaults();
@ -476,10 +541,10 @@ export async function updateGlobalConfig(updates: Partial<GlobalConfig>): Promis
if (updates.defaultAssistant !== undefined) merged.defaultAssistant = updates.defaultAssistant;
if (updates.assistants) {
merged.assistants = {
claude: { ...current.assistants?.claude, ...updates.assistants.claude },
codex: { ...current.assistants?.codex, ...updates.assistants.codex },
};
merged.assistants = mergeAssistantDefaults(
mergeAssistantDefaults(getDefaults().assistants, current.assistants),
updates.assistants
);
}
if (updates.streaming) {
@ -520,16 +585,7 @@ export function toSafeConfig(config: MergedConfig): SafeConfig {
return {
botName: config.botName,
assistant: config.assistant,
assistants: {
claude: {
model: config.assistants.claude.model,
},
codex: {
model: config.assistants.codex.model,
modelReasoningEffort: config.assistants.codex.modelReasoningEffort,
webSearchMode: config.assistants.codex.webSearchMode,
},
},
assistants: toSafeAssistantDefaults(config.assistants),
streaming: {
telegram: config.streaming.telegram,
discord: config.streaming.discord,

View file

@ -10,22 +10,54 @@
* Global configuration (non-secret user preferences)
* Located at ~/.archon/config.yaml
*/
import type { ModelReasoningEffort, WebSearchMode } from '../types';
export interface AssistantDefaults {
model?: string;
modelReasoningEffort?: ModelReasoningEffort;
webSearchMode?: WebSearchMode;
additionalDirectories?: string[];
}
// Provider config defaults — canonical definitions live in @archon/providers/types.
// Imported and re-exported here so existing consumers don't break.
import type {
ClaudeProviderDefaults,
CodexProviderDefaults,
PiProviderDefaults,
ProviderDefaultsMap,
} from '@archon/providers/types';
export interface ClaudeAssistantDefaults {
model?: string;
/** Claude Code settingSources controls which CLAUDE.md files are loaded.
* @default ['project']
* @see https://github.com/anthropics/claude-agent-sdk */
settingSources?: ('project' | 'user')[];
}
export type {
ClaudeProviderDefaults,
CodexProviderDefaults,
PiProviderDefaults,
ProviderDefaultsMap,
};
/**
* Intersection type: generic `ProviderDefaultsMap` (any string key) with
* typed built-in entries.
*
* The built-in entries exist ONLY to give call sites like
* `config.assistants.claude.model` IDE autocomplete without `as` casts.
* They do NOT provide parser safety (each provider's `parseXxxConfig`
* already takes `Record<string, unknown>` and defends itself).
*
* Community providers should NOT be added here they live behind the
* generic `[string]` index. Adding a new community provider must not
* require a core-package type change; that's the whole point of Phase 2.
*/
export type AssistantDefaultsConfig = ProviderDefaultsMap & {
claude?: ClaudeProviderDefaults;
codex?: CodexProviderDefaults;
};
/**
* Required variant built-ins are always present after `loadConfig`.
*
* `getDefaults()` seeds every registered provider (built-in + community)
* with `{}`, so community providers appear in the map too just typed as
* `ProviderDefaults` via the generic index rather than a specific shape.
* `registerBuiltinProviders()` is called before `loadConfig()` at every
* process entrypoint, so claude/codex are guaranteed present.
*/
export type AssistantDefaults = ProviderDefaultsMap & {
claude: ClaudeProviderDefaults;
codex: CodexProviderDefaults;
};
export interface GlobalConfig {
/**
@ -38,15 +70,12 @@ export interface GlobalConfig {
* Default AI assistant when no codebase-specific preference
* @default 'claude'
*/
defaultAssistant?: 'claude' | 'codex';
defaultAssistant?: string;
/**
* Assistant-specific defaults (model, reasoning effort, etc.)
*/
assistants?: {
claude?: ClaudeAssistantDefaults;
codex?: AssistantDefaults;
};
assistants?: AssistantDefaultsConfig;
/**
* Platform streaming preferences (can be overridden per conversation)
@ -84,20 +113,6 @@ export interface GlobalConfig {
*/
maxConversations?: number;
};
/**
* Bypass the env-leak gate globally. When true, Archon will not refuse to
* register or spawn subprocesses for codebases whose auto-loaded .env files
* contain sensitive keys (ANTHROPIC_API_KEY, OPENAI_API_KEY, etc).
*
* WARNING: Weakens the env-leak gate. Keys in the target repo's .env will
* be auto-loaded by Bun subprocesses (Claude/Codex) and bypass Archon's
* env allowlist. Use only on trusted machines.
*
* YAML key: `allow_target_repo_keys`
* @default false
*/
allow_target_repo_keys?: boolean;
}
/**
@ -109,15 +124,12 @@ export interface RepoConfig {
* AI assistant preference for this repository
* Overrides global default
*/
assistant?: 'claude' | 'codex';
assistant?: string;
/**
* Assistant-specific defaults for this repository
*/
assistants?: {
claude?: ClaudeAssistantDefaults;
codex?: AssistantDefaults;
};
assistants?: AssistantDefaultsConfig;
/**
* Commands configuration
@ -152,6 +164,41 @@ export interface RepoConfig {
* @example [".env", ".archon", "data/fixtures/"]
*/
copyFiles?: string[];
/**
* Initialize git submodules in new worktrees.
* Runs `git submodule update --init --recursive` after worktree creation
* when the repo contains a `.gitmodules` file. Repos without submodules
* pay zero cost (the check short-circuits).
*
* Set to `false` to skip submodule init (e.g., when submodules are not
* needed by any workflow or when fetch cost is prohibitive).
* @default true
*/
initSubmodules?: boolean;
/**
* Per-project worktree directory (relative to repo root). When set,
* worktrees are created at `<repoRoot>/<path>/<branch>` instead of under
* `~/.archon/worktrees/` or the workspaces layout.
*
* Opt-in co-locates worktrees with the repo so they appear in the IDE
* file tree. The user is responsible for adding the directory to their
* `.gitignore` (no automatic file mutation).
*
* Path resolution precedence (highest to lowest):
* 1. this `worktree.path` (repo-local)
* 2. global `paths.worktrees` (absolute override in `~/.archon/config.yaml`)
* 3. auto-detected project-scoped (`~/.archon/workspaces/owner/repo/...`)
* 4. default global (`~/.archon/worktrees/`)
*
* Must be a safe relative path: no leading `/`, no `..` segments. Absolute
* or escaping values fail loudly at worktree creation (Fail Fast no silent
* fallback).
*
* @example '.worktrees'
*/
path?: string;
};
/**
@ -172,12 +219,6 @@ export interface RepoConfig {
*/
env?: Record<string, string>;
/**
* Per-repo override for the env-leak gate bypass. Repo value wins over global.
* YAML key: `allow_target_repo_keys`
*/
allow_target_repo_keys?: boolean;
/**
* Default commands/workflows configuration
*/
@ -212,11 +253,8 @@ export interface RepoConfig {
*/
export interface MergedConfig {
botName: string;
assistant: 'claude' | 'codex';
assistants: {
claude: ClaudeAssistantDefaults;
codex: AssistantDefaults;
};
assistant: string;
assistants: AssistantDefaults;
streaming: {
telegram: 'stream' | 'batch';
discord: 'stream' | 'batch';
@ -260,14 +298,6 @@ export interface MergedConfig {
* Undefined when no env vars are configured.
*/
envVars?: Record<string, string>;
/**
* Effective value of the env-leak gate bypass. When true, the env scanner
* is skipped during registration and pre-spawn. Repo-level override wins
* over global (explicit `false` at repo level re-enables the gate).
* @default false
*/
allowTargetRepoKeys: boolean;
}
/**
@ -276,11 +306,8 @@ export interface MergedConfig {
*/
export interface SafeConfig {
botName: string;
assistant: 'claude' | 'codex';
assistants: {
claude: Pick<ClaudeAssistantDefaults, 'model'>;
codex: Pick<AssistantDefaults, 'model' | 'modelReasoningEffort' | 'webSearchMode'>;
};
assistant: string;
assistants: ProviderDefaultsMap;
streaming: {
telegram: 'stream' | 'batch';
discord: 'stream' | 'batch';

View file

@ -135,4 +135,46 @@ describe('SqliteAdapter', () => {
).rejects.toThrow('does not support RETURNING clause on UPDATE/DELETE');
});
});
describe('datetime() chronological vs lexical comparison', () => {
// Documents the SQLite-specific bug fixed in getActiveWorkflowRunByPath.
// `started_at` is TEXT in "YYYY-MM-DD HH:MM:SS" format. Comparing it
// directly to an ISO param "YYYY-MM-DDTHH:MM:SS.mmmZ" with `<` is
// LEXICAL: char 11 is space (0x20) in the column vs T (0x54) in the
// param, so every column value lex-sorts before every ISO param,
// making the comparison ALWAYS true regardless of actual time.
//
// Wrapping both sides in datetime() forces chronological comparison.
test('lexical comparison gives wrong answer for SQLite stored format vs ISO param', async () => {
db = createTestDb();
// Column-format value (afternoon) is chronologically AFTER the ISO
// param (morning), but lex compares char-11 (space < T) → wrong.
const result = await db.query<{ broken: number }>(
`SELECT ('2026-04-14 12:00:00' < $1) AS broken`,
['2026-04-14T10:00:00.000Z']
);
// Expected by chronology: FALSE. Lex says: TRUE.
expect(result.rows[0].broken).toBe(1);
});
test('datetime() wrap on both sides gives chronological comparison', async () => {
db = createTestDb();
const result = await db.query<{ correct: number }>(
`SELECT (datetime('2026-04-14 12:00:00') < datetime($1)) AS correct`,
['2026-04-14T10:00:00.000Z']
);
// 12:00 < 10:00 is FALSE — datetime() comparison agrees with reality.
expect(result.rows[0].correct).toBe(0);
});
test('datetime() handles equality across formats', async () => {
db = createTestDb();
const result = await db.query<{ equal: number }>(
`SELECT (datetime('2026-04-14 10:00:00') = datetime($1)) AS equal`,
['2026-04-14T10:00:00.000Z']
);
expect(result.rows[0].equal).toBe(1);
});
});
});

View file

@ -215,22 +215,6 @@ export class SqliteAdapter implements IDatabase {
} catch (e: unknown) {
getLog().warn({ err: e as Error }, 'db.sqlite_migration_session_columns_failed');
}
// Codebases columns (added in #983 — env-leak gate consent bit)
try {
const cbCols = this.db.prepare("PRAGMA table_info('remote_agent_codebases')").all() as {
name: string;
}[];
const cbColNames = new Set(cbCols.map(c => c.name));
if (!cbColNames.has('allow_env_keys')) {
this.db.run(
'ALTER TABLE remote_agent_codebases ADD COLUMN allow_env_keys INTEGER DEFAULT 0'
);
}
} catch (e: unknown) {
getLog().warn({ err: e as Error }, 'db.sqlite_migration_codebases_columns_failed');
}
}
/**
@ -252,7 +236,6 @@ export class SqliteAdapter implements IDatabase {
default_cwd TEXT NOT NULL,
default_branch TEXT DEFAULT 'main',
ai_assistant_type TEXT DEFAULT 'claude',
allow_env_keys INTEGER DEFAULT 0,
commands TEXT DEFAULT '{}',
created_at TEXT DEFAULT (datetime('now')),
updated_at TEXT DEFAULT (datetime('now'))

View file

@ -22,7 +22,6 @@ import {
findCodebaseByDefaultCwd,
findCodebaseByName,
updateCodebase,
updateCodebaseAllowEnvKeys,
deleteCodebase,
} from './codebases';
@ -37,7 +36,6 @@ describe('codebases', () => {
repository_url: 'https://github.com/user/repo',
default_cwd: '/workspace/test-project',
ai_assistant_type: 'claude',
allow_env_keys: false,
commands: { plan: { path: '.claude/commands/plan.md', description: 'Plan feature' } },
created_at: new Date(),
updated_at: new Date(),
@ -56,8 +54,8 @@ describe('codebases', () => {
expect(result).toEqual(mockCodebase);
expect(mockQuery).toHaveBeenCalledWith(
'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type, allow_env_keys) VALUES ($1, $2, $3, $4, $5) RETURNING *',
['test-project', 'https://github.com/user/repo', '/workspace/test-project', 'claude', false]
'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type) VALUES ($1, $2, $3, $4) RETURNING *',
['test-project', 'https://github.com/user/repo', '/workspace/test-project', 'claude']
);
});
@ -75,8 +73,8 @@ describe('codebases', () => {
expect(result).toEqual(codebaseWithoutOptional);
expect(mockQuery).toHaveBeenCalledWith(
'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type, allow_env_keys) VALUES ($1, $2, $3, $4, $5) RETURNING *',
['test-project', null, '/workspace/test-project', 'claude', false]
'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type) VALUES ($1, $2, $3, $4) RETURNING *',
['test-project', null, '/workspace/test-project', 'claude']
);
});
@ -191,6 +189,22 @@ describe('codebases', () => {
// Original frozen object should be unchanged
expect(frozenCommands).not.toHaveProperty('new-command');
});
test('throws on corrupt JSON string (SQLite TEXT column)', async () => {
mockQuery.mockResolvedValueOnce(createQueryResult([{ commands: '{not valid json' }]));
await expect(getCodebaseCommands('codebase-123')).rejects.toThrow(
/Corrupt commands JSON for codebase codebase-123/
);
});
test('parses valid JSON string from SQLite TEXT column', async () => {
const commands = { plan: { path: 'plan.md', description: 'Plan' } };
mockQuery.mockResolvedValueOnce(createQueryResult([{ commands: JSON.stringify(commands) }]));
const result = await getCodebaseCommands('codebase-123');
expect(result).toEqual(commands);
});
});
describe('registerCommand', () => {
@ -299,7 +313,6 @@ describe('codebases', () => {
name: 'test-repo',
default_cwd: '/workspace/test-repo',
ai_assistant_type: 'claude',
allow_env_keys: false,
repository_url: null,
commands: {},
created_at: new Date(),
@ -399,26 +412,6 @@ describe('codebases', () => {
});
});
describe('updateCodebaseAllowEnvKeys', () => {
test('flips the consent bit', async () => {
mockQuery.mockResolvedValueOnce(createQueryResult([], 1));
await updateCodebaseAllowEnvKeys('codebase-123', true);
expect(mockQuery).toHaveBeenCalledWith(
'UPDATE remote_agent_codebases SET allow_env_keys = $1, updated_at = NOW() WHERE id = $2',
[true, 'codebase-123']
);
});
test('throws when codebase not found', async () => {
mockQuery.mockResolvedValueOnce(createQueryResult([], 0));
await expect(updateCodebaseAllowEnvKeys('missing', false)).rejects.toThrow(
'Codebase missing not found'
);
});
});
describe('deleteCodebase', () => {
test('should unlink sessions, conversations, and delete codebase', async () => {
// First call: unlink sessions

View file

@ -17,13 +17,11 @@ export async function createCodebase(data: {
repository_url?: string;
default_cwd: string;
ai_assistant_type?: string;
allow_env_keys?: boolean;
}): Promise<Codebase> {
const assistantType = data.ai_assistant_type ?? 'claude';
const allowEnvKeys = data.allow_env_keys ?? false;
const result = await pool.query<Codebase>(
'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type, allow_env_keys) VALUES ($1, $2, $3, $4, $5) RETURNING *',
[data.name, data.repository_url ?? null, data.default_cwd, assistantType, allowEnvKeys]
'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type) VALUES ($1, $2, $3, $4) RETURNING *',
[data.name, data.repository_url ?? null, data.default_cwd, assistantType]
);
if (!result.rows[0]) {
throw new Error('Failed to create codebase: INSERT succeeded but no row returned');
@ -61,9 +59,12 @@ export async function getCodebaseCommands(
if (typeof raw === 'string') {
try {
parsed = JSON.parse(raw);
} catch {
getLog().error({ codebaseId: id, raw }, 'db.codebase_commands_json_parse_failed');
return {};
} catch (err) {
getLog().error({ codebaseId: id, raw, err }, 'db.codebase_commands_json_parse_failed');
throw new Error(
`Corrupt commands JSON for codebase ${id}: unable to parse stored data. ` +
`Run UPDATE remote_agent_codebases SET commands = '{}' WHERE id = '${id}' to reset.`
);
}
} else {
parsed = raw ?? {};
@ -158,21 +159,6 @@ export async function updateCodebase(
}
}
/**
* Flip the `allow_env_keys` consent bit for an existing codebase.
* Throws when the codebase does not exist.
*/
export async function updateCodebaseAllowEnvKeys(id: string, allowEnvKeys: boolean): Promise<void> {
const dialect = getDialect();
const result = await pool.query(
`UPDATE remote_agent_codebases SET allow_env_keys = $1, updated_at = ${dialect.now()} WHERE id = $2`,
[allowEnvKeys, id]
);
if ((result.rowCount ?? 0) === 0) {
throw new Error(`Codebase ${id} not found`);
}
}
export async function listCodebases(): Promise<readonly Codebase[]> {
const result = await pool.query<Codebase>(
'SELECT * FROM remote_agent_codebases ORDER BY name ASC'

View file

@ -3,6 +3,7 @@ import { createQueryResult, mockPostgresDialect } from '../test/mocks/database';
import type { MessageRow } from './messages';
const mockQuery = mock(() => Promise.resolve(createQueryResult([])));
const mockGetDatabaseType = mock(() => 'postgresql' as const);
// Mock the connection module before importing the module under test
mock.module('./connection', () => ({
@ -10,9 +11,22 @@ mock.module('./connection', () => ({
query: mockQuery,
},
getDialect: () => mockPostgresDialect,
getDatabaseType: mockGetDatabaseType,
}));
import { addMessage, listMessages } from './messages';
// Mock @archon/paths to avoid lazy logger initialization issues in tests
mock.module('@archon/paths', () => ({
createLogger: mock(() => ({
fatal: mock(() => undefined),
error: mock(() => undefined),
warn: mock(() => undefined),
info: mock(() => undefined),
debug: mock(() => undefined),
trace: mock(() => undefined),
})),
}));
import { addMessage, listMessages, getRecentWorkflowResultMessages } from './messages';
describe('messages', () => {
beforeEach(() => {
@ -121,4 +135,76 @@ describe('messages', () => {
expect(mockQuery).toHaveBeenCalledWith(expect.any(String), ['conv-456', 50]);
});
});
describe('getRecentWorkflowResultMessages', () => {
beforeEach(() => {
mockGetDatabaseType.mockClear();
});
test('uses PostgreSQL JSON extraction syntax when dbType is postgresql', async () => {
mockGetDatabaseType.mockReturnValueOnce('postgresql');
mockQuery.mockResolvedValueOnce(createQueryResult([]));
await getRecentWorkflowResultMessages('conv-1');
const sql = mockQuery.mock.calls[0]?.[0] as string;
expect(sql).toContain("metadata->>'workflowResult'");
expect(sql).not.toContain('json_extract');
});
test('uses SQLite JSON extraction syntax when dbType is sqlite', async () => {
mockGetDatabaseType.mockReturnValueOnce('sqlite');
mockQuery.mockResolvedValueOnce(createQueryResult([]));
await getRecentWorkflowResultMessages('conv-1');
const sql = mockQuery.mock.calls[0]?.[0] as string;
expect(sql).toContain("json_extract(metadata, '$.workflowResult')");
expect(sql).not.toContain("->>'" + 'workflowResult');
});
test('passes correct parameters: conversationId and limit', async () => {
mockGetDatabaseType.mockReturnValueOnce('postgresql');
mockQuery.mockResolvedValueOnce(createQueryResult([]));
await getRecentWorkflowResultMessages('conv-42', 5);
expect(mockQuery).toHaveBeenCalledWith(expect.any(String), ['conv-42', 5]);
});
test('default limit is 3', async () => {
mockGetDatabaseType.mockReturnValueOnce('postgresql');
mockQuery.mockResolvedValueOnce(createQueryResult([]));
await getRecentWorkflowResultMessages('conv-1');
expect(mockQuery).toHaveBeenCalledWith(expect.any(String), ['conv-1', 3]);
});
test('returns empty array on query error (non-throwing contract)', async () => {
mockGetDatabaseType.mockReturnValueOnce('postgresql');
mockQuery.mockRejectedValueOnce(new Error('connection refused'));
const result = await getRecentWorkflowResultMessages('conv-1');
expect(result).toEqual([]);
});
test('returns rows from successful query', async () => {
const row: MessageRow = {
id: 'msg-1',
conversation_id: 'conv-1',
role: 'assistant',
content: 'Workflow summary here.',
metadata: '{"workflowResult":{"workflowName":"plan","runId":"run-1"}}',
created_at: '2026-01-01T00:00:00Z',
};
mockGetDatabaseType.mockReturnValueOnce('postgresql');
mockQuery.mockResolvedValueOnce(createQueryResult([row]));
const result = await getRecentWorkflowResultMessages('conv-1');
expect(result).toEqual([row]);
});
});
});

View file

@ -1,7 +1,7 @@
/**
* Database operations for conversation messages (Web UI history)
* Database operations for conversation messages (Web UI history and orchestrator prompt enrichment)
*/
import { pool, getDialect } from './connection';
import { pool, getDialect, getDatabaseType } from './connection';
import { createLogger } from '@archon/paths';
/** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */
@ -16,7 +16,7 @@ export interface MessageRow {
conversation_id: string;
role: 'user' | 'assistant';
content: string;
metadata: string; // JSON string - parsed by frontend
metadata: string; // JSON string - parsed by frontend and server-side (orchestrator prompt enrichment)
created_at: string;
}
@ -64,3 +64,34 @@ export async function listMessages(
);
return result.rows;
}
/**
* Get recent messages with workflowResult metadata for a conversation.
* Used to inject workflow context into the orchestrator prompt.
* Non-throwing returns empty array on error.
*/
export async function getRecentWorkflowResultMessages(
conversationId: string,
limit = 3
): Promise<readonly MessageRow[]> {
const dbType = getDatabaseType();
const metadataFilter =
dbType === 'postgresql'
? "(metadata->>'workflowResult') IS NOT NULL"
: "json_extract(metadata, '$.workflowResult') IS NOT NULL";
try {
const result = await pool.query<Pick<MessageRow, 'id' | 'content' | 'metadata'>>(
`SELECT id, content, metadata FROM remote_agent_messages
WHERE conversation_id = $1
AND ${metadataFilter}
ORDER BY created_at DESC
LIMIT $2`,
[conversationId, limit]
);
return result.rows as MessageRow[];
} catch (error) {
const err = error as Error;
getLog().warn({ err, conversationId }, 'db.workflow_result_messages_query_failed');
return [];
}
}

View file

@ -559,6 +559,60 @@ describe('workflows database', () => {
expect(params).toEqual(['/repo/path']);
});
test('includes pending rows within the stale-pending age window', async () => {
mockQuery.mockResolvedValueOnce(createQueryResult([]));
await getActiveWorkflowRunByPath('/repo/path');
const [query] = mockQuery.mock.calls[0] as [string, unknown[]];
// Fresh `pending` counts as active so the lock is held immediately
// after pre-create — without this, two near-simultaneous dispatches
// both pass the guard.
expect(query).toContain("status = 'pending'");
// Age window cutoff prevents orphaned pending rows (from crashed
// dispatches) from permanently blocking a path.
expect(query).toMatch(/started_at >.*INTERVAL.*milliseconds/);
});
test('excludes self and applies older-wins tiebreaker when self is provided', async () => {
mockQuery.mockResolvedValueOnce(createQueryResult([]));
const startedAt = new Date('2026-04-14T10:00:00Z');
await getActiveWorkflowRunByPath('/repo/path', { id: 'self-id', startedAt });
const [query, params] = mockQuery.mock.calls[0] as [string, unknown[]];
expect(query).toContain('id != $2');
// PostgreSQL branch: explicit `::timestamptz` cast on the param so
// the comparison is chronological, not lexical. SQLite branch wraps
// both sides in datetime() — covered by tests in adapters/sqlite.test.ts
// because this suite mocks getDatabaseType as 'postgresql'.
expect(query).toContain('started_at < $3::timestamptz');
expect(query).toContain('started_at = $3::timestamptz AND id < $2');
// selfStartedAt serialized to ISO — bun:sqlite rejects Date bindings.
expect(params).toEqual(['/repo/path', 'self-id', startedAt.toISOString()]);
});
test('skips self exclusion + tiebreaker when self is omitted (no caller context)', async () => {
mockQuery.mockResolvedValueOnce(createQueryResult([]));
await getActiveWorkflowRunByPath('/repo/path');
const [query, params] = mockQuery.mock.calls[0] as [string, unknown[]];
// Without `self`, neither the id-exclusion nor the tiebreaker apply.
expect(query).not.toContain('id !=');
expect(query).not.toContain('started_at <');
expect(params).toEqual(['/repo/path']);
});
test('orders by (started_at ASC, id ASC) so older-wins is deterministic', async () => {
mockQuery.mockResolvedValueOnce(createQueryResult([]));
await getActiveWorkflowRunByPath('/repo/path');
const [query] = mockQuery.mock.calls[0] as [string, unknown[]];
expect(query).toContain('ORDER BY started_at ASC, id ASC');
});
test('returns null when no active run on path', async () => {
mockQuery.mockResolvedValueOnce(createQueryResult([]));
@ -671,6 +725,22 @@ describe('workflows database', () => {
expect(selectParams).toEqual(['workflow-run-123']);
});
test('refreshes started_at to NOW so resumed row competes fairly in the path-lock tiebreaker', async () => {
// Without this refresh, a resumed row carries its original (potentially
// hours-old) started_at and sorts ahead of any currently-active holder
// in the older-wins tiebreaker — slipping past the lock and causing
// two active workflows on the same working_path.
mockQuery.mockResolvedValueOnce(createQueryResult([], 1));
mockQuery.mockResolvedValueOnce(
createQueryResult([{ ...mockWorkflowRun, status: 'running' as const }])
);
await resumeWorkflowRun('workflow-run-123');
const [updateQuery] = mockQuery.mock.calls[0] as [string, unknown[]];
expect(updateQuery).toContain('started_at = NOW()');
});
test('throws when no row matched (run not found)', async () => {
// UPDATE returns rowCount 0
mockQuery.mockResolvedValueOnce(createQueryResult([], 0));

View file

@ -184,13 +184,76 @@ export async function getPausedWorkflowRun(conversationId: string): Promise<Work
}
}
export async function getActiveWorkflowRunByPath(workingPath: string): Promise<WorkflowRun | null> {
/**
* Find the workflow run currently holding the lock on `workingPath`.
*
* The lock is held by any row in `(running, paused)` or `pending` younger
* than `STALE_PENDING_AGE_MS` (orphaned pre-creates beyond that window are
* ignored they're from crashed or resume-replaced dispatches).
*
* When called from a dispatch that already pre-created its own row, pass
* `excludeId` and `selfStartedAt` so:
* 1. Self is never returned.
* 2. If two dispatches both have rows, the deterministic older-wins
* tiebreaker `(started_at, id)` ensures both agree on which is "first."
* The newer dispatch sees the older row and aborts; the older dispatch
* sees nothing.
*
* Returns the holding row, or null if the path is free.
*/
export const STALE_PENDING_AGE_MS = 5 * 60 * 1000; // 5 minutes
export async function getActiveWorkflowRunByPath(
workingPath: string,
self?: { id: string; startedAt: Date }
): Promise<WorkflowRun | null> {
const isPostgres = getDatabaseType() === 'postgresql';
const stalePendingCutoff = isPostgres
? `NOW() - INTERVAL '${String(STALE_PENDING_AGE_MS)} milliseconds'`
: `datetime('now', '-${String(Math.floor(STALE_PENDING_AGE_MS / 1000))} seconds')`;
// Build params + clauses dynamically. Self exclusion + tiebreaker travel
// together — the tiebreaker references both ids and timestamps.
const params: unknown[] = [workingPath];
const clauses: string[] = [
'working_path = $1',
`(status IN ('running', 'paused') OR (status = 'pending' AND started_at > ${stalePendingCutoff}))`,
];
if (self !== undefined) {
params.push(self.id);
clauses.push(`id != $${String(params.length)}`);
}
if (self !== undefined) {
// Older-wins tiebreaker. (started_at, id) is a total order so both
// dispatches always agree on which is "first." Without this, two rows
// with similar timestamps could mutually see each other and both abort.
//
// Serialize Date to ISO string — bun:sqlite rejects Date bindings.
//
// Format-aware comparison:
// PostgreSQL: started_at is TIMESTAMPTZ; cast the ISO param to
// timestamptz so the comparison is chronological, not lexical.
// SQLite: started_at is TEXT in "YYYY-MM-DD HH:MM:SS" format. Our
// ISO param has "YYYY-MM-DDTHH:MM:SS.mmmZ". Lexical comparison is
// WRONG: char 11 is space (0x20) in the column vs T (0x54) in the
// param, so every column value lex-sorts before every ISO param —
// making `started_at < $param` always TRUE regardless of actual
// time. Wrap both sides in datetime() to force chronological
// comparison via SQLite's date/time functions.
params.push(self.startedAt.toISOString());
const startedAtParam = `$${String(params.length)}`;
const idParam = `$${String(params.length - 1)}`;
const colExpr = isPostgres ? 'started_at' : 'datetime(started_at)';
const paramExpr = isPostgres ? `${startedAtParam}::timestamptz` : `datetime(${startedAtParam})`;
clauses.push(`(${colExpr} < ${paramExpr} OR (${colExpr} = ${paramExpr} AND id < ${idParam}))`);
}
try {
const result = await pool.query<WorkflowRun>(
`SELECT * FROM remote_agent_workflow_runs
WHERE working_path = $1 AND status IN ('running', 'paused')
ORDER BY started_at DESC LIMIT 1`,
[workingPath]
WHERE ${clauses.join(' AND ')}
ORDER BY started_at ASC, id ASC LIMIT 1`,
params
);
const row = result.rows[0];
return row ? normalizeWorkflowRun(row) : null;
@ -309,9 +372,23 @@ export async function resumeWorkflowRun(id: string): Promise<WorkflowRun> {
// Each phase has its own try/catch to avoid string-sniffing own errors in a shared catch.
let updateResult: Awaited<ReturnType<typeof pool.query>>;
try {
// Refresh started_at to NOW so the resumed row competes fairly with
// currently-active rows in getActiveWorkflowRunByPath's older-wins
// tiebreaker. Without this, a resumed row carries its original
// (potentially hours-old) started_at and would sort ahead of any
// currently-running holder, slipping past the path lock and causing
// two active workflows on the same working_path.
//
// We accept losing the original creation time here — `started_at` for
// an active row semantically means "when did this active phase start."
// The original creation time can be recovered from workflow_events
// history if needed for analytics.
updateResult = await pool.query(
`UPDATE remote_agent_workflow_runs
SET status = 'running', completed_at = NULL, last_activity_at = ${dialect.now()}
SET status = 'running',
completed_at = NULL,
started_at = ${dialect.now()},
last_activity_at = ${dialect.now()}
WHERE id = $1`,
[id]
);

View file

@ -20,7 +20,6 @@ const mockCreateCodebase = mock(() =>
repository_url: 'https://github.com/owner/repo',
default_cwd: '/home/test/.archon/workspaces/owner/repo/source',
ai_assistant_type: 'claude',
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),
@ -67,20 +66,6 @@ mock.module('../utils/commands', () => ({
findMarkdownFilesRecursive: mockFindMarkdownFilesRecursive,
}));
// ── env-leak-scanner mock ───────────────────────────────────────────────────
class MockEnvLeakError extends Error {
constructor(public report: unknown) {
super('Cannot add codebase — /test/path contains keys that will leak into AI subprocesses');
this.name = 'EnvLeakError';
}
}
const mockScanPathForSensitiveKeys = mock(() => ({ path: '', findings: [] }));
mock.module('../utils/env-leak-scanner', () => ({
scanPathForSensitiveKeys: mockScanPathForSensitiveKeys,
EnvLeakError: MockEnvLeakError,
}));
// ── Import module under test AFTER mocks are registered ────────────────────
import { cloneRepository, registerRepository } from './clone';
@ -118,7 +103,6 @@ function clearMocks(): void {
mockFindCodebaseByName.mockReset();
mockUpdateCodebase.mockReset();
mockFindMarkdownFilesRecursive.mockReset();
mockScanPathForSensitiveKeys.mockReset();
mockLogger.info.mockClear();
mockLogger.debug.mockClear();
mockLogger.warn.mockClear();
@ -132,7 +116,6 @@ function clearMocks(): void {
mockFindCodebaseByName.mockResolvedValue(null);
mockUpdateCodebase.mockResolvedValue(undefined);
mockFindMarkdownFilesRecursive.mockResolvedValue([]);
mockScanPathForSensitiveKeys.mockReturnValue({ path: '', findings: [] });
}
afterAll(() => {
@ -157,7 +140,6 @@ function makeCodebase(
repository_url: 'https://github.com/owner/repo',
default_cwd: '/home/test/.archon/workspaces/owner/repo/source',
ai_assistant_type: 'claude',
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),
@ -948,33 +930,4 @@ describe('RegisterResult shape', () => {
expect(result.alreadyExisted).toBe(true);
expect(result.commandCount).toBe(0);
});
describe('env leak gate', () => {
test('throws EnvLeakError when scanner finds sensitive keys and allowEnvKeys is false', async () => {
mockScanPathForSensitiveKeys.mockReturnValueOnce({
path: '/home/test/.archon/workspaces/owner/repo/source',
findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }],
});
await expect(cloneRepository('https://github.com/owner/repo')).rejects.toThrow(
'Cannot add codebase'
);
});
test('does not throw when allowEnvKeys is true, even with scanner findings present', async () => {
mockCreateCodebase.mockResolvedValueOnce(makeCodebase() as ReturnType<typeof makeCodebase>);
// Scanner is still called for the audit-log payload (files/keys), but the
// gate must NOT throw — the per-call grant is the bypass.
mockScanPathForSensitiveKeys.mockReturnValueOnce({
path: '/home/test/.archon/workspaces/owner/repo/source',
findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }],
});
const result = await cloneRepository('https://github.com/owner/repo', true);
expect(result.codebaseId).toBe('codebase-uuid-1');
// Scanner is called once — for the audit log, not as a gate
expect(mockScanPathForSensitiveKeys).toHaveBeenCalledTimes(1);
});
});
});

View file

@ -16,12 +16,6 @@ import {
parseOwnerRepo,
} from '@archon/paths';
import { findMarkdownFilesRecursive } from '../utils/commands';
import {
scanPathForSensitiveKeys,
EnvLeakError,
type LeakErrorContext,
} from '../utils/env-leak-scanner';
import { loadConfig } from '../config/config-loader';
import { createLogger } from '@archon/paths';
/** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */
@ -46,55 +40,14 @@ export interface RegisterResult {
async function registerRepoAtPath(
targetPath: string,
name: string,
repositoryUrl: string | null,
allowEnvKeys = false,
context: LeakErrorContext = 'register-ui'
repositoryUrl: string | null
): Promise<RegisterResult> {
// Scan for sensitive keys in auto-loaded .env files before registering.
// Two bypass paths exist (in order of precedence):
// 1. Per-call `allowEnvKeys=true` (Web UI checkbox or CLI --allow-env-keys)
// 2. Config-level `allow_target_repo_keys: true` (global YAML)
// When the per-call bypass is used we still emit an audit-log entry so the
// grant has a permanent breadcrumb (parity with the PATCH route's
// `env_leak_consent_granted` log).
if (!allowEnvKeys) {
const merged = await loadConfig(targetPath);
if (!merged.allowTargetRepoKeys) {
const report = scanPathForSensitiveKeys(targetPath);
if (report.findings.length > 0) {
throw new EnvLeakError(report, context);
}
}
} else {
// Per-call grant — emit audit log mirroring the PATCH route shape so the
// CLI/UI add-with-consent paths leave the same breadcrumbs.
let files: string[] = [];
let keys: string[] = [];
let scanStatus: 'ok' | 'skipped' = 'ok';
try {
const report = scanPathForSensitiveKeys(targetPath);
files = report.findings.map(f => f.file);
keys = Array.from(new Set(report.findings.flatMap(f => f.keys)));
} catch (scanErr) {
scanStatus = 'skipped';
getLog().warn({ err: scanErr, path: targetPath }, 'env_leak_consent_scan_skipped');
}
const actor = context === 'register-cli' ? 'user-cli' : 'user-ui';
getLog().warn(
{
name,
path: targetPath,
files,
keys,
scanStatus,
actor,
},
'env_leak_consent_granted'
);
}
// Auto-detect assistant type based on folder structure
let suggestedAssistant = 'claude';
// Auto-detect assistant type based on SDK folder conventions.
// Built-in providers use well-known folders (.claude/, .codex/).
// Falls back to first registered built-in provider if no folder detected.
const { getRegisteredProviders } = await import('@archon/providers');
const defaultProvider = getRegisteredProviders().find(p => p.builtIn)?.id ?? 'claude';
let suggestedAssistant = defaultProvider;
const codexFolder = join(targetPath, '.codex');
const claudeFolder = join(targetPath, '.claude');
@ -108,7 +61,7 @@ async function registerRepoAtPath(
suggestedAssistant = 'claude';
getLog().debug({ path: claudeFolder }, 'assistant_detected_claude');
} catch {
getLog().debug('assistant_default_claude');
getLog().debug({ provider: defaultProvider }, 'assistant_default_from_registry');
}
}
@ -173,7 +126,6 @@ async function registerRepoAtPath(
repository_url: repositoryUrl ?? undefined,
default_cwd: targetPath,
ai_assistant_type: suggestedAssistant,
allow_env_keys: allowEnvKeys,
});
// Auto-load commands if found
@ -242,15 +194,11 @@ function normalizeRepoUrl(rawUrl: string): {
* Local paths (starting with /, ~, or .) are delegated to registerRepository
* to avoid wrong owner/repo naming. See #383 for broader rethink.
*/
export async function cloneRepository(
repoUrl: string,
allowEnvKeys?: boolean,
context: LeakErrorContext = 'register-ui'
): Promise<RegisterResult> {
export async function cloneRepository(repoUrl: string): Promise<RegisterResult> {
// Local paths should be registered (symlink), not cloned (copied)
if (repoUrl.startsWith('/') || repoUrl.startsWith('~') || repoUrl.startsWith('.')) {
const resolvedPath = repoUrl.startsWith('~') ? expandTilde(repoUrl) : resolve(repoUrl);
return registerRepository(resolvedPath, allowEnvKeys, context);
return registerRepository(resolvedPath);
}
const { workingUrl, ownerName, repoName, targetPath } = normalizeRepoUrl(repoUrl);
@ -331,13 +279,7 @@ export async function cloneRepository(
await execFileAsync('git', ['config', '--global', '--add', 'safe.directory', targetPath]);
getLog().debug({ path: targetPath }, 'safe_directory_added');
const result = await registerRepoAtPath(
targetPath,
`${ownerName}/${repoName}`,
workingUrl,
allowEnvKeys,
context
);
const result = await registerRepoAtPath(targetPath, `${ownerName}/${repoName}`, workingUrl);
getLog().info({ url: workingUrl, targetPath }, 'clone_completed');
return result;
}
@ -345,11 +287,7 @@ export async function cloneRepository(
/**
* Register an existing local repository in the database (no git clone).
*/
export async function registerRepository(
localPath: string,
allowEnvKeys?: boolean,
context: LeakErrorContext = 'register-ui'
): Promise<RegisterResult> {
export async function registerRepository(localPath: string): Promise<RegisterResult> {
// Validate path exists and is a git repo
try {
await execFileAsync('git', ['-C', localPath, 'rev-parse', '--git-dir']);
@ -415,5 +353,5 @@ export async function registerRepository(
);
// default_cwd is the real local path (not the symlink)
return registerRepoAtPath(localPath, name, remoteUrl, allowEnvKeys, context);
return registerRepoAtPath(localPath, name, remoteUrl);
}

View file

@ -511,7 +511,6 @@ describe('CommandHandler', () => {
repository_url: 'https://github.com/user/my-repo',
default_cwd: '/workspace/my-repo',
ai_assistant_type: 'claude',
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),
@ -567,7 +566,6 @@ describe('CommandHandler', () => {
repository_url: 'https://github.com/owner/repo',
default_cwd: '/workspace/repo',
ai_assistant_type: 'claude',
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),
@ -606,7 +604,6 @@ describe('CommandHandler', () => {
repository_url: 'https://github.com/owner/orphaned-repo',
default_cwd: '/workspace/orphaned-repo',
ai_assistant_type: 'claude',
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),
@ -721,7 +718,6 @@ describe('CommandHandler', () => {
repository_url: 'https://github.com/user/my-repo',
default_cwd: '/workspace/my-repo',
ai_assistant_type: 'claude',
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),

View file

@ -24,8 +24,6 @@ export {
type IWebPlatformAdapter,
isWebAdapter,
type MessageMetadata,
type MessageChunk,
type IAssistantClient,
} from './types';
// =============================================================================
@ -52,13 +50,6 @@ export * as messageDb from './db/messages';
// Re-export SessionNotFoundError for error handling
export { SessionNotFoundError } from './db/sessions';
// =============================================================================
// AI Clients
// =============================================================================
export { ClaudeClient } from './clients/claude';
export { CodexClient } from './clients/codex';
export { getAssistantClient } from './clients/factory';
// =============================================================================
// Workflows
// =============================================================================
@ -145,15 +136,6 @@ export { toError } from './utils/error';
// Credential sanitization
export { sanitizeCredentials, sanitizeError } from './utils/credential-sanitizer';
// Env leak scanner
export {
EnvLeakError,
scanPathForSensitiveKeys,
formatLeakError,
type LeakReport,
type LeakErrorContext,
} from './utils/env-leak-scanner';
// GitHub GraphQL
export { getLinkedIssueNumbers } from './utils/github-graphql';

View file

@ -34,6 +34,8 @@ export interface ApprovalOperationResult {
workingPath: string | null;
userMessage: string | null;
codebaseId: string | null;
/** Internal DB UUID — resolve via getConversationById() to get platform_conversation_id. */
conversationId: string;
type: 'interactive_loop' | 'approval_gate';
}
@ -42,6 +44,8 @@ export interface RejectionOperationResult {
workingPath: string | null;
userMessage: string | null;
codebaseId: string | null;
/** Internal DB UUID — resolve via getConversationById() to get platform_conversation_id. */
conversationId: string;
/** true = run cancelled; false = transitioning to failed for retry (has onRejectPrompt) */
cancelled: boolean;
/** true when cancelled specifically because max rejection attempts were reached */
@ -168,6 +172,7 @@ export async function approveWorkflow(
workingPath: run.working_path,
userMessage: run.user_message,
codebaseId: run.codebase_id,
conversationId: run.conversation_id,
type: 'interactive_loop',
};
}
@ -204,6 +209,7 @@ export async function approveWorkflow(
workingPath: run.working_path,
userMessage: run.user_message,
codebaseId: run.codebase_id,
conversationId: run.conversation_id,
type: 'approval_gate',
};
}
@ -248,6 +254,7 @@ export async function rejectWorkflow(
workingPath: run.working_path,
userMessage: run.user_message,
codebaseId: run.codebase_id,
conversationId: run.conversation_id,
cancelled: true,
maxAttemptsReached: true,
};
@ -261,6 +268,7 @@ export async function rejectWorkflow(
workingPath: run.working_path,
userMessage: run.user_message,
codebaseId: run.codebase_id,
conversationId: run.conversation_id,
cancelled: false,
maxAttemptsReached: false,
};
@ -280,6 +288,7 @@ export async function rejectWorkflow(
workingPath: run.working_path,
userMessage: run.user_message,
codebaseId: run.codebase_id,
conversationId: run.conversation_id,
cancelled: true,
maxAttemptsReached: false,
};

View file

@ -37,6 +37,17 @@ const mockExecuteWorkflow = mock(() => Promise.resolve());
const mockHandleCommand = mock(() =>
Promise.resolve({ success: true, message: 'ok', workflow: undefined })
);
const mockSendQuery = mock(async function* () {
yield { type: 'assistant', content: 'test response' };
yield { type: 'result', sessionId: 'session-1' };
});
const mockGetCodebaseEnvVars = mock(() => Promise.resolve({}));
const mockLoadConfig = mock(() =>
Promise.resolve({
assistants: { claude: {}, codex: {} },
envVars: {},
})
);
const mockLogger = createMockLogger();
@ -93,11 +104,17 @@ mock.module('@archon/workflows/executor', () => ({
executeWorkflow: mockExecuteWorkflow,
}));
mock.module('../clients/factory', () => ({
getAssistantClient: mock(() => ({
sendQuery: mock(async function* () {}),
mock.module('@archon/providers', () => ({
getAgentProvider: mock(() => ({
sendQuery: mockSendQuery,
getType: mock(() => 'claude'),
getCapabilities: mock(() => ({})),
})),
getProviderCapabilities: mock(() => ({ envInjection: true })),
}));
mock.module('../db/env-vars', () => ({
getCodebaseEnvVars: mockGetCodebaseEnvVars,
}));
mock.module('../utils/error-formatter', () => ({
@ -126,7 +143,7 @@ mock.module('../db/workflow-events', () => ({
}));
mock.module('../config/config-loader', () => ({
loadConfig: mock(() => Promise.resolve({})),
loadConfig: mockLoadConfig,
}));
mock.module('../services/title-generator', () => ({
@ -142,6 +159,16 @@ mock.module('./orchestrator', () => ({
mock.module('./prompt-builder', () => ({
buildOrchestratorPrompt: mock(() => 'orchestrator system prompt'),
buildProjectScopedPrompt: mock(() => 'project scoped system prompt'),
formatWorkflowContextSection: mock((results: unknown[]) =>
results.length > 0 ? '## Recent Workflow Results\n\n...' : ''
),
}));
const mockGetRecentWorkflowResultMessages = mock(() => Promise.resolve([]));
mock.module('../db/messages', () => ({
addMessage: mock(() => Promise.resolve()),
listMessages: mock(() => Promise.resolve([])),
getRecentWorkflowResultMessages: mockGetRecentWorkflowResultMessages,
}));
mock.module('@archon/isolation', () => ({
@ -181,7 +208,6 @@ function makeCodebase(name: string, id = `id-${name}`): Codebase {
repository_url: null,
default_cwd: `/repos/${name}`,
ai_assistant_type: 'claude',
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),
@ -805,7 +831,6 @@ function makeCodebaseForSync() {
repository_url: 'https://github.com/test/repo',
default_cwd: '/repos/test-repo',
ai_assistant_type: 'claude',
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),
@ -874,9 +899,19 @@ describe('discoverAllWorkflows — remote sync', () => {
mockToRepoPath.mockClear();
mockGetOrCreateConversation.mockReset();
mockGetCodebase.mockReset();
mockSendQuery.mockClear();
mockGetCodebaseEnvVars.mockReset();
mockLoadConfig.mockReset();
// Reset mocks between tests in this suite and restore safe defaults
mockGetOrCreateConversation.mockImplementation(() => Promise.resolve(null));
mockGetCodebase.mockImplementation(() => Promise.resolve(null));
mockGetCodebaseEnvVars.mockImplementation(() => Promise.resolve({}));
mockLoadConfig.mockImplementation(() =>
Promise.resolve({
assistants: { claude: {}, codex: {} },
envVars: {},
})
);
});
test('calls syncWorkspace with codebase.default_cwd when conversation has codebase_id', async () => {
@ -955,6 +990,59 @@ describe('discoverAllWorkflows — remote sync', () => {
'workspace.sync_failed'
);
});
test('passes merged repo and DB env vars to provider for codebase-scoped chat', async () => {
const conversation = makeConversation({ codebase_id: 'codebase-1' });
const codebase = makeCodebaseForSync();
mockGetOrCreateConversation.mockReturnValueOnce(Promise.resolve(conversation));
mockGetCodebase.mockReturnValueOnce(Promise.resolve(codebase));
mockGetCodebaseEnvVars.mockResolvedValueOnce({ DB_SECRET: 'db-value' });
mockLoadConfig.mockResolvedValueOnce({
assistants: { claude: {}, codex: {} },
envVars: { FILE_SECRET: 'file-value' },
});
const platform = makePlatform();
await handleMessage(platform, 'conv-1', 'What is the latest commit?');
expect(mockSendQuery).toHaveBeenCalled();
const requestOptions = mockSendQuery.mock.calls[0][3] as Record<string, unknown>;
expect(requestOptions.env).toEqual({
FILE_SECRET: 'file-value',
DB_SECRET: 'db-value',
});
});
test('does not load codebase env vars when conversation has no codebase_id', async () => {
mockGetOrCreateConversation.mockReturnValueOnce(Promise.resolve(makeConversation()));
const platform = makePlatform();
await handleMessage(platform, 'conv-1', 'Hello');
expect(mockGetCodebaseEnvVars).not.toHaveBeenCalled();
});
test('falls back to config env when codebase env loading fails', async () => {
const conversation = makeConversation({ codebase_id: 'codebase-1' });
const codebase = makeCodebaseForSync();
mockGetOrCreateConversation.mockReturnValueOnce(Promise.resolve(conversation));
mockGetCodebase.mockReturnValueOnce(Promise.resolve(codebase));
mockGetCodebaseEnvVars.mockRejectedValueOnce(new Error('db unavailable'));
mockLoadConfig.mockResolvedValueOnce({
assistants: { claude: {}, codex: {} },
envVars: { FILE_SECRET: 'file-value' },
});
const platform = makePlatform();
await handleMessage(platform, 'conv-1', 'What is the latest commit?');
expect(mockLogger.warn).toHaveBeenCalledWith(
expect.objectContaining({ codebaseId: 'codebase-1' }),
'codebase_env_vars_load_failed'
);
const requestOptions = mockSendQuery.mock.calls[0][3] as Record<string, unknown>;
expect(requestOptions.env).toEqual({ FILE_SECRET: 'file-value' });
});
});
// ─── Workflow dispatch routing — interactive flag ─────────────────────────────
@ -971,7 +1059,6 @@ describe('workflow dispatch routing — interactive flag', () => {
repository_url: null,
default_cwd: '/repos/test-repo',
ai_assistant_type: 'claude' as const,
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),
@ -1072,7 +1159,6 @@ describe('natural-language approval routing', () => {
repository_url: null,
default_cwd: '/repos/test-repo',
ai_assistant_type: 'claude' as const,
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),
@ -1407,3 +1493,76 @@ describe('discoverAllWorkflows — merge repo workflows over global', () => {
expect(mockDiscoverWorkflowsWithConfig).toHaveBeenCalledTimes(2);
});
});
// ─── handleMessage — workflow context injection ───────────────────────────────
describe('handleMessage — workflow context injection', () => {
beforeEach(() => {
mockGetRecentWorkflowResultMessages.mockClear();
mockGetOrCreateConversation.mockReset();
mockListCodebases.mockReset();
mockDiscoverWorkflowsWithConfig.mockReset();
mockLogger.warn.mockClear();
mockGetOrCreateConversation.mockImplementation(() => Promise.resolve(makeConversation()));
mockListCodebases.mockImplementation(() => Promise.resolve([]));
mockDiscoverWorkflowsWithConfig.mockImplementation(() =>
Promise.resolve({ workflows: [], errors: [] })
);
mockGetRecentWorkflowResultMessages.mockImplementation(() => Promise.resolve([]));
});
test('calls getRecentWorkflowResultMessages for the conversation', async () => {
const platform = makePlatform();
await handleMessage(platform, 'conv-1', 'What happened?');
expect(mockGetRecentWorkflowResultMessages).toHaveBeenCalledWith('conv-1', 3);
});
test('does not throw when getRecentWorkflowResultMessages returns empty array', async () => {
mockGetRecentWorkflowResultMessages.mockResolvedValueOnce([]);
const platform = makePlatform();
await expect(handleMessage(platform, 'conv-1', 'Hello')).resolves.toBeUndefined();
});
test('handles malformed metadata JSON without throwing', async () => {
const badRow = {
id: 'msg-1',
conversation_id: 'conv-1',
role: 'assistant' as const,
content: 'Summary.',
metadata: 'not-valid-json',
created_at: '2026-01-01T00:00:00Z',
};
mockGetRecentWorkflowResultMessages.mockResolvedValueOnce([badRow]);
const platform = makePlatform();
await expect(
handleMessage(platform, 'conv-1', 'What did the workflow do?')
).resolves.toBeUndefined();
});
test('handles metadata with missing workflowResult key gracefully', async () => {
const rowNoWorkflowResult = {
id: 'msg-2',
conversation_id: 'conv-1',
role: 'assistant' as const,
content: 'Summary.',
metadata: '{"someOtherKey":"value"}',
created_at: '2026-01-01T00:00:00Z',
};
mockGetRecentWorkflowResultMessages.mockResolvedValueOnce([rowNoWorkflowResult]);
const platform = makePlatform();
await expect(handleMessage(platform, 'conv-1', 'Follow-up')).resolves.toBeUndefined();
});
test('continues without workflow context when outer fetch throws', async () => {
mockGetRecentWorkflowResultMessages.mockRejectedValueOnce(new Error('unexpected'));
const platform = makePlatform();
// Non-critical path — must not block message handling
await expect(handleMessage(platform, 'conv-1', 'Hello')).resolves.toBeUndefined();
});
});

View file

@ -13,9 +13,9 @@ import type {
HandleMessageContext,
Conversation,
Codebase,
AssistantRequestOptions,
AttachedFile,
} from '../types';
import type { SendQueryOptions } from '@archon/providers/types';
import { ConversationNotFoundError } from '../types';
import * as db from '../db/conversations';
import * as codebaseDb from '../db/codebases';
@ -24,8 +24,8 @@ import * as commandHandler from '../handlers/command-handler';
import { formatToolCall } from '@archon/workflows/utils/tool-formatter';
import { classifyAndFormatError } from '../utils/error-formatter';
import { toError } from '../utils/error';
import { getAssistantClient } from '../clients/factory';
import { getArchonHome, getArchonWorkspacesPath } from '@archon/paths';
import { getAgentProvider, getProviderCapabilities } from '@archon/providers';
import { getArchonWorkspacesPath } from '@archon/paths';
import { syncArchonToWorktree } from '../utils/worktree-sync';
import { syncWorkspace, toRepoPath } from '@archon/git';
import type { WorkspaceSyncResult } from '@archon/git';
@ -43,9 +43,16 @@ import type { MergedConfig } from '../config/config-types';
import { generateAndSetTitle } from '../services/title-generator';
import { validateAndResolveIsolation, dispatchBackgroundWorkflow } from './orchestrator';
import { IsolationBlockedError } from '@archon/isolation';
import { buildOrchestratorPrompt, buildProjectScopedPrompt } from './prompt-builder';
import {
buildOrchestratorPrompt,
buildProjectScopedPrompt,
formatWorkflowContextSection,
} from './prompt-builder';
import type { WorkflowResultContext } from './prompt-builder';
import * as messageDb from '../db/messages';
import * as workflowDb from '../db/workflows';
import * as workflowEventDb from '../db/workflow-events';
import { getCodebaseEnvVars } from '../db/env-vars';
import type { ApprovalContext } from '@archon/workflows/schemas/workflow-run';
/** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */
@ -221,31 +228,43 @@ async function dispatchOrchestratorWorkflow(
codebase_id: codebase.id,
});
// Validate and resolve isolation
// Validate and resolve isolation.
// A workflow with `worktree.enabled: false` short-circuits the resolver entirely
// and runs in the live checkout — no worktree creation, no env row. This is the
// declarative equivalent of CLI `--no-worktree` for workflows that should always
// run live (e.g. read-only triage, docs generation on the main checkout).
let cwd: string;
try {
const result = await validateAndResolveIsolation(
{ ...conversation, codebase_id: codebase.id },
codebase,
platform,
conversationId,
isolationHints
if (workflow.worktree?.enabled === false) {
getLog().info(
{ workflowName: workflow.name, conversationId, codebaseId: codebase.id },
'workflow.worktree_disabled_by_policy'
);
cwd = result.cwd;
} catch (error) {
if (error instanceof IsolationBlockedError) {
getLog().warn(
{
reason: error.reason,
conversationId,
codebaseId: codebase.id,
workflowName: workflow.name,
},
'isolation_blocked'
cwd = codebase.default_cwd;
} else {
try {
const result = await validateAndResolveIsolation(
{ ...conversation, codebase_id: codebase.id },
codebase,
platform,
conversationId,
isolationHints
);
return;
cwd = result.cwd;
} catch (error) {
if (error instanceof IsolationBlockedError) {
getLog().warn(
{
reason: error.reason,
conversationId,
codebaseId: codebase.id,
workflowName: workflow.name,
},
'isolation_blocked'
);
return;
}
throw error;
}
throw error;
}
// Dispatch workflow
@ -381,9 +400,9 @@ async function discoverAllWorkflows(conversation: Conversation): Promise<Discove
let config: MergedConfig | undefined;
try {
const result = await discoverWorkflowsWithConfig(getArchonWorkspacesPath(), loadConfig, {
globalSearchPath: getArchonHome(),
});
// Home-scoped workflows at ~/.archon/workflows/ are discovered automatically
// by discoverWorkflowsWithConfig — no option needed.
const result = await discoverWorkflowsWithConfig(getArchonWorkspacesPath(), loadConfig);
workflows = [...result.workflows];
allErrors.push(...result.errors);
} catch (error) {
@ -451,7 +470,8 @@ function buildFullPrompt(
message: string,
issueContext: string | undefined,
threadContext: string | undefined,
attachedFiles?: AttachedFile[]
attachedFiles?: AttachedFile[],
workflowContext?: string
): string {
const scopedCodebase = conversation.codebase_id
? codebases.find(c => c.id === conversation.codebase_id)
@ -471,11 +491,14 @@ function buildFullPrompt(
.join('\n')
: '';
const workflowContextSuffix = workflowContext ? '\n\n---\n\n' + workflowContext : '';
if (threadContext) {
return (
systemPrompt +
'\n\n---\n\n## Thread Context (previous messages)\n\n' +
threadContext +
workflowContextSuffix +
'\n\n---\n\n## Current Request\n\n' +
message +
contextSuffix +
@ -483,7 +506,14 @@ function buildFullPrompt(
);
}
return systemPrompt + '\n\n---\n\n## User Message\n\n' + message + contextSuffix + fileSuffix;
return (
systemPrompt +
workflowContextSuffix +
'\n\n---\n\n## User Message\n\n' +
message +
contextSuffix +
fileSuffix
);
}
// ─── Main Handler ───────────────────────────────────────────────────────────
@ -731,6 +761,44 @@ export async function handleMessage(
});
}
// Build workflow context for follow-up awareness
let workflowContext: string | undefined;
try {
const recentResultMessages = await messageDb.getRecentWorkflowResultMessages(
conversation.id,
3
);
if (recentResultMessages.length > 0) {
const workflowResults: WorkflowResultContext[] = recentResultMessages.map(msg => {
let workflowName = 'unknown';
let runId = 'unknown';
try {
const parsed =
typeof msg.metadata === 'string' ? JSON.parse(msg.metadata) : msg.metadata;
const meta = parsed as {
workflowResult?: { workflowName?: string; runId?: string };
};
workflowName = meta.workflowResult?.workflowName ?? 'unknown';
runId = meta.workflowResult?.runId ?? 'unknown';
} catch (metaErr) {
// Malformed metadata — use defaults
getLog().warn(
{ err: metaErr as Error, conversationId, messageId: msg.id },
'orchestrator.workflow_result_metadata_parse_failed'
);
}
return { workflowName, runId, summary: msg.content };
});
workflowContext = formatWorkflowContextSection(workflowResults);
}
} catch (error) {
getLog().warn(
{ err: error as Error, conversationId },
'orchestrator.workflow_context_fetch_failed'
);
// Non-critical — continue without context
}
const fullPrompt = buildFullPrompt(
conversation,
codebases,
@ -738,7 +806,8 @@ export async function handleMessage(
message,
issueContext,
threadContext,
attachedFiles
attachedFiles,
workflowContext
);
const cwd = getArchonWorkspacesPath();
@ -751,17 +820,41 @@ export async function handleMessage(
});
}
// 5. Send to AI client
const aiClient = getAssistantClient(conversation.ai_assistant_type);
// 5. Send to AI provider
const aiClient = getAgentProvider(conversation.ai_assistant_type);
getLog().debug({ assistantType: conversation.ai_assistant_type }, 'sending_to_ai');
// Reuse the config already loaded during workflow discovery (avoids a second disk read).
// Fall back to loadConfig only when no codebase is scoped (discoveredConfig is undefined).
const config = discoveredConfig ?? (await loadConfig());
const requestOptions: AssistantRequestOptions = {
...(conversation.ai_assistant_type === 'claude' && config.assistants.claude.settingSources
? { settingSources: config.assistants.claude.settingSources }
: {}),
const providerKey = conversation.ai_assistant_type;
let dbEnvVars: Record<string, string> = {};
if (conversation.codebase_id) {
try {
dbEnvVars = await getCodebaseEnvVars(conversation.codebase_id);
} catch (error) {
getLog().warn(
{ err: error as Error, codebaseId: conversation.codebase_id },
'codebase_env_vars_load_failed'
);
}
}
const effectiveEnv = { ...(config.envVars ?? {}), ...dbEnvVars };
// Warn if provider doesn't support env injection but env vars are configured
if (Object.keys(effectiveEnv).length > 0) {
const providerCaps = getProviderCapabilities(providerKey);
if (!providerCaps.envInjection) {
getLog().warn(
{ provider: providerKey, envVarCount: Object.keys(effectiveEnv).length },
'orchestrator.unsupported_env_injection'
);
}
}
const requestOptions: SendQueryOptions = {
assistantConfig: config.assistants[providerKey] ?? {},
env: Object.keys(effectiveEnv).length > 0 ? effectiveEnv : undefined,
};
const mode = platform.getStreamingMode();
@ -824,14 +917,14 @@ async function handleStreamMode(
originalMessage: string,
codebases: readonly Codebase[],
workflows: readonly WorkflowDefinition[],
aiClient: ReturnType<typeof getAssistantClient>,
aiClient: ReturnType<typeof getAgentProvider>,
fullPrompt: string,
cwd: string,
session: { id: string; assistant_session_id: string | null },
isolationHints: HandleMessageContext['isolationHints'],
conversation: Conversation,
issueContext?: string,
requestOptions?: AssistantRequestOptions
requestOptions?: SendQueryOptions
): Promise<void> {
const allMessages: string[] = [];
let newSessionId: string | undefined;
@ -873,8 +966,19 @@ async function handleStreamMode(
if (!commandDetected && platform.sendStructuredEvent) {
await platform.sendStructuredEvent(conversationId, msg);
}
} else if (msg.type === 'result' && msg.sessionId) {
newSessionId = msg.sessionId;
} else if (msg.type === 'result') {
if (msg.sessionId) {
newSessionId = msg.sessionId;
}
if (msg.isError) {
getLog().warn({ conversationId, errorSubtype: msg.errorSubtype }, 'ai_result_error');
const syntheticError = new Error(msg.errorSubtype ?? 'AI result error');
await platform.sendMessage(conversationId, classifyAndFormatError(syntheticError));
if (newSessionId) {
await tryPersistSessionId(session.id, newSessionId);
}
return;
}
if (!commandDetected && platform.sendStructuredEvent) {
await platform.sendStructuredEvent(conversationId, msg);
}
@ -940,14 +1044,14 @@ async function handleBatchMode(
originalMessage: string,
codebases: readonly Codebase[],
workflows: readonly WorkflowDefinition[],
aiClient: ReturnType<typeof getAssistantClient>,
aiClient: ReturnType<typeof getAgentProvider>,
fullPrompt: string,
cwd: string,
session: { id: string; assistant_session_id: string | null },
isolationHints: HandleMessageContext['isolationHints'],
conversation: Conversation,
issueContext?: string,
requestOptions?: AssistantRequestOptions
requestOptions?: SendQueryOptions
): Promise<void> {
const allChunks: { type: string; content: string }[] = [];
const assistantMessages: string[] = [];
@ -985,8 +1089,19 @@ async function handleBatchMode(
allChunks.push({ type: 'tool', content: toolMessage });
getLog().debug({ toolName: msg.toolName }, 'tool_call');
}
} else if (msg.type === 'result' && msg.sessionId) {
newSessionId = msg.sessionId;
} else if (msg.type === 'result') {
if (msg.sessionId) {
newSessionId = msg.sessionId;
}
if (msg.isError) {
getLog().warn({ conversationId, errorSubtype: msg.errorSubtype }, 'ai_result_error');
const syntheticError = new Error(msg.errorSubtype ?? 'AI result error');
await platform.sendMessage(conversationId, classifyAndFormatError(syntheticError));
if (newSessionId) {
await tryPersistSessionId(session.id, newSessionId);
}
return;
}
}
if (!commandDetected && allChunks.length > MAX_BATCH_TOTAL_CHUNKS) {
@ -1189,11 +1304,12 @@ async function handleRegisterProject(
return `Project "${projectName}" is already registered (path: ${alreadyExists.default_cwd}).`;
}
// Create codebase record
// Use config default provider instead of hardcoding 'claude'
const config = await loadConfig();
const codebase = await codebaseDb.createCodebase({
name: projectName,
default_cwd: projectPath,
ai_assistant_type: 'claude',
ai_assistant_type: config.assistant,
});
getLog().info(

View file

@ -50,14 +50,14 @@ mock.module('../handlers/command-handler', () => ({
})),
}));
mock.module('../clients/factory', () => ({
getAssistantClient: mock(() => null),
mock.module('@archon/providers', () => ({
getAgentProvider: mock(() => null),
}));
mock.module('../workflows/store-adapter', () => ({
createWorkflowDeps: mock(() => ({
store: {},
getAssistantClient: () => ({}),
getAgentProvider: () => ({}),
loadConfig: async () => ({}),
})),
}));
@ -176,7 +176,6 @@ function makeCodebase(overrides?: Partial<Codebase>): Codebase {
id: 'cb-1',
name: 'test-repo',
default_cwd: '/workspace/test-repo',
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),

View file

@ -79,11 +79,11 @@ mock.module('../handlers/command-handler', () => ({
parseCommand: mockParseCommand,
}));
// AI client mock
const mockGetAssistantClient = mock(() => null);
// AI provider mock
const mockGetAgentProvider = mock(() => null);
mock.module('../clients/factory', () => ({
getAssistantClient: mockGetAssistantClient,
mock.module('@archon/providers', () => ({
getAgentProvider: mockGetAgentProvider,
}));
// Workflow mocks
@ -96,7 +96,7 @@ const mockFindWorkflow = mock((name: string, workflows: readonly WorkflowDefinit
mock.module('../workflows/store-adapter', () => ({
createWorkflowDeps: mock(() => ({
store: {},
getAssistantClient: () => ({}),
getAgentProvider: () => ({}),
loadConfig: async () => ({}),
})),
}));
@ -216,7 +216,6 @@ const mockCodebase: Codebase = {
repository_url: 'https://github.com/user/repo',
default_cwd: '/workspace/test-project',
ai_assistant_type: 'claude',
allow_env_keys: false,
commands: {},
created_at: new Date(),
updated_at: new Date(),
@ -274,7 +273,7 @@ function clearAllMocks(): void {
mockTransitionSession.mockClear();
mockHandleCommand.mockClear();
mockParseCommand.mockClear();
mockGetAssistantClient.mockClear();
mockGetAgentProvider.mockClear();
mockDiscoverWorkflows.mockClear();
mockExecuteWorkflow.mockClear();
mockFindWorkflow.mockClear();
@ -457,7 +456,7 @@ describe('orchestrator-agent handleMessage', () => {
mockGetActiveSession.mockResolvedValue(null);
mockCreateSession.mockResolvedValue(mockSession);
mockTransitionSession.mockResolvedValue(mockSession);
mockGetAssistantClient.mockReturnValue(mockClient);
mockGetAgentProvider.mockReturnValue(mockClient);
mockDiscoverWorkflows.mockResolvedValue({ workflows: [], errors: [] });
mockParseCommand.mockImplementation((message: string) => {
const parts = message.split(/\s+/);
@ -479,7 +478,7 @@ describe('orchestrator-agent handleMessage', () => {
expect(mockHandleCommand).toHaveBeenCalled();
expect(platform.sendMessage).toHaveBeenCalledWith('chat-456', 'Status info');
expect(mockGetAssistantClient).not.toHaveBeenCalled();
expect(mockGetAgentProvider).not.toHaveBeenCalled();
});
test('delegates /help to command handler', async () => {
@ -676,7 +675,7 @@ describe('orchestrator-agent handleMessage', () => {
await handleMessage(platform, 'chat-456', 'hello');
expect(mockTransitionSession).not.toHaveBeenCalled();
// Should pass existing assistant_session_id to AI client
// Should pass existing assistant_session_id to AI provider
expect(mockClient.sendQuery).toHaveBeenCalledWith(
expect.any(String),
expect.any(String),
@ -699,8 +698,8 @@ describe('orchestrator-agent handleMessage', () => {
// ─── settingSources forwarding ────────────────────────────────────────
describe('settingSources forwarding', () => {
test('passes settingSources from config to AI client for claude', async () => {
describe('assistantConfig forwarding', () => {
test('passes assistantConfig with settingSources for claude', async () => {
mockLoadConfig.mockResolvedValueOnce({
botName: 'Archon',
assistant: 'claude',
@ -725,11 +724,13 @@ describe('orchestrator-agent handleMessage', () => {
expect.any(String),
expect.any(String),
expect.anything(),
expect.objectContaining({ settingSources: ['project', 'user'] })
expect.objectContaining({
assistantConfig: expect.objectContaining({ settingSources: ['project', 'user'] }),
})
);
});
test('does not pass settingSources for non-claude assistant', async () => {
test('passes codex assistantConfig for codex assistant', async () => {
const codexConversation: Conversation = {
...mockConversation,
ai_assistant_type: 'codex',
@ -754,15 +755,16 @@ describe('orchestrator-agent handleMessage', () => {
yield { type: 'result', sessionId: 'codex-session' };
}),
};
mockGetAssistantClient.mockReturnValueOnce(codexClient);
mockGetAgentProvider.mockReturnValueOnce(codexClient);
await handleMessage(platform, 'chat-456', 'hello');
// settingSources should NOT be in requestOptions since assistant type is codex
// Should pass codex assistantConfig, not claude's
const callArgs = codexClient.sendQuery.mock.calls[0];
const requestOptions = callArgs?.[3] as Record<string, unknown> | undefined;
expect(requestOptions).toBeDefined();
expect(requestOptions).not.toHaveProperty('settingSources');
expect(requestOptions?.assistantConfig).toBeDefined();
});
});
@ -1151,10 +1153,11 @@ describe('orchestrator-agent handleMessage', () => {
await handleMessage(platform, 'chat-456', 'help');
// Discovery is called positionally with (cwd, loadConfig) — no options arg.
// Home-scoped workflows (~/.archon/workflows/) are discovered internally.
expect(mockDiscoverWorkflows).toHaveBeenCalledWith(
'/home/test/.archon/workspaces',
expect.any(Function),
{ globalSearchPath: '/home/test/.archon' }
expect.any(Function)
);
});

Some files were not shown because too many files have changed in this diff Show more