- Fix @archon/paths test that expected 'remote-coding-agent' in path
(fails on Archon repo checkout). Now checks path exists instead.
- Fix @archon/cli tests that matched 'remote-coding-agent' directory name.
- Remove .agents/examples/codex-telegram-bot/package-lock.json that
snuck past .gitignore.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Astro 6 requires Node.js >= 22.12.0. Bun bundles Node 20.x which
is too old. Add actions/setup-node@v4 step before the build.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace all em dashes with regular dashes in README. Move the
"Previous Version" section to right before Quickstart so existing
users find it immediately.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Points users to the archive/v1-task-management-rag branch for
the original Python-based Archon (task management + RAG tool).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(codex): emit paired tool_result chunks so UI tool cards close
Tool cards in the web UI sometimes spin forever for Codex workflows.
The Codex client only yielded { type: 'tool', ... } on item.completed
events, never the paired tool_result chunk. The web adapter's running
tool entry then had nothing to close it, leaving the UI relying on the
emitLockEvent fallback at lock release — which never fires inside a
multi-node DAG, on cancel, or when SSE is briefly disconnected.
The Codex SDK only emits item.completed once a command_execution,
web_search, or mcp_tool_call is fully done (it carries aggregated_output,
exit_code, status, etc). So we can emit the start and the result
back-to-back in the same handler.
Changes:
- command_execution: emit tool_result with aggregated_output, append
[exit code: N] when non-zero so failures are visible.
- web_search: emit empty tool_result so the searching card closes.
- mcp_tool_call: always emit tool + tool_result, including for the
status === 'completed' branch which previously emitted nothing at all
(so completed MCP calls were invisible) and for status === 'failed'
where we previously emitted only a system message (leaving no card to
close, but inconsistent with command_execution failures).
- Update codex.test.ts assertions to cover paired chunks and exit codes.
Note: tool_result is paired to its tool by the web adapter's name-based
reverse-scan in web.ts. Since these chunks are yielded back-to-back with
no other tools in between, the match is unambiguous. PR #1031 will add
stable tool_use_id pairing for Claude; a follow-up can plumb Codex's
item.id through once that lands.
* fix(codex): log silent drops and assert paired web_search tool_result
- command_execution: warn when item.command is falsy (was silently dropped)
- mcp_tool_call: warn when result.content has unexpected shape (was silent empty)
- Simplify exit_code guard to != null, drop redundant String() cast
- Test: assert paired tool_result chunk for web_search
Addresses review feedback on #1032.
The server fatally exits when no AI credentials are configured.
The smoke test only needs to verify the container starts and serves
HTTP, not process AI requests. Pass CLAUDE_USE_GLOBAL_AUTH=true to
satisfy the credential check.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the smoke test fails, capture docker logs to diagnose container
startup crashes before the cleanup step removes the container.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fix: prevent target repo .env leakage into Claude subprocess (#1029)
Bun auto-loads CWD .env before user code runs. When archon runs from a
target repo, that repo's ANTHROPIC_API_KEY (and other secrets) leaked into
process.env and were passed through to the Claude Code subprocess, billing
the wrong account.
Changes:
- Add SUBPROCESS_ENV_ALLOWLIST + buildCleanSubprocessEnv() utility
- claude.ts buildSubprocessEnv now starts from allowlist, not process.env
- CLI and server entry points strip all keys parsed from CWD .env on startup
(replaces DATABASE_URL-only patch)
- Update claude.test.ts to assert ANTHROPIC_API_KEY no longer reaches subprocess
- Add env-allowlist tests
Fixes#1029
* fix: address review findings for env allowlist PR
- Register env-allowlist.test.ts in @archon/core test batch so security tests run in CI
- Remove CLAUDECODE, NODE_OPTIONS, VSCODE_INSPECTOR_OPTIONS from allowlist (they are always stripped by caller — listing them was semantically contradictory)
- Add intent comment to silent dotenv parse fallback in cli.ts and server/index.ts
- Add test for useGlobalAuth=false path verifying ANTHROPIC_API_KEY excluded from subprocess env
- Update security.md to document full CWD .env isolation and subprocess allowlist (was only describing DATABASE_URL)
* fix(claude): pair tool_call/tool_result by stable tool_use_id
Tool cards in the web UI sometimes spin forever for Claude workflows.
Two independent causes both rooted in tool-call pairing:
1. PostToolUseFailure hook was never registered, so any tool that
errored, was interrupted, or got permission-denied left an entry in
the web adapter's runningTools map with no closing tool_result chunk.
2. The web adapter paired tool_result -> tool_call by name via a
reverse-scan, which mis-attributes results when two tools with the
same name run concurrently (parallel DAG nodes).
Changes:
- Add optional toolCallId to MessageChunk tool/tool_result variants.
- Plumb Anthropic tool_use_id from assistant blocks and PostToolUse
hook input through the chunk pipeline.
- Register PostToolUseFailure hook so errored/interrupted tools also
produce a paired tool_result chunk (carrying the same tool_use_id).
- web adapter now uses the stable ID directly when present and looks
up tool_result matches by ID first, falling back to the name
reverse-scan only for clients that don't supply an ID (e.g. Codex).
The UI side already prefers toolCallId for matching
(ChatInterface.tsx:422-425), so no UI change is needed.
* fix(claude): propagate toolCallId in post-loop drain + harden failure hook
- Mirror in-loop drain at end of stream so PostToolUseFailure results
(which typically arrive just before the SDK terminal result message)
don't lose toolCallId and fall back to the broken name reverse-scan.
- Wrap PostToolUseFailure hook in try/catch with structured logging so
malformed SDK payloads can never crash the SDK hook dispatch silently.
- Log a debug breadcrumb when the SDK omits the expected error field.
- Tighten truthy guards (toolUseId/block.id) to !== undefined so empty
strings are not silently dropped.
- Warn in web adapter when a tool_result cannot be matched by stable ID
or name reverse-scan, so missing tool_call emissions are debuggable
instead of leaking runningTools entries silently.
Windows: `env.PATH` is undefined on Windows CI (uses `Path` instead).
Use `env.PATH ?? env.Path` for cross-platform compatibility.
Docker: smoke test hit wrong endpoint (`/health` vs `/api/health`),
and `--retry-connrefused` doesn't catch connection-reset errors (code 56).
Fix URL to `/api/health`, add `sleep 5` for container startup,
and use `--retry-all-errors` for robust retries.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The greedy tool_called→tool_completed matching algorithm matched across DAG
steps by tool name only, causing cross-step "stealing" in parallel workflows.
Combined with strict > timestamp comparison, the last tool in later steps had
no completion match and displayed an ever-increasing timer (405,000+ seconds).
Scope matching to same step_name and use >= for same-second timestamps.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds an update-homebrew job to release.yml that runs after the
GitHub release is created. Downloads checksums from the release,
updates homebrew/archon.rb via the existing update-homebrew.sh
script, and commits the result back to dev.
Relates to #980
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Update Dockerfile.user.example image refs (dynamous → coleam00/archon)
- Update docker-compose.yml image name (remote-coding-agent → archon)
- Fix clone URL and dir name in Book of Archon first-five-minutes
- Update prompt-builder example project name
- Add elevator pitch to README intro
- Fix all README doc links to point to archon.diy (old docs/ dir was deleted)
- Add install scripts to docs-web public dir for GitHub Pages serving
Relates to #980
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(docker): update Bun base image from 1.2 to 1.3
The lockfile was generated with Bun 1.3.x locally but the Docker image
used oven/bun:1.2-slim. Bun 1.3 changed the lockfile format, causing
--frozen-lockfile to fail during docker build.
* fix(docker): pin Bun to exact version 1.3.9 matching lockfile
Floating tag 1.3-slim resolved to 1.3.11 which has a different lockfile
format than 1.3.9 used to generate bun.lock. Pin to exact patch version
to prevent --frozen-lockfile failures.
* fix(docker): add missing docs-web workspace package.json
The docs-web package was added as a workspace member but its
package.json was never added to the Dockerfile COPY steps. This caused
bun install --frozen-lockfile to fail because the workspace layout
in Docker didn't match the lockfile.
* fix(docker): use hoisted linker for Vite/Rollup compatibility
Bun's default "isolated" linker stores packages in node_modules/.bun/
with symlinks that Vite's Rollup bundler cannot resolve during
production builds (e.g., remark-gfm → mdast-util-gfm chain).
Using --linker=hoisted gives the classic flat node_modules layout
that Rollup expects. Local dev is unaffected (Vite dev server handles
the isolated layout fine).
* ci: pin Bun version to 1.3.9 and add Docker build check
- Align CI Bun version (was 1.3.11) with Dockerfile and local dev
(1.3.9) to prevent lockfile format mismatches between environments
- Add docker-build job to test.yml that builds the Docker image on
every PR — catches Dockerfile regressions (missing workspace
packages, linker issues, build failures) before they reach deploy
* fix(ci): add permissions for GHA cache and tighten Bun engine
- Add actions: write permission to docker-build job so GHA layer cache
writes succeed on PRs from forks
- Tighten package.json engines.bun from >=1.0.0 to >=1.3.9 to document
the minimum version that matches the lockfile format
* fix(ci): add smoke test, align Bun version across all workflows
Review fixes:
- Add load: true + health endpoint smoke test to docker-build CI job
so we verify the image actually starts, not just compiles
- Align Bun 1.3.9 in deploy-docs.yml and release.yml (were still 1.3.11)
- Document why docs-web source is intentionally omitted from Docker
* chore: float Docker to bun:1.3 and align CI to 1.3.11
- Dockerfile: oven/bun:1.3-slim (auto-tracks latest 1.3.x patches)
- CI workflows: bun-version 1.3.11 (current latest, reproducible)
- engines.bun: >=1.3.9 (minimum for local devs)
Lockfile format is stable across 1.3.x patches, so this is safe.
* fix(docker,ci): pin Docker to 1.3.11, loosen engines, harden smoke test
- Dockerfile: pin oven/bun:1.3.11-slim (was floating 1.3-slim) so Docker
builds are reproducible and match CI exactly.
- package.json: loosen engines to ^1.3.0 so end users on any 1.3.x can
run the CLI; CI/Docker remain pinned to the canonical latest.
- CI smoke test: replace 'sleep 5' with curl --retry-connrefused, and
move container cleanup to an 'if: always()' step so a failed health
check no longer leaks the named container.
---------
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
- Fix stale /workflow status footer — now shows approve/reject guidance
for paused runs alongside cancel guidance for running runs; also shows
status label per run entry
- Remove dead listWorkflows/WorkflowListData/LoadConfigFn (YAGNI: no callers)
- Add try-catch + logging around all DB mutation calls in operations layer
(abandonWorkflow, approveWorkflow, rejectWorkflow) following the same
pattern as getRunOrThrow
- Add errorType: err.constructor.name to all error log calls in operations
- Fix log event domain prefix: ghost_environment_reconciled →
isolation.ghost_reconciled, ghost_reconciliation_failed →
isolation.ghost_reconciliation_failed
- Add N+1 comment in listEnvironments() explaining the dual-query pattern
- Add isolation-operations.test.ts with 8 tests covering reconcileGhosts
error swallowing, ghost detection, no-re-fetch-when-no-ghosts, and
cleanupStale/cleanupMerged delegation
- Register isolation-operations.test.ts as its own batch in package.json
- Add DB-throws test for getRunOrThrow error path in workflow-operations
- Eliminate second getWorkflowRun re-fetch in CLI approve/reject by
including workingPath/userMessage/codebaseId in operation result types
- Fix mock setup in CLI test: add createWorkflowEvent to workflow-events
module mock (was missing, caused test failures)
- Strengthen weak toThrow() assertion in workflowApproveCommand test
- Use nodeName instead of nodeId in progress output (shows command name
for command: nodes rather than the internal DAG key)
- Merge split imports from @archon/workflows/event-emitter into one line
- Add JSDoc @param verbose documentation to renderWorkflowEvent
- Add intent comment on default: break case in renderWorkflowEvent switch
- Add comment explaining why subscription is set up before the try block
- Add test for unsubscribe called when executeWorkflow throws (finally path)
- Add tests for formatDuration sub-second (500ms) and minutes (90s) branches
- Document workflow run progress output in cli.md (--quiet/--verbose flags)
- Update CLAUDE.md verbosity section to reflect dual effect of flags
Approve/reject/status/resume/abandon operations were duplicated between CLI
and command-handler with subtle behavioral drift. This extracts the shared
business logic into packages/core/src/operations/ so both callers are thin
formatting adapters over a single implementation.
Changes:
- Create workflow-operations.ts with 6 shared operations
- Create isolation-operations.ts with list/cleanup operations
- Thin command-handler cases to delegate to operations
- Thin CLI workflow/isolation commands to delegate to operations
- Add 15 unit tests for operations layer
- Update docs to reflect operations layer
- Add TODO for future dispatchOrchestratorWorkflow extraction
Fixes#988
Subscribe to WorkflowEventEmitter in workflowRunCommand() and render
node lifecycle events (started, completed, failed, skipped, approval)
to stderr. Tool events shown only with --verbose; --quiet suppresses
all progress output.
Changes:
- Add quiet/verbose fields to WorkflowRunOptions interface
- Thread --quiet/--verbose flags from cli.ts to workflowRunCommand
- Add renderWorkflowEvent() to format events as stderr progress lines
- Subscribe before executeWorkflow, unsubscribe in finally block
- Add 10 tests for progress rendering and flag behavior
Fixes#390
* feat: use 1M context model for implement nodes in bundled workflows (#1016)
Large codebases fill the 200k context window during implementation, triggering
SDK auto-compaction which loses context detail and slows execution. Setting
claude-opus-4-6[1m] on implementation nodes gives 5x more room before compaction.
Changes:
- Set model: claude-opus-4-6[1m] on implement nodes in 8 bundled workflows
- Fix loop nodes to respect per-node model overrides (previously only used
workflow-level model)
- Review/classify/report nodes stay on sonnet/haiku for cost efficiency
Fixes#1016
* fix: resolve per-node provider/model overrides for loop nodes
Move provider and model resolution for loop nodes to the call site,
matching the pattern used by command/prompt/bash nodes. This fixes:
- Loop nodes now respect per-node `provider` overrides (previously ignored)
- Model/provider compatibility is validated before execution
- JSDoc on buildLoopNodeOptions accurately describes the function
- Add missing blank line before JSDoc in CodexClient constructor
to match ClaudeClient style
- Update dag-executor test comment from "Default retry" to reflect
that retry config is now explicit (max_attempts: 2 = 2 retries → 3 total)
The test suite took ~131s because retry/reconnect tests waited on real
setTimeout delays (2-8s per retry). Make each delay injectable via
constructor options so tests can pass 1ms, and add --parallel to run
package test batches concurrently.
Changes:
- Add --parallel flag to root test script for concurrent package runs
- Add retryBaseDelayMs option to ClaudeClient and CodexClient constructors
- Add graceMs param to SSETransport constructor
- Add retryDelayMs option to GitHub and Gitea adapter constructors
- Update all retry/reconnect tests to use 1ms delays
- Add delay_ms: 1 to DAG executor retry test nodes
- Reduce test timeouts from 30-60s to 1-5s
Fixes#1008
- Update /init and /worktree error messages to reference /register-project instead of removed /clone and /setcwd commands
- Update .claude/rules/orchestrator.md: fix deterministic gate count (7→10), add /commands, /init, /worktree to table, remove 9 deleted commands, fix getTriggerForCommand example, update TransitionTrigger values list, fix anti-pattern count
- Add isError handling to environments query in ProjectDetail.tsx, matching the established pattern used by the conversations and runs queries
- Add tests for result chunk cost/stopReason/numTurns/modelUsage fields in claude.test.ts
- Add tests for rate_limit_event chunk (with and without rate_limit_info)
- Add tests for total_cost_usd accumulation in completeWorkflowRun (single-node, multi-node, no-cost)
- Add test for loop node cost accumulation across iterations
- Fix stale JSDoc in executeNodeInternal and executeLoopNode (NodeOutput -> NodeExecutionResult)
- Change stopReason guard from truthiness to !== undefined for consistency
- Add comment explaining totalCostUsd > 0 guard intent
- Add comment noting resume runs only accumulate cost for the current invocation
- Add rate_limit fallthrough comment in executeNodeInternal and executeLoopNode
- Fix tier 2 case-insensitive match to use filter+length check
(consistent with tiers 3/4, fulfils JSDoc ambiguity-throws contract)
- Add warn log on ambiguous match in command-handler catch block
- Add explicit WorkflowDefinition type annotation on let workflow
- Fix pre-existing parseWorkflowInvocation log event names to use
domain prefix (workflow.invoke_case_insensitive_match, workflow.invoke_unknown)
- Update CLAUDE.md to reflect unified resolveWorkflowName() across all platforms
Remove /clone, /getcwd, /setcwd, /repos, /repo, /repo-remove, /command-set,
/load-commands, and /reset-context from the command handler — these were
unreachable (fell through to AI router) and superseded by newer workflows and
auto-discovery. Promote /commands, /init, and /worktree to the deterministic
gate so they work reliably. Clean up 5 dead TransitionTrigger variants and
their DeactivatingCommand mappings. Add active worktree list to the web UI
project sidebar using the existing environments API endpoint.
Fixes#983
Chat platforms (Slack, Telegram, Web, GitHub, Discord) only supported
exact and case-insensitive workflow name matching, while the CLI had
full 4-tier resolution (exact, case-insensitive, suffix, substring).
Extract resolveWorkflowName() into @archon/workflows/router and use it
from both CLI and command-handler for consistent behavior everywhere.
Changes:
- Add resolveWorkflowName() to packages/workflows/src/router.ts
- Replace 44 lines of inline matching in CLI with shared function call
- Replace 2-tier matching in command-handler with shared function call
- Add suffix, substring, and ambiguity tests for command-handler
- Add comprehensive unit tests for resolveWorkflowName in router.test.ts
- Update docs to note matching applies to all platforms
Fixes#986
Extract total_cost_usd, stop_reason, num_turns, and model_usage from
Claude SDK result messages. Accumulate per-node cost in dag-executor
(regular + loop nodes), aggregate to per-run total via completeWorkflowRun
metadata, and display cost in the Web UI WorkflowRunCard. Rate limit
events are logged as warnings and yielded as a new message chunk type.
Fixes#932
The PUT fallback test was calling getArchonHome() unmocked and writing a
real file to ~/.archon/.archon/workflows/ with no cleanup. Redirect writes
to a throwaway tmpdir via ARCHON_HOME env var override (the canonical
override path used in Docker/CI), with try/finally cleanup.
Also strengthen the DELETE fallback test to assert the error body contains
the workflow name, matching the adjacent non-fallback 404 test.
Move section 3.4 (Identify New Abstractions) before PHASE_3_CHECKPOINT so
the checkpoint properly gates the step. Add a new checkbox for it.
Fix the grep pipeline: insert sed 's/^+//' to strip the diff '+' prefix
before the anchored grep -E patterns, which previously matched nothing.
Default planning and investigation prompts guide WHAT to do but not HOW
to reason about it. This adds primitives-level analysis sections so the
AI reasons from fundamentals before proposing fixes or plans.
Changes:
- archon-investigate-issue.md: Phase 3.0 First-Principles Analysis
- archon-create-plan.md: Phase 5.0 Primitives Inventory
- archon-web-research.md: ecosystem primitives research target
- archon-pr-review-scope.md: Phase 3.4 Identify New Abstractions
- archon-code-review-agent.md: Phase 2.5 Check for Primitive Duplication
Fixes#955