Archon

mirror of https://github.com/coleam00/Archon synced 2026-04-21 13:37:41 +00:00

Author	SHA1	Message	Date
Cole Medin	2732288f07	Merge pull request #1065 from coleam00/archon/task-fix-issue-1055 feat(core): inject workflow run context into orchestrator prompt	2026-04-16 07:55:00 -05:00
Cole Medin	b100cd4b48	Merge pull request #1064 from coleam00/archon/task-fix-issue-1054 fix(web): interleave tool calls with text during SSE streaming	2026-04-16 07:44:48 -05:00
Cole Medin	5acf5640c8	Merge pull request #1063 from coleam00/archon/task-fix-issue-1035 fix: archon setup --spawn fails on Windows with spaces in repo path	2026-04-16 07:36:58 -05:00
Cole Medin	68ecb75f0f	Merge pull request #1052 from coleam00/archon/task-fix-github-issue-1775831868291 fix(cli): send workflow dispatch/result messages for Web UI cards	2026-04-16 07:32:52 -05:00
Cole Medin	51b8652d43	fix: complete defensive chaining and add missing test coverage for PR #1052 - Fix half-applied optional chaining in WorkflowProgressCard refetchInterval (query.state.data?.run.status → ?.run?.status) preventing TypeError in polling - Add dispatch-failure test verifying executeWorkflow still runs when dispatch sendMessage fails - Add paused-workflow test proving paused guard fires before summary check - Strengthen dispatch metadata assertion to verify workerConversationId format Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 07:32:37 -05:00
jinglesthula	3dedc22537	Fix incorrect substep numbering in setup.md (#1013 ) Substeps for Step 4 were: 4a, 4b, 5c, 5d Co-authored-by: Jon Anderson <jonathan.anderson@byu.edu>	2026-04-15 12:15:35 +03:00
Rasmus Widing	882fc58f7c	fix: stop server startup from auto-failing in-flight workflow runs (#1216 ) (#1231 ) * fix: stop server startup from auto-failing in-flight workflow runs (#1216) `failOrphanedRuns()` at server startup unconditionally flipped every `running` workflow row to `failed`, including runs actively executing in another process (CLI / adapters). The dag-executor's between-layer status check then bailed out of the run, exit code 1 — even though every node had completed successfully. Same class of bug the CLI already learned (see comment at packages/cli/src/cli.ts:256-258). Per the new CLAUDE.md principle "No Autonomous Lifecycle Mutation Across Process Boundaries", we don't replace the call with a timer-based heuristic. Instead we remove it and surface running workflows to the user with one-click actions. Backend - `packages/server/src/index.ts` — remove the `failOrphanedRuns()` call at startup. Replace with explanatory comment referencing the CLI precedent and the CLAUDE.md principle. The function in `packages/core/src/db/workflows.ts:911` is preserved for use by the explicit `archon workflow cleanup` command. UI - `packages/web/src/components/layout/TopNav.tsx` — replace the binary pulse dot on the Dashboard nav with a numeric count badge sourced from `/api/dashboard/runs` `counts.running`. Hidden when count is 0. Same 10s polling interval as before. No animation — a steady factual count is honest; a pulse would imply system judgment. - `packages/web/src/components/dashboard/ConfirmRunActionDialog.tsx` (new) — shadcn AlertDialog wrapper for destructive workflow-run actions, mirroring the codebase-delete pattern in `sidebar/ProjectSelector.tsx`. Caller passes the existing button as `trigger` slot; dialog handles open/close via Radix. - `packages/web/src/components/dashboard/WorkflowRunCard.tsx` — replace 4 `window.confirm()` callsites (Reject, Abandon, Cancel, Delete) with ConfirmRunActionDialog. Each gets a context-appropriate description. - `packages/web/src/components/dashboard/WorkflowHistoryTable.tsx` — replace 1 `window.confirm()` (Delete) with the same dialog. CHANGELOG entries under [Unreleased]: Fixed for #1216, two Changed entries for the nav badge and dialog upgrade. No new tests: the web package has no React component testing infrastructure (existing `bun test` covers `src/lib/` and `src/stores/` only). Type-check + lint + manual UI verification + the backend reproducer are the verification levels. Closes #1216. * review: address PR #1231 nits — stale doc + 3 code polish PR review surfaced one real correctness issue in docs and three small code polish items. None block merge; addressing for cleanliness. - packages/docs-web/src/content/docs/guides/authoring-workflows.md:486 removed the "auto-marked as failed on next startup" paragraph that described the now-deleted behavior. Replaced with a "Crashed servers / orphaned runs" note pointing users at `archon workflow cleanup` and the dashboard Cancel/Abandon buttons; explains the auto-resume mechanism still works once the row reaches a terminal status. - ConfirmRunActionDialog: narrow `onConfirm` from `() => void \| Promise<void>` to `() => void`. All five callsites are synchronous wrappers around React Query mutations whose error handling lives at the page level (`runAction` in DashboardPage). The union widened the API for no current caller. Documented in the JSDoc what to do if an awaiting caller appears later. - TopNav: dropped the redundant `String(runningCount)` cast in the aria-label — template literal coerces. Also rewrote the comment above the `listDashboardRuns` query: the previous version implied `limit=1` constrained `counts.running`; in fact `counts` is a server-side aggregate independent of `limit`, and `limit=1` only minimises the `runs` array we discard. * review: correct remediation docs — cleanup ≠ abandon CodeRabbit caught a factual error I introduced in the doc update: `archon workflow cleanup` calls `deleteOldWorkflowRuns(days)` which DELETEs old terminal rows (`completed`/`failed`/`cancelled` older than N days) for disk hygiene. It does NOT transition stuck `running` rows. The correct remediation for a stuck `running` row is either the dashboard's per-row Cancel/Abandon button (already documented) or `archon workflow abandon <run-id>` from the CLI (existing subcommand, see packages/cli/src/cli.ts:366-374). Fixed three locations: - packages/docs-web/.../guides/authoring-workflows.md — replaced the vague "clean up explicitly" with concrete Web UI / CLI instructions and an explicit "Not to be confused with `archon workflow cleanup`" callout to close off the ambiguity CodeRabbit flagged. - packages/server/src/index.ts — comment updated to point at the correct remediation (`archon workflow abandon`) and clarify that `archon workflow cleanup` is unrelated disk-hygiene. - CHANGELOG.md — same correction in the [Unreleased] Fixed entry.	2026-04-15 12:05:41 +03:00
Rasmus Widing	5c8c39e5c9	fix(test): update stale mocks in cleanup-service 'continues processing' test (#1230 ) (#1232 ) After PR #1034 changed worktree existence checks from execFileAsync to fs/promises.access, the mockExecFileAsync rejections had no effect. removeEnvironment needs getById + getCodebase mocks to proceed past the early-return guard, otherwise envs route to report.skipped instead of report.removed. Replace the two stale mockExecFileAsync rejection calls with proper mockGetById and mockGetCodebase return values for both test environments. Fixes #1230	2026-04-15 11:53:02 +03:00
Shane McCarron	f61d576a4d	feat(isolation): auto-init submodules in worktrees (#1189 ) Worktrees created via `git worktree add` do not initialize submodules — monorepo workflows that need submodule content find empty directories. Auto-detect `.gitmodules` and run `git submodule update --init --recursive` after worktree creation; classify failures through the isolation error pipeline. Behavior: - `.gitmodules` absent → skip silently (zero-cost probe, no effect on non-submodule repos) - `.gitmodules` present → run submodule init by default (opt out via `worktree.initSubmodules: false`) - submodule init or `.gitmodules` read failure → throw with classified error including opt-out guidance - Only `ENOENT` on `.gitmodules` is treated as "no submodules"; other access errors (EACCES, EIO) surface as failures to prevent silent empty-dir worktrees Changes: - `packages/isolation/src/providers/worktree.ts` — `initSubmodules()` method + call site in `createWorktree()` - `packages/isolation/src/errors.ts` — collapsed `errorPatterns` + `knownPatterns` into single `ERROR_PATTERNS` source of truth with `known: boolean` per entry; added submodule pattern with opt-out guidance - `packages/isolation/src/types.ts` + `packages/core/src/config/config-types.ts` — new `initSubmodules?: boolean` config option - `packages/docs-web/src/content/docs/reference/configuration.md` — documented the new option and submodule behavior - Tests: default-on, explicit opt-in, explicit opt-out, skip-when-absent, fail-fast on EACCES, fail-fast on git failure, fail-fast on timeout Credit to @halindrome for the original implementation and root-cause mapping across #1183, #1187, #1188, #1192. Follow-up: #1192 (codebase identity rearchitect) would retire the cross-clone guard code in `resolver.ts` and `worktree.ts` that #1198, #1206 added. Separate PR. Closes #1187	2026-04-15 09:48:18 +03:00
Rasmus Widing	c4ab0a2333	docs(claude.md): codify "no autonomous lifecycle mutation across process boundaries" Generalize the lesson from #1216 (and the CLI precedent at packages/cli/src/cli.ts:256-258) into a project-wide engineering principle. When a process cannot reliably distinguish "actively running elsewhere" from "orphaned by a crash" — typically because the work was started by a different process or input source (CLI, adapter, webhook, web UI, cron) — it must not autonomously mutate that work based on a timer or staleness guess. Surface and ask instead. Phrased to be specific about what is still allowed: heuristics for recoverable operations (retry backoff, subprocess timeouts, hygiene cleanup of terminal-status data) are not banned. The rule targets destructive mutation of non-terminal state owned by an unknowable other party.	2026-04-15 09:14:15 +03:00
Kagura	73d9240eb3	fix(isolation): complete reports false success when worktree remains on disk (fixes #964 ) (#1034 ) * fix(isolation): complete reports false success when worktree remains on disk (fixes #964) Three changes to prevent ghost worktrees: 1. isolationCompleteCommand now checks result.worktreeRemoved — if the worktree was not actually removed (partial failure), it reports 'Partial' with warnings and counts as failed, not completed. Previously only skippedReason was checked; a destroy that returned successfully but with worktreeRemoved=false would still print 'Completed'. 2. WorktreeProvider.destroy() now runs 'git worktree prune' after removal to clean up stale worktree references that git may keep even after the directory is removed. 3. WorktreeProvider.destroy() adds post-removal verification: after git worktree remove, it checks 'git worktree list --porcelain' to confirm the worktree is actually unregistered. If still registered, worktreeRemoved is set back to false with a descriptive warning. * fix: address CodeRabbit review — ghost worktree prune, partial cleanup callers, accurate messages * test: add regression test for Partial branch in isolation complete Exercises the !result.worktreeRemoved path (without skippedReason) that was flagged as uncovered by CodeRabbit review.	2026-04-14 17:58:45 +03:00
Matt Chapman	28b258286f	Extra backticks for markdown block to fix formatting (#1218 ) of nested code blocks.	2026-04-14 17:58:31 +03:00
Rasmus Widing	81859d6842	fix(providers): replace Claude SDK embed with explicit binary-path resolver (#1217 ) * feat(providers): replace Claude SDK embed with explicit binary-path resolver Drop `@anthropic-ai/claude-agent-sdk/embed` and resolve Claude Code via CLAUDE_BIN_PATH env → assistants.claude.claudeBinaryPath config → throw with install instructions. The embed's silent failure modes on macOS (#1210) and Windows (#1087) become actionable errors with a documented recovery path. Dev mode (bun run) remains auto-resolved via node_modules. The setup wizard auto-detects Claude Code by probing the native installer path (~/.local/bin/claude), npm global cli.js, and PATH, then writes CLAUDE_BIN_PATH to ~/.archon/.env. Dockerfile pre-sets CLAUDE_BIN_PATH so extenders using the compiled binary keep working. Release workflow gets negative and positive resolver smoke tests. Docs, CHANGELOG, README, .env.example, CLAUDE.md, test-release and archon skills all updated to reflect the curl-first install story. Retires #1210, #1087, #1091 (never merged, now obsolete). Implements #1176. * fix(providers): only pass --no-env-file when spawning Claude via Bun/Node `--no-env-file` is a Bun flag that prevents Bun from auto-loading `.env` from the subprocess cwd. It is only meaningful when the Claude Code executable is a `cli.js` file — in which case the SDK spawns it via `bun`/`node` and the flag reaches the runtime. When `CLAUDE_BIN_PATH` points at a native compiled Claude binary (e.g. `~/.local/bin/claude` from the curl installer, which is Anthropic's recommended default), the SDK executes the binary directly. Passing `--no-env-file` then goes straight to the native binary, which rejects it with `error: unknown option '--no-env-file'` and the subprocess exits code 1. Emit `executableArgs` only when the target is a `.js` file (dev mode or explicit cli.js path). Caught by end-to-end smoke testing against the curl-installed native Claude binary. * docs: record env-leak validation result in provider comment Verified end-to-end with sentinel `.env` and `.env.local` files in a workflow CWD that the native Claude binary (curl installer) does not auto-load `.env` files. With Archon's full spawn pathway and parent env stripped, the subprocess saw both sentinels as UNSET. The first-layer protection in `@archon/paths` (#1067) handles the inheritance leak; `--no-env-file` only matters for the Bun-spawned cli.js path, where it is still emitted. * chore(providers): cleanup pass — exports, docs, troubleshooting Final-sweep cleanup tied to the binary-resolver PR: - Mirror Codex's package surface for the new Claude resolver: add `./claude/binary-resolver` subpath export and re-export `resolveClaudeBinaryPath` + `claudeFileExists` from the package index. Renames the previously single `fileExists` re-export to `codexFileExists` for symmetry; nothing outside the providers package was importing it. - Add a "Claude Code not found" entry to the troubleshooting reference doc with platform-specific install snippets and pointers to the AI Assistants binary-path section. - Reframe the example claudeBinaryPath in reference/configuration.md away from cli.js-only language; it accepts either the native binary or cli.js. * test+refactor(providers, cli): address PR review feedback Two test gaps and one doc nit from the PR review (#1217): - Extract the `--no-env-file` decision into a pure exported helper `shouldPassNoEnvFile(cliPath)` so the native-binary branch is unit testable without mocking `BUNDLED_IS_BINARY` or running the full sendQuery pathway. Six new tests cover undefined, cli.js, native binary (Linux + Windows), Homebrew symlink, and suffix-only matching. Also adds a `claude.subprocess_env_file_flag` debug log so the security-adjacent decision is auditable. - Extract the three install-location probes in setup.ts into exported wrappers (`probeFileExists`, `probeNpmRoot`, `probeWhichClaude`) and export `detectClaudeExecutablePath` itself, so the probe order can be spied on. Six new tests cover each tier winning, fall-through ordering, npm-tier skip when not installed, and the which-resolved-but-stale-path edge case. - CLAUDE.md `claudeBinaryPath` placeholder updated to reflect that the field accepts either the native binary or cli.js (the example value was previously `/absolute/path/to/cli.js`, slightly misleading now that the curl-installer native binary is the default). Skipped from the review by deliberate scope decision: - `resolveClaudeBinaryPath` async-with-no-await: matches Codex's resolver signature exactly. Changing only Claude breaks symmetry; if pursued, do both providers in a separate cleanup PR. - `isAbsolute()` validation in parseClaudeConfig: Codex doesn't do it either. Resolver throws on non-existence already. - Atomic `.env` writes in setup wizard: pre-existing pattern this PR touched only adjacently. File as separate issue if needed. - classifyError branch in dag-executor for setup errors: scope creep. - `.env.example` "missing #" claim: false positive (verified all CLAUDE_BIN_PATH lines have proper comment prefixes). * fix(test): use path.join in Windows-compatible probe-order test The "tier 2 wins (npm cli.js)" test hardcoded forward-slash path comparisons, but `path.join` produces backslashes on Windows. Caused the Windows CI leg of the test suite to fail while macOS and Linux passed. Use `path.join` for both the mock return value and the expectation so the separator matches whatever the platform produces.	2026-04-14 17:56:37 +03:00
Rasmus Widing	33d31c44f1	fix: lock workflow runs by working_path (#1036 , #1188 part 2) (#1212 ) * fix: lock workflow runs by working_path (#1036, #1188 part 2) Both bugs reduce to the same primitive: there's no enforced lock on working_path, so two dispatches that resolve to the same filesystem location can race. The DB row is the lock token; pending/running/paused are "lock held"; terminal statuses release. Changes: - getActiveWorkflowRunByPath includes `pending` (with 5-min stale-orphan age window), accepts excludeId + selfStartedAt, and orders by (started_at ASC, id ASC) for a deterministic older-wins tiebreaker. Eliminates the both-abort race where two near-simultaneous dispatches with similar timestamps could mutually abort each other. - Move the executor's guard call site to AFTER workflowRun is finalized (preCreated, resumed, or freshly created). This guarantees we always have self-ID + started_at to pass to the lock query. - On guard fire after row creation: mark self as 'cancelled' so we don't leave a zombie pending row that would then become its own lock holder. - New error message includes workflow name, duration, short run id, and three concrete next-action commands (status / cancel / different branch). Replaces the vague "Workflow already running". - Resume orphan fix: when executor activates a resumable run, mark the orchestrator's pre-created row as 'cancelled'. Without this, every resume leaks a pending row that would block the user's own back-to-back resume until the 5-min stale window. - New formatDuration helper for the error message (8 unit tests). Tests: - 5 new tests in db/workflows.test.ts: pending in active set, age window, excludeId exclusion, tiebreaker SQL shape, ordering. - 5 new tests in executor.test.ts: self-id passed to query, self-cancel on guard fire, new message format, resume orphan cancellation, resume proceeds even if orphan cancel fails. - Updated 2 executor-preamble tests for new structural behavior (row-then-guard, new message format). - 8 new tests for formatDuration. Deferred (kept scope tight): - Worktree-layer advisory lockfile (residual #1188.2 microsecond race where both dispatches reach provider.create — bounded by git's own atomicity for `worktree add`). - Startup cleanup of pre-existing stale pending rows (5-min age window makes them harmless). - DB partial UNIQUE constraint migration (code-only is sufficient). Fixes #1036 Fixes #1188 (part 2) * fix: SQLite Date binding + UTC timestamp parse for path lock guard Two issues found during E2E smoke testing: 1. bun:sqlite rejects Date objects as bindings ("Binding expected string, TypedArray, boolean, number, bigint or null"). Serialize selfStartedAt to ISO string before passing — PostgreSQL accepts ISO strings for TIMESTAMPTZ comparison too. 2. SQLite returns datetimes as plain strings without timezone suffix ("YYYY-MM-DD HH:MM:SS"), and JS new Date() parses such strings as local time. The blocking message was showing "running 3h" for workflows started seconds ago in a UTC+3 timezone. Added parseDbTimestamp helper that: - Returns Date.getTime() unchanged for Date inputs (PG path) - Treats SQLite-style strings as UTC by appending Z Used at both call sites: the lock query (selfStartedAt) and the blocking message duration. Tests: - 4 new tests in duration.test.ts for parseDbTimestamp covering Date input, SQLite UTC interpretation, explicit Z, and explicit +/-HH:MM offsets. - Updated workflows.test.ts assertion for ISO serialization. E2E smoke verified end-to-end: - Sanity (single dispatch) succeeds. - Two concurrent --no-worktree dispatches: one wins, one blocked with actionable message showing correct "Xs" duration. - Resume + back-to-back resume both succeed (orphan correctly cancelled when resume activates). * fix: address review — resume timestamp, lock-leak paths, status copy CodeRabbit review on #1212 surfaced three real correctness gaps: CRITICAL — resumeWorkflowRun preserved historical started_at, letting a resumed row sort ahead of a currently-active holder in the lock query's older-wins tiebreaker. Two active workflows could end up on the same working_path. Fix: refresh started_at to NOW() in resumeWorkflowRun. Original creation time is recoverable from workflow_events history if needed for analytics. MAJOR — lock-leak failure paths: - If resumeWorkflowRun() throws, the orchestrator's pre-created row was left as 'pending' until the 5-min stale window. Fix: cancel preCreatedRun in the resume catch. - If getActiveWorkflowRunByPath() throws, workflowRun (possibly already promoted to 'running' via resume) was left active with no auto-cleanup. Fix: cancel workflowRun in the guard catch. MINOR — the blocking message always said "running" but the lock query returns running, paused, AND fresh-pending rows. Telling a user to "wait for it to finish" on a paused run (waiting on user approval) would block them indefinitely. Fix: status-aware copy: - paused: "paused waiting for user input" + approve/reject actions - pending: "starting" verb - running: keep current Tests: - New: resume refreshes started_at (asserts SQL contains `started_at = NOW()`) - New: cancels preCreatedRun when resumeWorkflowRun throws - New: cancels workflowRun when guard query throws - New: paused message uses approve/reject actions, NOT "wait" - New: pending message uses "starting" verb - New: running message uses default copy - Updated: existing tests for new error string ("already active" reflects status-aware semantics, not just "running") Note: the user-facing error string changed from "already running on this path" to "already active on this path (status)". Internal use only — surfaced via getResult().error, not directly to users. * fix: SQLite tiebreaker dialect bug + paired self struct + UX polish CodeRabbit second review found one critical issue and several polish items not addressed in `008013da`. CRITICAL — SQLite tiebreaker silently broken under default deployment. SQLite stores started_at as TEXT "YYYY-MM-DD HH:MM:SS" (space sep). Our ISO param is "YYYY-MM-DDTHH:MM:SS.mmmZ" (T sep). SQLite compares text lexically: char 11 is space (0x20) in column vs T (0x54) in param, so EVERY column value lex-sorts before EVERY ISO param. Result: `started_at < $param` is always TRUE regardless of actual time. In true concurrent dispatches, both sides see each other as "older" and both abort — defeating the older-wins guarantee under SQLite, which is the default deployment. Fix: dialect-aware comparison in getActiveWorkflowRunByPath: - PostgreSQL: `started_at < $3::timestamptz` (TIMESTAMPTZ + cast) - SQLite: `datetime(started_at) < datetime($3)` (forces chronological via SQLite's date/time functions) Documented with reproducer tests in adapters/sqlite.test.ts: lexical returns wrong answer for "2026-04-14 12:00:00" < "2026-04-14T10:00:00Z"; datetime() returns correct answer. Type design — collapse paired params into struct. `excludeId` and `selfStartedAt` had to travel together (tiebreaker references both) but were two independent optionals — future callers could pass one without the other and silently degrade. Replaced with a single `self?: { id: string; startedAt: Date }` to make the paired-or-nothing invariant structural. formatDuration(0) consistency. Old: `if (ms <= 0) return '0s'` — special-cased 0ms despite the "sub-second rounds up to 1s" comment. Fixed to `ms < 0` so 0ms returns '1s' (a run that just started in the same DB second should display as active, not literal zero). Comment fix: "We acquired the lock via createWorkflowRun" was misleading — createWorkflowRun creates a row; the lock is determined later by the query. Log context: added cwd to workflow.guard_self_cancel_failed and pendingRunId to db_active_workflow_check_failed so operators can correlate leaked rows. Doc fixes: - /workflow abandon doc said "marks as failed" — actually 'cancelled' - database.md "Prevents concurrent workflow execution" → accurate description of path-based lock with stale-pending tolerance Test additions: - 3 SQLite-direct tests in adapters/sqlite.test.ts proving the lexical-vs-chronological bug and the datetime() fix - Guard self-cancel update throw still surfaces failure to user Signature change rippled through: - IWorkflowStore.getActiveWorkflowRunByPath now takes (path, self?) - All internal callers updated	2026-04-14 15:19:38 +03:00
Rasmus Widing	5a4541b391	fix: route canonical path failures through blocked classification (#1211 ) Follow-up to #1206 review: the early getCanonicalRepoPath() wrap in resolve() threw directly, escaping the classification flow that createNewEnvironment uses. Permission errors, malformed worktree pointers, ENOENT, etc. surfaced as unclassified crashes instead of becoming an actionable `blocked` result. Mirror createNewEnvironment's contract: - isKnownIsolationError → return { status: 'blocked', reason: 'creation_failed', userMessage: classifyIsolationError(err) + suffix } - unknown errors → throw (programming bugs stay visible as crashes, not silent isolation failures) Adds two tests in resolver.test.ts: - EACCES classifies to "Permission denied" blocked message - Unknown error propagates as throw Addresses CodeRabbit review comment on #1206.	2026-04-14 15:19:13 +03:00
Rasmus Widing	fd3f043125	fix: extend worktree ownership guard to resolver adoption paths (#1206 ) * fix: extend worktree ownership guard to resolver adoption paths (#1183, #1188) PR #1198 guarded WorktreeProvider.findExisting(), but IsolationResolver has three earlier adoption paths that bypass the provider layer: - findReusable (DB lookup by workflow identity) - findLinkedIssueEnv (cross-reference via linked issues) - tryBranchAdoption (PR branch discovery) Two clones of the same remote share codebase_id (identity is derived from owner/repo). Without these guards, clone B silently adopts clone A's worktree via any of the three paths. Changes: - Extract verifyWorktreeOwnership from WorktreeProvider (private) to @archon/git/src/worktree.ts as an exported function, sitting next to getCanonicalRepoPath which parses the same .git file format - Call the shared function from all three resolver paths; throw on cross-clone mismatch (DB rows are preserved — they legitimately belong to the other clone) - Compute canonicalRepoPath once at the top of resolve() - Six new tests in resolver.test.ts covering each guarded path's cross-checkout and same-clone behaviors Fixes #1183 Fixes #1188 (part 1 — cross-checkout; part 2 parallel collision deferred to follow-up alongside #1036) * fix: address PR review — polish, observability, secondary gap, docs Addresses the multi-agent review on #1206: Code fixes: - worktree.adoption_refused_cross_checkout log event renamed to match CLAUDE.md {domain}.{action}_{state} convention - verifyWorktreeOwnership now preserves err.code and err via { cause } when wrapping fs errors, so classifyIsolationError is robust to Node message format changes - Structured fields (codebaseId, canonicalRepoPath) added to all cross-clone rejection logs for incident debugging - Wrap getCanonicalRepoPath at top of resolve() with classified error instead of letting it propagate as an unclassified crash - Extract assertWorktreeOwnership helper on IsolationResolver — centralizes warn-then-rethrow contract, removes duplication - Dedupe toWorktreePath(existing.working_path) calls in resolver paths - Add code comment on findLinkedIssueEnv explaining why throw-on-first is intentional (user decision — surfaces anomaly instead of masking) Secondary gap closed: - WorktreeProvider.findExisting PR-branch adoption path (findWorktreeByBranch) now also verifies ownership — same class of bug as the main path, just via a different lookup Tests: - 8 new unit tests for verifyWorktreeOwnership in @archon/git (matching pointer, different clone, EISDIR/ENOENT errno preservation, submodule pointer, corrupted .git, trailing-slash normalization, cause chain) - tryBranchAdoption cross-clone test now asserts store.create was never called (symmetry with paths 1+2 asserting updateStatus) - New test for cross-clone rejection in the PR-branch-adoption secondary path in worktree.test.ts Docs: - CHANGELOG.md Unreleased entry for the cross-clone fix series - troubleshooting.md "Worktree Belongs to a Different Clone" section documenting all four new error patterns with resolution steps and pointer to #1192 for the architectural fix * fix(git): use raw .git pointer in cross-clone error message verifyWorktreeOwnership previously called path.resolve() on the gitdir path before embedding it in the error message. On Windows, resolve() prepends a drive letter to a POSIX-style path (e.g., /other/clone → C:\other\clone), which: 1. Misled users by showing a path that doesn't match what's actually in their .git file 2. Broke a Windows-only test asserting the error contains the literal /other/clone path Compare on resolved paths (correct — normalizes trailing slashes and relative components for the equality check) but display the raw match in the error message (recognizable to the user).	2026-04-14 12:10:19 +03:00
Rasmus Widing	af9ed84157	fix: prevent worktree isolation bypass via prompt and git-level adoption (#1198 ) * fix: prevent worktree isolation bypass via prompt and git-level adoption (#1193, #1188) Three fixes for workflows operating on wrong branches: - archon-implement prompt: replace ambiguous branch table with decision tree that trusts the worktree isolation system, uses $BASE_BRANCH explicitly, and instructs AI to never switch branches - WorktreeProvider.findExisting: verify worktree's parent repo matches the request before adopting, preventing cross-clone adoption - WorktreeProvider.createNewBranch: reset stale orphan branches to the intended start-point instead of silently inheriting old commits Fixes #1193 Relates to #1188 * fix: address PR review — strict worktree verification, align sibling prompts Address CodeRabbit + self-review findings on #1198: Code fixes: - findExisting now throws on cross-checkout or unverifiable state instead of returning null, avoiding a confusing cascade through createNewBranch - verifyWorktreeOwnership handles .git errors precisely: ENOENT/EACCES/EIO throw a fail-fast error; EISDIR (full checkout at path) throws a clear "not a worktree" error; unmatched gitdir (submodule, malformed) throws - Path comparison uses resolve() to normalize trailing slashes - Added classifyIsolationError patterns so new errors produce actionable user messages Test fixes: - mockClear readFile/rm in afterEach - New tests: cross-checkout throws, EISDIR throws, EACCES throws, submodule pointer throws, trailing-slash normalization, branch -f reset failure propagates without retry - Updated existing tests that relied on permissive adoption to provide valid matching gitdir Prompt fixes (sweep of all default commands): - archon-implement.md: clarify "never switch branches" applies to worktree context; non-worktree branch creation still allowed - archon-fix-issue.md + archon-implement-issue.md: aligned decision tree with archon-implement pattern; use $BASE_BRANCH instead of MAIN/MASTER - archon-plan-setup.md: converted table to ordered decision tree with IN WORKTREE? first; removed ambiguous "already on correct feature branch" row	2026-04-14 09:44:12 +03:00
Rasmus Widing	d6e24f5075	feat: Phase 2 — community-friendly provider registry system (#1195 ) * feat: replace hardcoded provider factory with typed registry system Replace the built-in-only factory switch with a typed ProviderRegistration registry where entries carry metadata (displayName, capabilities, isModelCompatible) alongside the factory function. This enables community providers to register without modifying core code. - Add ProviderRegistration and ProviderInfo types to contract layer - Create registry.ts with register/get/list/clear API, delete factory.ts - Bootstrap registerBuiltinProviders() at server and CLI entrypoints - Widen provider unions from 'claude' \| 'codex' to string across schemas, config types, deps, executors, and API validation - Replace hardcoded model-validation with registry-driven isModelCompatible and inferProviderFromModel (built-in only inference) - Add GET /api/providers endpoint returning registry metadata - Dynamic provider dropdowns in Web UI (BuilderToolbar, NodeInspector, WorkflowBuilder, SettingsPage) via useProviders hook - Dynamic provider selection in CLI setup command - Registry test suite covering full lifecycle * feat: generalize assistant config and tighten registry validation - Add ProviderDefaults/ProviderDefaultsMap generic types to contract layer - Add index signatures to ClaudeProviderDefaults/CodexProviderDefaults - Introduce AssistantDefaults/AssistantDefaultsConfig intersection types that combine ProviderDefaultsMap with typed built-in entries - Replace hardcoded claude/codex config merging with generic mergeAssistantDefaults() that iterates all provider entries - Replace hardcoded toSafeConfig projection with generic toSafeAssistantDefaults() that strips server-internal fields - Validate provider strings at all config-entry surfaces: env override, global config, repo config all throw on unknown providers - Validate provider on PATCH /api/config/assistants (400 on unknown) - Move validator.ts from hardcoded Codex checks to capability-driven warnings using registry getProviderCapabilities() - Remove resolveProvider() default to 'claude' — returns undefined when no provider is set, skipping capability warnings for unresolved nodes - Widen config API schemas to generic Record<string, ProviderDefaults> - Rewrite SettingsPage to iterate providers dynamically with built-in specific UI for Claude/Codex and generic JSON view for community - Extract bootstrap to provider-bootstrap modules in CLI and server - Remove all as Record<...> casts from dag-executor, executor, orchestrator — clean indexing via ProviderDefaultsMap intersection * fix: remove remaining hardcoded provider assumptions and regenerate types - Replace hardcoded 'claude' defaults in CLI setup with registry lookup (getRegisteredProviders().find(p => p.builtIn)?.id) - Replace hardcoded 'claude' default in clone.ts folder detection with registry-driven fallback - Update config YAML comment from "claude or codex" to "registered provider" - Make bootstrap test assertions use toContain instead of exact toEqual so they don't break when community providers are registered - Widen validator.test.ts helper from 'claude' \| 'codex' to string - Remove unnecessary type casts in NodeInspector, WorkflowBuilder, SettingsPage now that generated types use string - Regenerate api.generated.d.ts from updated OpenAPI spec — all provider fields are now string instead of 'claude' \| 'codex' union * fix: address PR review findings — consistency, tests, docs Critical fixes: - isModelCompatible now throws on unknown providers (fail-fast parity with getProviderCapabilities) instead of silently returning true - Schema provider fields use z.string().trim().min(1) to reject whitespace-only values - validator.ts resolveProvider accepts defaultProvider param so capability warnings fire for config-inherited providers - PATCH /api/config/assistants validates assistants keys against registry (rejects unknown provider IDs in the map) YAGNI cleanup: - Delete provider-bootstrap.ts wrappers in CLI and server — call registerBuiltinProviders() directly - Remove no-op .map(provider => provider) in SettingsPage Test coverage: - Add GET /api/providers endpoint tests (shape, projection, capabilities) - Add config-loader throw-path tests for unknown providers in env var, global config, and repo config - Add isModelCompatible throw test for unknown providers Docs: - CLAUDE.md: factory.ts → registry.ts in directory tree, add GET /api/providers to API endpoints section - .env.example: update DEFAULT_AI_ASSISTANT comment - docs-web configuration reference: update provider constraint docs UI: - Settings default-assistant dropdown uses allProviderEntries fallback (no longer silently empty on API failure) - clearRegistry marked @internal in JSDoc * fix: use registry defaults in getDefaults/registerProject, document type design - getDefaults() initializes assistant defaults from registered providers instead of hardcoding { claude: {}, codex: {} } - getDefaults() uses first registered built-in as default assistant instead of hardcoding 'claude' - handleRegisterProject uses config.assistant instead of hardcoded 'claude' for new codebase ai_assistant_type - Document AssistantDefaults/AssistantDefaultsConfig intersection types: built-in keys are typed for parseClaudeConfig/parseCodexConfig type safety; community providers use the generic [string] index - Document WorkflowConfig.assistants intersection type with same rationale * docs: update stale provider references to reflect registry system - architecture.md: DB schema comment now says 'registered provider' - first-workflow.md: provider field accepts any registered provider - quick-reference.md: provider type changed from enum to string - authoring-workflows.md: provider type changed from enum to string - title-generator.ts: @param doc updated from 'claude or codex' to generic provider identifier * docs: fix remaining stale provider references in quick-reference and authoring guide - quick-reference.md: per-node provider type changed from enum to string - quick-reference.md: model mismatch guidance updated for registry pattern - authoring-workflows.md: provider comment says 'any registered provider'	2026-04-13 21:27:11 +03:00
Rasmus Widing	b5c5f81c8a	refactor: extract provider metadata seam for Phase 2 registry readiness (#1185 ) * refactor: extract provider metadata seam for Phase 2 registry readiness - Add static capability constants (capabilities.ts) for Claude and Codex - Export getProviderCapabilities() from @archon/providers for capability queries without provider instantiation - Add inferProviderFromModel() to model-validation.ts, replacing three copy-pasted inline inference blocks in executor.ts and dag-executor.ts - Replace throwaway provider instantiation in dag-executor with static capability lookup (getProviderCapabilities) - Add orchestrator warning when env vars are configured but provider doesn't support envInjection * refactor: address LOW findings from code review - Remove CLAUDE_CAPABILITIES/CODEX_CAPABILITIES from public index (YAGNI — callers should use getProviderCapabilities(), not raw constants) - Remove dead _deps parameter from resolveNodeProviderAndModel and its two call-sites (no longer needed after static capability lookup refactor) - Update factory.ts module JSDoc to mention both exported functions - Add edge-case tests for getProviderCapabilities: empty string and case-sensitive throws (parity with existing getAgentProvider tests) - Add test for inferProviderFromModel with empty string (returns default, documenting the falsy-string shortcut)	2026-04-13 16:10:48 +03:00
Rasmus Widing	bf20063e5a	feat: propagate managed execution env to all workflow surfaces (#1161 ) * Implement managed execution env propagation * Address managed env review feedback	2026-04-13 15:21:57 +03:00
Rasmus Widing	a8ac3f057b	security: prevent target repo .env from leaking into subprocesses (#1135 ) Remove the entire env-leak scanning/consent infrastructure: scanner, allow_env_keys DB column usage, allow_target_repo_keys config, PATCH consent route, --allow-env-keys CLI flag, and UI consent toggle. The env-leak gate was the wrong primitive. Target repo .env protection is already structural: - stripCwdEnv() at boot removes Bun-auto-loaded CWD .env keys - Archon loads its own env sources afterward (~/.archon/.env) - process.env is clean before any subprocess spawns - Managed env injection (config.yaml env: + DB vars) is unchanged No scanning, no consent, no blocking. Any repo can be registered and used. Subprocesses receive the already-clean process.env.	2026-04-13 13:46:24 +03:00
Rasmus Widing	c9c6ab47cb	test: add comprehensive e2e smoke test workflows - e2e-all-nodes: exercises bash, prompt, script (bun), structured output, model override (haiku), effort control, and $nodeId.output refs - e2e-mixed-providers: tests Claude + Codex in the same workflow with cross-provider output references - echo-args.js: simple script node test helper	2026-04-13 11:26:05 +03:00
Rasmus Widing	37aeadb8c8	refactor: decompose provider sendQuery() into explicit helper boundaries (#1162 ) * refactor: decompose provider sendQuery() into explicit helper boundaries (#1139) sendQuery() in both Claude and Codex providers was a monolith mixing SDK option building, nodeConfig translation, stream normalization, and error classification. This makes it hard to safely extend for Phase 2 provider extensibility. Decompose both providers into focused internal helpers: Claude: - buildBaseClaudeOptions: SDK option construction - buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup - applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings - streamClaudeMessages: raw SDK event → MessageChunk normalization - classifyAndEnrichError: error classification with retry decisions Codex: - buildTurnOptions: per-turn option construction (output schema, abort) - streamCodexEvents: raw SDK event → MessageChunk normalization - classifyAndEnrichCodexError: error classification with retry decisions Also introduces ProviderWarning { code, message } replacing raw string warnings for machine-readable provider translation warnings. Adds 43 focused unit tests covering the extracted helpers directly. Fixes #1139 * fix: export ToolResultEntry type used in public buildBaseClaudeOptions API * fix: unexport internal helpers to prevent API surface leakage, fix retry state bug Review findings: 1. Internal helpers were exported and reachable through package.json subpath exports (./claude/provider, ./codex/provider), widening the public API. All new helpers are now file-local — the only public exports remain ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML, withFirstMessageTimeout, getProcessUid. 2. Codex streamState (lastTodoListSignature) was shared across retry attempts, causing todo-list dedup to suppress output on retry. Now creates fresh state per attempt. Removed direct helper test imports — existing sendQuery e2e tests (51 Claude + 42 Codex) cover all behavior paths. * fix: address review findings — abort handling, retry bugs, error swallowing Fixes from CodeRabbit + multi-agent review: 1. classifyAndEnrichError preserves first-event timeout diagnostic instead of collapsing it into generic "Query aborted" (the timeout aborts the controller, but the original error carries the #1067 breadcrumb) 2. nodeConfigWarnings emitted once before retry loop, not per attempt 3. buildSubprocessEnv() called once before retry loop (was re-logging auth mode and rebuilding { ...process.env } per attempt) 4. Abort signal listener registered once with forwarding to current controller (was accumulating per-retry listeners) 5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on circular refs — was asymmetric with PostToolUseFailure which had it) 6. Codex streamCodexEvents throws on abort instead of silent break (callers were getting truncated stream with no result/error) 7. Both providers store enrichedError (not raw error) for retry exhaustion — preserves stderr context in final throw 8. Log is_error result events at error level in Claude stream normalizer * test: add black-box behavioral tests for sendQuery decomposition fixes Restore test coverage for the specific fixes from the decomposition review, exercised through sendQuery (black-box) since helpers are file-local: Claude (6 tests): - Timeout error preserved (not collapsed into "Query aborted") - nodeConfig warnings emitted once even when retries occur - Abort signal cancels across retries via single forwarding listener - Enriched error (with stderr) thrown at retry exhaustion - PostToolUse hook handles circular reference without crashing - is_error result events logged at error level Codex (3 tests): - Abort signal throws instead of silently truncating stream - Enriched error thrown at retry exhaustion - Todo-list dedup state resets between retry attempts	2026-04-13 11:24:36 +03:00
Rasmus Widing	6a6740af38	fix: make env-integration test cross-platform (Windows CI) (#1160 ) * fix: make env-integration test cross-platform (Windows CI) Check for Windows env var equivalents (Path instead of PATH, USERPROFILE instead of HOME) in scenario 3 assertions. Closes #1128 * fix: Windows PATH/HOME casing in provider subprocess env test Same cross-platform fix for ClaudeProvider test — spread objects lose Windows case-insensitive behavior (Path vs PATH, USERPROFILE vs HOME).	2026-04-13 09:44:58 +03:00
Rasmus Widing	c1ed76524b	refactor: extract providers from @archon/core into @archon/providers (#1137 ) * refactor: extract providers from @archon/core into @archon/providers Move Claude and Codex provider implementations, factory, and SDK dependencies into a new @archon/providers package. This establishes a clean boundary: providers own SDK translation, core owns business logic. Key changes: - New @archon/providers package with zero-dep contract layer (types.ts) - @archon/workflows imports from @archon/providers/types — no mirror types - dag-executor delegates option building to providers via nodeConfig - IAgentProvider gains getCapabilities() for provider-agnostic warnings - @archon/core no longer depends on SDK packages directly - UnknownProviderError standardizes error shape across all surfaces Zero user-facing changes — same providers, same config, same behavior. * refactor: remove config type duplication and backward-compat re-exports Address review findings: - Move ClaudeProviderDefaults and CodexProviderDefaults to the @archon/providers/types contract layer as the single source of truth. @archon/core/config/config-types.ts now imports from there. - Remove provider re-exports from @archon/core (index.ts and types/). Consumers should import from @archon/providers directly. - Update @archon/server to depend on @archon/providers for MessageChunk. * refactor: move structured output validation into providers Each provider now normalizes its own structured output semantics: - Claude already yields structuredOutput from the SDK's native field - Codex now parses inline agent_message text as JSON when outputFormat is set, populating structuredOutput on the result chunk This eliminates the last provider === 'codex' branch from dag-executor, making it fully provider-agnostic. The dag-executor checks structuredOutput uniformly regardless of provider. Also removes the ClaudeCodexProviderDefaults deprecated alias — all consumers now use ClaudeProviderDefaults directly. * fix: address PR review — restore warnings, fix loop options, cleanup Critical fixes: - Restore MCP missing env vars user-facing warning (was silently dropped) - Restore Haiku + MCP tool search warning - Fix buildLoopNodeOptions to pass workflow-level nodeConfig (effort, thinking, betas, sandbox were silently lost for loop nodes) - Add TODO(#1135) comments documenting env-leak gate gap Cleanup: - Remove backward-compat type aliases from deps.ts (keep WorkflowTokenUsage) - Remove 26 unnecessary eslint-disable comments from test files - Trim internal helpers from providers barrel (withFirstMessageTimeout, getProcessUid, loadMcpConfig, buildSDKHooksFromYAML) - Add @archon/providers dep to CLI package.json - Fix 8 stale documentation paths pointing to deleted core/src/providers/ - Add E2E smoke test workflows for both Claude and Codex providers * fix: forward provider system warnings to users in dag-executor The dag-executor only forwarded system chunks starting with "MCP server connection failed:" — all other provider warnings (missing env vars, Haiku+MCP, structured output issues) were logged but never reached the user. Now forwards all system chunks starting with ⚠️ (the prefix providers use for user-actionable warnings). * fix: add providers package to Dockerfile and fix CI module resolution - Add packages/providers/ to all three Dockerfile stages (deps, production package.json copy, production source copy) - Replace wildcard export map (./) with explicit subpath entries to fix module resolution in CI (bun workspace linking) chore: update bun.lock for providers package exports	2026-04-13 09:21:36 +03:00
Rasmus Widing	eb75ab60e5	Merge pull request #1130 from coleam00/rules-cleanup docs: consolidate Claude guidance into CLAUDE.md	2026-04-12 20:31:49 +03:00
Rasmus Widing	39c6f05bad	docs: consolidate Claude guidance into CLAUDE.md	2026-04-12 20:21:16 +03:00
Rasmus Widing	a4242e6b49	Merge pull request #1116 from coleam00/rename-iassistantclient-to-iagentprovider refactor: rename IAssistantClient to IAgentProvider	2026-04-12 20:02:49 +03:00
Rasmus Widing	a7b3b94388	refactor: simplify provider rename follow-through - ProviderDefaults → CodexProviderDefaults (symmetric with ClaudeProviderDefaults) - Fix stale "AI client" comments in orchestrator-agent.ts and orchestrator.test.ts - Remove dead createMockAgentProvider in test/mocks/streaming.ts (zero importers, wrong method names) - Fix irregular whitespace in .claude/rules/workflows.md	2026-04-12 13:51:45 +03:00
Rasmus Widing	b9a70a5d17	refactor: complete provider rename in config types, logger domains, and docs - AssistantDefaults → ProviderDefaults, ClaudeAssistantDefaults → ClaudeProviderDefaults - Logger domains: client.claude → provider.claude, client.codex → provider.codex - Fix stale JSDoc, error messages, and references in architecture docs, CHANGELOG, testing rules	2026-04-12 13:47:05 +03:00
Rasmus Widing	91c184af57	refactor: rename IAssistantClient to IAgentProvider Rename the core AI provider interface and all related types, classes, factory functions, and directory from clients/ to providers/. Rename map: - IAssistantClient → IAgentProvider - ClaudeClient → ClaudeProvider - CodexClient → CodexProvider - getAssistantClient → getAgentProvider - AssistantRequestOptions → AgentRequestOptions - IWorkflowAssistantClient → IWorkflowAgentProvider - AssistantClientFactory → AgentProviderFactory - WorkflowAssistantOptions → WorkflowAgentOptions - packages/core/src/clients/ → packages/core/src/providers/ NOT renamed (user-facing/DB-stored): assistant config key, DEFAULT_AI_ASSISTANT env var, ai_assistant_type DB column. No behavioral changes — purely naming.	2026-04-12 13:11:21 +03:00
github-actions[bot]	c2089117fa	chore: update Homebrew formula for v0.3.6	2026-04-12 09:19:27 +00:00
Rasmus Widing	59cda08efa	Merge pull request #1114 from coleam00/dev Release 0.3.6	2026-04-12 12:17:34 +03:00
Rasmus Widing	883d1369f4	Release 0.3.6	2026-04-12 12:16:49 +03:00
Cole Medin	6da994815c	fix: strip CWD .env leak, remove subprocess allowlist, add first-event timeout (#1067 , #1030 , #1098 , #1070 ) * fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout (#1067) Three bugs fixed: (1) Bun auto-loads CWD .env files before user code, leaking non-overlapping keys into the Archon process — new stripCwdEnv() boot import removes them before any module reads env. (2) archon serve hardcoded skipPlatformAdapters:true, preventing Slack/Telegram/Discord from starting. (3) Claude SDK query had no first-event timeout, causing silent 30-min hangs when the subprocess wedges — new withFirstMessageTimeout wrapper races the first event against a configurable deadline (default 60s). Changes: - Add @archon/paths/strip-cwd-env and strip-cwd-env-boot modules - Import boot module as first import in CLI entry point - Remove skipPlatformAdapters: true from serve.ts - Add withFirstMessageTimeout + diagnostics to ClaudeClient - Add CLAUDECODE=1 nested-session warning to CLI - Add 9 unit tests (6 strip-cwd-env + 3 timeout) Fixes #1067 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review findings for PR #1092 Fixed: - Clear setTimeout timer in withFirstMessageTimeout finally block (HIGH-1) - Add strip-cwd-env-boot to server/src/index.ts for direct dev:server path (MEDIUM-1) - Warn to stderr on non-ENOENT errors in stripCwdEnv (MEDIUM-2) - Update stale configuration.md docs for new env-loading mechanism (HIGH-2) - Add ARCHON_CLAUDE_FIRST_EVENT_TIMEOUT_MS and ARCHON_SUPPRESS_NESTED_CLAUDE_WARNING env vars to docs - Add nested Claude Code hang troubleshooting entry - Fix boot module JSDoc: "CLI and server" → "CLI" only - Fix stripCwdEnv JSDoc: remove stale "override: true" reference - Update .claude/rules/cli.md startup behavior section - Update CLAUDE.md @archon/paths description with new exports Tests added: - Assert controller.signal.aborted on timeout - Handle generator that completes immediately without yielding - Strip distinct keys from different .env files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * simplify: replace string sentinel with typed error class in withFirstMessageTimeout Replace the '__timeout__' string sentinel used to identify timeout rejections with a dedicated FirstEventTimeoutError class. instanceof checks are more explicit and robust than string comparison on error messages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address review findings — dotenv version, docs, server warning, marker strip, tests 1. Align dotenv to ^17 (was ^16, rest of monorepo uses ^17.2.3) 2. Remove incorrect SUBPROCESS_ENV_ALLOWLIST claim from docs — the SDK bypasses the env option and uses process.env directly (#1097) 3. Add CLAUDECODE=1 warning to server entry point (was only in CLI) 4. Add diagnostic payload content test for withFirstMessageTimeout 5. Integrate #1097's finding: strip CLAUDECODE + CLAUDE_CODE_* session markers (except auth vars) + NODE_OPTIONS + VSCODE_INSPECTOR_OPTIONS from process.env at entry point. Pattern-matched on CLAUDE_CODE_* prefix rather than hardcoding 6 names, so future Claude Code markers are handled automatically. Auth vars (CLAUDE_CODE_OAUTH_TOKEN, CLAUDE_CODE_USE_BEDROCK, CLAUDE_CODE_USE_VERTEX) are preserved. Root cause per #1097: the Claude Agent SDK leaks process.env into the spawned child regardless of the explicit env option, so the only way to prevent the nested-session deadlock is to delete the markers from process.env at the entry point. Validation: bun run validate passes, 125 paths tests (6 new marker tests), 60 claude tests (1 new diagnostic test), DATABASE_URL leak verified stripped (target repo .env DATABASE_URL does not affect Archon DB selection). * refactor: remove SUBPROCESS_ENV_ALLOWLIST — trust user config, strip only CWD The allowlist was wrong for a single-developer tool: - It blocked keys the user intentionally set in ~/.archon/.env (ANTHROPIC_API_KEY, AWS_, CLAUDE_CONFIG_DIR, MiniMax vars, etc.) - It was bypassed by the SDK anyway (process.env leaks to subprocess regardless of the env option — see #1097) - It attracted a constant stream of PRs adding keys (#1060, #1093, #1099) New model: CWD .env keys are the only untrusted source. stripCwdEnv() at entry point handles that. Everything in ~/.archon/.env + shell env passes through to the subprocess. No filtering, no second-guessing. Changes: - Delete env-allowlist.ts and env-allowlist.test.ts - Simplify buildSubprocessEnv() to return { ...process.env } with auth-mode logging (no token stripping — user controls their config) - Replace 4 allowlist-based tests with 1 pass-through test - Remove env-allowlist.test.ts from core test batch - Update security.md and cli.md docs to reflect the new model The CLAUDECODE + CLAUDE_CODE_ marker strip and NODE_OPTIONS strip remain in stripCwdEnv() at entry point — those are process-level safety (not per-subprocess filtering) and are needed regardless. * fix: restore override:true for archon env, add integration tests The integration tests caught a real issue: without override:true, the ~/.archon/.env load doesn't win over shell-inherited env vars. If the user's shell profile exports PORT=9999 and ~/.archon/.env has PORT=3000, the user expects Archon to use 3000. stripCwdEnv() handles CWD .env files (untrusted). override:true handles shell-inherited vars (trusted but less specific than ~/.archon/.env). Different concerns, both needed. Also adds 6 integration tests covering the full entry-point flow: 1. Global auth user with ANTHROPIC_API_KEY in CWD .env — stripped 2. OAuth token in archon env + random key in CWD — CWD stripped, archon kept 3. General leak test — nothing from CWD reaches subprocess 4. Same key in both CWD and archon — archon value wins 5. CLAUDECODE markers stripped even when not from CWD .env 6. CLAUDE_CODE_OAUTH_TOKEN survives marker strip * test: add DATABASE_URL leak scenarios to env integration tests * fix: move CLAUDECODE warning into stripCwdEnv, remove dead useGlobalAuth logic Review findings addressed: 1. CLAUDECODE warning was dead code — the boot import deleted CLAUDECODE from process.env before the warning check in cli.ts/server/index.ts could fire. Moved the warning into stripCwdEnv() itself, emitted BEFORE the deletion. Removed duplicate warning code from both entry points. 2. useGlobalAuth token stripping removed (intentional, not regression) — the old code stripped CLAUDE_CODE_OAUTH_TOKEN and CLAUDE_API_KEY when useGlobalAuth=true. Per design discussion: the user controls ~/.archon/.env and all keys they set are intentional. If they want global auth, they just don't set tokens. Simplified buildSubprocessEnv to log auth mode for diagnostics only, no filtering. 3. Docs "no override needed" corrected — cli.md and configuration.md now reflect the actual code (override: true). --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>	2026-04-12 12:11:16 +03:00
Cole Medin	b620c04e27	fix(web): add defensive optional chaining for workflow run data access Prevents "Cannot read properties of undefined (reading 'status')" crash when navigating between chat and workflow execution views during race conditions where run data may be transiently undefined. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 20:09:09 -05:00
Cole Medin	bf8bc8e4ae	fix: address review findings for workflow context injection - CRITICAL: fix metadata filter in getRecentWorkflowResultMessages to check for workflowResult key presence instead of category (which is never persisted to DB); feature was completely non-functional on every call - HIGH: guard JSON.parse(msg.metadata) with typeof check to handle PostgreSQL JSONB columns returned as objects (not strings) by node-postgres - MEDIUM: add structured warn log inside inner metadata parse catch block - LOW: use SELECT id, content, metadata instead of SELECT * in new DB query - LOW: update comments in messages.ts and prompt-builder.ts for accuracy - Tests: add formatWorkflowContextSection unit tests (pure function coverage) - Tests: add getRecentWorkflowResultMessages tests (dialect switch + contract) - Tests: add getDatabaseType mock to messages.test.ts connection mock - Tests: add ../db/messages mock and formatWorkflowContextSection to prompt-builder mock in orchestrator-agent.test.ts - Tests: add handleMessage workflow context injection behavioral tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 17:59:19 -05:00
Cole Medin	4292c3a24b	simplify: replace nested ternary with if/else for headerTitle in WorkflowResultCard Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 17:49:55 -05:00
Cole Medin	e4555a769b	simplify: reduce complexity in changed files - Parallelize checksums + tarball fetch in serve.ts (removes waterfall latency) - Remove redundant existsSync before readFileSync in update-check.ts (catch already handles ENOENT) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 17:47:53 -05:00
Cole Medin	dbe559efd1	fix(web): address review findings — logging and test extraction - Add console.error logging to silent .catch on SSE reconnect re-fetch (ChatInterface.tsx:~544) so production failures are visible in logs - Extract onText setMessages reducer to chat-message-reducer.ts as a pure function (applyOnText) with 14 unit tests covering all 6 segmentation rules including the new tool-call boundary (issue #1054) - Refactor ChatInterface.onText to delegate to applyOnText Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 17:45:08 -05:00
Cole Medin	3e3ddf25d5	feat: inject workflow run context into orchestrator prompt (#1055 ) After a workflow completes, the AI had no awareness of results when answering follow-up questions. This adds a "Recent Workflow Results" section to the orchestrator prompt by querying persisted workflow_result messages from the conversation. Changes: - Add getRecentWorkflowResultMessages() to db/messages.ts - Add WorkflowResultContext type and formatWorkflowContextSection() to prompt-builder.ts - Extend buildFullPrompt() with optional workflowContext parameter - Fetch and inject workflow context in handleMessage() before prompt building Fixes #1055 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 17:34:17 -05:00
Cole Medin	4ee5232da3	fix(web): interleave tool calls with text during SSE streaming (#1054 ) During SSE streaming, tool calls always appeared below all text because onText appended to the existing message even when it already had tool calls. The server-side persistence already segments at this boundary. Mirror that rule in the client's onText handler: when the last streaming message has tool calls, seal it and start a new message for incoming text. Fixes #1054 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 17:31:38 -05:00
Cole Medin	16b47d3dde	fix: archon setup --spawn fails on Windows when repo path contains spaces (#1035 ) The cmd.exe fallback in spawnWindowsTerminal() used shell: true, which caused Bun/Node to flatten args into a single string without proper quoting. Paths with spaces were split at whitespace, breaking the /D argument to start. Changes: - Remove shell: true from cmd.exe fallback spawn options - Remove shell?: boolean from trySpawn options type (no callers need it) Fixes #1035 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 17:29:25 -05:00
Cole Medin	536584db8f	Merge pull request #1026 from coleam00/archon/task-fix-issue-1014 feat(web): loop node iteration visibility in workflow execution view	2026-04-10 16:14:12 -05:00
Cole Medin	60ddda3a12	revert: remove incorrect remainingMessage suppression in stream mode The suppression broke the "sends remaining message before dispatching workflow" test — when the AI response contains both text and a command in a single chunk, the text was never streamed, so suppressing remainingMessage loses it entirely. The actual duplicate was in the WorkflowLogs execution view, not the routing AI path, and is already fixed by the onText message splitting and text content dedup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 16:13:45 -05:00
Cole Medin	1eddf3e6aa	fix(web): split workflow status messages in WorkflowLogs onText handler WorkflowLogs' onText handler was blindly concatenating all SSE text into a single streaming message, unlike ChatInterface which splits on workflow status text (🚀/✅). This caused the "Starting workflow" text to merge with subsequent text into one giant message, breaking text dedup against DB messages (which are stored as separate segments). The SSE message content never matched any single DB message exactly, so both appeared. Add the same workflow status boundary detection from ChatInterface: close the current streaming message and start a new one when a workflow status message arrives or when regular text follows a status message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 15:51:40 -05:00
Cole Medin	4e56c86dff	fix: eliminate duplicate text and tool calls in workflow execution view Three fixes for message duplication during live workflow execution: 1. dag-executor: Add missing `tool_call_formatted` category to loop iteration tool messages. Without this, the web adapter sent tool text as both a regular SSE text event AND a structured tool_call event, causing each tool to appear twice (raw text + rendered card). Regular DAG nodes already had this metadata. 2. WorkflowLogs: Add text content dedup in SSE/DB merge. During live execution, the same text (e.g. "Starting workflow...") can appear in both DB (REST fetch) and SSE (event buffer replay). Collects DB text into a Set and skips matching SSE text messages. 3. orchestrator-agent: Suppress remainingMessage re-send in stream mode. The routing AI streams text chunks before /invoke-workflow is detected, then retracts them. Without suppression, remainingMessage re-sends the same text. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 15:48:40 -05:00
Cole Medin	5685b41d18	fix(cli): add cli. domain prefix to log event names Apply review finding: rename flat log event names to use the cli.{action}_{state} convention matching the rest of the file. - workflow_dispatch_surface_failed → cli.workflow_dispatch_surface_failed - workflow_output_surface_failed → cli.workflow_result_surface_failed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 10:27:12 -05:00
Cole Medin	b8e367f35d	simplify: reduce complexity in changed files Deduplicate JSON branch in workflowStatusCommand by computing the output array once with a single console.log call, removing the duplicated verbose/non-verbose conditional branches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 10:16:27 -05:00
Cole Medin	e8334b313a	Merge branch 'archon/task-fix-issue-1015' into dev Resolve merge conflict in MessageList.tsx by combining: - PR #1025: status/duration/nodes/artifacts enrichment for WorkflowResultCard - PR #1023: ArtifactViewerModal clickable file paths in result card content Both features now work together — the result card shows status-aware headers, node counts, duration, and artifact summaries while also supporting clickable artifact file paths in the markdown content. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 10:16:06 -05:00

1 2 3 4 5 ...

1284 commits