Archon

mirror of https://github.com/coleam00/Archon synced 2026-04-21 13:37:41 +00:00

Author	SHA1	Message	Date
Alex Siri	7ea321419f	fix: initialize options.hooks before merging YAML node hooks (#1177 ) Some checks are pending E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details Test Suite / test (windows-latest) (push) Waiting to run Details Test Suite / docker-build (push) Waiting to run Details When a workflow node defines hooks (PreToolUse/PostToolUse) in YAML but no hooks exist yet on the options object, applyNodeConfig crashes with "undefined is not an object" because it tries to assign properties on the undefined options.hooks. Initialize options.hooks to {} before the merge loop. Reproduces with: archon workflow run archon-architect (which uses per-node hooks extensively). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 14:52:56 +03:00
Rasmus Widing	ba4b9b47e6	docs(worktree): fix stale rename example + document copyFiles properly (#1328 ) Three related fixes around the `worktree.copyFiles` primitive: 1. Remove the `.env.example -> .env` rename example from reference/configuration.md and getting-started/overview.md. The `->` parser was removed in #739 (2026-03-19) because it caused the stale-credentials production bug in #228 — but the docs kept advertising it. A user writing `.env.example -> .env` today gets `parseCopyFileEntry` returning `{source: '.env.example -> .env', destination: '.env.example -> .env'}`, stat() fails with ENOENT, and the copy silently no-ops at debug level. 2. Replace the single-line "Default behavior: .archon/ is always copied" note with a proper "Worktree file copying" subsection that explains: - Why this exists (git worktree add = tracked files only; gitignored workflow inputs need this hook) - The `.archon/` default (no config needed for the common case) - Common entries: .env, .vscode/, .claude/, plans/, reports/, data fixtures - Semantics: source=destination, ENOENT silently skipped, per-entry error isolation, path-traversal rejected - Interaction with `worktree.path` (both layouts get the same treatment) 3. Update the overview example to drop the `.env.example + .env` pair (which implied rename semantics) in favor of `.env + plans/`, and call out that `.archon/` is auto-copied so users don't list it. No code changes. `bun run format:check` and `bun run lint` green.	2026-04-21 12:15:37 +03:00
Lior Franko	08de8ee5c6	fix(web,server): show real platform connection status in Settings (#1061 ) The Settings page's Platform Connections section hardcoded every platform except Web to 'Not configured', so users couldn't tell whether their Slack/ Telegram/Discord/GitHub/Gitea/GitLab adapters had actually started. - Server: /api/health now returns an activePlatforms array populated live as each adapter's start() resolves. Passed into registerApiRoutes so the reference stays mutable — Telegram starts after the HTTP listener is already accepting requests, so a snapshot would miss it. - Web: SettingsPage.PlatformConnectionsSection now reads activePlatforms from /api/health and looks each platform up in a Set. Also adds Gitea and GitLab to the list (they already ship as adapters). Closes #1031 Co-authored-by: Lior Franko <liorfr@dreamgroup.com>	2026-04-21 11:47:32 +03:00
Rasmus Widing	5ed38dc765	feat(isolation,workflows): worktree location + per-workflow isolation policy (#1310 ) Some checks are pending E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details Test Suite / test (windows-latest) (push) Waiting to run Details Test Suite / docker-build (push) Waiting to run Details * feat(isolation): per-project worktree.path + collapse to two layouts Adds an opt-in `worktree.path` to .archon/config.yaml so a repo can co-locate worktrees with its own checkout (`<repoRoot>/<path>/<branch>`) instead of the default `~/.archon/workspaces/<owner>/<repo>/worktrees/<branch>`. Requested in joelsb's #1117. Primitive changes (clean up the graveyard rather than add parallel code paths): - Collapse worktree layouts from three to two. The old "legacy global" layout (`~/.archon/worktrees/<owner>/<repo>/<branch>`) is gone — every repo resolves to the workspace-scoped layout (`~/.archon/workspaces/<owner>/<repo>/worktrees/<branch>`), whether it was archon-cloned or locally registered. `extractOwnerRepo()` on the repo path is the stable identity fallback. Ends the divergence where workspace-cloned and local repos had visibly different worktree trees. - `getWorktreeBase()` in @archon/git now returns `{ base, layout }` and accepts an optional `{ repoLocal }` override. The layout value replaces the old `isProjectScopedWorktreeBase()` classification at the call sites (`isProjectScopedWorktreeBase` stays exported as deprecated back-compat). - `WorktreeCreateConfig.path` carries the validated override from repo config. `resolveRepoLocalOverride()` fails loudly on absolute paths, `..` escapes, and resolve-escape edge cases (Fail Fast — no silent default fallback when the config is syntactically wrong). - `WorktreeProvider.create()` now loads repo config exactly once and threads it through `getWorktreePath()` + `createWorktree()`. Replaces the prior swallow-then-retry pattern flagged on #1117. `generateEnvId()` is gone — envId is assigned directly from the resolved path (the invariant was already documented on `destroy(envId)`). Tests (packages/git + packages/isolation): - Update the pre-existing `getWorktreeBase` / `isProjectScopedWorktreeBase` suite for the new two-layout return shape and precedence. - Add 8 tests for `worktree.path`: default fallthrough, empty/whitespace ignored, override wins for workspace-scoped repos, rejects absolute, rejects `../` escapes (three variants), accepts nested relative paths. Docs: add `worktree.path` to the repo config reference with explicit precedence and the `.gitignore` responsibility note. Co-authored-by: Joel Bastos <joelsb2001@gmail.com> * feat(workflows): per-workflow worktree.enabled policy Introduces a declarative top-level `worktree:` block on a workflow so authors can pin isolation behavior regardless of invocation surface. Solves the case where read-only workflows (e.g. `repo-triage`) should always run in the live checkout, without every CLI/web/scheduled-trigger caller having to remember to set the right flag. Schema (packages/workflows/src/schemas/workflow.ts + loader.ts): - New optional `worktree.enabled: boolean` on `workflowBaseSchema`. Loader parses with the same warn-and-ignore discipline used for `interactive` and `modelReasoningEffort` — invalid shapes log and drop rather than killing workflow discovery. Policy reconciliation (packages/cli/src/commands/workflow.ts): - Three hard-error cases when YAML policy contradicts invocation flags: • `enabled: false` + `--branch` (worktree required by flag, forbidden by policy) • `enabled: false` + `--from` (start-point only meaningful with worktree) • `enabled: true` + `--no-worktree` (policy requires worktree, flag forbids it) - `enabled: false` + `--no-worktree` is redundant, accepted silently. - `--resume` ignores the pinned policy (it reuses the existing run's worktree even when policy would disable — avoids disturbing a paused run). Orchestrator wiring (packages/core/src/orchestrator/orchestrator-agent.ts): - `dispatchOrchestratorWorkflow` short-circuits `validateAndResolveIsolation` when `workflow.worktree?.enabled === false` and runs directly in `codebase.default_cwd`. Web chat/slack/telegram callers have no flag equivalent to `--no-worktree`, so the YAML field is their only control. - Logged as `workflow.worktree_disabled_by_policy` for operator visibility. First consumer (.archon/workflows/repo-triage.yaml): - `worktree: { enabled: false }` — triage reads issues/PRs and writes gh labels; no code mutations, no reason to spin up a worktree per run. Tests: - Loader: parses `worktree.enabled: true\|false`, omits block when absent. - CLI: four new integration tests for the reconciliation matrix (skip when policy false, three hard-error cases, redundant `--no-worktree` accepted, `--no-worktree` + `enabled: true` rejected). Docs: authoring-workflows.md gets the new top-level field in the schema example with a comment explaining the precedence and the `enabled: true\|false` semantics. * fix(isolation): use path.sep for repo-containment check on Windows resolveRepoLocalOverride was hardcoding '/' as the separator in the startsWith check, so on Windows (where `resolve()` returns backslash paths like `D:\Users\dev\Projects\myapp`) every otherwise-valid relative `worktree.path` was rejected with "resolves outside the repo root". Fixed by importing `path.sep` and using it in the sentinel. Fixes the 3 Windows CI failures in `worktree.path repo-local override`. --------- Co-authored-by: Joel Bastos <joelsb2001@gmail.com>	2026-04-20 21:54:10 +03:00
Rasmus Widing	7be4d0a35e	feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath (closes #1136 ) (#1315 ) * feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath Collapses the awkward `~/.archon/.archon/workflows/` convention to a direct `~/.archon/workflows/` child (matching `workspaces/`, `archon.db`, etc.), adds home-scoped commands and scripts with the same loading story, and kills the opt-in `globalSearchPath` parameter so every call site gets home-scope for free. Closes #1136 (supersedes @jonasvanderhaegen's tactical fix — the bug was the primitive itself: an easy-to-forget parameter that five of six call sites on dev dropped). Primitive changes: - Home paths are direct children of `~/.archon/`. New helpers in `@archon/paths`: `getHomeWorkflowsPath()`, `getHomeCommandsPath()`, `getHomeScriptsPath()`, and `getLegacyHomeWorkflowsPath()` (detection-only for migration). - `discoverWorkflowsWithConfig(cwd, loadConfig)` reads home-scope internally. The old `{ globalSearchPath }` option is removed. Chat command handler, Web UI workflow picker, orchestrator resolve path — all inherit home-scope for free without maintainer patches at every new site. - `discoverScriptsForCwd(cwd)` merges home + repo scripts (repo wins on name collision). dag-executor and validator use it; the hardcoded `resolve(cwd, '.archon', 'scripts')` single-scope path is gone. - Command resolution is now walked-by-basename in each scope. `loadCommand` and `resolveCommand` walk 1 subfolder deep and match by `.md` basename, so `.archon/commands/triage/review.md` resolves as `review` — closes the latent bug where subfolder commands were listed but unresolvable. - All three (`workflows/`, `commands/`, `scripts/`) enforce a 1-level subfolder cap (matches the existing `defaults/` convention). Deeper nesting is silently skipped. - `WorkflowSource` gains `'global'` alongside `'bundled'` and `'project'`. Web UI node palette shows a dedicated "Global (~/.archon/commands/)" section; badges updated. Migration (clean cut — no fallback read): - First use after upgrade: if `~/.archon/.archon/workflows/` exists, Archon logs a one-time WARN per process with the exact `mv` command: `mv ~/.archon/.archon/workflows ~/.archon/workflows && rmdir ~/.archon/.archon` The legacy path is NOT read — users migrate manually. Rollback caveat noted in CHANGELOG. Tests: - `@archon/paths/archon-paths.test.ts`: new helper tests (default HOME, ARCHON_HOME override, Docker), plus regression guards for the double-`.archon/` path. - `@archon/workflows/loader.test.ts`: home-scoped workflows, precedence, subfolder 1-depth cap, legacy-path deprecation warning fires exactly once per process. - `@archon/workflows/validator.test.ts`: home-scoped commands + subfolder resolution. - `@archon/workflows/script-discovery.test.ts`: depth cap + merge semantics (repo wins, home-missing tolerance). - Existing CLI + orchestrator tests updated to drop `globalSearchPath` assertions. E2E smoke (verified locally, before cleanup): - `.archon/workflows/e2e-home-scope.yaml` + scratch repo at /tmp - Home-scoped workflow discovered from an unrelated git repo - Home-scoped script (`~/.archon/scripts/.ts`) executes inside a script node - 1-level subfolder workflow (`~/.archon/workflows/triage/.yaml`) listed - Legacy path warning fires with actionable `mv` command; workflows there are NOT loaded Docs: `CLAUDE.md`, `docs-web/guides/global-workflows.md` (full rewrite for three-type scope + subfolder convention + migration), `docs-web/reference/ configuration.md` (directory tree), `docs-web/reference/cli.md`, `docs-web/guides/authoring-workflows.md`. Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com> * test(script-discovery): normalize path separators in mocks for Windows The 4 new tests in `scanScriptDir depth cap` and `discoverScriptsForCwd — merge repo + home with repo winning` compared incoming mock paths with hardcoded forward-slash strings (`if (path === '/scripts/triage')`). On Windows, `path.join('/scripts', 'triage')` produces `\scripts\triage`, so those branches never matched, readdir returned `[]`, and the tests failed. Added a `norm()` helper at module scope and wrapped the incoming `path` argument in every `mockImplementation` before comparing. Stored paths go through `normalizeSep()` in production code, so the existing equality assertions on `script.path` remain OS-independent. Fixes Windows CI job `test (windows-latest)` on PR #1315. * address review feedback: home-scope error handling, depth cap, and tests Critical fixes: - api.ts: add `maxDepth: 1` to all 3 findMarkdownFilesRecursive calls in GET /api/commands (bundled/home/project). Without this the UI palette surfaced commands from deep subfolders that the executor (capped at 1) could not resolve — silent "command not found" at runtime. - validator.ts: wrap home-scope findMarkdownFilesRecursive and resolveCommandInDir calls in try/catch so EACCES/EPERM on ~/.archon/commands/ doesn't crash the validator with a raw filesystem error. ENOENT still returns [] via the underlying helper. Error handling fixes: - workflow-discovery.ts: maybeWarnLegacyHomePath now sets the "warned-once" flag eagerly before `await access()`, so concurrent discovery calls (server startup with parallel codebase resolution) can't double-warn. Non-ENOENT probe errors (EACCES/EPERM) now log at WARN instead of DEBUG so permission issues on the legacy dir are visible in default operation. - dag-executor.ts: wrap discoverScriptsForCwd in its own try/catch so an EACCES on ~/.archon/scripts/ routes through safeSendMessage / logNodeError with a dedicated "failed to discover scripts" message instead of being mis-attributed by the outer catch's "permission denied (check cwd permissions)" branch. Tests: - load-command-prompt.test.ts (new): 6 tests covering the executor's command resolution hot path — home-scope resolves when repo misses, repo shadows home, 1-level subfolder resolvable by basename, 2-level rejected, not-found, empty-file. Runs in its own bun test batch. - archon-paths.test.ts: add getHomeScriptsPath describe block to match the existing getHomeCommandsPath / getHomeWorkflowsPath coverage. Comment clarity: - workflow-discovery.ts: MAX_DISCOVERY_DEPTH comment now leads with the actual value (1) before describing what 0 would mean. - script-discovery.ts: copy the "routing ambiguity" rationale from MAX_DISCOVERY_DEPTH to MAX_SCRIPT_DISCOVERY_DEPTH. Cleanup: - Remove .archon/workflows/e2e-home-scope.yaml — one-off smoke test that would ship permanently in every project's workflow list. Equivalent coverage exists in loader.test.ts. Addresses all blocking and important feedback from the multi-agent review on PR #1315. --------- Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>	2026-04-20 21:45:32 +03:00
Rasmus Widing	cc78071ff6	fix(isolation): raise worktree git-operation timeout to 5m (#1306 ) All 15 worktree git-subprocess timeouts in WorktreeProvider were hardcoded at 30000ms. Repos with heavy post-checkout hooks (lint, dependency install, submodule init) routinely exceed that budget and fail worktree creation. Consolidate them onto a single GIT_OPERATION_TIMEOUT_MS constant at 5 min. Generous enough to cover reported cases while still catching genuine hangs (credential prompts in non-TTY, stalled fetches). Chosen over the config-key approach in #1029 to avoid adding permanent .archon/config.yaml surface for a problem a raised default solves cleanly. If 5 min turns out to also be too tight for real-world use, we'll revisit. Closes #1119 Supersedes #1029 Co-authored-by: Shay Elmualem <12733941+norbinsh@users.noreply.github.com>	2026-04-20 21:45:24 +03:00
Kagura	39a05b762f	fix(db): throw on corrupt commands JSON instead of silent empty fallback (#1033 ) * fix(db): throw on corrupt commands JSON instead of silent empty fallback (#967) getCodebaseCommands() silently returned {} when the commands column contained corrupt JSON. Callers had no way to distinguish 'no commands' from 'unreadable data', violating fail-fast principles. Now throws a descriptive error with the codebase ID and a recovery hint. The error is still logged for observability before throwing. Adds two test cases: corrupt JSON throws, valid JSON string parses. * fix: include parse error in log for better diagnostics	2026-04-20 16:19:50 +03:00
Cole Medin	cb44b96f7b	feat(providers/pi): interactive flag binds UIContext for extensions (#1299 ) * feat(providers/pi): interactive flag binds UIContext for extensions Adds `interactive: true` opt-in to Pi provider (in `.archon/config.yaml` under `assistants.pi`) that binds a minimal `ExtensionUIContext` stub to each session. Without this, Pi's `ExtensionRunner.hasUI()` reports false, causing extensions like `@plannotator/pi-extension` to silently auto-approve every plan instead of opening their browser review UI. Semantics: clamped to `enableExtensions: true` — no extensions loaded means nothing would consume `hasUI`, so `interactive` alone is silently dropped. Stub forwards `notify()` to Archon's event stream; interactive dialogs (select/confirm/input/editor/custom) resolve to undefined/false; TUI-only setters (widgets/headers/footers/themes) no-op. Theme access throws with a clear diagnostic — Pi's theme singleton is coupled to its own `Symbol.for()` registry which Archon doesn't own. Trust boundary: only binds when the operator has explicitly enabled both flags. Extensions gated on `ctx.hasUI` (plannotator and similar) get a functional UI context; extensions that reach for TUI features still fail loudly rather than rendering garbage. Includes smoke-test workflow documenting the integration surface. End-to-end plannotator UI rendering requires plan-mode activation (Pi `--plan` CLI flag or `/plannotator` TUI slash command) which is out of reach for programmatic Archon sessions — manual test only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(providers/pi): end-to-end interactive extension UI Three fixes that together get plannotator's browser review UI to actually render from an Archon workflow and reach the reviewer's browser. 1. Call resourceLoader.reload() when enableExtensions is true. createAgentSession's internal reload is gated on `!resourceLoader`, so caller-supplied loaders must reload themselves. Without this, getExtensions() returns the empty default, no ExtensionRunner is built, and session.extensionRunner.setFlagValue() silently no-ops. 2. Set PLANNOTATOR_REMOTE=1 in interactive mode. plannotator-browser.ts only calls ctx.ui.notify(url) when openBrowser() returns { isRemote: true }; otherwise it spawns xdg-open/start on the Archon server host — invisible to the user and untestable from bash asserts. From the workflow runner's POV every Archon execution IS remote; flipping the heuristic routes the URL through notify(), which the ExtensionUIContext stub forwards into the event stream. Respect explicit operator overrides. 3. notify() emits as assistant chunks, not system chunks. The DAG executor's system-chunk filter only forwards warnings/MCP prefixes, and only assistant chunks accumulate into $nodeId.output. Emitting as assistant makes the URL available both in the user's stream and in downstream bash/script nodes via output substitution. Plus: extensionFlags config pass-through (equivalent to `pi --plan` on the CLI) applied via ExtensionRunner.setFlagValue() BEFORE bindExtensions fires session_start, so extensions reading flags in their startup handler actually see them. Also bind extensions with an empty binding when enableExtensions is on but interactive is off, so session_start still fires for flag-driven but UI-less extensions. Smoke test (.archon/workflows/e2e-plannotator-smoke.yaml) uses openai-codex/gpt-5.4-mini (ChatGPT Plus OAuth compatible) and bumps idle_timeout to 600000ms so plannotator's server survives while a human approves in the browser. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(providers/pi): keep Archon extension-agnostic Remove the plannotator-specific PLANNOTATOR_REMOTE=1 env var write from the Pi provider. Archon's provider layer shouldn't know about any specific extension's internals. Document the env var in the plannotator smoke test instead — operators who use plannotator set it via their shell or per-codebase env config. Workflow smoke test updated with: - Instructions for setting PLANNOTATOR_REMOTE=1 externally - Simpler assertion (URL emission only) — validated in a real reject-revise-approve run: reviewer annotated, clicked Send Feedback, Pi received the feedback as a tool result, revised the plan (added aria-label and WCAG contrast per the annotation), resubmitted, and reviewer approved. Plannotator's tool result signals approval but doesn't return the plan text, so the bash assertion now only checks that the review URL reached the stream (not that plan content flowed into \$nodeId.output — it can't). - Known-limitation note documenting the tool-result shape so downstream workflow authors know to Write the plan separately if they need it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore(providers/pi): keep e2e-plannotator-smoke workflow local-only The smoke test is plannotator-specific (calls plannotator_submit_plan, expects PLAN.md on disk, requires PLANNOTATOR_REMOTE=1) and is better kept out of the PR while the extension-agnostic infra lands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(providers/pi): trim verbose inline comments Collapse multi-paragraph SDK explanations to 1-2 line "why" notes across provider.ts, types.ts, ui-context-stub.ts, and event-bridge.ts. No behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(providers/pi): wire assistants.pi.env + theme-proxy identity Two end-to-end fixes discovered while exercising the combined plannotator + @pi-agents/loop smoke flow: - PiProviderDefaults gains an optional `env` map; parsePiConfig picks it up and the provider applies it to process.env at session start (shell env wins, no override). Needed so extensions like plannotator can read PLANNOTATOR_REMOTE=1 from config.yaml without requiring a shell export before `archon workflow run`. - ui-context-stub theme proxy returns identity decorators instead of throwing on unknown methods. Styled strings flow into no-op setStatus/setWidget sinks anyway, so the throw was blocking plannotator_submit_plan after HTTP approval with no benefit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(providers/pi): flush notify() chunks immediately in batch mode Batch-mode adapters (CLI) accumulate assistant chunks and only flush on node completion. That broke plannotator's review-URL flow: Pi's notify() emitted the URL as an assistant chunk, but the user needed the URL to POST /api/approve — which is what unblocks the node in the first place. Adds an optional `flush` flag on assistant MessageChunks. notify() sets it, and the DAG executor drains pending batched content before surfacing the flushed chunk so ordering is preserved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: mention Pi alongside Claude and Codex in README + top-level docs The AI assistants docs page already covers Pi in depth, but the README architecture diagram + docs table, overview "Further Reading" section, and local-deployment .env comment still listed only Claude/Codex. Left feature-specific mentions alone where Pi genuinely lacks support (e.g. structured output — Claude + Codex only). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: note Pi structured output (best-effort) in matrix + workflow docs Pi gained structured output support via prompt augmentation + JSON extraction (see packages/providers/src/community/pi/capabilities.ts). Unlike Claude/Codex, which use SDK-enforced JSON mode, Pi appends the schema to the prompt and parses JSON out of the result text (bare or fenced). Updates four stale references that still said Claude/Codex-only: - ai-assistants.md capabilities matrix - authoring-workflows.md (YAML example + field table) - workflow-dag.md skill reference - CLAUDE.md DAG-format node description Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(providers/pi): default extensions + interactive to on Extensions (community packages like @plannotator/pi-extension and user-authored ones) are a core reason users pick Pi. Defaulting enableExtensions and interactive to false previously silenced installed extensions with no signal, leading to "did my extension even load?" confusion. Opt out in .archon/config.yaml when you want the prior behavior: assistants: pi: enableExtensions: false # skip extension discovery entirely # interactive: false # load extensions, but no UI bridge Docs gain a new "Extensions (on by default)" section in getting-started/ai-assistants.md that documents the three config surfaces (extensionFlags, env, workflow-level interactive) and uses plannotator as a concrete walk-through example. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-20 07:37:40 -05:00
Cocoon-Break	45682bd2c8	fix(providers/claude): use \|\| instead of ?? in hasExplicitTokens to handle empty-string env vars (#1028 ) Closes #1027	2026-04-20 14:15:27 +03:00
Rasmus Widing	28908f0c75	feat(paths/cli/setup): unify env load + write on three-path model (#1302 , #1303 ) (#1304 ) * feat(paths/cli/setup): unify env load + write on three-path model (#1302, #1303) Key env handling on directory ownership rather than filename. `.archon/` (at `~/` or `<cwd>/`) is archon-owned; anything else is the user's. - `<repo>/.env` — stripped at boot (guard kept), never loaded, never written - `<repo>/.archon/.env` — loaded at repo scope (wins over home), writable via `archon setup --scope project` - `~/.archon/.env` — loaded at home scope, writable via `--scope home` (default) Read side (#1302): - New `@archon/paths/env-loader` with `loadArchonEnv(cwd)` shared by CLI and server entry points. Loads both archon-owned files with `override: true`; repo scope wins. - Replaced `[dotenv@17.3.1] injecting env (0) from .env` (always lied about stripped keys) with `[archon] stripped N keys from <cwd> (...)` and `[archon] loaded N keys from <path>` lines, emitted only when N > 0. `quiet: true` passed to dotenv to silence its own output. - `stripCwdEnv` unchanged in semantics — still the only source that deletes keys from `process.env`; now logs what it did. Write side (#1303): - `archon setup` never writes to `<repo>/.env`. Writing there was incoherent because `stripCwdEnv` deletes those keys on every run. - New `--scope home\|project` (default home) targets exactly one archon-owned file. New `--force` overrides the merge; backup still written. - Merge-only by default: existing non-empty values win, user-added custom keys survive, `<path>.archon-backup-<ISO-ts>` written before every rewrite. Fixes silent PostgreSQL→SQLite downgrade and silent token loss in Add mode. - One-time migration note emitted when `<cwd>/.env` exists at setup start. Tests: new `env-loader.test.ts` (6), extended `strip-cwd-env.test.ts` (+4 for the log line), extended `setup.test.ts` (+10 for scope/merge/backup/force/ repo-untouched), extended `cli.test.ts` (+5 for flag parsing). Docs: configuration.md, cli.md, security.md, cli-internals.md, setup skill — all updated to the three-path model. * fix(cli/setup): address PR review — scope/path/secret-handling edge cases - cli: resolve --scope project to git repo root so running setup from a subdir writes to <repo-root>/.archon/.env (what loadArchonEnv reads at boot), not <subdir>/.archon/.env. Fail fast with a useful message when --scope project is used outside a git repo. - setup: resolveScopedEnvPath() now delegates to @archon/paths helpers (getArchonEnvPath / getRepoArchonEnvPath) so Docker's /.archon home, ARCHON_HOME overrides, and the "undefined" literal guard all behave identically between the loader and the writer. - setup: wrap the writeScopedEnv call in try/catch so an fs exception (permission denied, read-only FS, backup copy failure) stops the clack spinner cleanly and emits an actionable error instead of a raw stack trace after the user has completed the entire wizard. - setup: checkExistingConfig(envPath?) — scope-aware existing-config read. Add/Update/Fresh now reflects the actual write target, not an unconditional ~/.archon/.env. - setup: serializeEnv escapes \r (was only \n) so values with bare CR or CRLF round-trip through dotenv.parse without corruption. Regression test added. - setup: merge path treats whitespace-only existing values (' ') as empty, so a copy-paste stray space doesn't silently defeat the wizard update for that key forever. Regression test added. - setup: 0o600 mode on the written env file AND on backup copies — writeFileSync+copyFileSync default to 0o666 & ~umask, which can leave secrets group/world-readable on a permissive umask. - docs/cli.md + setup skill: appendix sections that still described the pre-#1303 two-file symlink model now reflect the three-path model. * fix(paths/env-loader): Windows-safe assertion for home-scope load line The test asserted the log line contained `from ~/`, which is opportunistic tilde-shortening that only happens when the tmpdir lives under `homedir()`. On Windows CI the tmpdir is on `D:\\` while homedir is `C:\\Users\\...`, so the path renders absolute and the `~/` never appears. Match on the count and the archon-home tmpdir segment instead — robust on both Unix tilde-short paths and Windows absolute paths.	2026-04-20 12:49:14 +03:00
Rasmus Widing	8ae4a56193	feat(workflows): add repo-triage — periodic maintenance via inline Haiku sub-agents (#1293 ) * feat(workflows): add repo-triage — 6-node periodic maintenance workflow Adds .archon/workflows/repo-triage.yaml: a self-contained periodic maintenance workflow that uses inline sub-agents (Claude SDK agents: field introduced in #1276) for map-reduce across open issues and PRs. Six DAG nodes, three-layer topology: - Layer 1 (parallel): triage-issues, link-prs, closed-pr-dedup-check, stale-nudge - Layer 2: closed-dedup-check (reads triage-issues state) - Layer 3: digest (synthesises all prior nodes + writes markdown) Capabilities per node: - triage-issues: delegates labeling to on-disk triage-agent; inline brief-gen Haiku for duplicate detection; 3-day auto-close clock for unanswered duplicate warnings - link-prs: conservative PR ↔ issue cross-refs via inline pr-issue- matcher Haiku, Sonnet re-verifies fully-addresses claims before suggesting Closes #X; auto-nudges on low-quality PR template fill with first-run grandfather guard (snapshot-only, no nudge spam) - closed-dedup-check: cross-matches open issues against recently- closed ones via inline closed-brief-gen Haiku; same 3-day clock - closed-pr-dedup-check: flags open PRs duplicating recently-closed PRs via inline pr-brief-gen Haiku; comment-only, never closes PRs - stale-nudge: 60-day inactivity pings (configurable); no auto-close - digest: synthesises per-node outputs + reads state files to emit $ARTIFACTS_DIR/digest.md with clickable GitHub comment links Env-gated rollout knobs: - DRY_RUN=1 (read-only; prints [DRY] lines, no gh/state mutations) - SKIP_PR_LINK=1, SKIP_CLOSED_DEDUP=1, SKIP_CLOSED_PR_DEDUP=1, SKIP_STALE_NUDGE=1 - STALE_DAYS=N (stale-nudge window; default 60) Cross-run state under .archon/state/ (gitignored): - triage-state.json briefs + pendingDedupComments - closed-dedup-state.json closedBriefs + closedMatchComments - closed-pr-dedup-state.json openBriefs + closedBriefs + matches - pr-state.json linkedPrs + commentIds + templateAdherence - stale-nudge-state.json nudged (with updatedAtAtNudge for re-nudge) Every bot comment: - @-tags the target human (reporter for issues, author for PRs) - Tracks comment ID in state for traceability - Is idempotent — re-runs skip existing comments Intended use: invoke periodically (`archon workflow run repo-triage --no-worktree`) once a scheduler lands; live state persists across runs so previously-flagged items reconcile correctly. .gitignore: adds .archon/state/ for cross-run memory files. * feat(workflows/repo-triage): post digest to Slack when SLACK_WEBHOOK is set Extends the digest node with an optional Slack-post step after the canonical digest.md artifact is written. Uses Slack incoming webhook (no bot token required beyond the incoming-webhook scope). Behavior: - SLACK_WEBHOOK unset → skipped silently with a one-line note - DRY_RUN=1 → prints full payload, does not curl - Otherwise → POSTs a compact (<3500 char) mrkdwn-formatted summary containing headline numbers, this-run comment index (clickable GitHub URLs), pending items, and a path reference to digest.md - curl failure or non-ok Slack response is logged but does not fail the node — digest.md on disk remains authoritative - Intermediate Slack text written to $ARTIFACTS_DIR/digest-slack.txt for traceability; payload JSON assembled via jq and written to $ARTIFACTS_DIR/slack-payload.json before curl posts it Slack mrkdwn conversion rules baked into the prompt (no tables, link shape <url\|text>, single-asterisk bold) so Sonnet emits a variant that renders cleanly in Slack rather than being sent raw. The webhook URL is read from the operator's environment (Archon auto-loads ~/.archon/.env on CLI startup — put SLACK_WEBHOOK=... there). * fix(workflows/repo-triage): address PR #1293 review feedback Critical (3): - `gh issue close --reason "not planned"` (space, not underscore) — the CLI expects lowercase with a space; `not_planned` fails at runtime. Fixed in both auto-close paths (triage-issues step 8, closed-dedup- check step 7). - link-prs step 7 state save was sparse `{ sha, processedAt, related, fullyAddresses }`, overwriting `commentIds` / `templateNudgedAt` / `templateAdherence`. Changed to explicit merge that spreads existing entry first so per-run captured fields survive. - Corrupt-JSON state files previously treated as first-run default (silent `pendingDedupComments` reset → 3-day clock restarts forever). All five state-load sites now abort loudly on JSON.parse throw; ENOENT/empty continue to default-shape. Important (7): - Sub-agents (`brief-gen`, `closed-brief-gen`, `pr-brief-gen`, `pr-issue-matcher`) emit `ERROR: <reason>` on gh failures rather than partial/fabricated JSON. Orchestrator detects the sentinel, logs the failed ID + first 200 chars of raw response, tracks in a failed-list, and aborts the cluster/match pass if ≥50% of items failed (avoids acting on bad data). - `pr-brief-gen` now sets `diffTruncated: true` when the 30k-char diff cap hits; link-prs verify pass downgrades any `fully-addresses` claim to `related` when either side's brief was truncated. - 3-day auto-close validates `postedAt` parses as ISO-8601 before the elapsed-time comparison; corrupt timestamps are logged and skipped, never acted on. - `gh issue close` failure path no longer drops state — sets `closeAttemptFailed: true` on the entry for next-run retry. Only drops on exit 0. - `closed-pr-dedup-check` idempotency check (`gh pr view --json comments`) now aborts the post on fetch failure rather than falling through — prevents double-posts on gh hiccups. - `triage-agent` label pass has preflight `test -f` check for `.claude/agents/triage-agent.md`; skips the pass with a clear log if the file is missing rather than firing Task calls that fail obscurely. - `brief-gen` template-adherence wording flipped from "Ignore … as 'filled'" (ambiguous, read as affirmative) to explicit "A section counts as MISSING when …", matching the `pr-issue-matcher` phrasing. Minor: - `stale-nudge` idempotency check uses substring "has been quiet for" instead of a prefix check that never matched (posted body starts with @<author>). - `closed-dedup-check` distinguishes "upstream crashed" (missing/corrupt triage-state.json, or `lastRunAt == null`) from "legitimately quiet day" (state present, briefs empty) — different log lines. - Slack curl adds `-w "\nHTTP_STATUS:%{http_code}"` + `2>&1` so TLS / 4xx / 5xx errors are visible in captured output. - `stateReason` values from `gh issue view --json stateReason` are UPPERCASE (`COMPLETED`, `NOT_PLANNED`); documented and instruct sub-agent to normalize to lowercase for consistency. Docs: - CLAUDE.md repo-level `.archon/` tree now lists `state/`. - archon-directories.md tree adds `state/` + `scripts/` (both were missing) with purpose descriptions. Deferred (worth doing as a follow-up, not blocking): - DRY/SKIP preamble duplication (~30-50 lines across 5 nodes). - Explicit `BASELINE_IS_EMPTY` capture in link-prs (current derived check works but is a load-bearing model instruction). - Digest `WARNING` prefix block when upstream nodes are missing outputs — today's "(output unavailable)" sub-line is functional. - Pre-existing README workflow-count (17 → 20) and table gaps — not caused by this PR.	2026-04-20 11:34:38 +03:00
Fly Lee	eb730c0b82	fix(docs): prevent theme reset to dark after user switches to auto/light (#1079 ) Starlight removes the `starlight-theme` localStorage key when the user selects "auto" mode. The old init script checked that key, so every navigation or refresh re-forced dark theme. Use a separate `archon-theme-init` sentinel that persists across theme changes. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 10:01:27 +03:00
Cole Medin	ec5e5a5cf9	feat(providers/pi): opt-in extension discovery via config flag (#1298 ) Some checks are pending E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details Test Suite / test (windows-latest) (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / docker-build (push) Waiting to run Details Adds `assistants.pi.enableExtensions` (default false) to `.archon/config.yaml`. When true, Pi's `noExtensions` guard is lifted so the session loads tools and lifecycle hooks from `~/.pi/agent/extensions/`, packages installed via `pi install npm:<pkg>`, and the workflow's cwd `.pi/` directory — opening up the community extension ecosystem at https://shittycodingagent.ai/packages. Default stays suppressed to preserve the "Archon is source of truth" trust boundary: enabling this loads arbitrary JS under the Archon server's OS permissions, including whatever extension code the target repo happens to ship. Operators opt in explicitly, per-host. Skills, prompt templates, themes, and context files remain suppressed even when extensions are enabled — only the extensions gate opens. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 14:35:52 -05:00
Cole Medin	fb73a500d7	feat(providers/pi): best-effort structured output via prompt engineering (#1297 ) Pi's SDK has no native JSON-schema mode (unlike Claude's outputFormat / Codex's outputSchema). Previously Pi declared structuredOutput: false and any workflow using output_format silently degraded — the node ran, the transcript was treated as free text, and downstream $nodeId.output.field refs resolved to empty strings. 8 bundled/repo workflows across 10 nodes were affected (archon-create-issue, archon-fix-github-issue, archon-smart-pr-review, archon-workflow-builder, archon-validate-pr, etc.). This PR closes the gap via prompt engineering + post-parse: 1. When requestOptions.outputFormat is present, the provider appends a "respond with ONLY a JSON object matching this schema" instruction plus JSON.stringify(schema) to the prompt before calling session.prompt(). 2. bridgeSession accepts an optional jsonSchema param. When set, it buffers every assistant text_delta and — on the terminal result chunk — parses the buffer via tryParseStructuredOutput (trims whitespace, strips ```json / ``` fences, JSON.parse). On success, attaches structuredOutput to the result chunk (matching Claude's shape). On failure, emits a warn event and leaves structuredOutput undefined so the executor's existing dag.structured_output_missing path handles it. 3. Flipped PI_CAPABILITIES.structuredOutput to true. Unlike Claude/Codex this is best-effort, not SDK-enforced — reliable on GPT-5, Claude, Gemini 2.x, recent Qwen Coder, DeepSeek V3, less reliable on smaller or older models that ignore JSON-only instructions. Tests added (14 total): - tryParseStructuredOutput: clean JSON, fenced, bare fences, arrays, whitespace, empty, prose-wrapped (fails), malformed, inner backticks - augmentPromptForJsonSchema via provider integration: schema appended, prompt unchanged when absent - End-to-end: clean JSON → structuredOutput parsed; fenced JSON parses; prose-wrapped → no structuredOutput + no crash; no outputFormat → never sets structuredOutput even if assistant happens to emit JSON Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 10:16:02 -05:00
Cole Medin	83c119af78	fix(providers/pi): wire env injection + harden silent-failure paths (#1296 ) Four defensive fixes to the Pi community provider to match the Claude/Codex contract and eliminate silent error swallowing. 1. envInjection now actually wired (capability was declared but unused) Pi's SDK has no top-level `env` option on createAgentSession, so per-project env vars were being dropped. Routes requestOptions.env through a BashSpawnHook that merges caller env over the inherited baseline (caller wins, matching Claude/Codex semantics). When env is present with no allow/deny, resolvePiTools now explicitly returns Pi's 4 default tools so the pre-constructed default bashTool is replaced with an env-aware one. 2. AsyncQueue no longer leaks on consumer abort. Added close() that drains pending waiters with { done: true } so iterate() exits instead of hanging forever when the producer's finally fires before the next push. bridgeSession calls queue.close() in its finally block. 3. buildResultChunk no longer reports silent success when agent_end fires with no assistant message. Now returns { isError: true, errorSubtype: 'missing_assistant_message' } and logs a warn event so broken Pi sessions don't masquerade as clean completions. 4. session-resolver no longer swallows arbitrary errors from SessionManager.list(). Narrowed the catch to ENOENT/ENOTDIR (the only "session dir doesn't exist yet" signals); permission errors, parse failures, and other unexpected errors now propagate. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 09:20:32 -05:00
Rasmus Widing	60eeb00e42	feat(workflows): inline sub-agent definitions on DAG nodes (#1276 ) Some checks are pending Test Suite / test (windows-latest) (push) Waiting to run Details Test Suite / docker-build (push) Waiting to run Details E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details * feat(workflows): inline sub-agent definitions on DAG nodes Add `agents:` node field letting workflow YAML define Claude Agent SDK sub-agents inline, keyed by kebab-case ID. The main agent can spawn them via the Task tool — useful for map-reduce patterns where a cheap model briefs items and a stronger model reduces. Authors no longer need standalone `.claude/agents/.md` files for workflow-scoped helpers; the definitions live with the workflow. Claude only. Codex and community providers without the capability emit a capability warning and ignore the field. Merges with the internal `dag-node-skills` wrapper when `skills:` is also set — user-defined agents win on ID collision. fix(workflows): address PR #1276 review feedback Critical: - Re-export agentDefinitionSchema + AgentDefinition from schemas/index.ts (matches the "schemas/index.ts re-exports all" convention). Important: - Surface user-override of internal 'dag-node-skills' wrapper: warn-level provider log + platform message to the user when agents: redefines the reserved ID alongside skills:. User-wins behavior preserved (by design) but silent capability removal is now observable. - Add validator test coverage for the agents-capability warning (codex node with agents: → warning; claude node → no warning; no-agents field → no warning). - Strengthen NodeConfig.agents duplicate-type comment explaining the intentional circular-dep avoidance and pointing at the Zod schema as authoritative source. Actual extraction is follow-up work. Simplifications: - Drop redundant typeof check in validator (schema already enforces). - Drop unreachable Object.keys(...).length > 0 check in dag-executor. - Drop rot-prone "(out of v1 scope)" parenthetical. - Drop WHAT-only comment on AGENT_ID_REGEX. - Tighten AGENT_ID_REGEX to reject trailing/double hyphens (/^[a-z0-9]+(-[a-z0-9]+)$/). Tests: - parseWorkflow strips agents on script: and loop: nodes (parallel to the existing bash: coverage). - provider emits warn log on dag-node-skills collision; no warn on non-colliding inline agents. Docs: - Renumber authoring-workflows Summary section (12b → 13; bump 13-19). - Add Pi capability-table row for inline agents (❌, Claude-only). - Add when-to-use guidance (agents: vs .claude/agents/.md) in the new "Inline sub-agents" section. - Cross-link skills.md Related → inline-sub-agents. - CHANGELOG [Unreleased] Added entry for #1276.	2026-04-19 09:16:01 +03:00
Cole Medin	4c6ddd994f	fix(workflows): fail loudly on SDK isError results (#1208 ) (#1291 ) Some checks are pending E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / test (windows-latest) (push) Waiting to run Details Test Suite / docker-build (push) Waiting to run Details E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details Previously, `dag-executor` only failed nodes/iterations when the SDK returned an `error_max_budget_usd` result. Every other `isError: true` subtype — including `error_during_execution` — was silently `break`ed out of the stream with whatever partial output had accumulated, letting failed runs masquerade as successful ones with empty output. This is the most likely explanation for the "5-second crash" symptom in #1208: iterations finish instantly with empty text, the loop keeps going, and only the `claude.result_is_error` log tips the user off. Changes: - Capture the SDK's `errors: string[]` detail on result messages (previously discarded) and surface it through `MessageChunk.errors`. - Log `errors`, `stopReason` alongside `errorSubtype` in `claude.result_is_error` so users can see what actually failed. - Throw from both the general node path and the loop iteration path on any `isError: true` result, including the subtype and SDK errors detail in the thrown message. Note: this does not implement auto-retry. See PR comments on #1121 and the analysis on #1208 — a retry-with-fresh-session approach for loop iterations is not obviously correct until we see what `error_during_execution` actually carries in the reporter's env. This change is the observability + fail-loud step that has to come first so that signal is no longer silent. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-18 15:02:35 -05:00
DIY Smart Code	d89bc767d2	fix(setup): align PORT default on 3090 across .env.example, wizard, and JSDoc (#1152 ) (#1271 ) The server's getPort() fallback changed from 3000 to 3090 in the Hono migration (#318), but .env.example, the setup wizard's generated .env, and the JSDoc describing the fallback were not updated — leaving three different sources of truth for "the default PORT." When the wizard writes PORT=3000 to ~/.archon/.env (which the Hono server loads with override: true, while Vite only reads repo-local .env), the two processes can land on different ports silently. That mismatch is the real mechanism behind the failure described in #1152. - .env.example: comment out PORT, document 3090 as the default - packages/cli/src/commands/setup.ts: wizard no longer writes PORT=3000 into the generated .env; fix the "Additional Options" note - packages/cli/src/commands/setup.test.ts: assert no bare PORT= line and the commented default is present - packages/core/src/utils/port-allocation.ts: fix stale JSDoc "default 3000" -> "default 3090" - deploy/.env.example: keep Docker default at 3000 (compose/Caddy target that) but annotate it so users don't copy it for local dev Single source of truth for the local-dev default is now basePort in port-allocation.ts.	2026-04-17 14:15:37 +02:00
Rasmus Widing	c864d8e427	refactor(providers/pi): drop rot-prone file:line refs from code comments (#1275 ) Applies the CLAUDE.md comment rule ("don't embed paths/callers that rot as the codebase evolves") flagged by the PR #1271 review to the Pi provider's inline comments. Three spots in the merged Pi code embed `packages/.../provider.ts:N-M` line ranges pointing at the Claude and Codex providers. These ranges will drift the moment those files change — the Claude auth-merge pattern's line numbers are already off-by-a-few in some local branches. Keep the conceptual cross-reference ("mirrors Claude's process-env + request-env merge pattern", "matches the Codex provider's fallback pattern for the same condition") — that's the load-bearing part of the comment — drop the fragile line numbers and file paths. Same treatment for the upstream Pi auth-storage.ts:424-485 reference, which points at a specific line range in a moving dependency. No behavior change; comment-only refactor.	2026-04-17 14:08:43 +02:00
Rasmus Widing	4e56991b72	feat(providers): add Pi community provider (@mariozechner/pi-coding-agent) (#1270 ) * feat(providers): add Pi community provider (@mariozechner/pi-coding-agent) Introduces Pi as the first community provider under the Phase 2 registry, registered with builtIn: false. Wraps Pi's full coding-agent harness the same way ClaudeProvider wraps @anthropic-ai/claude-agent-sdk and CodexProvider wraps @openai/codex-sdk. - PiProvider implements IAgentProvider; fresh AgentSession per sendQuery call - AsyncQueue bridges Pi's callback-based session.subscribe() to Archon's AsyncGenerator<MessageChunk> contract - Server-safe: AuthStorage.inMemory + SessionManager.inMemory + SettingsManager.inMemory + DefaultResourceLoader with all no* flags — no filesystem access, no cross-request state - API key seeded per-call from options.env → process.env fallback - Model refs: '<pi-provider-id>/<model-id>' (e.g. google/gemini-2.5-pro, openrouter/qwen/qwen3-coder) with syntactic compatibility check - registerPiProvider() wired at CLI, server, and config-loader entrypoints, kept separate from registerBuiltinProviders() since builtIn: false is load-bearing for the community-provider validation story - All 12 capability flags declared false in v1 — dag-executor warnings fire honestly for any unmapped nodeConfig field - 58 new tests covering event mapping, async-queue semantics, model-ref parsing, defensive config parsing, registry integration Supported Pi providers (v1): anthropic, openai, google, groq, mistral, cerebras, xai, openrouter, huggingface. Extend PI_PROVIDER_ENV_VARS as needed. Out of scope (v1): session resume, MCP, hooks, skills mapping, thinking level mapping, structured output, OAuth flows, model catalog validation. These remain false on PI_CAPABILITIES until intentionally wired. * feat(providers/pi): read ~/.pi/agent/auth.json for OAuth + api_key passthrough Replaces the v1 env-var-only auth flow with AuthStorage.create(), which reads ~/.pi/agent/auth.json. This transparently picks up credentials the user has populated via `pi` → `/login` (OAuth subscriptions: Claude Pro/Max, ChatGPT Plus, GitHub Copilot, Gemini CLI, Antigravity) or by editing the file directly. Env-var behavior preserved: when ANTHROPIC_API_KEY / GEMINI_API_KEY / etc. is set (in process.env or per-request options.env), the adapter calls setRuntimeApiKey which is priority #1 in Pi's resolution chain. Auth.json entries are priority #2-#3. Pi's internal env-var fallback remains priority #4 as a safety net. Archon does not implement OAuth flows itself — it only rides on creds the user created via the Pi CLI. OAuth refresh still happens inside Pi (auth-storage.ts:369-413) under a file lock; concurrent refreshes between the Pi CLI and Archon are race-safe by Pi's own design. - Fail-fast error now mentions both the env-var path and `pi /login` - 2 new tests: OAuth cred from auth.json; env var wins over auth.json - 12 existing tests still pass (env-var-only path unchanged) CI compatibility: no auth.json in CI, no change — env-var (secrets) flows through Pi's getEnvApiKey fallback identically to v1. * test(e2e): add Pi provider smoke test workflow Mirrors e2e-claude-smoke.yaml: single prompt node + bash assert. Targets `anthropic/claude-haiku-4-5` via `provider: pi`; works in CI (ANTHROPIC_API_KEY secret) and locally (user's `pi /login` OAuth). Verified locally with an Anthropic OAuth subscription — full run takes ~4s from session_started to assert PASS, exercising the async-queue bridge and agent_end → result-chunk assembly under real Pi event timing. Not yet wired into .github/workflows/e2e-smoke.yml — separate PR once this lands, to keep the Pi provider PR minimal. * feat(providers/pi): v2 — thinkingLevel, tool restrictions, systemPrompt Extends the Pi adapter with three node-level translations, flipping the corresponding capability flags from false → true so the dag-executor no longer emits warnings for these fields on Pi nodes. 1. effort / thinking → Pi thinkingLevel (options-translator.ts) - Archon EffortLevel enum: low\|medium\|high\|max (from packages/workflows/src/schemas/dag-node.ts). `max` maps to Pi's `xhigh` since Archon's enum lacks it. - Pi-native strings (minimal, xhigh, off) also accepted for programmatic callers bypassing the schema. - `off` on either field → no thinkingLevel (Pi's implicit off). - Claude-shape object `thinking: {type:'enabled', budget_tokens:N}` yields a system warning and is not applied. 2. allowed_tools / denied_tools → filtered Pi built-in tools - Supports all 7 Pi tools: read, bash, edit, write, grep, find, ls. - Case-insensitive normalization. - Empty `allowed_tools: []` means no tools (LLM-only), matching e2e-claude-smoke's idiom. - Unknown names (Claude-specific like `WebFetch`) collected and surfaced as a system warning; ignored tools don't fail the run. 3. systemPrompt (AgentRequestOptions + nodeConfig.systemPrompt) - Threaded through `DefaultResourceLoader({systemPrompt})`; Pi's default prompt is replaced entirely. Request-level wins over node-level. Capability flag changes: - thinkingControl: false → true - effortControl: false → true - toolRestrictions: false → true Package delta: - +1 direct dep: @sinclair/typebox (Pi types reference it; adding as direct dep resolves the TS portable-type error). - +1 test file: options-translator.test.ts (19 tests, 100% coverage). - provider.test.ts extended with 11 new tests covering all three paths. - registry.test.ts updated: capability assertion reflects new flags. Live-verified: `bun run cli workflow run e2e-pi-smoke --no-worktree` succeeds in 1.2s with thinkingLevel=low, toolCount=0. Smoke YAML updated to use `effort: low` (schema-valid) + `allowed_tools: []` (LLM-only). * test(e2e): add comprehensive Pi smoke covering every CI-compatible node type Exercises every node type Archon supports under `provider: pi`, except `approval:` (pauses for human input, incompatible with CI): 1. prompt — inline AI prompt 2. command — named command file (uses e2e-echo-command.md) 3. loop — bounded iterative AI prompt (max_iterations: 2) 4. bash — shell script with JSON output 5. script — bun runtime (echo-args.js) 6. script — uv / Python runtime (echo-py.py) Plus DAG features on top of Pi: - depends_on + $nodeId.output substitution - when: conditional with JSON dot-access - trigger_rule: all_success merge - final assert node validates every upstream output is non-empty Complements the minimal e2e-pi-smoke.yaml — that stays as the fast-path smoke for connectivity checks; this one is the broader surface coverage. Verified locally end-to-end against Anthropic OAuth (pi /login): PASS, all 9 non-final nodes produce output, assert succeeds. * feat(providers/pi): resolve Archon `skills:` names to Pi skill paths Flips capabilities.skills: false → true by translating Archon's name-based `skills:` nodeConfig (e.g. `skills: [agent-browser]`) to absolute directory paths Pi's DefaultResourceLoader can consume via additionalSkillPaths. Search order for each skill name (first match wins): 1. <cwd>/.agents/skills/<name>/ — project-local, agentskills.io 2. <cwd>/.claude/skills/<name>/ — project-local, Claude convention 3. ~/.agents/skills/<name>/ — user-global, agentskills.io 4. ~/.claude/skills/<name>/ — user-global, Claude convention A directory resolves only if it contains a SKILL.md. Unresolved names are collected and surfaced as a system-chunk warning (e.g. "Pi could not resolve skill names: foo, bar. Searched .agents/skills and .claude/skills (project + user-global)."), matching the semantic of "requested but not found" without aborting the run. Pi's buildSystemPrompt auto-appends the agentskills.io XML block for each loaded skill, so the model sees them — no separate prompt injection needed (Pi differs from Claude here; Claude wraps in an AgentDefinition with a preloaded prompt, Pi uses XML block in system prompt). Ancestor directory traversal above cwd is deliberately skipped in this pass — matches the Pi provider's cwd-bound scope and avoids ambiguity about which repo's skills win when Archon runs from a subdirectory. Bun's os.homedir() bypasses the HOME env var; the resolver uses `process.env.HOME ?? homedir()` so tests can stage a synthetic home dir. Tests: - 11 new tests in options-translator.test.ts cover project/user, .agents/ vs .claude/, project-wins-over-user, SKILL.md presence check, dedup, missing-name collection. - 2 new integration tests in provider.test.ts cover the missing-skill warning path and the "no skills configured → no additionalSkillPaths" path. - registry.test.ts updated to assert skills: true in capabilities. Live-verified locally: `.claude/skills/archon-dev/SKILL.md` resolves, pi.session_started log shows `skillCount: 1, missingSkillCount: 0`, smoke workflow passes in 1.2s. * feat(providers/pi): session resume via Pi session store Flips capabilities.sessionResume: false → true. Pi now persists sessions under ~/.pi/agent/sessions/<encoded-cwd>/<uuid>.jsonl by default — same pattern Claude and Codex use for their respective stores, same blast radius as those providers. Flow: - No resumeSessionId → SessionManager.create(cwd) (fresh, persisted) - resumeSessionId + match in SessionManager.list(cwd) → open(path) - resumeSessionId + no match → fresh session + system warning ("⚠️ Could not resume Pi session. Starting fresh conversation.") Matches Codex's resume_thread_failed fallback at packages/providers/src/codex/provider.ts:553-558. The sessionId flows back to Archon via the terminal `result` chunk — bridgeSession annotates it with session.sessionId unconditionally so Archon's orchestrator can persist it and pass it as resumeSessionId on the next turn. Same mechanism used for Claude/Codex. Cross-cwd resume (e.g. worktree switch) is deliberately not supported in this pass: list(cwd) scans only the current cwd's session dir. A workflow that changes cwd mid-run lands on a fresh session, which matches Pi's mental model. Bridge sessionId annotation uses session.sessionId, which Pi always populates (UUID) — so no special-case for inMemory sessions is needed. Factored the resolver into session-resolver.ts (5 unit tests): - no id → create - id + match → open - id + no match → create with resumeFailed: true - list() throws → resumeFailed: true (graceful) - empty-string id → treated as "no resume requested" Integration tests in provider.test.ts add 3 cases: - resume-not-found yields warning + calls create - resume-match calls open with the file path, no warning - result chunk always carries sessionId Verified live end-to-end against Anthropic OAuth: - first call → sessionId 019d...; model replies "noted" - second call with that sessionId → "resumed: true" in logs; model correctly recalls prior turn ("Crimson.") - bogus sessionId → "⚠️ Could not resume..." warning + fresh UUID * refactor(providers,core): generalize community-provider registration Addresses the community-pattern regression flagged in the PR #1270 review: a second community provider should require editing only its own directory, not seven files across providers/ + core/ + cli/ + server/. Three changes: 1. Drop typed `pi` slot from AssistantDefaultsConfig + AssistantDefaults. Community providers live behind the generic `[string]` index that `ProviderDefaultsMap` was explicitly designed to provide. The typed claude/codex slots stay — they give IDE autocomplete for built-in config access without `as` casts, which was the whole reason the intersection exists. Community providers parse their own config via Record<string, unknown> anyway, so the typed slot added no real parser safety. 2. Loop-based getDefaults + mergeAssistantDefaults. No more hardcoded `pi: {}` spreads. getDefaults() seeds from `getRegisteredProviders()`; mergeAssistantDefaults clones every slot present in `base`. Adding a new provider requires zero edits to this function. 3. New `registerCommunityProviders()` aggregator in registry.ts. Entrypoints (CLI, server, config-loader) call ONE function after `registerBuiltinProviders()` rather than one call per community provider. Adding a new community provider is now a single-line edit to registerCommunityProviders(). This makes Pi (and future community providers) actually behave like Phase 2 (#1195) advertised: drop the implementation under packages/providers/src/community/<id>/, export a `register<Id>Provider`, add one line to the aggregator. Tests: - New `registerCommunityProviders` suite (2 tests: registers pi, idempotent). - config-loader.test updated: assert built-in slots explicitly rather than exhaustive map shape. No functional change for Pi end-users. Purely structural. * fix(providers/pi,core): correctness + hygiene fixes from PR #1270 review Addresses six of the review's important findings, all within the same PR branch: 1. envInjection: false → true The provider reads requestOptions.env on every call (for API-key passthrough). Declaring the capability false caused a spurious dag-executor warning for every Pi user who configured codebase env vars — which is the MAIN auth path. Flipping to true removes the false positive. 2. toSafeAssistantDefaults: denylist → allowlist The old shape deleted `additionalDirectories`, `settingSources`, `codexBinaryPath` before sending defaults to the web UI. Any future sensitive provider field (OAuth token, absolute path, internal metadata) would silently leak via the `[key: string]: unknown` index signature. New SAFE_ASSISTANT_FIELDS map lists exactly what to expose per provider; unknown providers get an empty allowlist so the web UI sees "provider exists" but no config details. 3. AsyncQueue single-consumer invariant The type was documented single-consumer but unenforced. A second `for await` would silently race with the first over buffer + waiters. Added a synchronous guard in Symbol.asyncIterator that throws on second call — copy-paste mistakes now fail fast with a clear message instead of dropping items. 4. session.dispose() / session.abort() silent catches Both catch blocks now log at debug via a module-scoped logger so SDK regressions surface without polluting normal output. 5. Type scripted events as AgentSessionEvent in provider.test.ts Was `Record<string, unknown>` — Pi field renames would silently keep tests passing. Now typed against Pi's actual event union. 6. Leaked /tmp/pi-research/... path in provider.ts comment Local-machine path that crept in during research. Replaced with the upstream GitHub URL (matches convention at provider.ts:110). Plus review-flagged simplifications: - Extract lookupPiModel wrapper — isolates the `as unknown as` cast behind one searchable name. - Hoist QueueItem → BridgeQueueItem at module scope (export'd for test visibility; not used externally yet but enables unit testing the mapping in isolation if needed later). - getRegisteredProviderNames: remove side-effecting registration calls. `loadConfig()` already bootstraps the registry before any caller can observe this helper — the hidden coupling was misleading. Plus missing-coverage tests from the review (pr-test-analyzer): - session.prompt() rejection → error surfaces to consumer - pre-aborted signal → session.abort() called - mid-stream abort → session.abort() called - modelFallbackMessage → system chunk yielded - AsyncQueue second-consumer → throws synchronously No behavioral changes for end users beyond the envInjection warning fix. * docs: Pi provider + community-provider contributor guide Addresses the PR #1270 review's docs-impact findings: the original Pi PR had no user-facing or contributor-facing documentation, and architecture.md still referenced the pre-Phase-2 factory.ts pattern (factory.ts was deleted in #1195). 1. packages/docs-web/src/content/docs/reference/architecture.md - Replace stale factory.ts references with the registry pattern. - Update inline IAgentProvider block: add getCapabilities, add options parameter. - Rewrite MessageChunk block as the actual discriminated union (was a placeholder with optional fields that didn't match the current type). - "Adding a New AI Agent Provider" checklist now distinguishes built-in (register in registerBuiltinProviders) from community (separate guide). Links to the new contributor guide. 2. packages/docs-web/src/content/docs/contributing/adding-a-community-provider.md (new) - Step-by-step guide using Pi as the reference implementation. - Covers: directory layout, capability discipline (start false, flip one at a time), provider class skeleton, registration via aggregator, test isolation (Bun mock.module pollution), what NOT to do (no edits to AssistantDefaultsConfig, no direct registerProvider from entrypoints, no overclaiming capabilities). 3. packages/docs-web/src/content/docs/getting-started/ai-assistants.md - New "Pi (Community Provider)" section: install, OAuth + API-key table per Pi backend, model ref format, workflow examples, capability matrix showing what Pi supports (session resume, tool restrictions, effort/thinking, skills, system prompt, envInjection) and what it doesn't (MCP, hooks, structured output, cost control, fallback model, sandbox). 4. .env.example - New Pi section with commented env vars for each supported backend (ANTHROPIC_API_KEY through HUGGINGFACE_API_KEY), each paired with its Pi provider id. OAuth flow (pi /login → auth.json) is explicitly called out — Archon reads that file too. 5. CHANGELOG.md - Unreleased entry for Pi, registerCommunityProviders aggregator, and the new contributor guide.	2026-04-17 13:52:03 +02:00
Rasmus Widing	301a139e5a	fix(core/test): split connection.test.ts from DB-test batch to avoid mock pollution (#1269 ) messages.test.ts uses mock.module('./connection', ...) at module-load time. Per CLAUDE.md:131 (Bun issue oven-sh/bun#7823), mock.module() is process- global and irreversible. When Bun pre-loads all test files in a batch, the mock shadows the real connection module before connection.test.ts runs, causing getDatabaseType() to always return the mocked value regardless of DATABASE_URL. Move connection.test.ts into its own `bun test` invocation immediately after postgres.test.ts (which runs alone) and before the big DB/utils/ config/state batch that contains messages.test.ts. This follows the same isolation pattern already used for command-handler, clone, postgres, and path-validation tests.	2026-04-17 09:33:52 +02:00
Cole Medin	bed36ca4ad	fix(workflows): add word boundary to context variable substitution regex (#1256 ) * fix(workflows): add word boundary to context variable substitution regex (#1112) Variable substitution for $CONTEXT, $EXTERNAL_CONTEXT, and $ISSUE_CONTEXT was matching as a prefix of longer identifiers like $CONTEXT_FILE, silently corrupting bash node scripts. Added negative lookahead (?![A-Za-z0-9_]) to CONTEXT_VAR_PATTERN_STR so only exact variable names are substituted. Changes: - Add negative lookahead to CONTEXT_VAR_PATTERN_STR regex in executor-shared.ts - Add regression test for prefix-match boundary case Fixes #1112 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(workflows): add missing boundary cases for context variable substitution Add three new test cases that complete coverage of the word-boundary fix from #1112: $ISSUE_CONTEXT with suffix variants, $ISSUE_CONTEXT with multiple suffixes, and contextSubstituted=false for suffix-only prompts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:32:06 -05:00
Cole Medin	df828594d7	fix(test): normalize on-disk content to LF in bundled-defaults test Companion to `75427c7c`. The bundle-completeness test compared BUNDLED_* strings (now LF-normalized by the generator) against raw readFileSync output, which is CRLF on Windows checkouts. Apply the same normalization to the on-disk side so the defense-in-depth check stays meaningful on every platform. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-16 17:59:41 -05:00
Leex	9dd57b2f3c	fix(web): unify Add Project URL/path classification across UI entry points Settings → Projects Add Project only submitted { path }, so GitHub URLs entered there failed even though the API and the Sidebar Add Project already accepted them. Closes #1108. Changes: - Add packages/web/src/lib/codebase-input.ts: shared getCodebaseInput() helper returning a discriminated { path } \| { url } union (re-exported from api.ts for convenience). - Use the helper from all three Add Project entry points: Sidebar, Settings, and ChatPage. Removes three divergent inline heuristics. - SettingsPage: rename addPath → addValue (state now holds either URL or local path) and update placeholder text. - Tests: cover https://, git@ shorthand, ssh://, git://, whitespace, unix/relative/home/Windows/UNC paths. - Docs: document the unified Add Project entry point in adapters/web.md. Heuristic flips from "assume URL unless explicitly local" to "assume local unless explicitly remote" — only inputs starting with https?://, ssh://, git@, or git:// are sent as { url }; everything else is sent as { path }. The server already resolves tilde/relative paths. Co-authored-by: Nguyen Huu Loc <lockbkbang@gmail.com>	2026-04-16 23:43:19 +02:00
Rasmus Widing	86e4c8d605	fix(bundled-defaults): auto-generate import list, emit inline strings (#1263 ) * fix(bundled-defaults): auto-generate import list, emit inline strings Root-cause fix for bundle drift (15 commands + 7 workflows previously missing from binary distributions) and a prerequisite for packaging @archon/workflows as a Node-loadable SDK. The hand-maintained `bundled-defaults.ts` import list is replaced by `scripts/generate-bundled-defaults.ts`, which walks `.archon/{commands,workflows}/defaults/` and emits a generated source file with inline string literals. `bundled-defaults.ts` becomes a thin facade that re-exports the generated records and keeps the `isBinaryBuild()` helper. Inline strings (via JSON.stringify) replace Bun's `import X from '...' with { type: 'text' }` attributes. The binary build still embeds the data at compile time, but the module now loads under Node too — removing SDK blocker #2. - Generator: `scripts/generate-bundled-defaults.ts` (+ `--check` mode for CI) - `package.json`: `generate:bundled`, `check:bundled`; wired into `validate` - `build-binaries.sh`: regenerates defaults before compile - Test: `bundle completeness` now derives expected set from on-disk files - All 56 defaults (36 commands + 20 workflows) now in the bundle * fix(bundled-defaults): address PR review feedback Review: https://github.com/coleam00/Archon/pull/1263#issuecomment-4262719090 Generator: - Guard against .yaml/.yml name collisions (previously silent overwrite) - Add early access() check with actionable error when run from wrong cwd - Type top-level catch as unknown; print only message for Error instances - Drop redundant /* eslint-disable / emission (global ignore covers it) - Fix misleading CI-mechanism claim in header comment - Collapse dead `if (!ext) continue` guard into a single typed pass Scripts get real type-checking + linting: - New scripts/tsconfig.json extending root config - type-check now includes scripts/ via `tsc --noEmit -p scripts/tsconfig.json` - Drop `scripts/` from eslint ignores; add to projectService file scope Tests: - Inline listNames helper (Rule of Three) - Drop redundant toBeDefined/typeof assertions; the Record<string, string> type plus length > 50 already cover them - Add content-fidelity round-trip assertion (defense against generator content bugs, not just key-set drift) Facade comment: drop dead reference to .claude/rules/dx-quirks.md. CI: wire `bun run check:bundled` into .github/workflows/test.yml so the header's CI-verification claim is truthful. Docs: CLAUDE.md step count four→five; add contributor bullet about `bun run generate:bundled` in the Defaults section and CONTRIBUTING.md. chore(e2e): bump Codex model to gpt-5.2 gpt-5.1-codex-mini is deprecated and unavailable on ChatGPT-account Codex auth. Plain gpt-5.2 works. Verified end-to-end: - e2e-codex-smoke: structured output returns {category:'math'} - e2e-mixed-providers: claude+codex both return expected tokens	2026-04-16 21:27:51 +02:00
Cole Medin	d535c832e3	feat(telemetry): anonymous PostHog workflow-invocation tracking (#1262 ) * feat(telemetry): add anonymous PostHog workflow-invocation tracking Emits one `workflow_invoked` event per run with workflow name/description, platform, and Archon version. Uses a stable random UUID persisted to `$ARCHON_HOME/telemetry-id` for distinct-install counting, with `$process_person_profile: false` to stay in PostHog's anonymous tier. Opt out with `ARCHON_TELEMETRY_DISABLED=1` or `DO_NOT_TRACK=1`. Self-host via `POSTHOG_API_KEY` / `POSTHOG_HOST`. Closes #1261 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(telemetry): stop leaking test events to production PostHog The `telemetry-id preservation` test exercised the real capture path with the embedded production key, so every `bun run validate` published a tombstone `workflow_name: "w"` event. Redirect POSTHOG_HOST to loopback so the flush fails silently; bump test timeout to accommodate the retry-then-give-up window. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(telemetry): silence posthog-node stderr leak on network failure The PostHog SDK's internal logFlushError() writes 'Error while flushing PostHog' directly to stderr via console.error on any network or HTTP error, bypassing logger config. For a fire-and-forget telemetry path this leaked stack traces to users' terminals whenever PostHog was unreachable (offline, firewalled, DNS broken, rate-limited). Pass a silentFetch wrapper to the PostHog client that masks failures as fake 200 responses. The SDK never sees an error, so it never logs. Original failure is still recorded at debug level for diagnostics. Side benefit: shutdown is now fast on network failure (no retry loop), so offline CLI commands no longer hang ~10s on exit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(telemetry): make id-preservation test deterministic Replace the fire-and-forget capture + setTimeout + POSTHOG_HOST-loopback dance with a direct synchronous call to getOrCreateTelemetryId(). Export the function with an @internal marker so tests can exercise the id path without spinning up the PostHog client. No network, no timer, no flake. Addresses CodeRabbit feedback on #1262. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-16 13:45:55 -05:00
Cole Medin	367de7a625	test(ci): inject deliberate failure to verify CI red X Injects exit 1 into e2e-deterministic bash-echo node to prove the engine fix (failWorkflowRun on anyFailed) propagates to a non-zero CLI exit code and a red X in GitHub Actions. Will be reverted in the next commit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 11:40:55 -05:00
Cole Medin	7721259bdc	fix(core): surface auth errors instead of silently dropping them (#1089 ) * fix: surface auth errors instead of silently dropping them (#1076) When Claude OAuth refresh token is expired, the SDK yields a result chunk with is_error=true and no session_id. Both handleStreamMode and handleBatchMode guarded the result branch with `&& msg.sessionId`, silently dropping the error. Users saw no response at all. Changes: - Remove sessionId guard from result branches in orchestrator-agent.ts - Add isError early-exit that sends error message to user - Add 4 OAuth patterns to AUTH_PATTERNS in claude.ts and codex.ts - Add OAuth refresh-token handler to error-formatter.ts - Add tests for new error-formatter branches Fixes #1076 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add structured logging to isError path and remove overly broad auth pattern - Add getLog().warn({ conversationId, errorSubtype }, 'ai_result_error') in both handleStreamMode and handleBatchMode isError branches so auth failures are visible server-side instead of silently swallowed - Remove 'access token' from AUTH_PATTERNS in claude.ts and codex.ts; the real OAuth refresh error is already covered by 'refresh token' and 'could not be refreshed', eliminating false-positive auth classification risk Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: route isError results through classifyAndFormatError with provider-specific messages The isError path in stream/batch mode used a hardcoded generic message, bypassing the classifyAndFormatError infrastructure. Now constructs a synthetic Error from errorSubtype and routes through the formatter. Error formatter updated with provider-specific auth detection: - Claude: OAuth token refresh, sign-in expired → guidance to run /login - Codex: 401 retry exhaustion → guidance to run codex login - General: tightened patterns (removed broad 'auth error' substring match) Also persists session ID before early-returning on isError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 09:36:40 -05:00
Cole Medin	818854474f	fix(workflows): stop warning about model/provider on loop nodes (#1090 ) * fix(workflows): stop warning about model/provider on loop nodes (#1082) The loader incorrectly classified loop nodes as "non-AI nodes" and warned that model/provider fields were ignored, even though the DAG executor has supported these fields on loop nodes since commit `594d5daa`. Changes: - Add LOOP_NODE_AI_FIELDS constant excluding model/provider from the warn list - Update loader to use LOOP_NODE_AI_FIELDS for loop node field checking - Fix BASH_NODE_AI_FIELDS comment that incorrectly referenced loop nodes - Add tests for loop node model/provider acceptance and unsupported field warnings Fixes #1082 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(workflows): update stale comment and add LOOP_NODE_AI_FIELDS unit tests - Update section comment from "bash/loop nodes" to "non-AI nodes" since loop nodes do support model/provider (the fix in this PR) - Export LOOP_NODE_AI_FIELDS from schemas/index.ts alongside BASH/SCRIPT variants - Add dedicated describe block in schemas.test.ts verifying that model and provider are excluded and all other BASH_NODE_AI_FIELDS are still present Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * simplify: merge nodeType and aiFields into a single if/else chain in parseDagNode Eliminates the separate isNonAiNode predicate and nested ternary for aiFields selection by combining both into one explicit if/else block — each branch sets nodeType and aiFields together, removing the need to re-check node type twice. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 09:19:18 -05:00
Cole Medin	a5e5d5ceeb	fix: address review findings for grammY Telegram adapter - Fix misleading 'unde***' log when ctx.from is undefined; use 'unknown' to match the Slack/Discord adapter pattern - Log post-startup bot runtime errors before reject() (no-op after onStart fires but errors are now visible in logs) - Add debug log when message is dropped due to no handler registered - Add stop() unit test to guard against grammY API rename regressions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 09:15:47 -05:00
Cole Medin	da1f8b7d97	fix: replace Telegraf with grammY to fix Bun TypeError crash (#1042 ) Telegraf v4's internal `redactToken()` assigns to readonly `error.message` properties, which crashes under Bun's strict ESM mode. Telegraf is EOL. Changes: - Replace `telegraf` dependency with `grammy` ^1.36.0 - Migrate adapter from Telegraf API to grammY API (Bot, bot.api, bot.start) - Use grammY's `onStart` callback pattern for async polling launch - Preserve 409 retry logic and all existing behavior - Update test mocks from telegraf types to grammy types Fixes #1042 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 09:15:46 -05:00
Cole Medin	2732288f07	Merge pull request #1065 from coleam00/archon/task-fix-issue-1055 feat(core): inject workflow run context into orchestrator prompt	2026-04-16 07:55:00 -05:00
Cole Medin	b100cd4b48	Merge pull request #1064 from coleam00/archon/task-fix-issue-1054 fix(web): interleave tool calls with text during SSE streaming	2026-04-16 07:44:48 -05:00
Cole Medin	5acf5640c8	Merge pull request #1063 from coleam00/archon/task-fix-issue-1035 fix: archon setup --spawn fails on Windows with spaces in repo path	2026-04-16 07:36:58 -05:00
Cole Medin	68ecb75f0f	Merge pull request #1052 from coleam00/archon/task-fix-github-issue-1775831868291 fix(cli): send workflow dispatch/result messages for Web UI cards	2026-04-16 07:32:52 -05:00
Cole Medin	51b8652d43	fix: complete defensive chaining and add missing test coverage for PR #1052 - Fix half-applied optional chaining in WorkflowProgressCard refetchInterval (query.state.data?.run.status → ?.run?.status) preventing TypeError in polling - Add dispatch-failure test verifying executeWorkflow still runs when dispatch sendMessage fails - Add paused-workflow test proving paused guard fires before summary check - Strengthen dispatch metadata assertion to verify workerConversationId format Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 07:32:37 -05:00
Rasmus Widing	882fc58f7c	fix: stop server startup from auto-failing in-flight workflow runs (#1216 ) (#1231 ) * fix: stop server startup from auto-failing in-flight workflow runs (#1216) `failOrphanedRuns()` at server startup unconditionally flipped every `running` workflow row to `failed`, including runs actively executing in another process (CLI / adapters). The dag-executor's between-layer status check then bailed out of the run, exit code 1 — even though every node had completed successfully. Same class of bug the CLI already learned (see comment at packages/cli/src/cli.ts:256-258). Per the new CLAUDE.md principle "No Autonomous Lifecycle Mutation Across Process Boundaries", we don't replace the call with a timer-based heuristic. Instead we remove it and surface running workflows to the user with one-click actions. Backend - `packages/server/src/index.ts` — remove the `failOrphanedRuns()` call at startup. Replace with explanatory comment referencing the CLI precedent and the CLAUDE.md principle. The function in `packages/core/src/db/workflows.ts:911` is preserved for use by the explicit `archon workflow cleanup` command. UI - `packages/web/src/components/layout/TopNav.tsx` — replace the binary pulse dot on the Dashboard nav with a numeric count badge sourced from `/api/dashboard/runs` `counts.running`. Hidden when count is 0. Same 10s polling interval as before. No animation — a steady factual count is honest; a pulse would imply system judgment. - `packages/web/src/components/dashboard/ConfirmRunActionDialog.tsx` (new) — shadcn AlertDialog wrapper for destructive workflow-run actions, mirroring the codebase-delete pattern in `sidebar/ProjectSelector.tsx`. Caller passes the existing button as `trigger` slot; dialog handles open/close via Radix. - `packages/web/src/components/dashboard/WorkflowRunCard.tsx` — replace 4 `window.confirm()` callsites (Reject, Abandon, Cancel, Delete) with ConfirmRunActionDialog. Each gets a context-appropriate description. - `packages/web/src/components/dashboard/WorkflowHistoryTable.tsx` — replace 1 `window.confirm()` (Delete) with the same dialog. CHANGELOG entries under [Unreleased]: Fixed for #1216, two Changed entries for the nav badge and dialog upgrade. No new tests: the web package has no React component testing infrastructure (existing `bun test` covers `src/lib/` and `src/stores/` only). Type-check + lint + manual UI verification + the backend reproducer are the verification levels. Closes #1216. * review: address PR #1231 nits — stale doc + 3 code polish PR review surfaced one real correctness issue in docs and three small code polish items. None block merge; addressing for cleanliness. - packages/docs-web/src/content/docs/guides/authoring-workflows.md:486 removed the "auto-marked as failed on next startup" paragraph that described the now-deleted behavior. Replaced with a "Crashed servers / orphaned runs" note pointing users at `archon workflow cleanup` and the dashboard Cancel/Abandon buttons; explains the auto-resume mechanism still works once the row reaches a terminal status. - ConfirmRunActionDialog: narrow `onConfirm` from `() => void \| Promise<void>` to `() => void`. All five callsites are synchronous wrappers around React Query mutations whose error handling lives at the page level (`runAction` in DashboardPage). The union widened the API for no current caller. Documented in the JSDoc what to do if an awaiting caller appears later. - TopNav: dropped the redundant `String(runningCount)` cast in the aria-label — template literal coerces. Also rewrote the comment above the `listDashboardRuns` query: the previous version implied `limit=1` constrained `counts.running`; in fact `counts` is a server-side aggregate independent of `limit`, and `limit=1` only minimises the `runs` array we discard. * review: correct remediation docs — cleanup ≠ abandon CodeRabbit caught a factual error I introduced in the doc update: `archon workflow cleanup` calls `deleteOldWorkflowRuns(days)` which DELETEs old terminal rows (`completed`/`failed`/`cancelled` older than N days) for disk hygiene. It does NOT transition stuck `running` rows. The correct remediation for a stuck `running` row is either the dashboard's per-row Cancel/Abandon button (already documented) or `archon workflow abandon <run-id>` from the CLI (existing subcommand, see packages/cli/src/cli.ts:366-374). Fixed three locations: - packages/docs-web/.../guides/authoring-workflows.md — replaced the vague "clean up explicitly" with concrete Web UI / CLI instructions and an explicit "Not to be confused with `archon workflow cleanup`" callout to close off the ambiguity CodeRabbit flagged. - packages/server/src/index.ts — comment updated to point at the correct remediation (`archon workflow abandon`) and clarify that `archon workflow cleanup` is unrelated disk-hygiene. - CHANGELOG.md — same correction in the [Unreleased] Fixed entry.	2026-04-15 12:05:41 +03:00
Rasmus Widing	5c8c39e5c9	fix(test): update stale mocks in cleanup-service 'continues processing' test (#1230 ) (#1232 ) After PR #1034 changed worktree existence checks from execFileAsync to fs/promises.access, the mockExecFileAsync rejections had no effect. removeEnvironment needs getById + getCodebase mocks to proceed past the early-return guard, otherwise envs route to report.skipped instead of report.removed. Replace the two stale mockExecFileAsync rejection calls with proper mockGetById and mockGetCodebase return values for both test environments. Fixes #1230	2026-04-15 11:53:02 +03:00
Shane McCarron	f61d576a4d	feat(isolation): auto-init submodules in worktrees (#1189 ) Worktrees created via `git worktree add` do not initialize submodules — monorepo workflows that need submodule content find empty directories. Auto-detect `.gitmodules` and run `git submodule update --init --recursive` after worktree creation; classify failures through the isolation error pipeline. Behavior: - `.gitmodules` absent → skip silently (zero-cost probe, no effect on non-submodule repos) - `.gitmodules` present → run submodule init by default (opt out via `worktree.initSubmodules: false`) - submodule init or `.gitmodules` read failure → throw with classified error including opt-out guidance - Only `ENOENT` on `.gitmodules` is treated as "no submodules"; other access errors (EACCES, EIO) surface as failures to prevent silent empty-dir worktrees Changes: - `packages/isolation/src/providers/worktree.ts` — `initSubmodules()` method + call site in `createWorktree()` - `packages/isolation/src/errors.ts` — collapsed `errorPatterns` + `knownPatterns` into single `ERROR_PATTERNS` source of truth with `known: boolean` per entry; added submodule pattern with opt-out guidance - `packages/isolation/src/types.ts` + `packages/core/src/config/config-types.ts` — new `initSubmodules?: boolean` config option - `packages/docs-web/src/content/docs/reference/configuration.md` — documented the new option and submodule behavior - Tests: default-on, explicit opt-in, explicit opt-out, skip-when-absent, fail-fast on EACCES, fail-fast on git failure, fail-fast on timeout Credit to @halindrome for the original implementation and root-cause mapping across #1183, #1187, #1188, #1192. Follow-up: #1192 (codebase identity rearchitect) would retire the cross-clone guard code in `resolver.ts` and `worktree.ts` that #1198, #1206 added. Separate PR. Closes #1187	2026-04-15 09:48:18 +03:00
Kagura	73d9240eb3	fix(isolation): complete reports false success when worktree remains on disk (fixes #964 ) (#1034 ) * fix(isolation): complete reports false success when worktree remains on disk (fixes #964) Three changes to prevent ghost worktrees: 1. isolationCompleteCommand now checks result.worktreeRemoved — if the worktree was not actually removed (partial failure), it reports 'Partial' with warnings and counts as failed, not completed. Previously only skippedReason was checked; a destroy that returned successfully but with worktreeRemoved=false would still print 'Completed'. 2. WorktreeProvider.destroy() now runs 'git worktree prune' after removal to clean up stale worktree references that git may keep even after the directory is removed. 3. WorktreeProvider.destroy() adds post-removal verification: after git worktree remove, it checks 'git worktree list --porcelain' to confirm the worktree is actually unregistered. If still registered, worktreeRemoved is set back to false with a descriptive warning. * fix: address CodeRabbit review — ghost worktree prune, partial cleanup callers, accurate messages * test: add regression test for Partial branch in isolation complete Exercises the !result.worktreeRemoved path (without skippedReason) that was flagged as uncovered by CodeRabbit review.	2026-04-14 17:58:45 +03:00
Matt Chapman	28b258286f	Extra backticks for markdown block to fix formatting (#1218 ) of nested code blocks.	2026-04-14 17:58:31 +03:00
Rasmus Widing	81859d6842	fix(providers): replace Claude SDK embed with explicit binary-path resolver (#1217 ) * feat(providers): replace Claude SDK embed with explicit binary-path resolver Drop `@anthropic-ai/claude-agent-sdk/embed` and resolve Claude Code via CLAUDE_BIN_PATH env → assistants.claude.claudeBinaryPath config → throw with install instructions. The embed's silent failure modes on macOS (#1210) and Windows (#1087) become actionable errors with a documented recovery path. Dev mode (bun run) remains auto-resolved via node_modules. The setup wizard auto-detects Claude Code by probing the native installer path (~/.local/bin/claude), npm global cli.js, and PATH, then writes CLAUDE_BIN_PATH to ~/.archon/.env. Dockerfile pre-sets CLAUDE_BIN_PATH so extenders using the compiled binary keep working. Release workflow gets negative and positive resolver smoke tests. Docs, CHANGELOG, README, .env.example, CLAUDE.md, test-release and archon skills all updated to reflect the curl-first install story. Retires #1210, #1087, #1091 (never merged, now obsolete). Implements #1176. * fix(providers): only pass --no-env-file when spawning Claude via Bun/Node `--no-env-file` is a Bun flag that prevents Bun from auto-loading `.env` from the subprocess cwd. It is only meaningful when the Claude Code executable is a `cli.js` file — in which case the SDK spawns it via `bun`/`node` and the flag reaches the runtime. When `CLAUDE_BIN_PATH` points at a native compiled Claude binary (e.g. `~/.local/bin/claude` from the curl installer, which is Anthropic's recommended default), the SDK executes the binary directly. Passing `--no-env-file` then goes straight to the native binary, which rejects it with `error: unknown option '--no-env-file'` and the subprocess exits code 1. Emit `executableArgs` only when the target is a `.js` file (dev mode or explicit cli.js path). Caught by end-to-end smoke testing against the curl-installed native Claude binary. * docs: record env-leak validation result in provider comment Verified end-to-end with sentinel `.env` and `.env.local` files in a workflow CWD that the native Claude binary (curl installer) does not auto-load `.env` files. With Archon's full spawn pathway and parent env stripped, the subprocess saw both sentinels as UNSET. The first-layer protection in `@archon/paths` (#1067) handles the inheritance leak; `--no-env-file` only matters for the Bun-spawned cli.js path, where it is still emitted. * chore(providers): cleanup pass — exports, docs, troubleshooting Final-sweep cleanup tied to the binary-resolver PR: - Mirror Codex's package surface for the new Claude resolver: add `./claude/binary-resolver` subpath export and re-export `resolveClaudeBinaryPath` + `claudeFileExists` from the package index. Renames the previously single `fileExists` re-export to `codexFileExists` for symmetry; nothing outside the providers package was importing it. - Add a "Claude Code not found" entry to the troubleshooting reference doc with platform-specific install snippets and pointers to the AI Assistants binary-path section. - Reframe the example claudeBinaryPath in reference/configuration.md away from cli.js-only language; it accepts either the native binary or cli.js. * test+refactor(providers, cli): address PR review feedback Two test gaps and one doc nit from the PR review (#1217): - Extract the `--no-env-file` decision into a pure exported helper `shouldPassNoEnvFile(cliPath)` so the native-binary branch is unit testable without mocking `BUNDLED_IS_BINARY` or running the full sendQuery pathway. Six new tests cover undefined, cli.js, native binary (Linux + Windows), Homebrew symlink, and suffix-only matching. Also adds a `claude.subprocess_env_file_flag` debug log so the security-adjacent decision is auditable. - Extract the three install-location probes in setup.ts into exported wrappers (`probeFileExists`, `probeNpmRoot`, `probeWhichClaude`) and export `detectClaudeExecutablePath` itself, so the probe order can be spied on. Six new tests cover each tier winning, fall-through ordering, npm-tier skip when not installed, and the which-resolved-but-stale-path edge case. - CLAUDE.md `claudeBinaryPath` placeholder updated to reflect that the field accepts either the native binary or cli.js (the example value was previously `/absolute/path/to/cli.js`, slightly misleading now that the curl-installer native binary is the default). Skipped from the review by deliberate scope decision: - `resolveClaudeBinaryPath` async-with-no-await: matches Codex's resolver signature exactly. Changing only Claude breaks symmetry; if pursued, do both providers in a separate cleanup PR. - `isAbsolute()` validation in parseClaudeConfig: Codex doesn't do it either. Resolver throws on non-existence already. - Atomic `.env` writes in setup wizard: pre-existing pattern this PR touched only adjacently. File as separate issue if needed. - classifyError branch in dag-executor for setup errors: scope creep. - `.env.example` "missing #" claim: false positive (verified all CLAUDE_BIN_PATH lines have proper comment prefixes). * fix(test): use path.join in Windows-compatible probe-order test The "tier 2 wins (npm cli.js)" test hardcoded forward-slash path comparisons, but `path.join` produces backslashes on Windows. Caused the Windows CI leg of the test suite to fail while macOS and Linux passed. Use `path.join` for both the mock return value and the expectation so the separator matches whatever the platform produces.	2026-04-14 17:56:37 +03:00
Rasmus Widing	33d31c44f1	fix: lock workflow runs by working_path (#1036 , #1188 part 2) (#1212 ) * fix: lock workflow runs by working_path (#1036, #1188 part 2) Both bugs reduce to the same primitive: there's no enforced lock on working_path, so two dispatches that resolve to the same filesystem location can race. The DB row is the lock token; pending/running/paused are "lock held"; terminal statuses release. Changes: - getActiveWorkflowRunByPath includes `pending` (with 5-min stale-orphan age window), accepts excludeId + selfStartedAt, and orders by (started_at ASC, id ASC) for a deterministic older-wins tiebreaker. Eliminates the both-abort race where two near-simultaneous dispatches with similar timestamps could mutually abort each other. - Move the executor's guard call site to AFTER workflowRun is finalized (preCreated, resumed, or freshly created). This guarantees we always have self-ID + started_at to pass to the lock query. - On guard fire after row creation: mark self as 'cancelled' so we don't leave a zombie pending row that would then become its own lock holder. - New error message includes workflow name, duration, short run id, and three concrete next-action commands (status / cancel / different branch). Replaces the vague "Workflow already running". - Resume orphan fix: when executor activates a resumable run, mark the orchestrator's pre-created row as 'cancelled'. Without this, every resume leaks a pending row that would block the user's own back-to-back resume until the 5-min stale window. - New formatDuration helper for the error message (8 unit tests). Tests: - 5 new tests in db/workflows.test.ts: pending in active set, age window, excludeId exclusion, tiebreaker SQL shape, ordering. - 5 new tests in executor.test.ts: self-id passed to query, self-cancel on guard fire, new message format, resume orphan cancellation, resume proceeds even if orphan cancel fails. - Updated 2 executor-preamble tests for new structural behavior (row-then-guard, new message format). - 8 new tests for formatDuration. Deferred (kept scope tight): - Worktree-layer advisory lockfile (residual #1188.2 microsecond race where both dispatches reach provider.create — bounded by git's own atomicity for `worktree add`). - Startup cleanup of pre-existing stale pending rows (5-min age window makes them harmless). - DB partial UNIQUE constraint migration (code-only is sufficient). Fixes #1036 Fixes #1188 (part 2) * fix: SQLite Date binding + UTC timestamp parse for path lock guard Two issues found during E2E smoke testing: 1. bun:sqlite rejects Date objects as bindings ("Binding expected string, TypedArray, boolean, number, bigint or null"). Serialize selfStartedAt to ISO string before passing — PostgreSQL accepts ISO strings for TIMESTAMPTZ comparison too. 2. SQLite returns datetimes as plain strings without timezone suffix ("YYYY-MM-DD HH:MM:SS"), and JS new Date() parses such strings as local time. The blocking message was showing "running 3h" for workflows started seconds ago in a UTC+3 timezone. Added parseDbTimestamp helper that: - Returns Date.getTime() unchanged for Date inputs (PG path) - Treats SQLite-style strings as UTC by appending Z Used at both call sites: the lock query (selfStartedAt) and the blocking message duration. Tests: - 4 new tests in duration.test.ts for parseDbTimestamp covering Date input, SQLite UTC interpretation, explicit Z, and explicit +/-HH:MM offsets. - Updated workflows.test.ts assertion for ISO serialization. E2E smoke verified end-to-end: - Sanity (single dispatch) succeeds. - Two concurrent --no-worktree dispatches: one wins, one blocked with actionable message showing correct "Xs" duration. - Resume + back-to-back resume both succeed (orphan correctly cancelled when resume activates). * fix: address review — resume timestamp, lock-leak paths, status copy CodeRabbit review on #1212 surfaced three real correctness gaps: CRITICAL — resumeWorkflowRun preserved historical started_at, letting a resumed row sort ahead of a currently-active holder in the lock query's older-wins tiebreaker. Two active workflows could end up on the same working_path. Fix: refresh started_at to NOW() in resumeWorkflowRun. Original creation time is recoverable from workflow_events history if needed for analytics. MAJOR — lock-leak failure paths: - If resumeWorkflowRun() throws, the orchestrator's pre-created row was left as 'pending' until the 5-min stale window. Fix: cancel preCreatedRun in the resume catch. - If getActiveWorkflowRunByPath() throws, workflowRun (possibly already promoted to 'running' via resume) was left active with no auto-cleanup. Fix: cancel workflowRun in the guard catch. MINOR — the blocking message always said "running" but the lock query returns running, paused, AND fresh-pending rows. Telling a user to "wait for it to finish" on a paused run (waiting on user approval) would block them indefinitely. Fix: status-aware copy: - paused: "paused waiting for user input" + approve/reject actions - pending: "starting" verb - running: keep current Tests: - New: resume refreshes started_at (asserts SQL contains `started_at = NOW()`) - New: cancels preCreatedRun when resumeWorkflowRun throws - New: cancels workflowRun when guard query throws - New: paused message uses approve/reject actions, NOT "wait" - New: pending message uses "starting" verb - New: running message uses default copy - Updated: existing tests for new error string ("already active" reflects status-aware semantics, not just "running") Note: the user-facing error string changed from "already running on this path" to "already active on this path (status)". Internal use only — surfaced via getResult().error, not directly to users. * fix: SQLite tiebreaker dialect bug + paired self struct + UX polish CodeRabbit second review found one critical issue and several polish items not addressed in `008013da`. CRITICAL — SQLite tiebreaker silently broken under default deployment. SQLite stores started_at as TEXT "YYYY-MM-DD HH:MM:SS" (space sep). Our ISO param is "YYYY-MM-DDTHH:MM:SS.mmmZ" (T sep). SQLite compares text lexically: char 11 is space (0x20) in column vs T (0x54) in param, so EVERY column value lex-sorts before EVERY ISO param. Result: `started_at < $param` is always TRUE regardless of actual time. In true concurrent dispatches, both sides see each other as "older" and both abort — defeating the older-wins guarantee under SQLite, which is the default deployment. Fix: dialect-aware comparison in getActiveWorkflowRunByPath: - PostgreSQL: `started_at < $3::timestamptz` (TIMESTAMPTZ + cast) - SQLite: `datetime(started_at) < datetime($3)` (forces chronological via SQLite's date/time functions) Documented with reproducer tests in adapters/sqlite.test.ts: lexical returns wrong answer for "2026-04-14 12:00:00" < "2026-04-14T10:00:00Z"; datetime() returns correct answer. Type design — collapse paired params into struct. `excludeId` and `selfStartedAt` had to travel together (tiebreaker references both) but were two independent optionals — future callers could pass one without the other and silently degrade. Replaced with a single `self?: { id: string; startedAt: Date }` to make the paired-or-nothing invariant structural. formatDuration(0) consistency. Old: `if (ms <= 0) return '0s'` — special-cased 0ms despite the "sub-second rounds up to 1s" comment. Fixed to `ms < 0` so 0ms returns '1s' (a run that just started in the same DB second should display as active, not literal zero). Comment fix: "We acquired the lock via createWorkflowRun" was misleading — createWorkflowRun creates a row; the lock is determined later by the query. Log context: added cwd to workflow.guard_self_cancel_failed and pendingRunId to db_active_workflow_check_failed so operators can correlate leaked rows. Doc fixes: - /workflow abandon doc said "marks as failed" — actually 'cancelled' - database.md "Prevents concurrent workflow execution" → accurate description of path-based lock with stale-pending tolerance Test additions: - 3 SQLite-direct tests in adapters/sqlite.test.ts proving the lexical-vs-chronological bug and the datetime() fix - Guard self-cancel update throw still surfaces failure to user Signature change rippled through: - IWorkflowStore.getActiveWorkflowRunByPath now takes (path, self?) - All internal callers updated	2026-04-14 15:19:38 +03:00
Rasmus Widing	5a4541b391	fix: route canonical path failures through blocked classification (#1211 ) Follow-up to #1206 review: the early getCanonicalRepoPath() wrap in resolve() threw directly, escaping the classification flow that createNewEnvironment uses. Permission errors, malformed worktree pointers, ENOENT, etc. surfaced as unclassified crashes instead of becoming an actionable `blocked` result. Mirror createNewEnvironment's contract: - isKnownIsolationError → return { status: 'blocked', reason: 'creation_failed', userMessage: classifyIsolationError(err) + suffix } - unknown errors → throw (programming bugs stay visible as crashes, not silent isolation failures) Adds two tests in resolver.test.ts: - EACCES classifies to "Permission denied" blocked message - Unknown error propagates as throw Addresses CodeRabbit review comment on #1206.	2026-04-14 15:19:13 +03:00
Rasmus Widing	fd3f043125	fix: extend worktree ownership guard to resolver adoption paths (#1206 ) * fix: extend worktree ownership guard to resolver adoption paths (#1183, #1188) PR #1198 guarded WorktreeProvider.findExisting(), but IsolationResolver has three earlier adoption paths that bypass the provider layer: - findReusable (DB lookup by workflow identity) - findLinkedIssueEnv (cross-reference via linked issues) - tryBranchAdoption (PR branch discovery) Two clones of the same remote share codebase_id (identity is derived from owner/repo). Without these guards, clone B silently adopts clone A's worktree via any of the three paths. Changes: - Extract verifyWorktreeOwnership from WorktreeProvider (private) to @archon/git/src/worktree.ts as an exported function, sitting next to getCanonicalRepoPath which parses the same .git file format - Call the shared function from all three resolver paths; throw on cross-clone mismatch (DB rows are preserved — they legitimately belong to the other clone) - Compute canonicalRepoPath once at the top of resolve() - Six new tests in resolver.test.ts covering each guarded path's cross-checkout and same-clone behaviors Fixes #1183 Fixes #1188 (part 1 — cross-checkout; part 2 parallel collision deferred to follow-up alongside #1036) * fix: address PR review — polish, observability, secondary gap, docs Addresses the multi-agent review on #1206: Code fixes: - worktree.adoption_refused_cross_checkout log event renamed to match CLAUDE.md {domain}.{action}_{state} convention - verifyWorktreeOwnership now preserves err.code and err via { cause } when wrapping fs errors, so classifyIsolationError is robust to Node message format changes - Structured fields (codebaseId, canonicalRepoPath) added to all cross-clone rejection logs for incident debugging - Wrap getCanonicalRepoPath at top of resolve() with classified error instead of letting it propagate as an unclassified crash - Extract assertWorktreeOwnership helper on IsolationResolver — centralizes warn-then-rethrow contract, removes duplication - Dedupe toWorktreePath(existing.working_path) calls in resolver paths - Add code comment on findLinkedIssueEnv explaining why throw-on-first is intentional (user decision — surfaces anomaly instead of masking) Secondary gap closed: - WorktreeProvider.findExisting PR-branch adoption path (findWorktreeByBranch) now also verifies ownership — same class of bug as the main path, just via a different lookup Tests: - 8 new unit tests for verifyWorktreeOwnership in @archon/git (matching pointer, different clone, EISDIR/ENOENT errno preservation, submodule pointer, corrupted .git, trailing-slash normalization, cause chain) - tryBranchAdoption cross-clone test now asserts store.create was never called (symmetry with paths 1+2 asserting updateStatus) - New test for cross-clone rejection in the PR-branch-adoption secondary path in worktree.test.ts Docs: - CHANGELOG.md Unreleased entry for the cross-clone fix series - troubleshooting.md "Worktree Belongs to a Different Clone" section documenting all four new error patterns with resolution steps and pointer to #1192 for the architectural fix * fix(git): use raw .git pointer in cross-clone error message verifyWorktreeOwnership previously called path.resolve() on the gitdir path before embedding it in the error message. On Windows, resolve() prepends a drive letter to a POSIX-style path (e.g., /other/clone → C:\other\clone), which: 1. Misled users by showing a path that doesn't match what's actually in their .git file 2. Broke a Windows-only test asserting the error contains the literal /other/clone path Compare on resolved paths (correct — normalizes trailing slashes and relative components for the equality check) but display the raw match in the error message (recognizable to the user).	2026-04-14 12:10:19 +03:00
Rasmus Widing	af9ed84157	fix: prevent worktree isolation bypass via prompt and git-level adoption (#1198 ) * fix: prevent worktree isolation bypass via prompt and git-level adoption (#1193, #1188) Three fixes for workflows operating on wrong branches: - archon-implement prompt: replace ambiguous branch table with decision tree that trusts the worktree isolation system, uses $BASE_BRANCH explicitly, and instructs AI to never switch branches - WorktreeProvider.findExisting: verify worktree's parent repo matches the request before adopting, preventing cross-clone adoption - WorktreeProvider.createNewBranch: reset stale orphan branches to the intended start-point instead of silently inheriting old commits Fixes #1193 Relates to #1188 * fix: address PR review — strict worktree verification, align sibling prompts Address CodeRabbit + self-review findings on #1198: Code fixes: - findExisting now throws on cross-checkout or unverifiable state instead of returning null, avoiding a confusing cascade through createNewBranch - verifyWorktreeOwnership handles .git errors precisely: ENOENT/EACCES/EIO throw a fail-fast error; EISDIR (full checkout at path) throws a clear "not a worktree" error; unmatched gitdir (submodule, malformed) throws - Path comparison uses resolve() to normalize trailing slashes - Added classifyIsolationError patterns so new errors produce actionable user messages Test fixes: - mockClear readFile/rm in afterEach - New tests: cross-checkout throws, EISDIR throws, EACCES throws, submodule pointer throws, trailing-slash normalization, branch -f reset failure propagates without retry - Updated existing tests that relied on permissive adoption to provide valid matching gitdir Prompt fixes (sweep of all default commands): - archon-implement.md: clarify "never switch branches" applies to worktree context; non-worktree branch creation still allowed - archon-fix-issue.md + archon-implement-issue.md: aligned decision tree with archon-implement pattern; use $BASE_BRANCH instead of MAIN/MASTER - archon-plan-setup.md: converted table to ordered decision tree with IN WORKTREE? first; removed ambiguous "already on correct feature branch" row	2026-04-14 09:44:12 +03:00
Rasmus Widing	d6e24f5075	feat: Phase 2 — community-friendly provider registry system (#1195 ) * feat: replace hardcoded provider factory with typed registry system Replace the built-in-only factory switch with a typed ProviderRegistration registry where entries carry metadata (displayName, capabilities, isModelCompatible) alongside the factory function. This enables community providers to register without modifying core code. - Add ProviderRegistration and ProviderInfo types to contract layer - Create registry.ts with register/get/list/clear API, delete factory.ts - Bootstrap registerBuiltinProviders() at server and CLI entrypoints - Widen provider unions from 'claude' \| 'codex' to string across schemas, config types, deps, executors, and API validation - Replace hardcoded model-validation with registry-driven isModelCompatible and inferProviderFromModel (built-in only inference) - Add GET /api/providers endpoint returning registry metadata - Dynamic provider dropdowns in Web UI (BuilderToolbar, NodeInspector, WorkflowBuilder, SettingsPage) via useProviders hook - Dynamic provider selection in CLI setup command - Registry test suite covering full lifecycle * feat: generalize assistant config and tighten registry validation - Add ProviderDefaults/ProviderDefaultsMap generic types to contract layer - Add index signatures to ClaudeProviderDefaults/CodexProviderDefaults - Introduce AssistantDefaults/AssistantDefaultsConfig intersection types that combine ProviderDefaultsMap with typed built-in entries - Replace hardcoded claude/codex config merging with generic mergeAssistantDefaults() that iterates all provider entries - Replace hardcoded toSafeConfig projection with generic toSafeAssistantDefaults() that strips server-internal fields - Validate provider strings at all config-entry surfaces: env override, global config, repo config all throw on unknown providers - Validate provider on PATCH /api/config/assistants (400 on unknown) - Move validator.ts from hardcoded Codex checks to capability-driven warnings using registry getProviderCapabilities() - Remove resolveProvider() default to 'claude' — returns undefined when no provider is set, skipping capability warnings for unresolved nodes - Widen config API schemas to generic Record<string, ProviderDefaults> - Rewrite SettingsPage to iterate providers dynamically with built-in specific UI for Claude/Codex and generic JSON view for community - Extract bootstrap to provider-bootstrap modules in CLI and server - Remove all as Record<...> casts from dag-executor, executor, orchestrator — clean indexing via ProviderDefaultsMap intersection * fix: remove remaining hardcoded provider assumptions and regenerate types - Replace hardcoded 'claude' defaults in CLI setup with registry lookup (getRegisteredProviders().find(p => p.builtIn)?.id) - Replace hardcoded 'claude' default in clone.ts folder detection with registry-driven fallback - Update config YAML comment from "claude or codex" to "registered provider" - Make bootstrap test assertions use toContain instead of exact toEqual so they don't break when community providers are registered - Widen validator.test.ts helper from 'claude' \| 'codex' to string - Remove unnecessary type casts in NodeInspector, WorkflowBuilder, SettingsPage now that generated types use string - Regenerate api.generated.d.ts from updated OpenAPI spec — all provider fields are now string instead of 'claude' \| 'codex' union * fix: address PR review findings — consistency, tests, docs Critical fixes: - isModelCompatible now throws on unknown providers (fail-fast parity with getProviderCapabilities) instead of silently returning true - Schema provider fields use z.string().trim().min(1) to reject whitespace-only values - validator.ts resolveProvider accepts defaultProvider param so capability warnings fire for config-inherited providers - PATCH /api/config/assistants validates assistants keys against registry (rejects unknown provider IDs in the map) YAGNI cleanup: - Delete provider-bootstrap.ts wrappers in CLI and server — call registerBuiltinProviders() directly - Remove no-op .map(provider => provider) in SettingsPage Test coverage: - Add GET /api/providers endpoint tests (shape, projection, capabilities) - Add config-loader throw-path tests for unknown providers in env var, global config, and repo config - Add isModelCompatible throw test for unknown providers Docs: - CLAUDE.md: factory.ts → registry.ts in directory tree, add GET /api/providers to API endpoints section - .env.example: update DEFAULT_AI_ASSISTANT comment - docs-web configuration reference: update provider constraint docs UI: - Settings default-assistant dropdown uses allProviderEntries fallback (no longer silently empty on API failure) - clearRegistry marked @internal in JSDoc * fix: use registry defaults in getDefaults/registerProject, document type design - getDefaults() initializes assistant defaults from registered providers instead of hardcoding { claude: {}, codex: {} } - getDefaults() uses first registered built-in as default assistant instead of hardcoding 'claude' - handleRegisterProject uses config.assistant instead of hardcoded 'claude' for new codebase ai_assistant_type - Document AssistantDefaults/AssistantDefaultsConfig intersection types: built-in keys are typed for parseClaudeConfig/parseCodexConfig type safety; community providers use the generic [string] index - Document WorkflowConfig.assistants intersection type with same rationale * docs: update stale provider references to reflect registry system - architecture.md: DB schema comment now says 'registered provider' - first-workflow.md: provider field accepts any registered provider - quick-reference.md: provider type changed from enum to string - authoring-workflows.md: provider type changed from enum to string - title-generator.ts: @param doc updated from 'claude or codex' to generic provider identifier * docs: fix remaining stale provider references in quick-reference and authoring guide - quick-reference.md: per-node provider type changed from enum to string - quick-reference.md: model mismatch guidance updated for registry pattern - authoring-workflows.md: provider comment says 'any registered provider'	2026-04-13 21:27:11 +03:00
Rasmus Widing	b5c5f81c8a	refactor: extract provider metadata seam for Phase 2 registry readiness (#1185 ) * refactor: extract provider metadata seam for Phase 2 registry readiness - Add static capability constants (capabilities.ts) for Claude and Codex - Export getProviderCapabilities() from @archon/providers for capability queries without provider instantiation - Add inferProviderFromModel() to model-validation.ts, replacing three copy-pasted inline inference blocks in executor.ts and dag-executor.ts - Replace throwaway provider instantiation in dag-executor with static capability lookup (getProviderCapabilities) - Add orchestrator warning when env vars are configured but provider doesn't support envInjection * refactor: address LOW findings from code review - Remove CLAUDE_CAPABILITIES/CODEX_CAPABILITIES from public index (YAGNI — callers should use getProviderCapabilities(), not raw constants) - Remove dead _deps parameter from resolveNodeProviderAndModel and its two call-sites (no longer needed after static capability lookup refactor) - Update factory.ts module JSDoc to mention both exported functions - Add edge-case tests for getProviderCapabilities: empty string and case-sensitive throws (parity with existing getAgentProvider tests) - Add test for inferProviderFromModel with empty string (returns default, documenting the falsy-string shortcut)	2026-04-13 16:10:48 +03:00
Rasmus Widing	bf20063e5a	feat: propagate managed execution env to all workflow surfaces (#1161 ) * Implement managed execution env propagation * Address managed env review feedback	2026-04-13 15:21:57 +03:00
Rasmus Widing	a8ac3f057b	security: prevent target repo .env from leaking into subprocesses (#1135 ) Remove the entire env-leak scanning/consent infrastructure: scanner, allow_env_keys DB column usage, allow_target_repo_keys config, PATCH consent route, --allow-env-keys CLI flag, and UI consent toggle. The env-leak gate was the wrong primitive. Target repo .env protection is already structural: - stripCwdEnv() at boot removes Bun-auto-loaded CWD .env keys - Archon loads its own env sources afterward (~/.archon/.env) - process.env is clean before any subprocess spawns - Managed env injection (config.yaml env: + DB vars) is unchanged No scanning, no consent, no blocking. Any repo can be registered and used. Subprocesses receive the already-clean process.env.	2026-04-13 13:46:24 +03:00

1 2 3 4 5 ...

674 commits