mirror of
https://github.com/coleam00/Archon
synced 2026-04-21 13:37:41 +00:00
674 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
7ea321419f
|
fix: initialize options.hooks before merging YAML node hooks (#1177)
Some checks are pending
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (ubuntu-latest) (push) Waiting to run
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
When a workflow node defines hooks (PreToolUse/PostToolUse) in YAML but
no hooks exist yet on the options object, applyNodeConfig crashes with
"undefined is not an object" because it tries to assign properties on
the undefined options.hooks.
Initialize options.hooks to {} before the merge loop.
Reproduces with: archon workflow run archon-architect (which uses
per-node hooks extensively).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
ba4b9b47e6
|
docs(worktree): fix stale rename example + document copyFiles properly (#1328)
Three related fixes around the `worktree.copyFiles` primitive: 1. Remove the `.env.example -> .env` rename example from reference/configuration.md and getting-started/overview.md. The `->` parser was removed in #739 (2026-03-19) because it caused the stale-credentials production bug in #228 — but the docs kept advertising it. A user writing `.env.example -> .env` today gets `parseCopyFileEntry` returning `{source: '.env.example -> .env', destination: '.env.example -> .env'}`, stat() fails with ENOENT, and the copy silently no-ops at debug level. 2. Replace the single-line "Default behavior: .archon/ is always copied" note with a proper "Worktree file copying" subsection that explains: - Why this exists (git worktree add = tracked files only; gitignored workflow inputs need this hook) - The `.archon/` default (no config needed for the common case) - Common entries: .env, .vscode/, .claude/, plans/, reports/, data fixtures - Semantics: source=destination, ENOENT silently skipped, per-entry error isolation, path-traversal rejected - Interaction with `worktree.path` (both layouts get the same treatment) 3. Update the overview example to drop the `.env.example + .env` pair (which implied rename semantics) in favor of `.env + plans/`, and call out that `.archon/` is auto-copied so users don't list it. No code changes. `bun run format:check` and `bun run lint` green. |
||
|
|
08de8ee5c6
|
fix(web,server): show real platform connection status in Settings (#1061)
The Settings page's Platform Connections section hardcoded every platform except Web to 'Not configured', so users couldn't tell whether their Slack/ Telegram/Discord/GitHub/Gitea/GitLab adapters had actually started. - Server: /api/health now returns an activePlatforms array populated live as each adapter's start() resolves. Passed into registerApiRoutes so the reference stays mutable — Telegram starts after the HTTP listener is already accepting requests, so a snapshot would miss it. - Web: SettingsPage.PlatformConnectionsSection now reads activePlatforms from /api/health and looks each platform up in a Set. Also adds Gitea and GitLab to the list (they already ship as adapters). Closes #1031 Co-authored-by: Lior Franko <liorfr@dreamgroup.com> |
||
|
|
5ed38dc765
|
feat(isolation,workflows): worktree location + per-workflow isolation policy (#1310)
Some checks are pending
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (ubuntu-latest) (push) Waiting to run
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
* feat(isolation): per-project worktree.path + collapse to two layouts Adds an opt-in `worktree.path` to .archon/config.yaml so a repo can co-locate worktrees with its own checkout (`<repoRoot>/<path>/<branch>`) instead of the default `~/.archon/workspaces/<owner>/<repo>/worktrees/<branch>`. Requested in joelsb's #1117. Primitive changes (clean up the graveyard rather than add parallel code paths): - Collapse worktree layouts from three to two. The old "legacy global" layout (`~/.archon/worktrees/<owner>/<repo>/<branch>`) is gone — every repo resolves to the workspace-scoped layout (`~/.archon/workspaces/<owner>/<repo>/worktrees/<branch>`), whether it was archon-cloned or locally registered. `extractOwnerRepo()` on the repo path is the stable identity fallback. Ends the divergence where workspace-cloned and local repos had visibly different worktree trees. - `getWorktreeBase()` in @archon/git now returns `{ base, layout }` and accepts an optional `{ repoLocal }` override. The layout value replaces the old `isProjectScopedWorktreeBase()` classification at the call sites (`isProjectScopedWorktreeBase` stays exported as deprecated back-compat). - `WorktreeCreateConfig.path` carries the validated override from repo config. `resolveRepoLocalOverride()` fails loudly on absolute paths, `..` escapes, and resolve-escape edge cases (Fail Fast — no silent default fallback when the config is syntactically wrong). - `WorktreeProvider.create()` now loads repo config exactly once and threads it through `getWorktreePath()` + `createWorktree()`. Replaces the prior swallow-then-retry pattern flagged on #1117. `generateEnvId()` is gone — envId is assigned directly from the resolved path (the invariant was already documented on `destroy(envId)`). Tests (packages/git + packages/isolation): - Update the pre-existing `getWorktreeBase` / `isProjectScopedWorktreeBase` suite for the new two-layout return shape and precedence. - Add 8 tests for `worktree.path`: default fallthrough, empty/whitespace ignored, override wins for workspace-scoped repos, rejects absolute, rejects `../` escapes (three variants), accepts nested relative paths. Docs: add `worktree.path` to the repo config reference with explicit precedence and the `.gitignore` responsibility note. Co-authored-by: Joel Bastos <joelsb2001@gmail.com> * feat(workflows): per-workflow worktree.enabled policy Introduces a declarative top-level `worktree:` block on a workflow so authors can pin isolation behavior regardless of invocation surface. Solves the case where read-only workflows (e.g. `repo-triage`) should always run in the live checkout, without every CLI/web/scheduled-trigger caller having to remember to set the right flag. Schema (packages/workflows/src/schemas/workflow.ts + loader.ts): - New optional `worktree.enabled: boolean` on `workflowBaseSchema`. Loader parses with the same warn-and-ignore discipline used for `interactive` and `modelReasoningEffort` — invalid shapes log and drop rather than killing workflow discovery. Policy reconciliation (packages/cli/src/commands/workflow.ts): - Three hard-error cases when YAML policy contradicts invocation flags: • `enabled: false` + `--branch` (worktree required by flag, forbidden by policy) • `enabled: false` + `--from` (start-point only meaningful with worktree) • `enabled: true` + `--no-worktree` (policy requires worktree, flag forbids it) - `enabled: false` + `--no-worktree` is redundant, accepted silently. - `--resume` ignores the pinned policy (it reuses the existing run's worktree even when policy would disable — avoids disturbing a paused run). Orchestrator wiring (packages/core/src/orchestrator/orchestrator-agent.ts): - `dispatchOrchestratorWorkflow` short-circuits `validateAndResolveIsolation` when `workflow.worktree?.enabled === false` and runs directly in `codebase.default_cwd`. Web chat/slack/telegram callers have no flag equivalent to `--no-worktree`, so the YAML field is their only control. - Logged as `workflow.worktree_disabled_by_policy` for operator visibility. First consumer (.archon/workflows/repo-triage.yaml): - `worktree: { enabled: false }` — triage reads issues/PRs and writes gh labels; no code mutations, no reason to spin up a worktree per run. Tests: - Loader: parses `worktree.enabled: true|false`, omits block when absent. - CLI: four new integration tests for the reconciliation matrix (skip when policy false, three hard-error cases, redundant `--no-worktree` accepted, `--no-worktree` + `enabled: true` rejected). Docs: authoring-workflows.md gets the new top-level field in the schema example with a comment explaining the precedence and the `enabled: true|false` semantics. * fix(isolation): use path.sep for repo-containment check on Windows resolveRepoLocalOverride was hardcoding '/' as the separator in the startsWith check, so on Windows (where `resolve()` returns backslash paths like `D:\Users\dev\Projects\myapp`) every otherwise-valid relative `worktree.path` was rejected with "resolves outside the repo root". Fixed by importing `path.sep` and using it in the sentinel. Fixes the 3 Windows CI failures in `worktree.path repo-local override`. --------- Co-authored-by: Joel Bastos <joelsb2001@gmail.com> |
||
|
|
7be4d0a35e
|
feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath (closes #1136) (#1315)
* feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath
Collapses the awkward `~/.archon/.archon/workflows/` convention to a direct
`~/.archon/workflows/` child (matching `workspaces/`, `archon.db`, etc.), adds
home-scoped commands and scripts with the same loading story, and kills the
opt-in `globalSearchPath` parameter so every call site gets home-scope for free.
Closes #1136 (supersedes @jonasvanderhaegen's tactical fix — the bug was the
primitive itself: an easy-to-forget parameter that five of six call sites on
dev dropped).
Primitive changes:
- Home paths are direct children of `~/.archon/`. New helpers in `@archon/paths`:
`getHomeWorkflowsPath()`, `getHomeCommandsPath()`, `getHomeScriptsPath()`,
and `getLegacyHomeWorkflowsPath()` (detection-only for migration).
- `discoverWorkflowsWithConfig(cwd, loadConfig)` reads home-scope internally.
The old `{ globalSearchPath }` option is removed. Chat command handler, Web
UI workflow picker, orchestrator resolve path — all inherit home-scope for
free without maintainer patches at every new site.
- `discoverScriptsForCwd(cwd)` merges home + repo scripts (repo wins on name
collision). dag-executor and validator use it; the hardcoded
`resolve(cwd, '.archon', 'scripts')` single-scope path is gone.
- Command resolution is now walked-by-basename in each scope. `loadCommand`
and `resolveCommand` walk 1 subfolder deep and match by `.md` basename, so
`.archon/commands/triage/review.md` resolves as `review` — closes the
latent bug where subfolder commands were listed but unresolvable.
- All three (`workflows/`, `commands/`, `scripts/`) enforce a 1-level
subfolder cap (matches the existing `defaults/` convention). Deeper
nesting is silently skipped.
- `WorkflowSource` gains `'global'` alongside `'bundled'` and `'project'`.
Web UI node palette shows a dedicated "Global (~/.archon/commands/)"
section; badges updated.
Migration (clean cut — no fallback read):
- First use after upgrade: if `~/.archon/.archon/workflows/` exists, Archon
logs a one-time WARN per process with the exact `mv` command:
`mv ~/.archon/.archon/workflows ~/.archon/workflows && rmdir ~/.archon/.archon`
The legacy path is NOT read — users migrate manually. Rollback caveat
noted in CHANGELOG.
Tests:
- `@archon/paths/archon-paths.test.ts`: new helper tests (default HOME,
ARCHON_HOME override, Docker), plus regression guards for the double-`.archon/`
path.
- `@archon/workflows/loader.test.ts`: home-scoped workflows, precedence,
subfolder 1-depth cap, legacy-path deprecation warning fires exactly once
per process.
- `@archon/workflows/validator.test.ts`: home-scoped commands + subfolder
resolution.
- `@archon/workflows/script-discovery.test.ts`: depth cap + merge semantics
(repo wins, home-missing tolerance).
- Existing CLI + orchestrator tests updated to drop `globalSearchPath`
assertions.
E2E smoke (verified locally, before cleanup):
- `.archon/workflows/e2e-home-scope.yaml` + scratch repo at /tmp
- Home-scoped workflow discovered from an unrelated git repo
- Home-scoped script (`~/.archon/scripts/*.ts`) executes inside a script node
- 1-level subfolder workflow (`~/.archon/workflows/triage/*.yaml`) listed
- Legacy path warning fires with actionable `mv` command; workflows there
are NOT loaded
Docs: `CLAUDE.md`, `docs-web/guides/global-workflows.md` (full rewrite for
three-type scope + subfolder convention + migration), `docs-web/reference/
configuration.md` (directory tree), `docs-web/reference/cli.md`,
`docs-web/guides/authoring-workflows.md`.
Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>
* test(script-discovery): normalize path separators in mocks for Windows
The 4 new tests in `scanScriptDir depth cap` and `discoverScriptsForCwd —
merge repo + home with repo winning` compared incoming mock paths with
hardcoded forward-slash strings (`if (path === '/scripts/triage')`). On
Windows, `path.join('/scripts', 'triage')` produces `\scripts\triage`, so
those branches never matched, readdir returned `[]`, and the tests failed.
Added a `norm()` helper at module scope and wrapped the incoming `path`
argument in every `mockImplementation` before comparing. Stored paths go
through `normalizeSep()` in production code, so the existing equality
assertions on `script.path` remain OS-independent.
Fixes Windows CI job `test (windows-latest)` on PR #1315.
* address review feedback: home-scope error handling, depth cap, and tests
Critical fixes:
- api.ts: add `maxDepth: 1` to all 3 findMarkdownFilesRecursive calls in
GET /api/commands (bundled/home/project). Without this the UI palette
surfaced commands from deep subfolders that the executor (capped at 1)
could not resolve — silent "command not found" at runtime.
- validator.ts: wrap home-scope findMarkdownFilesRecursive and
resolveCommandInDir calls in try/catch so EACCES/EPERM on
~/.archon/commands/ doesn't crash the validator with a raw filesystem
error. ENOENT still returns [] via the underlying helper.
Error handling fixes:
- workflow-discovery.ts: maybeWarnLegacyHomePath now sets the
"warned-once" flag eagerly before `await access()`, so concurrent
discovery calls (server startup with parallel codebase resolution)
can't double-warn. Non-ENOENT probe errors (EACCES/EPERM) now log at
WARN instead of DEBUG so permission issues on the legacy dir are
visible in default operation.
- dag-executor.ts: wrap discoverScriptsForCwd in its own try/catch so
an EACCES on ~/.archon/scripts/ routes through safeSendMessage /
logNodeError with a dedicated "failed to discover scripts" message
instead of being mis-attributed by the outer catch's
"permission denied (check cwd permissions)" branch.
Tests:
- load-command-prompt.test.ts (new): 6 tests covering the executor's
command resolution hot path — home-scope resolves when repo misses,
repo shadows home, 1-level subfolder resolvable by basename, 2-level
rejected, not-found, empty-file. Runs in its own bun test batch.
- archon-paths.test.ts: add getHomeScriptsPath describe block to match
the existing getHomeCommandsPath / getHomeWorkflowsPath coverage.
Comment clarity:
- workflow-discovery.ts: MAX_DISCOVERY_DEPTH comment now leads with the
actual value (1) before describing what 0 would mean.
- script-discovery.ts: copy the "routing ambiguity" rationale from
MAX_DISCOVERY_DEPTH to MAX_SCRIPT_DISCOVERY_DEPTH.
Cleanup:
- Remove .archon/workflows/e2e-home-scope.yaml — one-off smoke test that
would ship permanently in every project's workflow list. Equivalent
coverage exists in loader.test.ts.
Addresses all blocking and important feedback from the multi-agent
review on PR #1315.
---------
Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>
|
||
|
|
cc78071ff6
|
fix(isolation): raise worktree git-operation timeout to 5m (#1306)
All 15 worktree git-subprocess timeouts in WorktreeProvider were hardcoded at 30000ms. Repos with heavy post-checkout hooks (lint, dependency install, submodule init) routinely exceed that budget and fail worktree creation. Consolidate them onto a single GIT_OPERATION_TIMEOUT_MS constant at 5 min. Generous enough to cover reported cases while still catching genuine hangs (credential prompts in non-TTY, stalled fetches). Chosen over the config-key approach in #1029 to avoid adding permanent .archon/config.yaml surface for a problem a raised default solves cleanly. If 5 min turns out to also be too tight for real-world use, we'll revisit. Closes #1119 Supersedes #1029 Co-authored-by: Shay Elmualem <12733941+norbinsh@users.noreply.github.com> |
||
|
|
39a05b762f
|
fix(db): throw on corrupt commands JSON instead of silent empty fallback (#1033)
* fix(db): throw on corrupt commands JSON instead of silent empty fallback (#967) getCodebaseCommands() silently returned {} when the commands column contained corrupt JSON. Callers had no way to distinguish 'no commands' from 'unreadable data', violating fail-fast principles. Now throws a descriptive error with the codebase ID and a recovery hint. The error is still logged for observability before throwing. Adds two test cases: corrupt JSON throws, valid JSON string parses. * fix: include parse error in log for better diagnostics |
||
|
|
cb44b96f7b
|
feat(providers/pi): interactive flag binds UIContext for extensions (#1299)
* feat(providers/pi): interactive flag binds UIContext for extensions
Adds `interactive: true` opt-in to Pi provider (in `.archon/config.yaml`
under `assistants.pi`) that binds a minimal `ExtensionUIContext` stub to
each session. Without this, Pi's `ExtensionRunner.hasUI()` reports false,
causing extensions like `@plannotator/pi-extension` to silently auto-approve
every plan instead of opening their browser review UI.
Semantics: clamped to `enableExtensions: true` — no extensions loaded
means nothing would consume `hasUI`, so `interactive` alone is silently
dropped. Stub forwards `notify()` to Archon's event stream; interactive
dialogs (select/confirm/input/editor/custom) resolve to undefined/false;
TUI-only setters (widgets/headers/footers/themes) no-op. Theme access
throws with a clear diagnostic — Pi's theme singleton is coupled to its
own `Symbol.for()` registry which Archon doesn't own.
Trust boundary: only binds when the operator has explicitly enabled
both flags. Extensions gated on `ctx.hasUI` (plannotator and similar)
get a functional UI context; extensions that reach for TUI features
still fail loudly rather than rendering garbage.
Includes smoke-test workflow documenting the integration surface.
End-to-end plannotator UI rendering requires plan-mode activation
(Pi `--plan` CLI flag or `/plannotator` TUI slash command) which is
out of reach for programmatic Archon sessions — manual test only.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(providers/pi): end-to-end interactive extension UI
Three fixes that together get plannotator's browser review UI to actually
render from an Archon workflow and reach the reviewer's browser.
1. Call resourceLoader.reload() when enableExtensions is true.
createAgentSession's internal reload is gated on `!resourceLoader`, so
caller-supplied loaders must reload themselves. Without this,
getExtensions() returns the empty default, no ExtensionRunner is built,
and session.extensionRunner.setFlagValue() silently no-ops.
2. Set PLANNOTATOR_REMOTE=1 in interactive mode.
plannotator-browser.ts only calls ctx.ui.notify(url) when openBrowser()
returns { isRemote: true }; otherwise it spawns xdg-open/start on the
Archon server host — invisible to the user and untestable from bash
asserts. From the workflow runner's POV every Archon execution IS
remote; flipping the heuristic routes the URL through notify(), which
the ExtensionUIContext stub forwards into the event stream. Respect
explicit operator overrides.
3. notify() emits as assistant chunks, not system chunks.
The DAG executor's system-chunk filter only forwards warnings/MCP
prefixes, and only assistant chunks accumulate into $nodeId.output.
Emitting as assistant makes the URL available both in the user's
stream and in downstream bash/script nodes via output substitution.
Plus: extensionFlags config pass-through (equivalent to `pi --plan` on the
CLI) applied via ExtensionRunner.setFlagValue() BEFORE bindExtensions
fires session_start, so extensions reading flags in their startup handler
actually see them. Also bind extensions with an empty binding when
enableExtensions is on but interactive is off, so session_start still
fires for flag-driven but UI-less extensions.
Smoke test (.archon/workflows/e2e-plannotator-smoke.yaml) uses
openai-codex/gpt-5.4-mini (ChatGPT Plus OAuth compatible) and bumps
idle_timeout to 600000ms so plannotator's server survives while a human
approves in the browser.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(providers/pi): keep Archon extension-agnostic
Remove the plannotator-specific PLANNOTATOR_REMOTE=1 env var write from
the Pi provider. Archon's provider layer shouldn't know about any
specific extension's internals. Document the env var in the plannotator
smoke test instead — operators who use plannotator set it via their shell
or per-codebase env config.
Workflow smoke test updated with:
- Instructions for setting PLANNOTATOR_REMOTE=1 externally
- Simpler assertion (URL emission only) — validated in a real
reject-revise-approve run: reviewer annotated, clicked Send Feedback,
Pi received the feedback as a tool result, revised the plan (added
aria-label and WCAG contrast per the annotation), resubmitted, and
reviewer approved. Plannotator's tool result signals approval but
doesn't return the plan text, so the bash assertion now only checks
that the review URL reached the stream (not that plan content flowed
into \$nodeId.output — it can't).
- Known-limitation note documenting the tool-result shape so downstream
workflow authors know to Write the plan separately if they need it.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* chore(providers/pi): keep e2e-plannotator-smoke workflow local-only
The smoke test is plannotator-specific (calls plannotator_submit_plan,
expects PLAN.md on disk, requires PLANNOTATOR_REMOTE=1) and is better
kept out of the PR while the extension-agnostic infra lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* style(providers/pi): trim verbose inline comments
Collapse multi-paragraph SDK explanations to 1-2 line "why" notes across
provider.ts, types.ts, ui-context-stub.ts, and event-bridge.ts. No
behavior change.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(providers/pi): wire assistants.pi.env + theme-proxy identity
Two end-to-end fixes discovered while exercising the combined
plannotator + @pi-agents/loop smoke flow:
- PiProviderDefaults gains an optional `env` map; parsePiConfig picks
it up and the provider applies it to process.env at session start
(shell env wins, no override). Needed so extensions like plannotator
can read PLANNOTATOR_REMOTE=1 from config.yaml without requiring a
shell export before `archon workflow run`.
- ui-context-stub theme proxy returns identity decorators instead of
throwing on unknown methods. Styled strings flow into no-op
setStatus/setWidget sinks anyway, so the throw was blocking
plannotator_submit_plan after HTTP approval with no benefit.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(providers/pi): flush notify() chunks immediately in batch mode
Batch-mode adapters (CLI) accumulate assistant chunks and only flush on
node completion. That broke plannotator's review-URL flow: Pi's notify()
emitted the URL as an assistant chunk, but the user needed the URL to
POST /api/approve — which is what unblocks the node in the first place.
Adds an optional `flush` flag on assistant MessageChunks. notify() sets
it, and the DAG executor drains pending batched content before surfacing
the flushed chunk so ordering is preserved.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs: mention Pi alongside Claude and Codex in README + top-level docs
The AI assistants docs page already covers Pi in depth, but the README
architecture diagram + docs table, overview "Further Reading" section,
and local-deployment .env comment still listed only Claude/Codex.
Left feature-specific mentions alone where Pi genuinely lacks support
(e.g. structured output — Claude + Codex only).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs: note Pi structured output (best-effort) in matrix + workflow docs
Pi gained structured output support via prompt augmentation + JSON
extraction (see packages/providers/src/community/pi/capabilities.ts).
Unlike Claude/Codex, which use SDK-enforced JSON mode, Pi appends the
schema to the prompt and parses JSON out of the result text (bare or
fenced). Updates four stale references that still said Claude/Codex-only:
- ai-assistants.md capabilities matrix
- authoring-workflows.md (YAML example + field table)
- workflow-dag.md skill reference
- CLAUDE.md DAG-format node description
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(providers/pi): default extensions + interactive to on
Extensions (community packages like @plannotator/pi-extension and
user-authored ones) are a core reason users pick Pi. Defaulting
enableExtensions and interactive to false previously silenced installed
extensions with no signal, leading to "did my extension even load?"
confusion.
Opt out in .archon/config.yaml when you want the prior behavior:
assistants:
pi:
enableExtensions: false # skip extension discovery entirely
# interactive: false # load extensions, but no UI bridge
Docs gain a new "Extensions (on by default)" section in
getting-started/ai-assistants.md that documents the three config
surfaces (extensionFlags, env, workflow-level interactive) and uses
plannotator as a concrete walk-through example.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
45682bd2c8
|
fix(providers/claude): use || instead of ?? in hasExplicitTokens to handle empty-string env vars (#1028)
Closes #1027 |
||
|
|
28908f0c75
|
feat(paths/cli/setup): unify env load + write on three-path model (#1302, #1303) (#1304)
* feat(paths/cli/setup): unify env load + write on three-path model (#1302, #1303) Key env handling on directory ownership rather than filename. `.archon/` (at `~/` or `<cwd>/`) is archon-owned; anything else is the user's. - `<repo>/.env` — stripped at boot (guard kept), never loaded, never written - `<repo>/.archon/.env` — loaded at repo scope (wins over home), writable via `archon setup --scope project` - `~/.archon/.env` — loaded at home scope, writable via `--scope home` (default) Read side (#1302): - New `@archon/paths/env-loader` with `loadArchonEnv(cwd)` shared by CLI and server entry points. Loads both archon-owned files with `override: true`; repo scope wins. - Replaced `[dotenv@17.3.1] injecting env (0) from .env` (always lied about stripped keys) with `[archon] stripped N keys from <cwd> (...)` and `[archon] loaded N keys from <path>` lines, emitted only when N > 0. `quiet: true` passed to dotenv to silence its own output. - `stripCwdEnv` unchanged in semantics — still the only source that deletes keys from `process.env`; now logs what it did. Write side (#1303): - `archon setup` never writes to `<repo>/.env`. Writing there was incoherent because `stripCwdEnv` deletes those keys on every run. - New `--scope home|project` (default home) targets exactly one archon-owned file. New `--force` overrides the merge; backup still written. - Merge-only by default: existing non-empty values win, user-added custom keys survive, `<path>.archon-backup-<ISO-ts>` written before every rewrite. Fixes silent PostgreSQL→SQLite downgrade and silent token loss in Add mode. - One-time migration note emitted when `<cwd>/.env` exists at setup start. Tests: new `env-loader.test.ts` (6), extended `strip-cwd-env.test.ts` (+4 for the log line), extended `setup.test.ts` (+10 for scope/merge/backup/force/ repo-untouched), extended `cli.test.ts` (+5 for flag parsing). Docs: configuration.md, cli.md, security.md, cli-internals.md, setup skill — all updated to the three-path model. * fix(cli/setup): address PR review — scope/path/secret-handling edge cases - cli: resolve --scope project to git repo root so running setup from a subdir writes to <repo-root>/.archon/.env (what loadArchonEnv reads at boot), not <subdir>/.archon/.env. Fail fast with a useful message when --scope project is used outside a git repo. - setup: resolveScopedEnvPath() now delegates to @archon/paths helpers (getArchonEnvPath / getRepoArchonEnvPath) so Docker's /.archon home, ARCHON_HOME overrides, and the "undefined" literal guard all behave identically between the loader and the writer. - setup: wrap the writeScopedEnv call in try/catch so an fs exception (permission denied, read-only FS, backup copy failure) stops the clack spinner cleanly and emits an actionable error instead of a raw stack trace after the user has completed the entire wizard. - setup: checkExistingConfig(envPath?) — scope-aware existing-config read. Add/Update/Fresh now reflects the actual write target, not an unconditional ~/.archon/.env. - setup: serializeEnv escapes \r (was only \n) so values with bare CR or CRLF round-trip through dotenv.parse without corruption. Regression test added. - setup: merge path treats whitespace-only existing values (' ') as empty, so a copy-paste stray space doesn't silently defeat the wizard update for that key forever. Regression test added. - setup: 0o600 mode on the written env file AND on backup copies — writeFileSync+copyFileSync default to 0o666 & ~umask, which can leave secrets group/world-readable on a permissive umask. - docs/cli.md + setup skill: appendix sections that still described the pre-#1303 two-file symlink model now reflect the three-path model. * fix(paths/env-loader): Windows-safe assertion for home-scope load line The test asserted the log line contained `from ~/`, which is opportunistic tilde-shortening that only happens when the tmpdir lives under `homedir()`. On Windows CI the tmpdir is on `D:\\` while homedir is `C:\\Users\\...`, so the path renders absolute and the `~/` never appears. Match on the count and the archon-home tmpdir segment instead — robust on both Unix tilde-short paths and Windows absolute paths. |
||
|
|
8ae4a56193
|
feat(workflows): add repo-triage — periodic maintenance via inline Haiku sub-agents (#1293)
* feat(workflows): add repo-triage — 6-node periodic maintenance workflow Adds .archon/workflows/repo-triage.yaml: a self-contained periodic maintenance workflow that uses inline sub-agents (Claude SDK agents: field introduced in #1276) for map-reduce across open issues and PRs. Six DAG nodes, three-layer topology: - Layer 1 (parallel): triage-issues, link-prs, closed-pr-dedup-check, stale-nudge - Layer 2: closed-dedup-check (reads triage-issues state) - Layer 3: digest (synthesises all prior nodes + writes markdown) Capabilities per node: - triage-issues: delegates labeling to on-disk triage-agent; inline brief-gen Haiku for duplicate detection; 3-day auto-close clock for unanswered duplicate warnings - link-prs: conservative PR ↔ issue cross-refs via inline pr-issue- matcher Haiku, Sonnet re-verifies fully-addresses claims before suggesting Closes #X; auto-nudges on low-quality PR template fill with first-run grandfather guard (snapshot-only, no nudge spam) - closed-dedup-check: cross-matches open issues against recently- closed ones via inline closed-brief-gen Haiku; same 3-day clock - closed-pr-dedup-check: flags open PRs duplicating recently-closed PRs via inline pr-brief-gen Haiku; comment-only, never closes PRs - stale-nudge: 60-day inactivity pings (configurable); no auto-close - digest: synthesises per-node outputs + reads state files to emit $ARTIFACTS_DIR/digest.md with clickable GitHub comment links Env-gated rollout knobs: - DRY_RUN=1 (read-only; prints [DRY] lines, no gh/state mutations) - SKIP_PR_LINK=1, SKIP_CLOSED_DEDUP=1, SKIP_CLOSED_PR_DEDUP=1, SKIP_STALE_NUDGE=1 - STALE_DAYS=N (stale-nudge window; default 60) Cross-run state under .archon/state/ (gitignored): - triage-state.json briefs + pendingDedupComments - closed-dedup-state.json closedBriefs + closedMatchComments - closed-pr-dedup-state.json openBriefs + closedBriefs + matches - pr-state.json linkedPrs + commentIds + templateAdherence - stale-nudge-state.json nudged (with updatedAtAtNudge for re-nudge) Every bot comment: - @-tags the target human (reporter for issues, author for PRs) - Tracks comment ID in state for traceability - Is idempotent — re-runs skip existing comments Intended use: invoke periodically (`archon workflow run repo-triage --no-worktree`) once a scheduler lands; live state persists across runs so previously-flagged items reconcile correctly. .gitignore: adds .archon/state/ for cross-run memory files. * feat(workflows/repo-triage): post digest to Slack when SLACK_WEBHOOK is set Extends the digest node with an optional Slack-post step after the canonical digest.md artifact is written. Uses Slack incoming webhook (no bot token required beyond the incoming-webhook scope). Behavior: - SLACK_WEBHOOK unset → skipped silently with a one-line note - DRY_RUN=1 → prints full payload, does not curl - Otherwise → POSTs a compact (<3500 char) mrkdwn-formatted summary containing headline numbers, this-run comment index (clickable GitHub URLs), pending items, and a path reference to digest.md - curl failure or non-ok Slack response is logged but does not fail the node — digest.md on disk remains authoritative - Intermediate Slack text written to $ARTIFACTS_DIR/digest-slack.txt for traceability; payload JSON assembled via jq and written to $ARTIFACTS_DIR/slack-payload.json before curl posts it Slack mrkdwn conversion rules baked into the prompt (no tables, link shape <url|text>, single-asterisk bold) so Sonnet emits a variant that renders cleanly in Slack rather than being sent raw. The webhook URL is read from the operator's environment (Archon auto-loads ~/.archon/.env on CLI startup — put SLACK_WEBHOOK=... there). * fix(workflows/repo-triage): address PR #1293 review feedback Critical (3): - `gh issue close --reason "not planned"` (space, not underscore) — the CLI expects lowercase with a space; `not_planned` fails at runtime. Fixed in both auto-close paths (triage-issues step 8, closed-dedup- check step 7). - link-prs step 7 state save was sparse `{ sha, processedAt, related, fullyAddresses }`, overwriting `commentIds` / `templateNudgedAt` / `templateAdherence`. Changed to explicit merge that spreads existing entry first so per-run captured fields survive. - Corrupt-JSON state files previously treated as first-run default (silent `pendingDedupComments` reset → 3-day clock restarts forever). All five state-load sites now abort loudly on JSON.parse throw; ENOENT/empty continue to default-shape. Important (7): - Sub-agents (`brief-gen`, `closed-brief-gen`, `pr-brief-gen`, `pr-issue-matcher`) emit `ERROR: <reason>` on gh failures rather than partial/fabricated JSON. Orchestrator detects the sentinel, logs the failed ID + first 200 chars of raw response, tracks in a failed-list, and aborts the cluster/match pass if ≥50% of items failed (avoids acting on bad data). - `pr-brief-gen` now sets `diffTruncated: true` when the 30k-char diff cap hits; link-prs verify pass downgrades any `fully-addresses` claim to `related` when either side's brief was truncated. - 3-day auto-close validates `postedAt` parses as ISO-8601 before the elapsed-time comparison; corrupt timestamps are logged and skipped, never acted on. - `gh issue close` failure path no longer drops state — sets `closeAttemptFailed: true` on the entry for next-run retry. Only drops on exit 0. - `closed-pr-dedup-check` idempotency check (`gh pr view --json comments`) now aborts the post on fetch failure rather than falling through — prevents double-posts on gh hiccups. - `triage-agent` label pass has preflight `test -f` check for `.claude/agents/triage-agent.md`; skips the pass with a clear log if the file is missing rather than firing Task calls that fail obscurely. - `brief-gen` template-adherence wording flipped from "Ignore … as 'filled'" (ambiguous, read as affirmative) to explicit "A section counts as MISSING when …", matching the `pr-issue-matcher` phrasing. Minor: - `stale-nudge` idempotency check uses substring "has been quiet for" instead of a prefix check that never matched (posted body starts with @<author>). - `closed-dedup-check` distinguishes "upstream crashed" (missing/corrupt triage-state.json, or `lastRunAt == null`) from "legitimately quiet day" (state present, briefs empty) — different log lines. - Slack curl adds `-w "\nHTTP_STATUS:%{http_code}"` + `2>&1` so TLS / 4xx / 5xx errors are visible in captured output. - `stateReason` values from `gh issue view --json stateReason` are UPPERCASE (`COMPLETED`, `NOT_PLANNED`); documented and instruct sub-agent to normalize to lowercase for consistency. Docs: - CLAUDE.md repo-level `.archon/` tree now lists `state/`. - archon-directories.md tree adds `state/` + `scripts/` (both were missing) with purpose descriptions. Deferred (worth doing as a follow-up, not blocking): - DRY/SKIP preamble duplication (~30-50 lines across 5 nodes). - Explicit `BASELINE_IS_EMPTY` capture in link-prs (current derived check works but is a load-bearing model instruction). - Digest `WARNING` prefix block when upstream nodes are missing outputs — today's "(output unavailable)" sub-line is functional. - Pre-existing README workflow-count (17 → 20) and table gaps — not caused by this PR. |
||
|
|
eb730c0b82
|
fix(docs): prevent theme reset to dark after user switches to auto/light (#1079)
Starlight removes the `starlight-theme` localStorage key when the user selects "auto" mode. The old init script checked that key, so every navigation or refresh re-forced dark theme. Use a separate `archon-theme-init` sentinel that persists across theme changes. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
ec5e5a5cf9
|
feat(providers/pi): opt-in extension discovery via config flag (#1298)
Some checks are pending
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
Test Suite / test (ubuntu-latest) (push) Waiting to run
Test Suite / test (windows-latest) (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / docker-build (push) Waiting to run
Adds `assistants.pi.enableExtensions` (default false) to `.archon/config.yaml`. When true, Pi's `noExtensions` guard is lifted so the session loads tools and lifecycle hooks from `~/.pi/agent/extensions/`, packages installed via `pi install npm:<pkg>`, and the workflow's cwd `.pi/` directory — opening up the community extension ecosystem at https://shittycodingagent.ai/packages. Default stays suppressed to preserve the "Archon is source of truth" trust boundary: enabling this loads arbitrary JS under the Archon server's OS permissions, including whatever extension code the target repo happens to ship. Operators opt in explicitly, per-host. Skills, prompt templates, themes, and context files remain suppressed even when extensions are enabled — only the extensions gate opens. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
fb73a500d7
|
feat(providers/pi): best-effort structured output via prompt engineering (#1297)
Pi's SDK has no native JSON-schema mode (unlike Claude's outputFormat / Codex's outputSchema). Previously Pi declared structuredOutput: false and any workflow using output_format silently degraded — the node ran, the transcript was treated as free text, and downstream $nodeId.output.field refs resolved to empty strings. 8 bundled/repo workflows across 10 nodes were affected (archon-create-issue, archon-fix-github-issue, archon-smart-pr-review, archon-workflow-builder, archon-validate-pr, etc.). This PR closes the gap via prompt engineering + post-parse: 1. When requestOptions.outputFormat is present, the provider appends a "respond with ONLY a JSON object matching this schema" instruction plus JSON.stringify(schema) to the prompt before calling session.prompt(). 2. bridgeSession accepts an optional jsonSchema param. When set, it buffers every assistant text_delta and — on the terminal result chunk — parses the buffer via tryParseStructuredOutput (trims whitespace, strips ```json / ``` fences, JSON.parse). On success, attaches structuredOutput to the result chunk (matching Claude's shape). On failure, emits a warn event and leaves structuredOutput undefined so the executor's existing dag.structured_output_missing path handles it. 3. Flipped PI_CAPABILITIES.structuredOutput to true. Unlike Claude/Codex this is best-effort, not SDK-enforced — reliable on GPT-5, Claude, Gemini 2.x, recent Qwen Coder, DeepSeek V3, less reliable on smaller or older models that ignore JSON-only instructions. Tests added (14 total): - tryParseStructuredOutput: clean JSON, fenced, bare fences, arrays, whitespace, empty, prose-wrapped (fails), malformed, inner backticks - augmentPromptForJsonSchema via provider integration: schema appended, prompt unchanged when absent - End-to-end: clean JSON → structuredOutput parsed; fenced JSON parses; prose-wrapped → no structuredOutput + no crash; no outputFormat → never sets structuredOutput even if assistant happens to emit JSON Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
83c119af78
|
fix(providers/pi): wire env injection + harden silent-failure paths (#1296)
Four defensive fixes to the Pi community provider to match the
Claude/Codex contract and eliminate silent error swallowing.
1. envInjection now actually wired (capability was declared but unused)
Pi's SDK has no top-level `env` option on createAgentSession, so
per-project env vars were being dropped. Routes requestOptions.env
through a BashSpawnHook that merges caller env over the inherited
baseline (caller wins, matching Claude/Codex semantics). When env is
present with no allow/deny, resolvePiTools now explicitly returns Pi's
4 default tools so the pre-constructed default bashTool is replaced
with an env-aware one.
2. AsyncQueue no longer leaks on consumer abort. Added close() that
drains pending waiters with { done: true } so iterate() exits instead
of hanging forever when the producer's finally fires before the next
push. bridgeSession calls queue.close() in its finally block.
3. buildResultChunk no longer reports silent success when agent_end fires
with no assistant message. Now returns { isError: true, errorSubtype:
'missing_assistant_message' } and logs a warn event so broken Pi
sessions don't masquerade as clean completions.
4. session-resolver no longer swallows arbitrary errors from
SessionManager.list(). Narrowed the catch to ENOENT/ENOTDIR (the only
"session dir doesn't exist yet" signals); permission errors, parse
failures, and other unexpected errors now propagate.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
60eeb00e42
|
feat(workflows): inline sub-agent definitions on DAG nodes (#1276)
Some checks are pending
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (ubuntu-latest) (push) Waiting to run
* feat(workflows): inline sub-agent definitions on DAG nodes Add `agents:` node field letting workflow YAML define Claude Agent SDK sub-agents inline, keyed by kebab-case ID. The main agent can spawn them via the Task tool — useful for map-reduce patterns where a cheap model briefs items and a stronger model reduces. Authors no longer need standalone `.claude/agents/*.md` files for workflow-scoped helpers; the definitions live with the workflow. Claude only. Codex and community providers without the capability emit a capability warning and ignore the field. Merges with the internal `dag-node-skills` wrapper when `skills:` is also set — user-defined agents win on ID collision. * fix(workflows): address PR #1276 review feedback Critical: - Re-export agentDefinitionSchema + AgentDefinition from schemas/index.ts (matches the "schemas/index.ts re-exports all" convention). Important: - Surface user-override of internal 'dag-node-skills' wrapper: warn-level provider log + platform message to the user when agents: redefines the reserved ID alongside skills:. User-wins behavior preserved (by design) but silent capability removal is now observable. - Add validator test coverage for the agents-capability warning (codex node with agents: → warning; claude node → no warning; no-agents field → no warning). - Strengthen NodeConfig.agents duplicate-type comment explaining the intentional circular-dep avoidance and pointing at the Zod schema as authoritative source. Actual extraction is follow-up work. Simplifications: - Drop redundant typeof check in validator (schema already enforces). - Drop unreachable Object.keys(...).length > 0 check in dag-executor. - Drop rot-prone "(out of v1 scope)" parenthetical. - Drop WHAT-only comment on AGENT_ID_REGEX. - Tighten AGENT_ID_REGEX to reject trailing/double hyphens (/^[a-z0-9]+(-[a-z0-9]+)*$/). Tests: - parseWorkflow strips agents on script: and loop: nodes (parallel to the existing bash: coverage). - provider emits warn log on dag-node-skills collision; no warn on non-colliding inline agents. Docs: - Renumber authoring-workflows Summary section (12b → 13; bump 13-19). - Add Pi capability-table row for inline agents (❌, Claude-only). - Add when-to-use guidance (agents: vs .claude/agents/*.md) in the new "Inline sub-agents" section. - Cross-link skills.md Related → inline-sub-agents. - CHANGELOG [Unreleased] Added entry for #1276. |
||
|
|
4c6ddd994f
|
fix(workflows): fail loudly on SDK isError results (#1208) (#1291)
Some checks are pending
E2E Smoke Tests / e2e-deterministic (push) Waiting to run
E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions
Test Suite / test (windows-latest) (push) Waiting to run
Test Suite / docker-build (push) Waiting to run
E2E Smoke Tests / e2e-codex (push) Waiting to run
E2E Smoke Tests / e2e-claude (push) Waiting to run
Test Suite / test (ubuntu-latest) (push) Waiting to run
Previously, `dag-executor` only failed nodes/iterations when the SDK returned an `error_max_budget_usd` result. Every other `isError: true` subtype — including `error_during_execution` — was silently `break`ed out of the stream with whatever partial output had accumulated, letting failed runs masquerade as successful ones with empty output. This is the most likely explanation for the "5-second crash" symptom in #1208: iterations finish instantly with empty text, the loop keeps going, and only the `claude.result_is_error` log tips the user off. Changes: - Capture the SDK's `errors: string[]` detail on result messages (previously discarded) and surface it through `MessageChunk.errors`. - Log `errors`, `stopReason` alongside `errorSubtype` in `claude.result_is_error` so users can see what actually failed. - Throw from both the general node path and the loop iteration path on any `isError: true` result, including the subtype and SDK errors detail in the thrown message. Note: this does not implement auto-retry. See PR comments on #1121 and the analysis on #1208 — a retry-with-fresh-session approach for loop iterations is not obviously correct until we see what `error_during_execution` actually carries in the reporter's env. This change is the observability + fail-loud step that has to come first so that signal is no longer silent. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
d89bc767d2
|
fix(setup): align PORT default on 3090 across .env.example, wizard, and JSDoc (#1152) (#1271)
The server's getPort() fallback changed from 3000 to 3090 in the Hono migration (#318), but .env.example, the setup wizard's generated .env, and the JSDoc describing the fallback were not updated — leaving three different sources of truth for "the default PORT." When the wizard writes PORT=3000 to ~/.archon/.env (which the Hono server loads with override: true, while Vite only reads repo-local .env), the two processes can land on different ports silently. That mismatch is the real mechanism behind the failure described in #1152. - .env.example: comment out PORT, document 3090 as the default - packages/cli/src/commands/setup.ts: wizard no longer writes PORT=3000 into the generated .env; fix the "Additional Options" note - packages/cli/src/commands/setup.test.ts: assert no bare PORT= line and the commented default is present - packages/core/src/utils/port-allocation.ts: fix stale JSDoc "default 3000" -> "default 3090" - deploy/.env.example: keep Docker default at 3000 (compose/Caddy target that) but annotate it so users don't copy it for local dev Single source of truth for the local-dev default is now basePort in port-allocation.ts. |
||
|
|
c864d8e427
|
refactor(providers/pi): drop rot-prone file:line refs from code comments (#1275)
Applies the CLAUDE.md comment rule ("don't embed paths/callers that rot
as the codebase evolves") flagged by the PR #1271 review to the Pi
provider's inline comments.
Three spots in the merged Pi code embed `packages/.../provider.ts:N-M`
line ranges pointing at the Claude and Codex providers. These ranges
will drift the moment those files change — the Claude auth-merge
pattern's line numbers are already off-by-a-few in some local branches.
Keep the conceptual cross-reference ("mirrors Claude's process-env +
request-env merge pattern", "matches the Codex provider's fallback
pattern for the same condition") — that's the load-bearing part of the
comment — drop the fragile line numbers and file paths.
Same treatment for the upstream Pi auth-storage.ts:424-485 reference,
which points at a specific line range in a moving dependency.
No behavior change; comment-only refactor.
|
||
|
|
4e56991b72
|
feat(providers): add Pi community provider (@mariozechner/pi-coding-agent) (#1270)
* feat(providers): add Pi community provider (@mariozechner/pi-coding-agent) Introduces Pi as the first community provider under the Phase 2 registry, registered with builtIn: false. Wraps Pi's full coding-agent harness the same way ClaudeProvider wraps @anthropic-ai/claude-agent-sdk and CodexProvider wraps @openai/codex-sdk. - PiProvider implements IAgentProvider; fresh AgentSession per sendQuery call - AsyncQueue bridges Pi's callback-based session.subscribe() to Archon's AsyncGenerator<MessageChunk> contract - Server-safe: AuthStorage.inMemory + SessionManager.inMemory + SettingsManager.inMemory + DefaultResourceLoader with all no* flags — no filesystem access, no cross-request state - API key seeded per-call from options.env → process.env fallback - Model refs: '<pi-provider-id>/<model-id>' (e.g. google/gemini-2.5-pro, openrouter/qwen/qwen3-coder) with syntactic compatibility check - registerPiProvider() wired at CLI, server, and config-loader entrypoints, kept separate from registerBuiltinProviders() since builtIn: false is load-bearing for the community-provider validation story - All 12 capability flags declared false in v1 — dag-executor warnings fire honestly for any unmapped nodeConfig field - 58 new tests covering event mapping, async-queue semantics, model-ref parsing, defensive config parsing, registry integration Supported Pi providers (v1): anthropic, openai, google, groq, mistral, cerebras, xai, openrouter, huggingface. Extend PI_PROVIDER_ENV_VARS as needed. Out of scope (v1): session resume, MCP, hooks, skills mapping, thinking level mapping, structured output, OAuth flows, model catalog validation. These remain false on PI_CAPABILITIES until intentionally wired. * feat(providers/pi): read ~/.pi/agent/auth.json for OAuth + api_key passthrough Replaces the v1 env-var-only auth flow with AuthStorage.create(), which reads ~/.pi/agent/auth.json. This transparently picks up credentials the user has populated via `pi` → `/login` (OAuth subscriptions: Claude Pro/Max, ChatGPT Plus, GitHub Copilot, Gemini CLI, Antigravity) or by editing the file directly. Env-var behavior preserved: when ANTHROPIC_API_KEY / GEMINI_API_KEY / etc. is set (in process.env or per-request options.env), the adapter calls setRuntimeApiKey which is priority #1 in Pi's resolution chain. Auth.json entries are priority #2-#3. Pi's internal env-var fallback remains priority #4 as a safety net. Archon does not implement OAuth flows itself — it only rides on creds the user created via the Pi CLI. OAuth refresh still happens inside Pi (auth-storage.ts:369-413) under a file lock; concurrent refreshes between the Pi CLI and Archon are race-safe by Pi's own design. - Fail-fast error now mentions both the env-var path and `pi /login` - 2 new tests: OAuth cred from auth.json; env var wins over auth.json - 12 existing tests still pass (env-var-only path unchanged) CI compatibility: no auth.json in CI, no change — env-var (secrets) flows through Pi's getEnvApiKey fallback identically to v1. * test(e2e): add Pi provider smoke test workflow Mirrors e2e-claude-smoke.yaml: single prompt node + bash assert. Targets `anthropic/claude-haiku-4-5` via `provider: pi`; works in CI (ANTHROPIC_API_KEY secret) and locally (user's `pi /login` OAuth). Verified locally with an Anthropic OAuth subscription — full run takes ~4s from session_started to assert PASS, exercising the async-queue bridge and agent_end → result-chunk assembly under real Pi event timing. Not yet wired into .github/workflows/e2e-smoke.yml — separate PR once this lands, to keep the Pi provider PR minimal. * feat(providers/pi): v2 — thinkingLevel, tool restrictions, systemPrompt Extends the Pi adapter with three node-level translations, flipping the corresponding capability flags from false → true so the dag-executor no longer emits warnings for these fields on Pi nodes. 1. effort / thinking → Pi thinkingLevel (options-translator.ts) - Archon EffortLevel enum: low|medium|high|max (from packages/workflows/src/schemas/dag-node.ts). `max` maps to Pi's `xhigh` since Archon's enum lacks it. - Pi-native strings (minimal, xhigh, off) also accepted for programmatic callers bypassing the schema. - `off` on either field → no thinkingLevel (Pi's implicit off). - Claude-shape object `thinking: {type:'enabled', budget_tokens:N}` yields a system warning and is not applied. 2. allowed_tools / denied_tools → filtered Pi built-in tools - Supports all 7 Pi tools: read, bash, edit, write, grep, find, ls. - Case-insensitive normalization. - Empty `allowed_tools: []` means no tools (LLM-only), matching e2e-claude-smoke's idiom. - Unknown names (Claude-specific like `WebFetch`) collected and surfaced as a system warning; ignored tools don't fail the run. 3. systemPrompt (AgentRequestOptions + nodeConfig.systemPrompt) - Threaded through `DefaultResourceLoader({systemPrompt})`; Pi's default prompt is replaced entirely. Request-level wins over node-level. Capability flag changes: - thinkingControl: false → true - effortControl: false → true - toolRestrictions: false → true Package delta: - +1 direct dep: @sinclair/typebox (Pi types reference it; adding as direct dep resolves the TS portable-type error). - +1 test file: options-translator.test.ts (19 tests, 100% coverage). - provider.test.ts extended with 11 new tests covering all three paths. - registry.test.ts updated: capability assertion reflects new flags. Live-verified: `bun run cli workflow run e2e-pi-smoke --no-worktree` succeeds in 1.2s with thinkingLevel=low, toolCount=0. Smoke YAML updated to use `effort: low` (schema-valid) + `allowed_tools: []` (LLM-only). * test(e2e): add comprehensive Pi smoke covering every CI-compatible node type Exercises every node type Archon supports under `provider: pi`, except `approval:` (pauses for human input, incompatible with CI): 1. prompt — inline AI prompt 2. command — named command file (uses e2e-echo-command.md) 3. loop — bounded iterative AI prompt (max_iterations: 2) 4. bash — shell script with JSON output 5. script — bun runtime (echo-args.js) 6. script — uv / Python runtime (echo-py.py) Plus DAG features on top of Pi: - depends_on + $nodeId.output substitution - when: conditional with JSON dot-access - trigger_rule: all_success merge - final assert node validates every upstream output is non-empty Complements the minimal e2e-pi-smoke.yaml — that stays as the fast-path smoke for connectivity checks; this one is the broader surface coverage. Verified locally end-to-end against Anthropic OAuth (pi /login): PASS, all 9 non-final nodes produce output, assert succeeds. * feat(providers/pi): resolve Archon `skills:` names to Pi skill paths Flips capabilities.skills: false → true by translating Archon's name-based `skills:` nodeConfig (e.g. `skills: [agent-browser]`) to absolute directory paths Pi's DefaultResourceLoader can consume via additionalSkillPaths. Search order for each skill name (first match wins): 1. <cwd>/.agents/skills/<name>/ — project-local, agentskills.io 2. <cwd>/.claude/skills/<name>/ — project-local, Claude convention 3. ~/.agents/skills/<name>/ — user-global, agentskills.io 4. ~/.claude/skills/<name>/ — user-global, Claude convention A directory resolves only if it contains a SKILL.md. Unresolved names are collected and surfaced as a system-chunk warning (e.g. "Pi could not resolve skill names: foo, bar. Searched .agents/skills and .claude/skills (project + user-global)."), matching the semantic of "requested but not found" without aborting the run. Pi's buildSystemPrompt auto-appends the agentskills.io XML block for each loaded skill, so the model sees them — no separate prompt injection needed (Pi differs from Claude here; Claude wraps in an AgentDefinition with a preloaded prompt, Pi uses XML block in system prompt). Ancestor directory traversal above cwd is deliberately skipped in this pass — matches the Pi provider's cwd-bound scope and avoids ambiguity about which repo's skills win when Archon runs from a subdirectory. Bun's os.homedir() bypasses the HOME env var; the resolver uses `process.env.HOME ?? homedir()` so tests can stage a synthetic home dir. Tests: - 11 new tests in options-translator.test.ts cover project/user, .agents/ vs .claude/, project-wins-over-user, SKILL.md presence check, dedup, missing-name collection. - 2 new integration tests in provider.test.ts cover the missing-skill warning path and the "no skills configured → no additionalSkillPaths" path. - registry.test.ts updated to assert skills: true in capabilities. Live-verified locally: `.claude/skills/archon-dev/SKILL.md` resolves, pi.session_started log shows `skillCount: 1, missingSkillCount: 0`, smoke workflow passes in 1.2s. * feat(providers/pi): session resume via Pi session store Flips capabilities.sessionResume: false → true. Pi now persists sessions under ~/.pi/agent/sessions/<encoded-cwd>/<uuid>.jsonl by default — same pattern Claude and Codex use for their respective stores, same blast radius as those providers. Flow: - No resumeSessionId → SessionManager.create(cwd) (fresh, persisted) - resumeSessionId + match in SessionManager.list(cwd) → open(path) - resumeSessionId + no match → fresh session + system warning ("⚠️ Could not resume Pi session. Starting fresh conversation.") Matches Codex's resume_thread_failed fallback at packages/providers/src/codex/provider.ts:553-558. The sessionId flows back to Archon via the terminal `result` chunk — bridgeSession annotates it with session.sessionId unconditionally so Archon's orchestrator can persist it and pass it as resumeSessionId on the next turn. Same mechanism used for Claude/Codex. Cross-cwd resume (e.g. worktree switch) is deliberately not supported in this pass: list(cwd) scans only the current cwd's session dir. A workflow that changes cwd mid-run lands on a fresh session, which matches Pi's mental model. Bridge sessionId annotation uses session.sessionId, which Pi always populates (UUID) — so no special-case for inMemory sessions is needed. Factored the resolver into session-resolver.ts (5 unit tests): - no id → create - id + match → open - id + no match → create with resumeFailed: true - list() throws → resumeFailed: true (graceful) - empty-string id → treated as "no resume requested" Integration tests in provider.test.ts add 3 cases: - resume-not-found yields warning + calls create - resume-match calls open with the file path, no warning - result chunk always carries sessionId Verified live end-to-end against Anthropic OAuth: - first call → sessionId 019d...; model replies "noted" - second call with that sessionId → "resumed: true" in logs; model correctly recalls prior turn ("Crimson.") - bogus sessionId → "⚠️ Could not resume..." warning + fresh UUID * refactor(providers,core): generalize community-provider registration Addresses the community-pattern regression flagged in the PR #1270 review: a second community provider should require editing only its own directory, not seven files across providers/ + core/ + cli/ + server/. Three changes: 1. Drop typed `pi` slot from AssistantDefaultsConfig + AssistantDefaults. Community providers live behind the generic `[string]` index that `ProviderDefaultsMap` was explicitly designed to provide. The typed claude/codex slots stay — they give IDE autocomplete for built-in config access without `as` casts, which was the whole reason the intersection exists. Community providers parse their own config via Record<string, unknown> anyway, so the typed slot added no real parser safety. 2. Loop-based getDefaults + mergeAssistantDefaults. No more hardcoded `pi: {}` spreads. getDefaults() seeds from `getRegisteredProviders()`; mergeAssistantDefaults clones every slot present in `base`. Adding a new provider requires zero edits to this function. 3. New `registerCommunityProviders()` aggregator in registry.ts. Entrypoints (CLI, server, config-loader) call ONE function after `registerBuiltinProviders()` rather than one call per community provider. Adding a new community provider is now a single-line edit to registerCommunityProviders(). This makes Pi (and future community providers) actually behave like Phase 2 (#1195) advertised: drop the implementation under packages/providers/src/community/<id>/, export a `register<Id>Provider`, add one line to the aggregator. Tests: - New `registerCommunityProviders` suite (2 tests: registers pi, idempotent). - config-loader.test updated: assert built-in slots explicitly rather than exhaustive map shape. No functional change for Pi end-users. Purely structural. * fix(providers/pi,core): correctness + hygiene fixes from PR #1270 review Addresses six of the review's important findings, all within the same PR branch: 1. envInjection: false → true The provider reads requestOptions.env on every call (for API-key passthrough). Declaring the capability false caused a spurious dag-executor warning for every Pi user who configured codebase env vars — which is the MAIN auth path. Flipping to true removes the false positive. 2. toSafeAssistantDefaults: denylist → allowlist The old shape deleted `additionalDirectories`, `settingSources`, `codexBinaryPath` before sending defaults to the web UI. Any future sensitive provider field (OAuth token, absolute path, internal metadata) would silently leak via the `[key: string]: unknown` index signature. New SAFE_ASSISTANT_FIELDS map lists exactly what to expose per provider; unknown providers get an empty allowlist so the web UI sees "provider exists" but no config details. 3. AsyncQueue single-consumer invariant The type was documented single-consumer but unenforced. A second `for await` would silently race with the first over buffer + waiters. Added a synchronous guard in Symbol.asyncIterator that throws on second call — copy-paste mistakes now fail fast with a clear message instead of dropping items. 4. session.dispose() / session.abort() silent catches Both catch blocks now log at debug via a module-scoped logger so SDK regressions surface without polluting normal output. 5. Type scripted events as AgentSessionEvent in provider.test.ts Was `Record<string, unknown>` — Pi field renames would silently keep tests passing. Now typed against Pi's actual event union. 6. Leaked /tmp/pi-research/... path in provider.ts comment Local-machine path that crept in during research. Replaced with the upstream GitHub URL (matches convention at provider.ts:110). Plus review-flagged simplifications: - Extract lookupPiModel wrapper — isolates the `as unknown as` cast behind one searchable name. - Hoist QueueItem → BridgeQueueItem at module scope (export'd for test visibility; not used externally yet but enables unit testing the mapping in isolation if needed later). - getRegisteredProviderNames: remove side-effecting registration calls. `loadConfig()` already bootstraps the registry before any caller can observe this helper — the hidden coupling was misleading. Plus missing-coverage tests from the review (pr-test-analyzer): - session.prompt() rejection → error surfaces to consumer - pre-aborted signal → session.abort() called - mid-stream abort → session.abort() called - modelFallbackMessage → system chunk yielded - AsyncQueue second-consumer → throws synchronously No behavioral changes for end users beyond the envInjection warning fix. * docs: Pi provider + community-provider contributor guide Addresses the PR #1270 review's docs-impact findings: the original Pi PR had no user-facing or contributor-facing documentation, and architecture.md still referenced the pre-Phase-2 factory.ts pattern (factory.ts was deleted in #1195). 1. packages/docs-web/src/content/docs/reference/architecture.md - Replace stale factory.ts references with the registry pattern. - Update inline IAgentProvider block: add getCapabilities, add options parameter. - Rewrite MessageChunk block as the actual discriminated union (was a placeholder with optional fields that didn't match the current type). - "Adding a New AI Agent Provider" checklist now distinguishes built-in (register in registerBuiltinProviders) from community (separate guide). Links to the new contributor guide. 2. packages/docs-web/src/content/docs/contributing/adding-a-community-provider.md (new) - Step-by-step guide using Pi as the reference implementation. - Covers: directory layout, capability discipline (start false, flip one at a time), provider class skeleton, registration via aggregator, test isolation (Bun mock.module pollution), what NOT to do (no edits to AssistantDefaultsConfig, no direct registerProvider from entrypoints, no overclaiming capabilities). 3. packages/docs-web/src/content/docs/getting-started/ai-assistants.md - New "Pi (Community Provider)" section: install, OAuth + API-key table per Pi backend, model ref format, workflow examples, capability matrix showing what Pi supports (session resume, tool restrictions, effort/thinking, skills, system prompt, envInjection) and what it doesn't (MCP, hooks, structured output, cost control, fallback model, sandbox). 4. .env.example - New Pi section with commented env vars for each supported backend (ANTHROPIC_API_KEY through HUGGINGFACE_API_KEY), each paired with its Pi provider id. OAuth flow (pi /login → auth.json) is explicitly called out — Archon reads that file too. 5. CHANGELOG.md - Unreleased entry for Pi, registerCommunityProviders aggregator, and the new contributor guide. |
||
|
|
301a139e5a
|
fix(core/test): split connection.test.ts from DB-test batch to avoid mock pollution (#1269)
messages.test.ts uses mock.module('./connection', ...) at module-load time.
Per CLAUDE.md:131 (Bun issue oven-sh/bun#7823), mock.module() is process-
global and irreversible. When Bun pre-loads all test files in a batch, the
mock shadows the real connection module before connection.test.ts runs,
causing getDatabaseType() to always return the mocked value regardless of
DATABASE_URL.
Move connection.test.ts into its own `bun test` invocation immediately
after postgres.test.ts (which runs alone) and before the big DB/utils/
config/state batch that contains messages.test.ts. This follows the same
isolation pattern already used for command-handler, clone, postgres, and
path-validation tests.
|
||
|
|
bed36ca4ad
|
fix(workflows): add word boundary to context variable substitution regex (#1256)
* fix(workflows): add word boundary to context variable substitution regex (#1112) Variable substitution for $CONTEXT, $EXTERNAL_CONTEXT, and $ISSUE_CONTEXT was matching as a prefix of longer identifiers like $CONTEXT_FILE, silently corrupting bash node scripts. Added negative lookahead (?![A-Za-z0-9_]) to CONTEXT_VAR_PATTERN_STR so only exact variable names are substituted. Changes: - Add negative lookahead to CONTEXT_VAR_PATTERN_STR regex in executor-shared.ts - Add regression test for prefix-match boundary case Fixes #1112 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(workflows): add missing boundary cases for context variable substitution Add three new test cases that complete coverage of the word-boundary fix from #1112: $ISSUE_CONTEXT with suffix variants, $ISSUE_CONTEXT with multiple suffixes, and contextSubstituted=false for suffix-only prompts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
df828594d7 |
fix(test): normalize on-disk content to LF in bundled-defaults test
Companion to
|
||
|
|
9dd57b2f3c |
fix(web): unify Add Project URL/path classification across UI entry points
Settings → Projects Add Project only submitted { path }, so GitHub URLs
entered there failed even though the API and the Sidebar Add Project
already accepted them. Closes #1108.
Changes:
- Add packages/web/src/lib/codebase-input.ts: shared getCodebaseInput()
helper returning a discriminated { path } | { url } union (re-exported
from api.ts for convenience).
- Use the helper from all three Add Project entry points: Sidebar,
Settings, and ChatPage. Removes three divergent inline heuristics.
- SettingsPage: rename addPath → addValue (state now holds either URL
or local path) and update placeholder text.
- Tests: cover https://, git@ shorthand, ssh://, git://, whitespace,
unix/relative/home/Windows/UNC paths.
- Docs: document the unified Add Project entry point in adapters/web.md.
Heuristic flips from "assume URL unless explicitly local" to "assume
local unless explicitly remote" — only inputs starting with https?://,
ssh://, git@, or git:// are sent as { url }; everything else is sent
as { path }. The server already resolves tilde/relative paths.
Co-authored-by: Nguyen Huu Loc <lockbkbang@gmail.com>
|
||
|
|
86e4c8d605
|
fix(bundled-defaults): auto-generate import list, emit inline strings (#1263)
* fix(bundled-defaults): auto-generate import list, emit inline strings
Root-cause fix for bundle drift (15 commands + 7 workflows previously
missing from binary distributions) and a prerequisite for packaging
@archon/workflows as a Node-loadable SDK.
The hand-maintained `bundled-defaults.ts` import list is replaced by
`scripts/generate-bundled-defaults.ts`, which walks
`.archon/{commands,workflows}/defaults/` and emits a generated source
file with inline string literals. `bundled-defaults.ts` becomes a thin
facade that re-exports the generated records and keeps the
`isBinaryBuild()` helper.
Inline strings (via JSON.stringify) replace Bun's
`import X from '...' with { type: 'text' }` attributes. The binary build
still embeds the data at compile time, but the module now loads under
Node too — removing SDK blocker #2.
- Generator: `scripts/generate-bundled-defaults.ts` (+ `--check` mode for CI)
- `package.json`: `generate:bundled`, `check:bundled`; wired into `validate`
- `build-binaries.sh`: regenerates defaults before compile
- Test: `bundle completeness` now derives expected set from on-disk files
- All 56 defaults (36 commands + 20 workflows) now in the bundle
* fix(bundled-defaults): address PR review feedback
Review: https://github.com/coleam00/Archon/pull/1263#issuecomment-4262719090
Generator:
- Guard against .yaml/.yml name collisions (previously silent overwrite)
- Add early access() check with actionable error when run from wrong cwd
- Type top-level catch as unknown; print only message for Error instances
- Drop redundant /* eslint-disable */ emission (global ignore covers it)
- Fix misleading CI-mechanism claim in header comment
- Collapse dead `if (!ext) continue` guard into a single typed pass
Scripts get real type-checking + linting:
- New scripts/tsconfig.json extending root config
- type-check now includes scripts/ via `tsc --noEmit -p scripts/tsconfig.json`
- Drop `scripts/**` from eslint ignores; add to projectService file scope
Tests:
- Inline listNames helper (Rule of Three)
- Drop redundant toBeDefined/typeof assertions; the Record<string, string>
type plus length > 50 already cover them
- Add content-fidelity round-trip assertion (defense against generator
content bugs, not just key-set drift)
Facade comment: drop dead reference to .claude/rules/dx-quirks.md.
CI: wire `bun run check:bundled` into .github/workflows/test.yml so the
header's CI-verification claim is truthful.
Docs: CLAUDE.md step count four→five; add contributor bullet about
`bun run generate:bundled` in the Defaults section and CONTRIBUTING.md.
* chore(e2e): bump Codex model to gpt-5.2
gpt-5.1-codex-mini is deprecated and unavailable on ChatGPT-account Codex
auth. Plain gpt-5.2 works. Verified end-to-end:
- e2e-codex-smoke: structured output returns {category:'math'}
- e2e-mixed-providers: claude+codex both return expected tokens
|
||
|
|
d535c832e3
|
feat(telemetry): anonymous PostHog workflow-invocation tracking (#1262)
* feat(telemetry): add anonymous PostHog workflow-invocation tracking Emits one `workflow_invoked` event per run with workflow name/description, platform, and Archon version. Uses a stable random UUID persisted to `$ARCHON_HOME/telemetry-id` for distinct-install counting, with `$process_person_profile: false` to stay in PostHog's anonymous tier. Opt out with `ARCHON_TELEMETRY_DISABLED=1` or `DO_NOT_TRACK=1`. Self-host via `POSTHOG_API_KEY` / `POSTHOG_HOST`. Closes #1261 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(telemetry): stop leaking test events to production PostHog The `telemetry-id preservation` test exercised the real capture path with the embedded production key, so every `bun run validate` published a tombstone `workflow_name: "w"` event. Redirect POSTHOG_HOST to loopback so the flush fails silently; bump test timeout to accommodate the retry-then-give-up window. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(telemetry): silence posthog-node stderr leak on network failure The PostHog SDK's internal logFlushError() writes 'Error while flushing PostHog' directly to stderr via console.error on any network or HTTP error, bypassing logger config. For a fire-and-forget telemetry path this leaked stack traces to users' terminals whenever PostHog was unreachable (offline, firewalled, DNS broken, rate-limited). Pass a silentFetch wrapper to the PostHog client that masks failures as fake 200 responses. The SDK never sees an error, so it never logs. Original failure is still recorded at debug level for diagnostics. Side benefit: shutdown is now fast on network failure (no retry loop), so offline CLI commands no longer hang ~10s on exit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(telemetry): make id-preservation test deterministic Replace the fire-and-forget capture + setTimeout + POSTHOG_HOST-loopback dance with a direct synchronous call to getOrCreateTelemetryId(). Export the function with an @internal marker so tests can exercise the id path without spinning up the PostHog client. No network, no timer, no flake. Addresses CodeRabbit feedback on #1262. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
367de7a625 |
test(ci): inject deliberate failure to verify CI red X
Injects exit 1 into e2e-deterministic bash-echo node to prove the engine fix (failWorkflowRun on anyFailed) propagates to a non-zero CLI exit code and a red X in GitHub Actions. Will be reverted in the next commit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
7721259bdc
|
fix(core): surface auth errors instead of silently dropping them (#1089)
* fix: surface auth errors instead of silently dropping them (#1076) When Claude OAuth refresh token is expired, the SDK yields a result chunk with is_error=true and no session_id. Both handleStreamMode and handleBatchMode guarded the result branch with `&& msg.sessionId`, silently dropping the error. Users saw no response at all. Changes: - Remove sessionId guard from result branches in orchestrator-agent.ts - Add isError early-exit that sends error message to user - Add 4 OAuth patterns to AUTH_PATTERNS in claude.ts and codex.ts - Add OAuth refresh-token handler to error-formatter.ts - Add tests for new error-formatter branches Fixes #1076 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add structured logging to isError path and remove overly broad auth pattern - Add getLog().warn({ conversationId, errorSubtype }, 'ai_result_error') in both handleStreamMode and handleBatchMode isError branches so auth failures are visible server-side instead of silently swallowed - Remove 'access token' from AUTH_PATTERNS in claude.ts and codex.ts; the real OAuth refresh error is already covered by 'refresh token' and 'could not be refreshed', eliminating false-positive auth classification risk Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: route isError results through classifyAndFormatError with provider-specific messages The isError path in stream/batch mode used a hardcoded generic message, bypassing the classifyAndFormatError infrastructure. Now constructs a synthetic Error from errorSubtype and routes through the formatter. Error formatter updated with provider-specific auth detection: - Claude: OAuth token refresh, sign-in expired → guidance to run /login - Codex: 401 retry exhaustion → guidance to run codex login - General: tightened patterns (removed broad 'auth error' substring match) Also persists session ID before early-returning on isError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
818854474f
|
fix(workflows): stop warning about model/provider on loop nodes (#1090)
* fix(workflows): stop warning about model/provider on loop nodes (#1082)
The loader incorrectly classified loop nodes as "non-AI nodes" and warned
that model/provider fields were ignored, even though the DAG executor has
supported these fields on loop nodes since commit
|
||
|
|
a5e5d5ceeb |
fix: address review findings for grammY Telegram adapter
- Fix misleading 'unde***' log when ctx.from is undefined; use 'unknown' to match the Slack/Discord adapter pattern - Log post-startup bot runtime errors before reject() (no-op after onStart fires but errors are now visible in logs) - Add debug log when message is dropped due to no handler registered - Add stop() unit test to guard against grammY API rename regressions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
da1f8b7d97 |
fix: replace Telegraf with grammY to fix Bun TypeError crash (#1042)
Telegraf v4's internal `redactToken()` assigns to readonly `error.message` properties, which crashes under Bun's strict ESM mode. Telegraf is EOL. Changes: - Replace `telegraf` dependency with `grammy` ^1.36.0 - Migrate adapter from Telegraf API to grammY API (Bot, bot.api, bot.start) - Use grammY's `onStart` callback pattern for async polling launch - Preserve 409 retry logic and all existing behavior - Update test mocks from telegraf types to grammy types Fixes #1042 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
2732288f07
|
Merge pull request #1065 from coleam00/archon/task-fix-issue-1055
feat(core): inject workflow run context into orchestrator prompt |
||
|
|
b100cd4b48
|
Merge pull request #1064 from coleam00/archon/task-fix-issue-1054
fix(web): interleave tool calls with text during SSE streaming |
||
|
|
5acf5640c8
|
Merge pull request #1063 from coleam00/archon/task-fix-issue-1035
fix: archon setup --spawn fails on Windows with spaces in repo path |
||
|
|
68ecb75f0f
|
Merge pull request #1052 from coleam00/archon/task-fix-github-issue-1775831868291
fix(cli): send workflow dispatch/result messages for Web UI cards |
||
|
|
51b8652d43 |
fix: complete defensive chaining and add missing test coverage for PR #1052
- Fix half-applied optional chaining in WorkflowProgressCard refetchInterval (query.state.data?.run.status → ?.run?.status) preventing TypeError in polling - Add dispatch-failure test verifying executeWorkflow still runs when dispatch sendMessage fails - Add paused-workflow test proving paused guard fires before summary check - Strengthen dispatch metadata assertion to verify workerConversationId format Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
882fc58f7c
|
fix: stop server startup from auto-failing in-flight workflow runs (#1216) (#1231)
* fix: stop server startup from auto-failing in-flight workflow runs (#1216) `failOrphanedRuns()` at server startup unconditionally flipped every `running` workflow row to `failed`, including runs actively executing in another process (CLI / adapters). The dag-executor's between-layer status check then bailed out of the run, exit code 1 — even though every node had completed successfully. Same class of bug the CLI already learned (see comment at packages/cli/src/cli.ts:256-258). Per the new CLAUDE.md principle "No Autonomous Lifecycle Mutation Across Process Boundaries", we don't replace the call with a timer-based heuristic. Instead we remove it and surface running workflows to the user with one-click actions. Backend - `packages/server/src/index.ts` — remove the `failOrphanedRuns()` call at startup. Replace with explanatory comment referencing the CLI precedent and the CLAUDE.md principle. The function in `packages/core/src/db/workflows.ts:911` is preserved for use by the explicit `archon workflow cleanup` command. UI - `packages/web/src/components/layout/TopNav.tsx` — replace the binary pulse dot on the Dashboard nav with a numeric count badge sourced from `/api/dashboard/runs` `counts.running`. Hidden when count is 0. Same 10s polling interval as before. No animation — a steady factual count is honest; a pulse would imply system judgment. - `packages/web/src/components/dashboard/ConfirmRunActionDialog.tsx` (new) — shadcn AlertDialog wrapper for destructive workflow-run actions, mirroring the codebase-delete pattern in `sidebar/ProjectSelector.tsx`. Caller passes the existing button as `trigger` slot; dialog handles open/close via Radix. - `packages/web/src/components/dashboard/WorkflowRunCard.tsx` — replace 4 `window.confirm()` callsites (Reject, Abandon, Cancel, Delete) with ConfirmRunActionDialog. Each gets a context-appropriate description. - `packages/web/src/components/dashboard/WorkflowHistoryTable.tsx` — replace 1 `window.confirm()` (Delete) with the same dialog. CHANGELOG entries under [Unreleased]: Fixed for #1216, two Changed entries for the nav badge and dialog upgrade. No new tests: the web package has no React component testing infrastructure (existing `bun test` covers `src/lib/` and `src/stores/` only). Type-check + lint + manual UI verification + the backend reproducer are the verification levels. Closes #1216. * review: address PR #1231 nits — stale doc + 3 code polish PR review surfaced one real correctness issue in docs and three small code polish items. None block merge; addressing for cleanliness. - packages/docs-web/src/content/docs/guides/authoring-workflows.md:486 removed the "auto-marked as failed on next startup" paragraph that described the now-deleted behavior. Replaced with a "Crashed servers / orphaned runs" note pointing users at `archon workflow cleanup` and the dashboard Cancel/Abandon buttons; explains the auto-resume mechanism still works once the row reaches a terminal status. - ConfirmRunActionDialog: narrow `onConfirm` from `() => void | Promise<void>` to `() => void`. All five callsites are synchronous wrappers around React Query mutations whose error handling lives at the page level (`runAction` in DashboardPage). The union widened the API for no current caller. Documented in the JSDoc what to do if an awaiting caller appears later. - TopNav: dropped the redundant `String(runningCount)` cast in the aria-label — template literal coerces. Also rewrote the comment above the `listDashboardRuns` query: the previous version implied `limit=1` constrained `counts.running`; in fact `counts` is a server-side aggregate independent of `limit`, and `limit=1` only minimises the `runs` array we discard. * review: correct remediation docs — cleanup ≠ abandon CodeRabbit caught a factual error I introduced in the doc update: `archon workflow cleanup` calls `deleteOldWorkflowRuns(days)` which DELETEs old terminal rows (`completed`/`failed`/`cancelled` older than N days) for disk hygiene. It does NOT transition stuck `running` rows. The correct remediation for a stuck `running` row is either the dashboard's per-row Cancel/Abandon button (already documented) or `archon workflow abandon <run-id>` from the CLI (existing subcommand, see packages/cli/src/cli.ts:366-374). Fixed three locations: - packages/docs-web/.../guides/authoring-workflows.md — replaced the vague "clean up explicitly" with concrete Web UI / CLI instructions and an explicit "Not to be confused with `archon workflow cleanup`" callout to close off the ambiguity CodeRabbit flagged. - packages/server/src/index.ts — comment updated to point at the correct remediation (`archon workflow abandon`) and clarify that `archon workflow cleanup` is unrelated disk-hygiene. - CHANGELOG.md — same correction in the [Unreleased] Fixed entry. |
||
|
|
5c8c39e5c9
|
fix(test): update stale mocks in cleanup-service 'continues processing' test (#1230) (#1232)
After PR #1034 changed worktree existence checks from execFileAsync to fs/promises.access, the mockExecFileAsync rejections had no effect. removeEnvironment needs getById + getCodebase mocks to proceed past the early-return guard, otherwise envs route to report.skipped instead of report.removed. Replace the two stale mockExecFileAsync rejection calls with proper mockGetById and mockGetCodebase return values for both test environments. Fixes #1230 |
||
|
|
f61d576a4d
|
feat(isolation): auto-init submodules in worktrees (#1189)
Worktrees created via `git worktree add` do not initialize submodules — monorepo workflows that need submodule content find empty directories. Auto-detect `.gitmodules` and run `git submodule update --init --recursive` after worktree creation; classify failures through the isolation error pipeline. Behavior: - `.gitmodules` absent → skip silently (zero-cost probe, no effect on non-submodule repos) - `.gitmodules` present → run submodule init by default (opt out via `worktree.initSubmodules: false`) - submodule init or `.gitmodules` read failure → throw with classified error including opt-out guidance - Only `ENOENT` on `.gitmodules` is treated as "no submodules"; other access errors (EACCES, EIO) surface as failures to prevent silent empty-dir worktrees Changes: - `packages/isolation/src/providers/worktree.ts` — `initSubmodules()` method + call site in `createWorktree()` - `packages/isolation/src/errors.ts` — collapsed `errorPatterns` + `knownPatterns` into single `ERROR_PATTERNS` source of truth with `known: boolean` per entry; added submodule pattern with opt-out guidance - `packages/isolation/src/types.ts` + `packages/core/src/config/config-types.ts` — new `initSubmodules?: boolean` config option - `packages/docs-web/src/content/docs/reference/configuration.md` — documented the new option and submodule behavior - Tests: default-on, explicit opt-in, explicit opt-out, skip-when-absent, fail-fast on EACCES, fail-fast on git failure, fail-fast on timeout Credit to @halindrome for the original implementation and root-cause mapping across #1183, #1187, #1188, #1192. Follow-up: #1192 (codebase identity rearchitect) would retire the cross-clone guard code in `resolver.ts` and `worktree.ts` that #1198, #1206 added. Separate PR. Closes #1187 |
||
|
|
73d9240eb3
|
fix(isolation): complete reports false success when worktree remains on disk (fixes #964) (#1034)
* fix(isolation): complete reports false success when worktree remains on disk (fixes #964) Three changes to prevent ghost worktrees: 1. isolationCompleteCommand now checks result.worktreeRemoved — if the worktree was not actually removed (partial failure), it reports 'Partial' with warnings and counts as failed, not completed. Previously only skippedReason was checked; a destroy that returned successfully but with worktreeRemoved=false would still print 'Completed'. 2. WorktreeProvider.destroy() now runs 'git worktree prune' after removal to clean up stale worktree references that git may keep even after the directory is removed. 3. WorktreeProvider.destroy() adds post-removal verification: after git worktree remove, it checks 'git worktree list --porcelain' to confirm the worktree is actually unregistered. If still registered, worktreeRemoved is set back to false with a descriptive warning. * fix: address CodeRabbit review — ghost worktree prune, partial cleanup callers, accurate messages * test: add regression test for Partial branch in isolation complete Exercises the !result.worktreeRemoved path (without skippedReason) that was flagged as uncovered by CodeRabbit review. |
||
|
|
28b258286f
|
Extra backticks for markdown block to fix formatting (#1218)
of nested code blocks. |
||
|
|
81859d6842
|
fix(providers): replace Claude SDK embed with explicit binary-path resolver (#1217)
* feat(providers): replace Claude SDK embed with explicit binary-path resolver Drop `@anthropic-ai/claude-agent-sdk/embed` and resolve Claude Code via CLAUDE_BIN_PATH env → assistants.claude.claudeBinaryPath config → throw with install instructions. The embed's silent failure modes on macOS (#1210) and Windows (#1087) become actionable errors with a documented recovery path. Dev mode (bun run) remains auto-resolved via node_modules. The setup wizard auto-detects Claude Code by probing the native installer path (~/.local/bin/claude), npm global cli.js, and PATH, then writes CLAUDE_BIN_PATH to ~/.archon/.env. Dockerfile pre-sets CLAUDE_BIN_PATH so extenders using the compiled binary keep working. Release workflow gets negative and positive resolver smoke tests. Docs, CHANGELOG, README, .env.example, CLAUDE.md, test-release and archon skills all updated to reflect the curl-first install story. Retires #1210, #1087, #1091 (never merged, now obsolete). Implements #1176. * fix(providers): only pass --no-env-file when spawning Claude via Bun/Node `--no-env-file` is a Bun flag that prevents Bun from auto-loading `.env` from the subprocess cwd. It is only meaningful when the Claude Code executable is a `cli.js` file — in which case the SDK spawns it via `bun`/`node` and the flag reaches the runtime. When `CLAUDE_BIN_PATH` points at a native compiled Claude binary (e.g. `~/.local/bin/claude` from the curl installer, which is Anthropic's recommended default), the SDK executes the binary directly. Passing `--no-env-file` then goes straight to the native binary, which rejects it with `error: unknown option '--no-env-file'` and the subprocess exits code 1. Emit `executableArgs` only when the target is a `.js` file (dev mode or explicit cli.js path). Caught by end-to-end smoke testing against the curl-installed native Claude binary. * docs: record env-leak validation result in provider comment Verified end-to-end with sentinel `.env` and `.env.local` files in a workflow CWD that the native Claude binary (curl installer) does not auto-load `.env` files. With Archon's full spawn pathway and parent env stripped, the subprocess saw both sentinels as UNSET. The first-layer protection in `@archon/paths` (#1067) handles the inheritance leak; `--no-env-file` only matters for the Bun-spawned cli.js path, where it is still emitted. * chore(providers): cleanup pass — exports, docs, troubleshooting Final-sweep cleanup tied to the binary-resolver PR: - Mirror Codex's package surface for the new Claude resolver: add `./claude/binary-resolver` subpath export and re-export `resolveClaudeBinaryPath` + `claudeFileExists` from the package index. Renames the previously single `fileExists` re-export to `codexFileExists` for symmetry; nothing outside the providers package was importing it. - Add a "Claude Code not found" entry to the troubleshooting reference doc with platform-specific install snippets and pointers to the AI Assistants binary-path section. - Reframe the example claudeBinaryPath in reference/configuration.md away from cli.js-only language; it accepts either the native binary or cli.js. * test+refactor(providers, cli): address PR review feedback Two test gaps and one doc nit from the PR review (#1217): - Extract the `--no-env-file` decision into a pure exported helper `shouldPassNoEnvFile(cliPath)` so the native-binary branch is unit testable without mocking `BUNDLED_IS_BINARY` or running the full sendQuery pathway. Six new tests cover undefined, cli.js, native binary (Linux + Windows), Homebrew symlink, and suffix-only matching. Also adds a `claude.subprocess_env_file_flag` debug log so the security-adjacent decision is auditable. - Extract the three install-location probes in setup.ts into exported wrappers (`probeFileExists`, `probeNpmRoot`, `probeWhichClaude`) and export `detectClaudeExecutablePath` itself, so the probe order can be spied on. Six new tests cover each tier winning, fall-through ordering, npm-tier skip when not installed, and the which-resolved-but-stale-path edge case. - CLAUDE.md `claudeBinaryPath` placeholder updated to reflect that the field accepts either the native binary or cli.js (the example value was previously `/absolute/path/to/cli.js`, slightly misleading now that the curl-installer native binary is the default). Skipped from the review by deliberate scope decision: - `resolveClaudeBinaryPath` async-with-no-await: matches Codex's resolver signature exactly. Changing only Claude breaks symmetry; if pursued, do both providers in a separate cleanup PR. - `isAbsolute()` validation in parseClaudeConfig: Codex doesn't do it either. Resolver throws on non-existence already. - Atomic `.env` writes in setup wizard: pre-existing pattern this PR touched only adjacently. File as separate issue if needed. - classifyError branch in dag-executor for setup errors: scope creep. - `.env.example` "missing #" claim: false positive (verified all CLAUDE_BIN_PATH lines have proper comment prefixes). * fix(test): use path.join in Windows-compatible probe-order test The "tier 2 wins (npm cli.js)" test hardcoded forward-slash path comparisons, but `path.join` produces backslashes on Windows. Caused the Windows CI leg of the test suite to fail while macOS and Linux passed. Use `path.join` for both the mock return value and the expectation so the separator matches whatever the platform produces. |
||
|
|
33d31c44f1
|
fix: lock workflow runs by working_path (#1036, #1188 part 2) (#1212)
* fix: lock workflow runs by working_path (#1036, #1188 part 2)
Both bugs reduce to the same primitive: there's no enforced lock on
working_path, so two dispatches that resolve to the same filesystem
location can race. The DB row is the lock token; pending/running/paused
are "lock held"; terminal statuses release.
Changes:
- getActiveWorkflowRunByPath includes `pending` (with 5-min stale-orphan
age window), accepts excludeId + selfStartedAt, and orders by
(started_at ASC, id ASC) for a deterministic older-wins tiebreaker.
Eliminates the both-abort race where two near-simultaneous dispatches
with similar timestamps could mutually abort each other.
- Move the executor's guard call site to AFTER workflowRun is finalized
(preCreated, resumed, or freshly created). This guarantees we always
have self-ID + started_at to pass to the lock query.
- On guard fire after row creation: mark self as 'cancelled' so we don't
leave a zombie pending row that would then become its own lock holder.
- New error message includes workflow name, duration, short run id, and
three concrete next-action commands (status / cancel / different
branch). Replaces the vague "Workflow already running".
- Resume orphan fix: when executor activates a resumable run, mark the
orchestrator's pre-created row as 'cancelled'. Without this, every
resume leaks a pending row that would block the user's own
back-to-back resume until the 5-min stale window.
- New formatDuration helper for the error message (8 unit tests).
Tests:
- 5 new tests in db/workflows.test.ts: pending in active set, age window,
excludeId exclusion, tiebreaker SQL shape, ordering.
- 5 new tests in executor.test.ts: self-id passed to query, self-cancel
on guard fire, new message format, resume orphan cancellation,
resume proceeds even if orphan cancel fails.
- Updated 2 executor-preamble tests for new structural behavior
(row-then-guard, new message format).
- 8 new tests for formatDuration.
Deferred (kept scope tight):
- Worktree-layer advisory lockfile (residual #1188.2 microsecond race
where both dispatches reach provider.create — bounded by git's own
atomicity for `worktree add`).
- Startup cleanup of pre-existing stale pending rows (5-min age window
makes them harmless).
- DB partial UNIQUE constraint migration (code-only is sufficient).
Fixes #1036
Fixes #1188 (part 2)
* fix: SQLite Date binding + UTC timestamp parse for path lock guard
Two issues found during E2E smoke testing:
1. bun:sqlite rejects Date objects as bindings ("Binding expected
string, TypedArray, boolean, number, bigint or null"). Serialize
selfStartedAt to ISO string before passing — PostgreSQL accepts
ISO strings for TIMESTAMPTZ comparison too.
2. SQLite returns datetimes as plain strings without timezone suffix
("YYYY-MM-DD HH:MM:SS"), and JS new Date() parses such strings as
local time. The blocking message was showing "running 3h" for
workflows started seconds ago in a UTC+3 timezone.
Added parseDbTimestamp helper that:
- Returns Date.getTime() unchanged for Date inputs (PG path)
- Treats SQLite-style strings as UTC by appending Z
Used at both call sites: the lock query (selfStartedAt) and the
blocking message duration.
Tests:
- 4 new tests in duration.test.ts for parseDbTimestamp covering
Date input, SQLite UTC interpretation, explicit Z, and explicit
+/-HH:MM offsets.
- Updated workflows.test.ts assertion for ISO serialization.
E2E smoke verified end-to-end:
- Sanity (single dispatch) succeeds.
- Two concurrent --no-worktree dispatches: one wins, one blocked
with actionable message showing correct "Xs" duration.
- Resume + back-to-back resume both succeed (orphan correctly
cancelled when resume activates).
* fix: address review — resume timestamp, lock-leak paths, status copy
CodeRabbit review on #1212 surfaced three real correctness gaps:
CRITICAL — resumeWorkflowRun preserved historical started_at, letting
a resumed row sort ahead of a currently-active holder in the lock
query's older-wins tiebreaker. Two active workflows could end up on
the same working_path. Fix: refresh started_at to NOW() in
resumeWorkflowRun. Original creation time is recoverable from
workflow_events history if needed for analytics.
MAJOR — lock-leak failure paths:
- If resumeWorkflowRun() throws, the orchestrator's pre-created row
was left as 'pending' until the 5-min stale window. Fix: cancel
preCreatedRun in the resume catch.
- If getActiveWorkflowRunByPath() throws, workflowRun (possibly
already promoted to 'running' via resume) was left active with no
auto-cleanup. Fix: cancel workflowRun in the guard catch.
MINOR — the blocking message always said "running" but the lock
query returns running, paused, AND fresh-pending rows. Telling a
user to "wait for it to finish" on a paused run (waiting on user
approval) would block them indefinitely. Fix: status-aware copy:
- paused: "paused waiting for user input" + approve/reject actions
- pending: "starting" verb
- running: keep current
Tests:
- New: resume refreshes started_at (asserts SQL contains
`started_at = NOW()`)
- New: cancels preCreatedRun when resumeWorkflowRun throws
- New: cancels workflowRun when guard query throws
- New: paused message uses approve/reject actions, NOT "wait"
- New: pending message uses "starting" verb
- New: running message uses default copy
- Updated: existing tests for new error string ("already active"
reflects status-aware semantics, not just "running")
Note: the user-facing error string changed from "already running on
this path" to "already active on this path (status)". Internal use
only — surfaced via getResult().error, not directly to users.
* fix: SQLite tiebreaker dialect bug + paired self struct + UX polish
CodeRabbit second review found one critical issue and several polish
items not addressed in
|
||
|
|
5a4541b391
|
fix: route canonical path failures through blocked classification (#1211)
Follow-up to #1206 review: the early getCanonicalRepoPath() wrap in resolve() threw directly, escaping the classification flow that createNewEnvironment uses. Permission errors, malformed worktree pointers, ENOENT, etc. surfaced as unclassified crashes instead of becoming an actionable `blocked` result. Mirror createNewEnvironment's contract: - isKnownIsolationError → return { status: 'blocked', reason: 'creation_failed', userMessage: classifyIsolationError(err) + suffix } - unknown errors → throw (programming bugs stay visible as crashes, not silent isolation failures) Adds two tests in resolver.test.ts: - EACCES classifies to "Permission denied" blocked message - Unknown error propagates as throw Addresses CodeRabbit review comment on #1206. |
||
|
|
fd3f043125
|
fix: extend worktree ownership guard to resolver adoption paths (#1206)
* fix: extend worktree ownership guard to resolver adoption paths (#1183, #1188) PR #1198 guarded WorktreeProvider.findExisting(), but IsolationResolver has three earlier adoption paths that bypass the provider layer: - findReusable (DB lookup by workflow identity) - findLinkedIssueEnv (cross-reference via linked issues) - tryBranchAdoption (PR branch discovery) Two clones of the same remote share codebase_id (identity is derived from owner/repo). Without these guards, clone B silently adopts clone A's worktree via any of the three paths. Changes: - Extract verifyWorktreeOwnership from WorktreeProvider (private) to @archon/git/src/worktree.ts as an exported function, sitting next to getCanonicalRepoPath which parses the same .git file format - Call the shared function from all three resolver paths; throw on cross-clone mismatch (DB rows are preserved — they legitimately belong to the other clone) - Compute canonicalRepoPath once at the top of resolve() - Six new tests in resolver.test.ts covering each guarded path's cross-checkout and same-clone behaviors Fixes #1183 Fixes #1188 (part 1 — cross-checkout; part 2 parallel collision deferred to follow-up alongside #1036) * fix: address PR review — polish, observability, secondary gap, docs Addresses the multi-agent review on #1206: Code fixes: - worktree.adoption_refused_cross_checkout log event renamed to match CLAUDE.md {domain}.{action}_{state} convention - verifyWorktreeOwnership now preserves err.code and err via { cause } when wrapping fs errors, so classifyIsolationError is robust to Node message format changes - Structured fields (codebaseId, canonicalRepoPath) added to all cross-clone rejection logs for incident debugging - Wrap getCanonicalRepoPath at top of resolve() with classified error instead of letting it propagate as an unclassified crash - Extract assertWorktreeOwnership helper on IsolationResolver — centralizes warn-then-rethrow contract, removes duplication - Dedupe toWorktreePath(existing.working_path) calls in resolver paths - Add code comment on findLinkedIssueEnv explaining why throw-on-first is intentional (user decision — surfaces anomaly instead of masking) Secondary gap closed: - WorktreeProvider.findExisting PR-branch adoption path (findWorktreeByBranch) now also verifies ownership — same class of bug as the main path, just via a different lookup Tests: - 8 new unit tests for verifyWorktreeOwnership in @archon/git (matching pointer, different clone, EISDIR/ENOENT errno preservation, submodule pointer, corrupted .git, trailing-slash normalization, cause chain) - tryBranchAdoption cross-clone test now asserts store.create was never called (symmetry with paths 1+2 asserting updateStatus) - New test for cross-clone rejection in the PR-branch-adoption secondary path in worktree.test.ts Docs: - CHANGELOG.md Unreleased entry for the cross-clone fix series - troubleshooting.md "Worktree Belongs to a Different Clone" section documenting all four new error patterns with resolution steps and pointer to #1192 for the architectural fix * fix(git): use raw .git pointer in cross-clone error message verifyWorktreeOwnership previously called path.resolve() on the gitdir path before embedding it in the error message. On Windows, resolve() prepends a drive letter to a POSIX-style path (e.g., /other/clone → C:\other\clone), which: 1. Misled users by showing a path that doesn't match what's actually in their .git file 2. Broke a Windows-only test asserting the error contains the literal /other/clone path Compare on resolved paths (correct — normalizes trailing slashes and relative components for the equality check) but display the raw match in the error message (recognizable to the user). |
||
|
|
af9ed84157
|
fix: prevent worktree isolation bypass via prompt and git-level adoption (#1198)
* fix: prevent worktree isolation bypass via prompt and git-level adoption (#1193, #1188) Three fixes for workflows operating on wrong branches: - archon-implement prompt: replace ambiguous branch table with decision tree that trusts the worktree isolation system, uses $BASE_BRANCH explicitly, and instructs AI to never switch branches - WorktreeProvider.findExisting: verify worktree's parent repo matches the request before adopting, preventing cross-clone adoption - WorktreeProvider.createNewBranch: reset stale orphan branches to the intended start-point instead of silently inheriting old commits Fixes #1193 Relates to #1188 * fix: address PR review — strict worktree verification, align sibling prompts Address CodeRabbit + self-review findings on #1198: Code fixes: - findExisting now throws on cross-checkout or unverifiable state instead of returning null, avoiding a confusing cascade through createNewBranch - verifyWorktreeOwnership handles .git errors precisely: ENOENT/EACCES/EIO throw a fail-fast error; EISDIR (full checkout at path) throws a clear "not a worktree" error; unmatched gitdir (submodule, malformed) throws - Path comparison uses resolve() to normalize trailing slashes - Added classifyIsolationError patterns so new errors produce actionable user messages Test fixes: - mockClear readFile/rm in afterEach - New tests: cross-checkout throws, EISDIR throws, EACCES throws, submodule pointer throws, trailing-slash normalization, branch -f reset failure propagates without retry - Updated existing tests that relied on permissive adoption to provide valid matching gitdir Prompt fixes (sweep of all default commands): - archon-implement.md: clarify "never switch branches" applies to worktree context; non-worktree branch creation still allowed - archon-fix-issue.md + archon-implement-issue.md: aligned decision tree with archon-implement pattern; use $BASE_BRANCH instead of MAIN/MASTER - archon-plan-setup.md: converted table to ordered decision tree with IN WORKTREE? first; removed ambiguous "already on correct feature branch" row |
||
|
|
d6e24f5075
|
feat: Phase 2 — community-friendly provider registry system (#1195)
* feat: replace hardcoded provider factory with typed registry system
Replace the built-in-only factory switch with a typed ProviderRegistration
registry where entries carry metadata (displayName, capabilities,
isModelCompatible) alongside the factory function. This enables community
providers to register without modifying core code.
- Add ProviderRegistration and ProviderInfo types to contract layer
- Create registry.ts with register/get/list/clear API, delete factory.ts
- Bootstrap registerBuiltinProviders() at server and CLI entrypoints
- Widen provider unions from 'claude' | 'codex' to string across schemas,
config types, deps, executors, and API validation
- Replace hardcoded model-validation with registry-driven isModelCompatible
and inferProviderFromModel (built-in only inference)
- Add GET /api/providers endpoint returning registry metadata
- Dynamic provider dropdowns in Web UI (BuilderToolbar, NodeInspector,
WorkflowBuilder, SettingsPage) via useProviders hook
- Dynamic provider selection in CLI setup command
- Registry test suite covering full lifecycle
* feat: generalize assistant config and tighten registry validation
- Add ProviderDefaults/ProviderDefaultsMap generic types to contract layer
- Add index signatures to ClaudeProviderDefaults/CodexProviderDefaults
- Introduce AssistantDefaults/AssistantDefaultsConfig intersection types
that combine ProviderDefaultsMap with typed built-in entries
- Replace hardcoded claude/codex config merging with generic
mergeAssistantDefaults() that iterates all provider entries
- Replace hardcoded toSafeConfig projection with generic
toSafeAssistantDefaults() that strips server-internal fields
- Validate provider strings at all config-entry surfaces: env override,
global config, repo config all throw on unknown providers
- Validate provider on PATCH /api/config/assistants (400 on unknown)
- Move validator.ts from hardcoded Codex checks to capability-driven
warnings using registry getProviderCapabilities()
- Remove resolveProvider() default to 'claude' — returns undefined when
no provider is set, skipping capability warnings for unresolved nodes
- Widen config API schemas to generic Record<string, ProviderDefaults>
- Rewrite SettingsPage to iterate providers dynamically with built-in
specific UI for Claude/Codex and generic JSON view for community
- Extract bootstrap to provider-bootstrap modules in CLI and server
- Remove all as Record<...> casts from dag-executor, executor,
orchestrator — clean indexing via ProviderDefaultsMap intersection
* fix: remove remaining hardcoded provider assumptions and regenerate types
- Replace hardcoded 'claude' defaults in CLI setup with registry lookup
(getRegisteredProviders().find(p => p.builtIn)?.id)
- Replace hardcoded 'claude' default in clone.ts folder detection with
registry-driven fallback
- Update config YAML comment from "claude or codex" to "registered provider"
- Make bootstrap test assertions use toContain instead of exact toEqual
so they don't break when community providers are registered
- Widen validator.test.ts helper from 'claude' | 'codex' to string
- Remove unnecessary type casts in NodeInspector, WorkflowBuilder,
SettingsPage now that generated types use string
- Regenerate api.generated.d.ts from updated OpenAPI spec — all provider
fields are now string instead of 'claude' | 'codex' union
* fix: address PR review findings — consistency, tests, docs
Critical fixes:
- isModelCompatible now throws on unknown providers (fail-fast parity
with getProviderCapabilities) instead of silently returning true
- Schema provider fields use z.string().trim().min(1) to reject
whitespace-only values
- validator.ts resolveProvider accepts defaultProvider param so
capability warnings fire for config-inherited providers
- PATCH /api/config/assistants validates assistants keys against
registry (rejects unknown provider IDs in the map)
YAGNI cleanup:
- Delete provider-bootstrap.ts wrappers in CLI and server — call
registerBuiltinProviders() directly
- Remove no-op .map(provider => provider) in SettingsPage
Test coverage:
- Add GET /api/providers endpoint tests (shape, projection, capabilities)
- Add config-loader throw-path tests for unknown providers in env var,
global config, and repo config
- Add isModelCompatible throw test for unknown providers
Docs:
- CLAUDE.md: factory.ts → registry.ts in directory tree, add
GET /api/providers to API endpoints section
- .env.example: update DEFAULT_AI_ASSISTANT comment
- docs-web configuration reference: update provider constraint docs
UI:
- Settings default-assistant dropdown uses allProviderEntries fallback
(no longer silently empty on API failure)
- clearRegistry marked @internal in JSDoc
* fix: use registry defaults in getDefaults/registerProject, document type design
- getDefaults() initializes assistant defaults from registered providers
instead of hardcoding { claude: {}, codex: {} }
- getDefaults() uses first registered built-in as default assistant
instead of hardcoding 'claude'
- handleRegisterProject uses config.assistant instead of hardcoded 'claude'
for new codebase ai_assistant_type
- Document AssistantDefaults/AssistantDefaultsConfig intersection types:
built-in keys are typed for parseClaudeConfig/parseCodexConfig type
safety; community providers use the generic [string] index
- Document WorkflowConfig.assistants intersection type with same rationale
* docs: update stale provider references to reflect registry system
- architecture.md: DB schema comment now says 'registered provider'
- first-workflow.md: provider field accepts any registered provider
- quick-reference.md: provider type changed from enum to string
- authoring-workflows.md: provider type changed from enum to string
- title-generator.ts: @param doc updated from 'claude or codex' to
generic provider identifier
* docs: fix remaining stale provider references in quick-reference and authoring guide
- quick-reference.md: per-node provider type changed from enum to string
- quick-reference.md: model mismatch guidance updated for registry pattern
- authoring-workflows.md: provider comment says 'any registered provider'
|
||
|
|
b5c5f81c8a
|
refactor: extract provider metadata seam for Phase 2 registry readiness (#1185)
* refactor: extract provider metadata seam for Phase 2 registry readiness - Add static capability constants (capabilities.ts) for Claude and Codex - Export getProviderCapabilities() from @archon/providers for capability queries without provider instantiation - Add inferProviderFromModel() to model-validation.ts, replacing three copy-pasted inline inference blocks in executor.ts and dag-executor.ts - Replace throwaway provider instantiation in dag-executor with static capability lookup (getProviderCapabilities) - Add orchestrator warning when env vars are configured but provider doesn't support envInjection * refactor: address LOW findings from code review - Remove CLAUDE_CAPABILITIES/CODEX_CAPABILITIES from public index (YAGNI — callers should use getProviderCapabilities(), not raw constants) - Remove dead _deps parameter from resolveNodeProviderAndModel and its two call-sites (no longer needed after static capability lookup refactor) - Update factory.ts module JSDoc to mention both exported functions - Add edge-case tests for getProviderCapabilities: empty string and case-sensitive throws (parity with existing getAgentProvider tests) - Add test for inferProviderFromModel with empty string (returns default, documenting the falsy-string shortcut) |
||
|
|
bf20063e5a
|
feat: propagate managed execution env to all workflow surfaces (#1161)
* Implement managed execution env propagation * Address managed env review feedback |
||
|
|
a8ac3f057b
|
security: prevent target repo .env from leaking into subprocesses (#1135)
Remove the entire env-leak scanning/consent infrastructure: scanner, allow_env_keys DB column usage, allow_target_repo_keys config, PATCH consent route, --allow-env-keys CLI flag, and UI consent toggle. The env-leak gate was the wrong primitive. Target repo .env protection is already structural: - stripCwdEnv() at boot removes Bun-auto-loaded CWD .env keys - Archon loads its own env sources afterward (~/.archon/.env) - process.env is clean before any subprocess spawns - Managed env injection (config.yaml env: + DB vars) is unchanged No scanning, no consent, no blocking. Any repo can be registered and used. Subprocesses receive the already-clean process.env. |