Archon

mirror of https://github.com/coleam00/Archon synced 2026-04-21 13:37:41 +00:00

Author	SHA1	Message	Date
Alex Siri	7ea321419f	fix: initialize options.hooks before merging YAML node hooks (#1177 ) Some checks are pending E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details Test Suite / test (windows-latest) (push) Waiting to run Details Test Suite / docker-build (push) Waiting to run Details When a workflow node defines hooks (PreToolUse/PostToolUse) in YAML but no hooks exist yet on the options object, applyNodeConfig crashes with "undefined is not an object" because it tries to assign properties on the undefined options.hooks. Initialize options.hooks to {} before the merge loop. Reproduces with: archon workflow run archon-architect (which uses per-node hooks extensively). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 14:52:56 +03:00
Rasmus Widing	ba4b9b47e6	docs(worktree): fix stale rename example + document copyFiles properly (#1328 ) Three related fixes around the `worktree.copyFiles` primitive: 1. Remove the `.env.example -> .env` rename example from reference/configuration.md and getting-started/overview.md. The `->` parser was removed in #739 (2026-03-19) because it caused the stale-credentials production bug in #228 — but the docs kept advertising it. A user writing `.env.example -> .env` today gets `parseCopyFileEntry` returning `{source: '.env.example -> .env', destination: '.env.example -> .env'}`, stat() fails with ENOENT, and the copy silently no-ops at debug level. 2. Replace the single-line "Default behavior: .archon/ is always copied" note with a proper "Worktree file copying" subsection that explains: - Why this exists (git worktree add = tracked files only; gitignored workflow inputs need this hook) - The `.archon/` default (no config needed for the common case) - Common entries: .env, .vscode/, .claude/, plans/, reports/, data fixtures - Semantics: source=destination, ENOENT silently skipped, per-entry error isolation, path-traversal rejected - Interaction with `worktree.path` (both layouts get the same treatment) 3. Update the overview example to drop the `.env.example + .env` pair (which implied rename semantics) in favor of `.env + plans/`, and call out that `.archon/` is auto-copied so users don't list it. No code changes. `bun run format:check` and `bun run lint` green.	2026-04-21 12:15:37 +03:00
Lior Franko	08de8ee5c6	fix(web,server): show real platform connection status in Settings (#1061 ) The Settings page's Platform Connections section hardcoded every platform except Web to 'Not configured', so users couldn't tell whether their Slack/ Telegram/Discord/GitHub/Gitea/GitLab adapters had actually started. - Server: /api/health now returns an activePlatforms array populated live as each adapter's start() resolves. Passed into registerApiRoutes so the reference stays mutable — Telegram starts after the HTTP listener is already accepting requests, so a snapshot would miss it. - Web: SettingsPage.PlatformConnectionsSection now reads activePlatforms from /api/health and looks each platform up in a Set. Also adds Gitea and GitLab to the list (they already ship as adapters). Closes #1031 Co-authored-by: Lior Franko <liorfr@dreamgroup.com>	2026-04-21 11:47:32 +03:00
Rasmus Widing	5ed38dc765	feat(isolation,workflows): worktree location + per-workflow isolation policy (#1310 ) Some checks are pending E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details Test Suite / test (windows-latest) (push) Waiting to run Details Test Suite / docker-build (push) Waiting to run Details * feat(isolation): per-project worktree.path + collapse to two layouts Adds an opt-in `worktree.path` to .archon/config.yaml so a repo can co-locate worktrees with its own checkout (`<repoRoot>/<path>/<branch>`) instead of the default `~/.archon/workspaces/<owner>/<repo>/worktrees/<branch>`. Requested in joelsb's #1117. Primitive changes (clean up the graveyard rather than add parallel code paths): - Collapse worktree layouts from three to two. The old "legacy global" layout (`~/.archon/worktrees/<owner>/<repo>/<branch>`) is gone — every repo resolves to the workspace-scoped layout (`~/.archon/workspaces/<owner>/<repo>/worktrees/<branch>`), whether it was archon-cloned or locally registered. `extractOwnerRepo()` on the repo path is the stable identity fallback. Ends the divergence where workspace-cloned and local repos had visibly different worktree trees. - `getWorktreeBase()` in @archon/git now returns `{ base, layout }` and accepts an optional `{ repoLocal }` override. The layout value replaces the old `isProjectScopedWorktreeBase()` classification at the call sites (`isProjectScopedWorktreeBase` stays exported as deprecated back-compat). - `WorktreeCreateConfig.path` carries the validated override from repo config. `resolveRepoLocalOverride()` fails loudly on absolute paths, `..` escapes, and resolve-escape edge cases (Fail Fast — no silent default fallback when the config is syntactically wrong). - `WorktreeProvider.create()` now loads repo config exactly once and threads it through `getWorktreePath()` + `createWorktree()`. Replaces the prior swallow-then-retry pattern flagged on #1117. `generateEnvId()` is gone — envId is assigned directly from the resolved path (the invariant was already documented on `destroy(envId)`). Tests (packages/git + packages/isolation): - Update the pre-existing `getWorktreeBase` / `isProjectScopedWorktreeBase` suite for the new two-layout return shape and precedence. - Add 8 tests for `worktree.path`: default fallthrough, empty/whitespace ignored, override wins for workspace-scoped repos, rejects absolute, rejects `../` escapes (three variants), accepts nested relative paths. Docs: add `worktree.path` to the repo config reference with explicit precedence and the `.gitignore` responsibility note. Co-authored-by: Joel Bastos <joelsb2001@gmail.com> * feat(workflows): per-workflow worktree.enabled policy Introduces a declarative top-level `worktree:` block on a workflow so authors can pin isolation behavior regardless of invocation surface. Solves the case where read-only workflows (e.g. `repo-triage`) should always run in the live checkout, without every CLI/web/scheduled-trigger caller having to remember to set the right flag. Schema (packages/workflows/src/schemas/workflow.ts + loader.ts): - New optional `worktree.enabled: boolean` on `workflowBaseSchema`. Loader parses with the same warn-and-ignore discipline used for `interactive` and `modelReasoningEffort` — invalid shapes log and drop rather than killing workflow discovery. Policy reconciliation (packages/cli/src/commands/workflow.ts): - Three hard-error cases when YAML policy contradicts invocation flags: • `enabled: false` + `--branch` (worktree required by flag, forbidden by policy) • `enabled: false` + `--from` (start-point only meaningful with worktree) • `enabled: true` + `--no-worktree` (policy requires worktree, flag forbids it) - `enabled: false` + `--no-worktree` is redundant, accepted silently. - `--resume` ignores the pinned policy (it reuses the existing run's worktree even when policy would disable — avoids disturbing a paused run). Orchestrator wiring (packages/core/src/orchestrator/orchestrator-agent.ts): - `dispatchOrchestratorWorkflow` short-circuits `validateAndResolveIsolation` when `workflow.worktree?.enabled === false` and runs directly in `codebase.default_cwd`. Web chat/slack/telegram callers have no flag equivalent to `--no-worktree`, so the YAML field is their only control. - Logged as `workflow.worktree_disabled_by_policy` for operator visibility. First consumer (.archon/workflows/repo-triage.yaml): - `worktree: { enabled: false }` — triage reads issues/PRs and writes gh labels; no code mutations, no reason to spin up a worktree per run. Tests: - Loader: parses `worktree.enabled: true\|false`, omits block when absent. - CLI: four new integration tests for the reconciliation matrix (skip when policy false, three hard-error cases, redundant `--no-worktree` accepted, `--no-worktree` + `enabled: true` rejected). Docs: authoring-workflows.md gets the new top-level field in the schema example with a comment explaining the precedence and the `enabled: true\|false` semantics. * fix(isolation): use path.sep for repo-containment check on Windows resolveRepoLocalOverride was hardcoding '/' as the separator in the startsWith check, so on Windows (where `resolve()` returns backslash paths like `D:\Users\dev\Projects\myapp`) every otherwise-valid relative `worktree.path` was rejected with "resolves outside the repo root". Fixed by importing `path.sep` and using it in the sentinel. Fixes the 3 Windows CI failures in `worktree.path repo-local override`. --------- Co-authored-by: Joel Bastos <joelsb2001@gmail.com>	2026-04-20 21:54:10 +03:00
Rasmus Widing	7be4d0a35e	feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath (closes #1136 ) (#1315 ) * feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath Collapses the awkward `~/.archon/.archon/workflows/` convention to a direct `~/.archon/workflows/` child (matching `workspaces/`, `archon.db`, etc.), adds home-scoped commands and scripts with the same loading story, and kills the opt-in `globalSearchPath` parameter so every call site gets home-scope for free. Closes #1136 (supersedes @jonasvanderhaegen's tactical fix — the bug was the primitive itself: an easy-to-forget parameter that five of six call sites on dev dropped). Primitive changes: - Home paths are direct children of `~/.archon/`. New helpers in `@archon/paths`: `getHomeWorkflowsPath()`, `getHomeCommandsPath()`, `getHomeScriptsPath()`, and `getLegacyHomeWorkflowsPath()` (detection-only for migration). - `discoverWorkflowsWithConfig(cwd, loadConfig)` reads home-scope internally. The old `{ globalSearchPath }` option is removed. Chat command handler, Web UI workflow picker, orchestrator resolve path — all inherit home-scope for free without maintainer patches at every new site. - `discoverScriptsForCwd(cwd)` merges home + repo scripts (repo wins on name collision). dag-executor and validator use it; the hardcoded `resolve(cwd, '.archon', 'scripts')` single-scope path is gone. - Command resolution is now walked-by-basename in each scope. `loadCommand` and `resolveCommand` walk 1 subfolder deep and match by `.md` basename, so `.archon/commands/triage/review.md` resolves as `review` — closes the latent bug where subfolder commands were listed but unresolvable. - All three (`workflows/`, `commands/`, `scripts/`) enforce a 1-level subfolder cap (matches the existing `defaults/` convention). Deeper nesting is silently skipped. - `WorkflowSource` gains `'global'` alongside `'bundled'` and `'project'`. Web UI node palette shows a dedicated "Global (~/.archon/commands/)" section; badges updated. Migration (clean cut — no fallback read): - First use after upgrade: if `~/.archon/.archon/workflows/` exists, Archon logs a one-time WARN per process with the exact `mv` command: `mv ~/.archon/.archon/workflows ~/.archon/workflows && rmdir ~/.archon/.archon` The legacy path is NOT read — users migrate manually. Rollback caveat noted in CHANGELOG. Tests: - `@archon/paths/archon-paths.test.ts`: new helper tests (default HOME, ARCHON_HOME override, Docker), plus regression guards for the double-`.archon/` path. - `@archon/workflows/loader.test.ts`: home-scoped workflows, precedence, subfolder 1-depth cap, legacy-path deprecation warning fires exactly once per process. - `@archon/workflows/validator.test.ts`: home-scoped commands + subfolder resolution. - `@archon/workflows/script-discovery.test.ts`: depth cap + merge semantics (repo wins, home-missing tolerance). - Existing CLI + orchestrator tests updated to drop `globalSearchPath` assertions. E2E smoke (verified locally, before cleanup): - `.archon/workflows/e2e-home-scope.yaml` + scratch repo at /tmp - Home-scoped workflow discovered from an unrelated git repo - Home-scoped script (`~/.archon/scripts/.ts`) executes inside a script node - 1-level subfolder workflow (`~/.archon/workflows/triage/.yaml`) listed - Legacy path warning fires with actionable `mv` command; workflows there are NOT loaded Docs: `CLAUDE.md`, `docs-web/guides/global-workflows.md` (full rewrite for three-type scope + subfolder convention + migration), `docs-web/reference/ configuration.md` (directory tree), `docs-web/reference/cli.md`, `docs-web/guides/authoring-workflows.md`. Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com> * test(script-discovery): normalize path separators in mocks for Windows The 4 new tests in `scanScriptDir depth cap` and `discoverScriptsForCwd — merge repo + home with repo winning` compared incoming mock paths with hardcoded forward-slash strings (`if (path === '/scripts/triage')`). On Windows, `path.join('/scripts', 'triage')` produces `\scripts\triage`, so those branches never matched, readdir returned `[]`, and the tests failed. Added a `norm()` helper at module scope and wrapped the incoming `path` argument in every `mockImplementation` before comparing. Stored paths go through `normalizeSep()` in production code, so the existing equality assertions on `script.path` remain OS-independent. Fixes Windows CI job `test (windows-latest)` on PR #1315. * address review feedback: home-scope error handling, depth cap, and tests Critical fixes: - api.ts: add `maxDepth: 1` to all 3 findMarkdownFilesRecursive calls in GET /api/commands (bundled/home/project). Without this the UI palette surfaced commands from deep subfolders that the executor (capped at 1) could not resolve — silent "command not found" at runtime. - validator.ts: wrap home-scope findMarkdownFilesRecursive and resolveCommandInDir calls in try/catch so EACCES/EPERM on ~/.archon/commands/ doesn't crash the validator with a raw filesystem error. ENOENT still returns [] via the underlying helper. Error handling fixes: - workflow-discovery.ts: maybeWarnLegacyHomePath now sets the "warned-once" flag eagerly before `await access()`, so concurrent discovery calls (server startup with parallel codebase resolution) can't double-warn. Non-ENOENT probe errors (EACCES/EPERM) now log at WARN instead of DEBUG so permission issues on the legacy dir are visible in default operation. - dag-executor.ts: wrap discoverScriptsForCwd in its own try/catch so an EACCES on ~/.archon/scripts/ routes through safeSendMessage / logNodeError with a dedicated "failed to discover scripts" message instead of being mis-attributed by the outer catch's "permission denied (check cwd permissions)" branch. Tests: - load-command-prompt.test.ts (new): 6 tests covering the executor's command resolution hot path — home-scope resolves when repo misses, repo shadows home, 1-level subfolder resolvable by basename, 2-level rejected, not-found, empty-file. Runs in its own bun test batch. - archon-paths.test.ts: add getHomeScriptsPath describe block to match the existing getHomeCommandsPath / getHomeWorkflowsPath coverage. Comment clarity: - workflow-discovery.ts: MAX_DISCOVERY_DEPTH comment now leads with the actual value (1) before describing what 0 would mean. - script-discovery.ts: copy the "routing ambiguity" rationale from MAX_DISCOVERY_DEPTH to MAX_SCRIPT_DISCOVERY_DEPTH. Cleanup: - Remove .archon/workflows/e2e-home-scope.yaml — one-off smoke test that would ship permanently in every project's workflow list. Equivalent coverage exists in loader.test.ts. Addresses all blocking and important feedback from the multi-agent review on PR #1315. --------- Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>	2026-04-20 21:45:32 +03:00
Rasmus Widing	cc78071ff6	fix(isolation): raise worktree git-operation timeout to 5m (#1306 ) All 15 worktree git-subprocess timeouts in WorktreeProvider were hardcoded at 30000ms. Repos with heavy post-checkout hooks (lint, dependency install, submodule init) routinely exceed that budget and fail worktree creation. Consolidate them onto a single GIT_OPERATION_TIMEOUT_MS constant at 5 min. Generous enough to cover reported cases while still catching genuine hangs (credential prompts in non-TTY, stalled fetches). Chosen over the config-key approach in #1029 to avoid adding permanent .archon/config.yaml surface for a problem a raised default solves cleanly. If 5 min turns out to also be too tight for real-world use, we'll revisit. Closes #1119 Supersedes #1029 Co-authored-by: Shay Elmualem <12733941+norbinsh@users.noreply.github.com>	2026-04-20 21:45:24 +03:00
ACJLabsDev	235a8ce202	Add Star History Chart to README.md (#1229 )	2026-04-20 19:43:52 +03:00
Kagura	39a05b762f	fix(db): throw on corrupt commands JSON instead of silent empty fallback (#1033 ) * fix(db): throw on corrupt commands JSON instead of silent empty fallback (#967) getCodebaseCommands() silently returned {} when the commands column contained corrupt JSON. Callers had no way to distinguish 'no commands' from 'unreadable data', violating fail-fast principles. Now throws a descriptive error with the codebase ID and a recovery hint. The error is still logged for observability before throwing. Adds two test cases: corrupt JSON throws, valid JSON string parses. * fix: include parse error in log for better diagnostics	2026-04-20 16:19:50 +03:00
Cole Medin	c5e11ea8f5	docs(claude-md): surface Pi as peer provider alongside Claude and Codex Some checks are pending E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / test (windows-latest) (push) Waiting to run Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details Test Suite / docker-build (push) Waiting to run Details CLAUDE.md is the primary entry point for agents working in this repo, but it only mentioned Pi once — buried in a DAG-node capability parenthetical. Add Pi to the directory tree, Package Split blurb, and AI Agent Providers list so Pi is discoverable without relying on the docs site or git log. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-20 07:59:41 -05:00
Cole Medin	cb44b96f7b	feat(providers/pi): interactive flag binds UIContext for extensions (#1299 ) * feat(providers/pi): interactive flag binds UIContext for extensions Adds `interactive: true` opt-in to Pi provider (in `.archon/config.yaml` under `assistants.pi`) that binds a minimal `ExtensionUIContext` stub to each session. Without this, Pi's `ExtensionRunner.hasUI()` reports false, causing extensions like `@plannotator/pi-extension` to silently auto-approve every plan instead of opening their browser review UI. Semantics: clamped to `enableExtensions: true` — no extensions loaded means nothing would consume `hasUI`, so `interactive` alone is silently dropped. Stub forwards `notify()` to Archon's event stream; interactive dialogs (select/confirm/input/editor/custom) resolve to undefined/false; TUI-only setters (widgets/headers/footers/themes) no-op. Theme access throws with a clear diagnostic — Pi's theme singleton is coupled to its own `Symbol.for()` registry which Archon doesn't own. Trust boundary: only binds when the operator has explicitly enabled both flags. Extensions gated on `ctx.hasUI` (plannotator and similar) get a functional UI context; extensions that reach for TUI features still fail loudly rather than rendering garbage. Includes smoke-test workflow documenting the integration surface. End-to-end plannotator UI rendering requires plan-mode activation (Pi `--plan` CLI flag or `/plannotator` TUI slash command) which is out of reach for programmatic Archon sessions — manual test only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(providers/pi): end-to-end interactive extension UI Three fixes that together get plannotator's browser review UI to actually render from an Archon workflow and reach the reviewer's browser. 1. Call resourceLoader.reload() when enableExtensions is true. createAgentSession's internal reload is gated on `!resourceLoader`, so caller-supplied loaders must reload themselves. Without this, getExtensions() returns the empty default, no ExtensionRunner is built, and session.extensionRunner.setFlagValue() silently no-ops. 2. Set PLANNOTATOR_REMOTE=1 in interactive mode. plannotator-browser.ts only calls ctx.ui.notify(url) when openBrowser() returns { isRemote: true }; otherwise it spawns xdg-open/start on the Archon server host — invisible to the user and untestable from bash asserts. From the workflow runner's POV every Archon execution IS remote; flipping the heuristic routes the URL through notify(), which the ExtensionUIContext stub forwards into the event stream. Respect explicit operator overrides. 3. notify() emits as assistant chunks, not system chunks. The DAG executor's system-chunk filter only forwards warnings/MCP prefixes, and only assistant chunks accumulate into $nodeId.output. Emitting as assistant makes the URL available both in the user's stream and in downstream bash/script nodes via output substitution. Plus: extensionFlags config pass-through (equivalent to `pi --plan` on the CLI) applied via ExtensionRunner.setFlagValue() BEFORE bindExtensions fires session_start, so extensions reading flags in their startup handler actually see them. Also bind extensions with an empty binding when enableExtensions is on but interactive is off, so session_start still fires for flag-driven but UI-less extensions. Smoke test (.archon/workflows/e2e-plannotator-smoke.yaml) uses openai-codex/gpt-5.4-mini (ChatGPT Plus OAuth compatible) and bumps idle_timeout to 600000ms so plannotator's server survives while a human approves in the browser. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(providers/pi): keep Archon extension-agnostic Remove the plannotator-specific PLANNOTATOR_REMOTE=1 env var write from the Pi provider. Archon's provider layer shouldn't know about any specific extension's internals. Document the env var in the plannotator smoke test instead — operators who use plannotator set it via their shell or per-codebase env config. Workflow smoke test updated with: - Instructions for setting PLANNOTATOR_REMOTE=1 externally - Simpler assertion (URL emission only) — validated in a real reject-revise-approve run: reviewer annotated, clicked Send Feedback, Pi received the feedback as a tool result, revised the plan (added aria-label and WCAG contrast per the annotation), resubmitted, and reviewer approved. Plannotator's tool result signals approval but doesn't return the plan text, so the bash assertion now only checks that the review URL reached the stream (not that plan content flowed into \$nodeId.output — it can't). - Known-limitation note documenting the tool-result shape so downstream workflow authors know to Write the plan separately if they need it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore(providers/pi): keep e2e-plannotator-smoke workflow local-only The smoke test is plannotator-specific (calls plannotator_submit_plan, expects PLAN.md on disk, requires PLANNOTATOR_REMOTE=1) and is better kept out of the PR while the extension-agnostic infra lands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(providers/pi): trim verbose inline comments Collapse multi-paragraph SDK explanations to 1-2 line "why" notes across provider.ts, types.ts, ui-context-stub.ts, and event-bridge.ts. No behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(providers/pi): wire assistants.pi.env + theme-proxy identity Two end-to-end fixes discovered while exercising the combined plannotator + @pi-agents/loop smoke flow: - PiProviderDefaults gains an optional `env` map; parsePiConfig picks it up and the provider applies it to process.env at session start (shell env wins, no override). Needed so extensions like plannotator can read PLANNOTATOR_REMOTE=1 from config.yaml without requiring a shell export before `archon workflow run`. - ui-context-stub theme proxy returns identity decorators instead of throwing on unknown methods. Styled strings flow into no-op setStatus/setWidget sinks anyway, so the throw was blocking plannotator_submit_plan after HTTP approval with no benefit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(providers/pi): flush notify() chunks immediately in batch mode Batch-mode adapters (CLI) accumulate assistant chunks and only flush on node completion. That broke plannotator's review-URL flow: Pi's notify() emitted the URL as an assistant chunk, but the user needed the URL to POST /api/approve — which is what unblocks the node in the first place. Adds an optional `flush` flag on assistant MessageChunks. notify() sets it, and the DAG executor drains pending batched content before surfacing the flushed chunk so ordering is preserved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: mention Pi alongside Claude and Codex in README + top-level docs The AI assistants docs page already covers Pi in depth, but the README architecture diagram + docs table, overview "Further Reading" section, and local-deployment .env comment still listed only Claude/Codex. Left feature-specific mentions alone where Pi genuinely lacks support (e.g. structured output — Claude + Codex only). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: note Pi structured output (best-effort) in matrix + workflow docs Pi gained structured output support via prompt augmentation + JSON extraction (see packages/providers/src/community/pi/capabilities.ts). Unlike Claude/Codex, which use SDK-enforced JSON mode, Pi appends the schema to the prompt and parses JSON out of the result text (bare or fenced). Updates four stale references that still said Claude/Codex-only: - ai-assistants.md capabilities matrix - authoring-workflows.md (YAML example + field table) - workflow-dag.md skill reference - CLAUDE.md DAG-format node description Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(providers/pi): default extensions + interactive to on Extensions (community packages like @plannotator/pi-extension and user-authored ones) are a core reason users pick Pi. Defaulting enableExtensions and interactive to false previously silenced installed extensions with no signal, leading to "did my extension even load?" confusion. Opt out in .archon/config.yaml when you want the prior behavior: assistants: pi: enableExtensions: false # skip extension discovery entirely # interactive: false # load extensions, but no UI bridge Docs gain a new "Extensions (on by default)" section in getting-started/ai-assistants.md that documents the three config surfaces (extensionFlags, env, workflow-level interactive) and uses plannotator as a concrete walk-through example. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-20 07:37:40 -05:00
Cocoon-Break	45682bd2c8	fix(providers/claude): use \|\| instead of ?? in hasExplicitTokens to handle empty-string env vars (#1028 ) Closes #1027	2026-04-20 14:15:27 +03:00
Rasmus Widing	52eebf995a	chore(gitignore): ignore .claude/scheduled_tasks.lock Machine-local runtime state from the Claude Code scheduler (pid + sessionId + acquisition timestamp). Should not be shared across machines.	2026-04-20 13:39:44 +03:00
Rasmus Widing	28908f0c75	feat(paths/cli/setup): unify env load + write on three-path model (#1302 , #1303 ) (#1304 ) * feat(paths/cli/setup): unify env load + write on three-path model (#1302, #1303) Key env handling on directory ownership rather than filename. `.archon/` (at `~/` or `<cwd>/`) is archon-owned; anything else is the user's. - `<repo>/.env` — stripped at boot (guard kept), never loaded, never written - `<repo>/.archon/.env` — loaded at repo scope (wins over home), writable via `archon setup --scope project` - `~/.archon/.env` — loaded at home scope, writable via `--scope home` (default) Read side (#1302): - New `@archon/paths/env-loader` with `loadArchonEnv(cwd)` shared by CLI and server entry points. Loads both archon-owned files with `override: true`; repo scope wins. - Replaced `[dotenv@17.3.1] injecting env (0) from .env` (always lied about stripped keys) with `[archon] stripped N keys from <cwd> (...)` and `[archon] loaded N keys from <path>` lines, emitted only when N > 0. `quiet: true` passed to dotenv to silence its own output. - `stripCwdEnv` unchanged in semantics — still the only source that deletes keys from `process.env`; now logs what it did. Write side (#1303): - `archon setup` never writes to `<repo>/.env`. Writing there was incoherent because `stripCwdEnv` deletes those keys on every run. - New `--scope home\|project` (default home) targets exactly one archon-owned file. New `--force` overrides the merge; backup still written. - Merge-only by default: existing non-empty values win, user-added custom keys survive, `<path>.archon-backup-<ISO-ts>` written before every rewrite. Fixes silent PostgreSQL→SQLite downgrade and silent token loss in Add mode. - One-time migration note emitted when `<cwd>/.env` exists at setup start. Tests: new `env-loader.test.ts` (6), extended `strip-cwd-env.test.ts` (+4 for the log line), extended `setup.test.ts` (+10 for scope/merge/backup/force/ repo-untouched), extended `cli.test.ts` (+5 for flag parsing). Docs: configuration.md, cli.md, security.md, cli-internals.md, setup skill — all updated to the three-path model. * fix(cli/setup): address PR review — scope/path/secret-handling edge cases - cli: resolve --scope project to git repo root so running setup from a subdir writes to <repo-root>/.archon/.env (what loadArchonEnv reads at boot), not <subdir>/.archon/.env. Fail fast with a useful message when --scope project is used outside a git repo. - setup: resolveScopedEnvPath() now delegates to @archon/paths helpers (getArchonEnvPath / getRepoArchonEnvPath) so Docker's /.archon home, ARCHON_HOME overrides, and the "undefined" literal guard all behave identically between the loader and the writer. - setup: wrap the writeScopedEnv call in try/catch so an fs exception (permission denied, read-only FS, backup copy failure) stops the clack spinner cleanly and emits an actionable error instead of a raw stack trace after the user has completed the entire wizard. - setup: checkExistingConfig(envPath?) — scope-aware existing-config read. Add/Update/Fresh now reflects the actual write target, not an unconditional ~/.archon/.env. - setup: serializeEnv escapes \r (was only \n) so values with bare CR or CRLF round-trip through dotenv.parse without corruption. Regression test added. - setup: merge path treats whitespace-only existing values (' ') as empty, so a copy-paste stray space doesn't silently defeat the wizard update for that key forever. Regression test added. - setup: 0o600 mode on the written env file AND on backup copies — writeFileSync+copyFileSync default to 0o666 & ~umask, which can leave secrets group/world-readable on a permissive umask. - docs/cli.md + setup skill: appendix sections that still described the pre-#1303 two-file symlink model now reflect the three-path model. * fix(paths/env-loader): Windows-safe assertion for home-scope load line The test asserted the log line contained `from ~/`, which is opportunistic tilde-shortening that only happens when the tmpdir lives under `homedir()`. On Windows CI the tmpdir is on `D:\\` while homedir is `C:\\Users\\...`, so the path renders absolute and the `~/` never appears. Match on the count and the archon-home tmpdir segment instead — robust on both Unix tilde-short paths and Windows absolute paths.	2026-04-20 12:49:14 +03:00
Rasmus Widing	8ae4a56193	feat(workflows): add repo-triage — periodic maintenance via inline Haiku sub-agents (#1293 ) * feat(workflows): add repo-triage — 6-node periodic maintenance workflow Adds .archon/workflows/repo-triage.yaml: a self-contained periodic maintenance workflow that uses inline sub-agents (Claude SDK agents: field introduced in #1276) for map-reduce across open issues and PRs. Six DAG nodes, three-layer topology: - Layer 1 (parallel): triage-issues, link-prs, closed-pr-dedup-check, stale-nudge - Layer 2: closed-dedup-check (reads triage-issues state) - Layer 3: digest (synthesises all prior nodes + writes markdown) Capabilities per node: - triage-issues: delegates labeling to on-disk triage-agent; inline brief-gen Haiku for duplicate detection; 3-day auto-close clock for unanswered duplicate warnings - link-prs: conservative PR ↔ issue cross-refs via inline pr-issue- matcher Haiku, Sonnet re-verifies fully-addresses claims before suggesting Closes #X; auto-nudges on low-quality PR template fill with first-run grandfather guard (snapshot-only, no nudge spam) - closed-dedup-check: cross-matches open issues against recently- closed ones via inline closed-brief-gen Haiku; same 3-day clock - closed-pr-dedup-check: flags open PRs duplicating recently-closed PRs via inline pr-brief-gen Haiku; comment-only, never closes PRs - stale-nudge: 60-day inactivity pings (configurable); no auto-close - digest: synthesises per-node outputs + reads state files to emit $ARTIFACTS_DIR/digest.md with clickable GitHub comment links Env-gated rollout knobs: - DRY_RUN=1 (read-only; prints [DRY] lines, no gh/state mutations) - SKIP_PR_LINK=1, SKIP_CLOSED_DEDUP=1, SKIP_CLOSED_PR_DEDUP=1, SKIP_STALE_NUDGE=1 - STALE_DAYS=N (stale-nudge window; default 60) Cross-run state under .archon/state/ (gitignored): - triage-state.json briefs + pendingDedupComments - closed-dedup-state.json closedBriefs + closedMatchComments - closed-pr-dedup-state.json openBriefs + closedBriefs + matches - pr-state.json linkedPrs + commentIds + templateAdherence - stale-nudge-state.json nudged (with updatedAtAtNudge for re-nudge) Every bot comment: - @-tags the target human (reporter for issues, author for PRs) - Tracks comment ID in state for traceability - Is idempotent — re-runs skip existing comments Intended use: invoke periodically (`archon workflow run repo-triage --no-worktree`) once a scheduler lands; live state persists across runs so previously-flagged items reconcile correctly. .gitignore: adds .archon/state/ for cross-run memory files. * feat(workflows/repo-triage): post digest to Slack when SLACK_WEBHOOK is set Extends the digest node with an optional Slack-post step after the canonical digest.md artifact is written. Uses Slack incoming webhook (no bot token required beyond the incoming-webhook scope). Behavior: - SLACK_WEBHOOK unset → skipped silently with a one-line note - DRY_RUN=1 → prints full payload, does not curl - Otherwise → POSTs a compact (<3500 char) mrkdwn-formatted summary containing headline numbers, this-run comment index (clickable GitHub URLs), pending items, and a path reference to digest.md - curl failure or non-ok Slack response is logged but does not fail the node — digest.md on disk remains authoritative - Intermediate Slack text written to $ARTIFACTS_DIR/digest-slack.txt for traceability; payload JSON assembled via jq and written to $ARTIFACTS_DIR/slack-payload.json before curl posts it Slack mrkdwn conversion rules baked into the prompt (no tables, link shape <url\|text>, single-asterisk bold) so Sonnet emits a variant that renders cleanly in Slack rather than being sent raw. The webhook URL is read from the operator's environment (Archon auto-loads ~/.archon/.env on CLI startup — put SLACK_WEBHOOK=... there). * fix(workflows/repo-triage): address PR #1293 review feedback Critical (3): - `gh issue close --reason "not planned"` (space, not underscore) — the CLI expects lowercase with a space; `not_planned` fails at runtime. Fixed in both auto-close paths (triage-issues step 8, closed-dedup- check step 7). - link-prs step 7 state save was sparse `{ sha, processedAt, related, fullyAddresses }`, overwriting `commentIds` / `templateNudgedAt` / `templateAdherence`. Changed to explicit merge that spreads existing entry first so per-run captured fields survive. - Corrupt-JSON state files previously treated as first-run default (silent `pendingDedupComments` reset → 3-day clock restarts forever). All five state-load sites now abort loudly on JSON.parse throw; ENOENT/empty continue to default-shape. Important (7): - Sub-agents (`brief-gen`, `closed-brief-gen`, `pr-brief-gen`, `pr-issue-matcher`) emit `ERROR: <reason>` on gh failures rather than partial/fabricated JSON. Orchestrator detects the sentinel, logs the failed ID + first 200 chars of raw response, tracks in a failed-list, and aborts the cluster/match pass if ≥50% of items failed (avoids acting on bad data). - `pr-brief-gen` now sets `diffTruncated: true` when the 30k-char diff cap hits; link-prs verify pass downgrades any `fully-addresses` claim to `related` when either side's brief was truncated. - 3-day auto-close validates `postedAt` parses as ISO-8601 before the elapsed-time comparison; corrupt timestamps are logged and skipped, never acted on. - `gh issue close` failure path no longer drops state — sets `closeAttemptFailed: true` on the entry for next-run retry. Only drops on exit 0. - `closed-pr-dedup-check` idempotency check (`gh pr view --json comments`) now aborts the post on fetch failure rather than falling through — prevents double-posts on gh hiccups. - `triage-agent` label pass has preflight `test -f` check for `.claude/agents/triage-agent.md`; skips the pass with a clear log if the file is missing rather than firing Task calls that fail obscurely. - `brief-gen` template-adherence wording flipped from "Ignore … as 'filled'" (ambiguous, read as affirmative) to explicit "A section counts as MISSING when …", matching the `pr-issue-matcher` phrasing. Minor: - `stale-nudge` idempotency check uses substring "has been quiet for" instead of a prefix check that never matched (posted body starts with @<author>). - `closed-dedup-check` distinguishes "upstream crashed" (missing/corrupt triage-state.json, or `lastRunAt == null`) from "legitimately quiet day" (state present, briefs empty) — different log lines. - Slack curl adds `-w "\nHTTP_STATUS:%{http_code}"` + `2>&1` so TLS / 4xx / 5xx errors are visible in captured output. - `stateReason` values from `gh issue view --json stateReason` are UPPERCASE (`COMPLETED`, `NOT_PLANNED`); documented and instruct sub-agent to normalize to lowercase for consistency. Docs: - CLAUDE.md repo-level `.archon/` tree now lists `state/`. - archon-directories.md tree adds `state/` + `scripts/` (both were missing) with purpose descriptions. Deferred (worth doing as a follow-up, not blocking): - DRY/SKIP preamble duplication (~30-50 lines across 5 nodes). - Explicit `BASELINE_IS_EMPTY` capture in link-prs (current derived check works but is a load-bearing model instruction). - Digest `WARNING` prefix block when upstream nodes are missing outputs — today's "(output unavailable)" sub-line is functional. - Pre-existing README workflow-count (17 → 20) and table gaps — not caused by this PR.	2026-04-20 11:34:38 +03:00
Fly Lee	eb730c0b82	fix(docs): prevent theme reset to dark after user switches to auto/light (#1079 ) Starlight removes the `starlight-theme` localStorage key when the user selects "auto" mode. The old init script checked that key, so every navigation or refresh re-forced dark theme. Use a separate `archon-theme-init` sentinel that persists across theme changes. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 10:01:27 +03:00
Anjishnu Sengupta	c495175d94	Fix formatting in README.md (#1059 )	2026-04-20 09:59:26 +03:00
Cole Medin	ec5e5a5cf9	feat(providers/pi): opt-in extension discovery via config flag (#1298 ) Some checks are pending E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details Test Suite / test (windows-latest) (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / docker-build (push) Waiting to run Details Adds `assistants.pi.enableExtensions` (default false) to `.archon/config.yaml`. When true, Pi's `noExtensions` guard is lifted so the session loads tools and lifecycle hooks from `~/.pi/agent/extensions/`, packages installed via `pi install npm:<pkg>`, and the workflow's cwd `.pi/` directory — opening up the community extension ecosystem at https://shittycodingagent.ai/packages. Default stays suppressed to preserve the "Archon is source of truth" trust boundary: enabling this loads arbitrary JS under the Archon server's OS permissions, including whatever extension code the target repo happens to ship. Operators opt in explicitly, per-host. Skills, prompt templates, themes, and context files remain suppressed even when extensions are enabled — only the extensions gate opens. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 14:35:52 -05:00
Cole Medin	fb73a500d7	feat(providers/pi): best-effort structured output via prompt engineering (#1297 ) Pi's SDK has no native JSON-schema mode (unlike Claude's outputFormat / Codex's outputSchema). Previously Pi declared structuredOutput: false and any workflow using output_format silently degraded — the node ran, the transcript was treated as free text, and downstream $nodeId.output.field refs resolved to empty strings. 8 bundled/repo workflows across 10 nodes were affected (archon-create-issue, archon-fix-github-issue, archon-smart-pr-review, archon-workflow-builder, archon-validate-pr, etc.). This PR closes the gap via prompt engineering + post-parse: 1. When requestOptions.outputFormat is present, the provider appends a "respond with ONLY a JSON object matching this schema" instruction plus JSON.stringify(schema) to the prompt before calling session.prompt(). 2. bridgeSession accepts an optional jsonSchema param. When set, it buffers every assistant text_delta and — on the terminal result chunk — parses the buffer via tryParseStructuredOutput (trims whitespace, strips ```json / ``` fences, JSON.parse). On success, attaches structuredOutput to the result chunk (matching Claude's shape). On failure, emits a warn event and leaves structuredOutput undefined so the executor's existing dag.structured_output_missing path handles it. 3. Flipped PI_CAPABILITIES.structuredOutput to true. Unlike Claude/Codex this is best-effort, not SDK-enforced — reliable on GPT-5, Claude, Gemini 2.x, recent Qwen Coder, DeepSeek V3, less reliable on smaller or older models that ignore JSON-only instructions. Tests added (14 total): - tryParseStructuredOutput: clean JSON, fenced, bare fences, arrays, whitespace, empty, prose-wrapped (fails), malformed, inner backticks - augmentPromptForJsonSchema via provider integration: schema appended, prompt unchanged when absent - End-to-end: clean JSON → structuredOutput parsed; fenced JSON parses; prose-wrapped → no structuredOutput + no crash; no outputFormat → never sets structuredOutput even if assistant happens to emit JSON Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 10:16:02 -05:00
Cole Medin	83c119af78	fix(providers/pi): wire env injection + harden silent-failure paths (#1296 ) Four defensive fixes to the Pi community provider to match the Claude/Codex contract and eliminate silent error swallowing. 1. envInjection now actually wired (capability was declared but unused) Pi's SDK has no top-level `env` option on createAgentSession, so per-project env vars were being dropped. Routes requestOptions.env through a BashSpawnHook that merges caller env over the inherited baseline (caller wins, matching Claude/Codex semantics). When env is present with no allow/deny, resolvePiTools now explicitly returns Pi's 4 default tools so the pre-constructed default bashTool is replaced with an env-aware one. 2. AsyncQueue no longer leaks on consumer abort. Added close() that drains pending waiters with { done: true } so iterate() exits instead of hanging forever when the producer's finally fires before the next push. bridgeSession calls queue.close() in its finally block. 3. buildResultChunk no longer reports silent success when agent_end fires with no assistant message. Now returns { isError: true, errorSubtype: 'missing_assistant_message' } and logs a warn event so broken Pi sessions don't masquerade as clean completions. 4. session-resolver no longer swallows arbitrary errors from SessionManager.list(). Narrowed the catch to ENOENT/ENOTDIR (the only "session dir doesn't exist yet" signals); permission errors, parse failures, and other unexpected errors now propagate. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 09:20:32 -05:00
Rasmus Widing	60eeb00e42	feat(workflows): inline sub-agent definitions on DAG nodes (#1276 ) Some checks are pending Test Suite / test (windows-latest) (push) Waiting to run Details Test Suite / docker-build (push) Waiting to run Details E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details * feat(workflows): inline sub-agent definitions on DAG nodes Add `agents:` node field letting workflow YAML define Claude Agent SDK sub-agents inline, keyed by kebab-case ID. The main agent can spawn them via the Task tool — useful for map-reduce patterns where a cheap model briefs items and a stronger model reduces. Authors no longer need standalone `.claude/agents/.md` files for workflow-scoped helpers; the definitions live with the workflow. Claude only. Codex and community providers without the capability emit a capability warning and ignore the field. Merges with the internal `dag-node-skills` wrapper when `skills:` is also set — user-defined agents win on ID collision. fix(workflows): address PR #1276 review feedback Critical: - Re-export agentDefinitionSchema + AgentDefinition from schemas/index.ts (matches the "schemas/index.ts re-exports all" convention). Important: - Surface user-override of internal 'dag-node-skills' wrapper: warn-level provider log + platform message to the user when agents: redefines the reserved ID alongside skills:. User-wins behavior preserved (by design) but silent capability removal is now observable. - Add validator test coverage for the agents-capability warning (codex node with agents: → warning; claude node → no warning; no-agents field → no warning). - Strengthen NodeConfig.agents duplicate-type comment explaining the intentional circular-dep avoidance and pointing at the Zod schema as authoritative source. Actual extraction is follow-up work. Simplifications: - Drop redundant typeof check in validator (schema already enforces). - Drop unreachable Object.keys(...).length > 0 check in dag-executor. - Drop rot-prone "(out of v1 scope)" parenthetical. - Drop WHAT-only comment on AGENT_ID_REGEX. - Tighten AGENT_ID_REGEX to reject trailing/double hyphens (/^[a-z0-9]+(-[a-z0-9]+)$/). Tests: - parseWorkflow strips agents on script: and loop: nodes (parallel to the existing bash: coverage). - provider emits warn log on dag-node-skills collision; no warn on non-colliding inline agents. Docs: - Renumber authoring-workflows Summary section (12b → 13; bump 13-19). - Add Pi capability-table row for inline agents (❌, Claude-only). - Add when-to-use guidance (agents: vs .claude/agents/.md) in the new "Inline sub-agents" section. - Cross-link skills.md Related → inline-sub-agents. - CHANGELOG [Unreleased] Added entry for #1276.	2026-04-19 09:16:01 +03:00
Cole Medin	4c6ddd994f	fix(workflows): fail loudly on SDK isError results (#1208 ) (#1291 ) Some checks are pending E2E Smoke Tests / e2e-deterministic (push) Waiting to run Details E2E Smoke Tests / e2e-mixed (push) Blocked by required conditions Details Test Suite / test (windows-latest) (push) Waiting to run Details Test Suite / docker-build (push) Waiting to run Details E2E Smoke Tests / e2e-codex (push) Waiting to run Details E2E Smoke Tests / e2e-claude (push) Waiting to run Details Test Suite / test (ubuntu-latest) (push) Waiting to run Details Previously, `dag-executor` only failed nodes/iterations when the SDK returned an `error_max_budget_usd` result. Every other `isError: true` subtype — including `error_during_execution` — was silently `break`ed out of the stream with whatever partial output had accumulated, letting failed runs masquerade as successful ones with empty output. This is the most likely explanation for the "5-second crash" symptom in #1208: iterations finish instantly with empty text, the loop keeps going, and only the `claude.result_is_error` log tips the user off. Changes: - Capture the SDK's `errors: string[]` detail on result messages (previously discarded) and surface it through `MessageChunk.errors`. - Log `errors`, `stopReason` alongside `errorSubtype` in `claude.result_is_error` so users can see what actually failed. - Throw from both the general node path and the loop iteration path on any `isError: true` result, including the subtype and SDK errors detail in the thrown message. Note: this does not implement auto-retry. See PR comments on #1121 and the analysis on #1208 — a retry-with-fresh-session approach for loop iterations is not obviously correct until we see what `error_during_execution` actually carries in the reporter's env. This change is the observability + fail-loud step that has to come first so that signal is no longer silent. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-18 15:02:35 -05:00
DIY Smart Code	d89bc767d2	fix(setup): align PORT default on 3090 across .env.example, wizard, and JSDoc (#1152 ) (#1271 ) The server's getPort() fallback changed from 3000 to 3090 in the Hono migration (#318), but .env.example, the setup wizard's generated .env, and the JSDoc describing the fallback were not updated — leaving three different sources of truth for "the default PORT." When the wizard writes PORT=3000 to ~/.archon/.env (which the Hono server loads with override: true, while Vite only reads repo-local .env), the two processes can land on different ports silently. That mismatch is the real mechanism behind the failure described in #1152. - .env.example: comment out PORT, document 3090 as the default - packages/cli/src/commands/setup.ts: wizard no longer writes PORT=3000 into the generated .env; fix the "Additional Options" note - packages/cli/src/commands/setup.test.ts: assert no bare PORT= line and the commented default is present - packages/core/src/utils/port-allocation.ts: fix stale JSDoc "default 3000" -> "default 3090" - deploy/.env.example: keep Docker default at 3000 (compose/Caddy target that) but annotate it so users don't copy it for local dev Single source of truth for the local-dev default is now basePort in port-allocation.ts.	2026-04-17 14:15:37 +02:00
Rasmus Widing	c864d8e427	refactor(providers/pi): drop rot-prone file:line refs from code comments (#1275 ) Applies the CLAUDE.md comment rule ("don't embed paths/callers that rot as the codebase evolves") flagged by the PR #1271 review to the Pi provider's inline comments. Three spots in the merged Pi code embed `packages/.../provider.ts:N-M` line ranges pointing at the Claude and Codex providers. These ranges will drift the moment those files change — the Claude auth-merge pattern's line numbers are already off-by-a-few in some local branches. Keep the conceptual cross-reference ("mirrors Claude's process-env + request-env merge pattern", "matches the Codex provider's fallback pattern for the same condition") — that's the load-bearing part of the comment — drop the fragile line numbers and file paths. Same treatment for the upstream Pi auth-storage.ts:424-485 reference, which points at a specific line range in a moving dependency. No behavior change; comment-only refactor.	2026-04-17 14:08:43 +02:00
Rasmus Widing	4e56991b72	feat(providers): add Pi community provider (@mariozechner/pi-coding-agent) (#1270 ) * feat(providers): add Pi community provider (@mariozechner/pi-coding-agent) Introduces Pi as the first community provider under the Phase 2 registry, registered with builtIn: false. Wraps Pi's full coding-agent harness the same way ClaudeProvider wraps @anthropic-ai/claude-agent-sdk and CodexProvider wraps @openai/codex-sdk. - PiProvider implements IAgentProvider; fresh AgentSession per sendQuery call - AsyncQueue bridges Pi's callback-based session.subscribe() to Archon's AsyncGenerator<MessageChunk> contract - Server-safe: AuthStorage.inMemory + SessionManager.inMemory + SettingsManager.inMemory + DefaultResourceLoader with all no* flags — no filesystem access, no cross-request state - API key seeded per-call from options.env → process.env fallback - Model refs: '<pi-provider-id>/<model-id>' (e.g. google/gemini-2.5-pro, openrouter/qwen/qwen3-coder) with syntactic compatibility check - registerPiProvider() wired at CLI, server, and config-loader entrypoints, kept separate from registerBuiltinProviders() since builtIn: false is load-bearing for the community-provider validation story - All 12 capability flags declared false in v1 — dag-executor warnings fire honestly for any unmapped nodeConfig field - 58 new tests covering event mapping, async-queue semantics, model-ref parsing, defensive config parsing, registry integration Supported Pi providers (v1): anthropic, openai, google, groq, mistral, cerebras, xai, openrouter, huggingface. Extend PI_PROVIDER_ENV_VARS as needed. Out of scope (v1): session resume, MCP, hooks, skills mapping, thinking level mapping, structured output, OAuth flows, model catalog validation. These remain false on PI_CAPABILITIES until intentionally wired. * feat(providers/pi): read ~/.pi/agent/auth.json for OAuth + api_key passthrough Replaces the v1 env-var-only auth flow with AuthStorage.create(), which reads ~/.pi/agent/auth.json. This transparently picks up credentials the user has populated via `pi` → `/login` (OAuth subscriptions: Claude Pro/Max, ChatGPT Plus, GitHub Copilot, Gemini CLI, Antigravity) or by editing the file directly. Env-var behavior preserved: when ANTHROPIC_API_KEY / GEMINI_API_KEY / etc. is set (in process.env or per-request options.env), the adapter calls setRuntimeApiKey which is priority #1 in Pi's resolution chain. Auth.json entries are priority #2-#3. Pi's internal env-var fallback remains priority #4 as a safety net. Archon does not implement OAuth flows itself — it only rides on creds the user created via the Pi CLI. OAuth refresh still happens inside Pi (auth-storage.ts:369-413) under a file lock; concurrent refreshes between the Pi CLI and Archon are race-safe by Pi's own design. - Fail-fast error now mentions both the env-var path and `pi /login` - 2 new tests: OAuth cred from auth.json; env var wins over auth.json - 12 existing tests still pass (env-var-only path unchanged) CI compatibility: no auth.json in CI, no change — env-var (secrets) flows through Pi's getEnvApiKey fallback identically to v1. * test(e2e): add Pi provider smoke test workflow Mirrors e2e-claude-smoke.yaml: single prompt node + bash assert. Targets `anthropic/claude-haiku-4-5` via `provider: pi`; works in CI (ANTHROPIC_API_KEY secret) and locally (user's `pi /login` OAuth). Verified locally with an Anthropic OAuth subscription — full run takes ~4s from session_started to assert PASS, exercising the async-queue bridge and agent_end → result-chunk assembly under real Pi event timing. Not yet wired into .github/workflows/e2e-smoke.yml — separate PR once this lands, to keep the Pi provider PR minimal. * feat(providers/pi): v2 — thinkingLevel, tool restrictions, systemPrompt Extends the Pi adapter with three node-level translations, flipping the corresponding capability flags from false → true so the dag-executor no longer emits warnings for these fields on Pi nodes. 1. effort / thinking → Pi thinkingLevel (options-translator.ts) - Archon EffortLevel enum: low\|medium\|high\|max (from packages/workflows/src/schemas/dag-node.ts). `max` maps to Pi's `xhigh` since Archon's enum lacks it. - Pi-native strings (minimal, xhigh, off) also accepted for programmatic callers bypassing the schema. - `off` on either field → no thinkingLevel (Pi's implicit off). - Claude-shape object `thinking: {type:'enabled', budget_tokens:N}` yields a system warning and is not applied. 2. allowed_tools / denied_tools → filtered Pi built-in tools - Supports all 7 Pi tools: read, bash, edit, write, grep, find, ls. - Case-insensitive normalization. - Empty `allowed_tools: []` means no tools (LLM-only), matching e2e-claude-smoke's idiom. - Unknown names (Claude-specific like `WebFetch`) collected and surfaced as a system warning; ignored tools don't fail the run. 3. systemPrompt (AgentRequestOptions + nodeConfig.systemPrompt) - Threaded through `DefaultResourceLoader({systemPrompt})`; Pi's default prompt is replaced entirely. Request-level wins over node-level. Capability flag changes: - thinkingControl: false → true - effortControl: false → true - toolRestrictions: false → true Package delta: - +1 direct dep: @sinclair/typebox (Pi types reference it; adding as direct dep resolves the TS portable-type error). - +1 test file: options-translator.test.ts (19 tests, 100% coverage). - provider.test.ts extended with 11 new tests covering all three paths. - registry.test.ts updated: capability assertion reflects new flags. Live-verified: `bun run cli workflow run e2e-pi-smoke --no-worktree` succeeds in 1.2s with thinkingLevel=low, toolCount=0. Smoke YAML updated to use `effort: low` (schema-valid) + `allowed_tools: []` (LLM-only). * test(e2e): add comprehensive Pi smoke covering every CI-compatible node type Exercises every node type Archon supports under `provider: pi`, except `approval:` (pauses for human input, incompatible with CI): 1. prompt — inline AI prompt 2. command — named command file (uses e2e-echo-command.md) 3. loop — bounded iterative AI prompt (max_iterations: 2) 4. bash — shell script with JSON output 5. script — bun runtime (echo-args.js) 6. script — uv / Python runtime (echo-py.py) Plus DAG features on top of Pi: - depends_on + $nodeId.output substitution - when: conditional with JSON dot-access - trigger_rule: all_success merge - final assert node validates every upstream output is non-empty Complements the minimal e2e-pi-smoke.yaml — that stays as the fast-path smoke for connectivity checks; this one is the broader surface coverage. Verified locally end-to-end against Anthropic OAuth (pi /login): PASS, all 9 non-final nodes produce output, assert succeeds. * feat(providers/pi): resolve Archon `skills:` names to Pi skill paths Flips capabilities.skills: false → true by translating Archon's name-based `skills:` nodeConfig (e.g. `skills: [agent-browser]`) to absolute directory paths Pi's DefaultResourceLoader can consume via additionalSkillPaths. Search order for each skill name (first match wins): 1. <cwd>/.agents/skills/<name>/ — project-local, agentskills.io 2. <cwd>/.claude/skills/<name>/ — project-local, Claude convention 3. ~/.agents/skills/<name>/ — user-global, agentskills.io 4. ~/.claude/skills/<name>/ — user-global, Claude convention A directory resolves only if it contains a SKILL.md. Unresolved names are collected and surfaced as a system-chunk warning (e.g. "Pi could not resolve skill names: foo, bar. Searched .agents/skills and .claude/skills (project + user-global)."), matching the semantic of "requested but not found" without aborting the run. Pi's buildSystemPrompt auto-appends the agentskills.io XML block for each loaded skill, so the model sees them — no separate prompt injection needed (Pi differs from Claude here; Claude wraps in an AgentDefinition with a preloaded prompt, Pi uses XML block in system prompt). Ancestor directory traversal above cwd is deliberately skipped in this pass — matches the Pi provider's cwd-bound scope and avoids ambiguity about which repo's skills win when Archon runs from a subdirectory. Bun's os.homedir() bypasses the HOME env var; the resolver uses `process.env.HOME ?? homedir()` so tests can stage a synthetic home dir. Tests: - 11 new tests in options-translator.test.ts cover project/user, .agents/ vs .claude/, project-wins-over-user, SKILL.md presence check, dedup, missing-name collection. - 2 new integration tests in provider.test.ts cover the missing-skill warning path and the "no skills configured → no additionalSkillPaths" path. - registry.test.ts updated to assert skills: true in capabilities. Live-verified locally: `.claude/skills/archon-dev/SKILL.md` resolves, pi.session_started log shows `skillCount: 1, missingSkillCount: 0`, smoke workflow passes in 1.2s. * feat(providers/pi): session resume via Pi session store Flips capabilities.sessionResume: false → true. Pi now persists sessions under ~/.pi/agent/sessions/<encoded-cwd>/<uuid>.jsonl by default — same pattern Claude and Codex use for their respective stores, same blast radius as those providers. Flow: - No resumeSessionId → SessionManager.create(cwd) (fresh, persisted) - resumeSessionId + match in SessionManager.list(cwd) → open(path) - resumeSessionId + no match → fresh session + system warning ("⚠️ Could not resume Pi session. Starting fresh conversation.") Matches Codex's resume_thread_failed fallback at packages/providers/src/codex/provider.ts:553-558. The sessionId flows back to Archon via the terminal `result` chunk — bridgeSession annotates it with session.sessionId unconditionally so Archon's orchestrator can persist it and pass it as resumeSessionId on the next turn. Same mechanism used for Claude/Codex. Cross-cwd resume (e.g. worktree switch) is deliberately not supported in this pass: list(cwd) scans only the current cwd's session dir. A workflow that changes cwd mid-run lands on a fresh session, which matches Pi's mental model. Bridge sessionId annotation uses session.sessionId, which Pi always populates (UUID) — so no special-case for inMemory sessions is needed. Factored the resolver into session-resolver.ts (5 unit tests): - no id → create - id + match → open - id + no match → create with resumeFailed: true - list() throws → resumeFailed: true (graceful) - empty-string id → treated as "no resume requested" Integration tests in provider.test.ts add 3 cases: - resume-not-found yields warning + calls create - resume-match calls open with the file path, no warning - result chunk always carries sessionId Verified live end-to-end against Anthropic OAuth: - first call → sessionId 019d...; model replies "noted" - second call with that sessionId → "resumed: true" in logs; model correctly recalls prior turn ("Crimson.") - bogus sessionId → "⚠️ Could not resume..." warning + fresh UUID * refactor(providers,core): generalize community-provider registration Addresses the community-pattern regression flagged in the PR #1270 review: a second community provider should require editing only its own directory, not seven files across providers/ + core/ + cli/ + server/. Three changes: 1. Drop typed `pi` slot from AssistantDefaultsConfig + AssistantDefaults. Community providers live behind the generic `[string]` index that `ProviderDefaultsMap` was explicitly designed to provide. The typed claude/codex slots stay — they give IDE autocomplete for built-in config access without `as` casts, which was the whole reason the intersection exists. Community providers parse their own config via Record<string, unknown> anyway, so the typed slot added no real parser safety. 2. Loop-based getDefaults + mergeAssistantDefaults. No more hardcoded `pi: {}` spreads. getDefaults() seeds from `getRegisteredProviders()`; mergeAssistantDefaults clones every slot present in `base`. Adding a new provider requires zero edits to this function. 3. New `registerCommunityProviders()` aggregator in registry.ts. Entrypoints (CLI, server, config-loader) call ONE function after `registerBuiltinProviders()` rather than one call per community provider. Adding a new community provider is now a single-line edit to registerCommunityProviders(). This makes Pi (and future community providers) actually behave like Phase 2 (#1195) advertised: drop the implementation under packages/providers/src/community/<id>/, export a `register<Id>Provider`, add one line to the aggregator. Tests: - New `registerCommunityProviders` suite (2 tests: registers pi, idempotent). - config-loader.test updated: assert built-in slots explicitly rather than exhaustive map shape. No functional change for Pi end-users. Purely structural. * fix(providers/pi,core): correctness + hygiene fixes from PR #1270 review Addresses six of the review's important findings, all within the same PR branch: 1. envInjection: false → true The provider reads requestOptions.env on every call (for API-key passthrough). Declaring the capability false caused a spurious dag-executor warning for every Pi user who configured codebase env vars — which is the MAIN auth path. Flipping to true removes the false positive. 2. toSafeAssistantDefaults: denylist → allowlist The old shape deleted `additionalDirectories`, `settingSources`, `codexBinaryPath` before sending defaults to the web UI. Any future sensitive provider field (OAuth token, absolute path, internal metadata) would silently leak via the `[key: string]: unknown` index signature. New SAFE_ASSISTANT_FIELDS map lists exactly what to expose per provider; unknown providers get an empty allowlist so the web UI sees "provider exists" but no config details. 3. AsyncQueue single-consumer invariant The type was documented single-consumer but unenforced. A second `for await` would silently race with the first over buffer + waiters. Added a synchronous guard in Symbol.asyncIterator that throws on second call — copy-paste mistakes now fail fast with a clear message instead of dropping items. 4. session.dispose() / session.abort() silent catches Both catch blocks now log at debug via a module-scoped logger so SDK regressions surface without polluting normal output. 5. Type scripted events as AgentSessionEvent in provider.test.ts Was `Record<string, unknown>` — Pi field renames would silently keep tests passing. Now typed against Pi's actual event union. 6. Leaked /tmp/pi-research/... path in provider.ts comment Local-machine path that crept in during research. Replaced with the upstream GitHub URL (matches convention at provider.ts:110). Plus review-flagged simplifications: - Extract lookupPiModel wrapper — isolates the `as unknown as` cast behind one searchable name. - Hoist QueueItem → BridgeQueueItem at module scope (export'd for test visibility; not used externally yet but enables unit testing the mapping in isolation if needed later). - getRegisteredProviderNames: remove side-effecting registration calls. `loadConfig()` already bootstraps the registry before any caller can observe this helper — the hidden coupling was misleading. Plus missing-coverage tests from the review (pr-test-analyzer): - session.prompt() rejection → error surfaces to consumer - pre-aborted signal → session.abort() called - mid-stream abort → session.abort() called - modelFallbackMessage → system chunk yielded - AsyncQueue second-consumer → throws synchronously No behavioral changes for end users beyond the envInjection warning fix. * docs: Pi provider + community-provider contributor guide Addresses the PR #1270 review's docs-impact findings: the original Pi PR had no user-facing or contributor-facing documentation, and architecture.md still referenced the pre-Phase-2 factory.ts pattern (factory.ts was deleted in #1195). 1. packages/docs-web/src/content/docs/reference/architecture.md - Replace stale factory.ts references with the registry pattern. - Update inline IAgentProvider block: add getCapabilities, add options parameter. - Rewrite MessageChunk block as the actual discriminated union (was a placeholder with optional fields that didn't match the current type). - "Adding a New AI Agent Provider" checklist now distinguishes built-in (register in registerBuiltinProviders) from community (separate guide). Links to the new contributor guide. 2. packages/docs-web/src/content/docs/contributing/adding-a-community-provider.md (new) - Step-by-step guide using Pi as the reference implementation. - Covers: directory layout, capability discipline (start false, flip one at a time), provider class skeleton, registration via aggregator, test isolation (Bun mock.module pollution), what NOT to do (no edits to AssistantDefaultsConfig, no direct registerProvider from entrypoints, no overclaiming capabilities). 3. packages/docs-web/src/content/docs/getting-started/ai-assistants.md - New "Pi (Community Provider)" section: install, OAuth + API-key table per Pi backend, model ref format, workflow examples, capability matrix showing what Pi supports (session resume, tool restrictions, effort/thinking, skills, system prompt, envInjection) and what it doesn't (MCP, hooks, structured output, cost control, fallback model, sandbox). 4. .env.example - New Pi section with commented env vars for each supported backend (ANTHROPIC_API_KEY through HUGGINGFACE_API_KEY), each paired with its Pi provider id. OAuth flow (pi /login → auth.json) is explicitly called out — Archon reads that file too. 5. CHANGELOG.md - Unreleased entry for Pi, registerCommunityProviders aggregator, and the new contributor guide.	2026-04-17 13:52:03 +02:00
DIY Smart Code	922edbbac0	Merge pull request #1272 from coleam00/fix/issue-1260-docker-bind-mount-dirs fix(docker): create /.archon subdirs in entrypoint for bind mounts (#1260)	2026-04-17 12:50:43 +02:00
Leex	a7337d6977	fix(docker): create /.archon subdirs in entrypoint for bind mounts (#1260 ) Named volumes inherit /.archon/workspaces and /.archon/worktrees from the image layer on first run, but bind mounts do not. Without these directories, the Claude subprocess is spawned with a non-existent cwd and fails silently, causing the 60s first-event timeout. Adding mkdir -p in the entrypoint is idempotent for named volumes and fixes bind-mount setups (e.g. ARCHON_DATA pointing to a host path on macOS/Linux).	2026-04-17 12:40:13 +02:00
Rasmus Widing	301a139e5a	fix(core/test): split connection.test.ts from DB-test batch to avoid mock pollution (#1269 ) messages.test.ts uses mock.module('./connection', ...) at module-load time. Per CLAUDE.md:131 (Bun issue oven-sh/bun#7823), mock.module() is process- global and irreversible. When Bun pre-loads all test files in a batch, the mock shadows the real connection module before connection.test.ts runs, causing getDatabaseType() to always return the mocked value regardless of DATABASE_URL. Move connection.test.ts into its own `bun test` invocation immediately after postgres.test.ts (which runs alone) and before the big DB/utils/ config/state batch that contains messages.test.ts. This follows the same isolation pattern already used for command-handler, clone, postgres, and path-validation tests.	2026-04-17 09:33:52 +02:00
Cole Medin	bed36ca4ad	fix(workflows): add word boundary to context variable substitution regex (#1256 ) * fix(workflows): add word boundary to context variable substitution regex (#1112) Variable substitution for $CONTEXT, $EXTERNAL_CONTEXT, and $ISSUE_CONTEXT was matching as a prefix of longer identifiers like $CONTEXT_FILE, silently corrupting bash node scripts. Added negative lookahead (?![A-Za-z0-9_]) to CONTEXT_VAR_PATTERN_STR so only exact variable names are substituted. Changes: - Add negative lookahead to CONTEXT_VAR_PATTERN_STR regex in executor-shared.ts - Add regression test for prefix-match boundary case Fixes #1112 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(workflows): add missing boundary cases for context variable substitution Add three new test cases that complete coverage of the word-boundary fix from #1112: $ISSUE_CONTEXT with suffix variants, $ISSUE_CONTEXT with multiple suffixes, and contextSubstituted=false for suffix-only prompts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 18:32:06 -05:00
Cole Medin	df828594d7	fix(test): normalize on-disk content to LF in bundled-defaults test Companion to `75427c7c`. The bundle-completeness test compared BUNDLED_* strings (now LF-normalized by the generator) against raw readFileSync output, which is CRLF on Windows checkouts. Apply the same normalization to the on-disk side so the defense-in-depth check stays meaningful on every platform. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-16 17:59:41 -05:00
Cole Medin	75427c7cdd	fix(ci): normalize line endings in bundled-defaults generator On Windows, `git checkout` converts source files to CRLF via the `* text=auto` policy. The generator inlined raw file content as JSON strings, so the Windows regeneration produced `\r\n` escapes while the committed artifact (written on Linux) used `\n`. `bun run check:bundled` then flagged the file as stale and failed the Windows CI job. Fix by normalizing CRLF → LF both when reading source defaults and when comparing against the existing generated file. No-op on Linux. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-16 17:55:24 -05:00
DIY Smart Code	b7b445bd31	Merge pull request #1110 from LocNguyenSGU/fix/issue-1108-settings-add-project-url-support fix: accept GitHub URLs in settings add project	2026-04-16 23:55:51 +02:00
Leex	9dd57b2f3c	fix(web): unify Add Project URL/path classification across UI entry points Settings → Projects Add Project only submitted { path }, so GitHub URLs entered there failed even though the API and the Sidebar Add Project already accepted them. Closes #1108. Changes: - Add packages/web/src/lib/codebase-input.ts: shared getCodebaseInput() helper returning a discriminated { path } \| { url } union (re-exported from api.ts for convenience). - Use the helper from all three Add Project entry points: Sidebar, Settings, and ChatPage. Removes three divergent inline heuristics. - SettingsPage: rename addPath → addValue (state now holds either URL or local path) and update placeholder text. - Tests: cover https://, git@ shorthand, ssh://, git://, whitespace, unix/relative/home/Windows/UNC paths. - Docs: document the unified Add Project entry point in adapters/web.md. Heuristic flips from "assume URL unless explicitly local" to "assume local unless explicitly remote" — only inputs starting with https?://, ssh://, git@, or git:// are sent as { url }; everything else is sent as { path }. The server already resolves tilde/relative paths. Co-authored-by: Nguyen Huu Loc <lockbkbang@gmail.com>	2026-04-16 23:43:19 +02:00
Rasmus Widing	86e4c8d605	fix(bundled-defaults): auto-generate import list, emit inline strings (#1263 ) * fix(bundled-defaults): auto-generate import list, emit inline strings Root-cause fix for bundle drift (15 commands + 7 workflows previously missing from binary distributions) and a prerequisite for packaging @archon/workflows as a Node-loadable SDK. The hand-maintained `bundled-defaults.ts` import list is replaced by `scripts/generate-bundled-defaults.ts`, which walks `.archon/{commands,workflows}/defaults/` and emits a generated source file with inline string literals. `bundled-defaults.ts` becomes a thin facade that re-exports the generated records and keeps the `isBinaryBuild()` helper. Inline strings (via JSON.stringify) replace Bun's `import X from '...' with { type: 'text' }` attributes. The binary build still embeds the data at compile time, but the module now loads under Node too — removing SDK blocker #2. - Generator: `scripts/generate-bundled-defaults.ts` (+ `--check` mode for CI) - `package.json`: `generate:bundled`, `check:bundled`; wired into `validate` - `build-binaries.sh`: regenerates defaults before compile - Test: `bundle completeness` now derives expected set from on-disk files - All 56 defaults (36 commands + 20 workflows) now in the bundle * fix(bundled-defaults): address PR review feedback Review: https://github.com/coleam00/Archon/pull/1263#issuecomment-4262719090 Generator: - Guard against .yaml/.yml name collisions (previously silent overwrite) - Add early access() check with actionable error when run from wrong cwd - Type top-level catch as unknown; print only message for Error instances - Drop redundant /* eslint-disable / emission (global ignore covers it) - Fix misleading CI-mechanism claim in header comment - Collapse dead `if (!ext) continue` guard into a single typed pass Scripts get real type-checking + linting: - New scripts/tsconfig.json extending root config - type-check now includes scripts/ via `tsc --noEmit -p scripts/tsconfig.json` - Drop `scripts/` from eslint ignores; add to projectService file scope Tests: - Inline listNames helper (Rule of Three) - Drop redundant toBeDefined/typeof assertions; the Record<string, string> type plus length > 50 already cover them - Add content-fidelity round-trip assertion (defense against generator content bugs, not just key-set drift) Facade comment: drop dead reference to .claude/rules/dx-quirks.md. CI: wire `bun run check:bundled` into .github/workflows/test.yml so the header's CI-verification claim is truthful. Docs: CLAUDE.md step count four→five; add contributor bullet about `bun run generate:bundled` in the Defaults section and CONTRIBUTING.md. chore(e2e): bump Codex model to gpt-5.2 gpt-5.1-codex-mini is deprecated and unavailable on ChatGPT-account Codex auth. Plain gpt-5.2 works. Verified end-to-end: - e2e-codex-smoke: structured output returns {category:'math'} - e2e-mixed-providers: claude+codex both return expected tokens	2026-04-16 21:27:51 +02:00
Cole Medin	d535c832e3	feat(telemetry): anonymous PostHog workflow-invocation tracking (#1262 ) * feat(telemetry): add anonymous PostHog workflow-invocation tracking Emits one `workflow_invoked` event per run with workflow name/description, platform, and Archon version. Uses a stable random UUID persisted to `$ARCHON_HOME/telemetry-id` for distinct-install counting, with `$process_person_profile: false` to stay in PostHog's anonymous tier. Opt out with `ARCHON_TELEMETRY_DISABLED=1` or `DO_NOT_TRACK=1`. Self-host via `POSTHOG_API_KEY` / `POSTHOG_HOST`. Closes #1261 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(telemetry): stop leaking test events to production PostHog The `telemetry-id preservation` test exercised the real capture path with the embedded production key, so every `bun run validate` published a tombstone `workflow_name: "w"` event. Redirect POSTHOG_HOST to loopback so the flush fails silently; bump test timeout to accommodate the retry-then-give-up window. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(telemetry): silence posthog-node stderr leak on network failure The PostHog SDK's internal logFlushError() writes 'Error while flushing PostHog' directly to stderr via console.error on any network or HTTP error, bypassing logger config. For a fire-and-forget telemetry path this leaked stack traces to users' terminals whenever PostHog was unreachable (offline, firewalled, DNS broken, rate-limited). Pass a silentFetch wrapper to the PostHog client that masks failures as fake 200 responses. The SDK never sees an error, so it never logs. Original failure is still recorded at debug level for diagnostics. Side benefit: shutdown is now fast on network failure (no retry loop), so offline CLI commands no longer hang ~10s on exit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(telemetry): make id-preservation test deterministic Replace the fire-and-forget capture + setTimeout + POSTHOG_HOST-loopback dance with a direct synchronous call to getOrCreateTelemetryId(). Export the function with an @internal marker so tests can exercise the id path without spinning up the PostHog client. No network, no timer, no flake. Addresses CodeRabbit feedback on #1262. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-16 13:45:55 -05:00
Cole Medin	f1c5dcb231	Merge pull request #1255 from coleam00/feat/e2e-smoke-tests feat(ci): add E2E smoke test workflows for Claude and Codex	2026-04-16 11:48:15 -05:00
Cole Medin	47be699e00	chore(ci): remove test branch trigger before merge Removes feat/e2e-smoke-tests from E2E workflow triggers. CI failure detection verified: red X on run 24522356737 (deliberate bash exit 1), green on run 24522484762 (reverted), and credit-exhaustion failure also correctly produced exit 1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 11:46:23 -05:00
Cole Medin	2682430543	test(ci): temporarily re-add branch trigger to verify green CI Will remove feat/e2e-smoke-tests trigger in the final cleanup commit before merging to dev. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 11:43:54 -05:00
Cole Medin	7d38716f1f	fix(ci): revert deliberate failure, remove test branch trigger Reverts the injected exit 1 in bash-echo (CI red X confirmed in run 24522356737). Removes feat/e2e-smoke-tests from branch triggers — ready to merge to dev. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 11:43:33 -05:00
Cole Medin	367de7a625	test(ci): inject deliberate failure to verify CI red X Injects exit 1 into e2e-deterministic bash-echo node to prove the engine fix (failWorkflowRun on anyFailed) propagates to a non-zero CLI exit code and a red X in GitHub Actions. Will be reverted in the next commit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 11:40:55 -05:00
Cole Medin	18681701b3	fix(ci): remove command node from Claude smoke test Command nodes consistently produce zero output and hit the 30s idle timeout in CI, even with allowed_tools: []. This appears to be a bug in how command: nodes interact with the Claude CLI subprocess — the process never emits output. This adds 30s of wasted time to every run. The simple prompt node already verifies Claude connectivity. Command file discovery/loading is a deterministic operation that doesn't need an AI call to validate in a smoke test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 11:05:48 -05:00
Cole Medin	1c600f2b62	fix(ci): add allowed_tools: [] to command node to prevent 30s hang The command-test node was missing allowed_tools: [], causing the Claude CLI to load full tool access. Without tools restricted, the subprocess hangs after responding. The simple prompt node with allowed_tools: [] completes in 4s — this should match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 11:03:04 -05:00
Cole Medin	bf9091159c	refactor(ci): strip E2E smoke tests to bare minimum for speed Claude CLI is extremely slow with structured output (~4 min) and tool use (~2 min) in CI, making the previous multi-workflow approach take 10+ min. Radical simplification: - Remove e2e-all-nodes (redundant with deterministic + claude-smoke) - Remove e2e-skills-mcp (advanced features too slow for per-commit smoke) - Remove structured output and tool use from Claude smoke test (too slow) - Strip Claude smoke to: 1 prompt + 1 command + 1 bash verify node - Keep mixed providers (simplified: 1 Claude + 1 Codex + bash verify) - All timeouts reduced to 30s, all job timeouts to 5 min - Remove MCP test fixtures and e2e-test-skill (no longer needed) Expected: Claude job ~15s of AI time, Codex ~5s, mixed ~10s Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 10:50:11 -05:00
Cole Medin	4c259e7a0a	fix(ci): increase Claude E2E job timeout from 10 to 20 minutes Claude CLI is slow with structured output and tool use in CI (~4 min for structured output, ~2 min for tool use). With 3 sequential workflow runs (claude-smoke, all-nodes, skills-mcp), 10 minutes is insufficient. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 10:46:48 -05:00
Cole Medin	d666b3c7ca	fix(ci): resolve 5 E2E smoke test failures from first CI run - Rename echo-args.py → echo-py.py to avoid duplicate script name conflict with echo-args.js (script discovery uses base name, not extension) - Add CODEX_API_KEY env var to codex and mixed CI jobs (Codex CLI requires this, not OPENAI_API_KEY, for headless auth) - Sequentialize all Claude AI nodes via depends_on chains to prevent concurrent CLI subprocess idle timeouts in CI - Increase idle_timeout from 60s to 120s on all AI nodes for CI headroom - Override MCP test node to model: sonnet (Haiku doesn't support MCP tool search) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 10:34:57 -05:00
Cole Medin	7d9090678e	feat(ci): add E2E smoke test workflows for Claude and Codex providers Adds real workflow execution to CI, verifying the full engine works end-to-end with both providers. Organized into 4 tiers: deterministic (0 API calls), Claude, Codex, and mixed-provider tests. New workflows: - e2e-deterministic: bash, script (bun/uv), conditions, trigger rules - e2e-skills-mcp: skills injection, MCP server, effort, systemPrompt - Enhanced existing e2e-claude-smoke, e2e-codex-smoke, e2e-mixed-providers - Fixed e2e-all-nodes (was broken due to script node syntax) Supporting files: - e2e-echo-command.md (test command file) - echo-args.py (Python script for uv runtime test) - e2e-test-skill/SKILL.md (minimal skill for injection test) - e2e-filesystem.json (MCP config for filesystem server test) GitHub Actions: .github/workflows/e2e-smoke.yml - Runs on push to main/dev only (no PR trigger to avoid API cost abuse) - Uses haiku (Claude) and gpt-5.1-codex-mini (Codex) for cost efficiency Closes #1254 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 10:12:06 -05:00
Cole Medin	7721259bdc	fix(core): surface auth errors instead of silently dropping them (#1089 ) * fix: surface auth errors instead of silently dropping them (#1076) When Claude OAuth refresh token is expired, the SDK yields a result chunk with is_error=true and no session_id. Both handleStreamMode and handleBatchMode guarded the result branch with `&& msg.sessionId`, silently dropping the error. Users saw no response at all. Changes: - Remove sessionId guard from result branches in orchestrator-agent.ts - Add isError early-exit that sends error message to user - Add 4 OAuth patterns to AUTH_PATTERNS in claude.ts and codex.ts - Add OAuth refresh-token handler to error-formatter.ts - Add tests for new error-formatter branches Fixes #1076 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add structured logging to isError path and remove overly broad auth pattern - Add getLog().warn({ conversationId, errorSubtype }, 'ai_result_error') in both handleStreamMode and handleBatchMode isError branches so auth failures are visible server-side instead of silently swallowed - Remove 'access token' from AUTH_PATTERNS in claude.ts and codex.ts; the real OAuth refresh error is already covered by 'refresh token' and 'could not be refreshed', eliminating false-positive auth classification risk Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: route isError results through classifyAndFormatError with provider-specific messages The isError path in stream/batch mode used a hardcoded generic message, bypassing the classifyAndFormatError infrastructure. Now constructs a synthetic Error from errorSubtype and routes through the formatter. Error formatter updated with provider-specific auth detection: - Claude: OAuth token refresh, sign-in expired → guidance to run /login - Codex: 401 retry exhaustion → guidance to run codex login - General: tightened patterns (removed broad 'auth error' substring match) Also persists session ID before early-returning on isError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 09:36:40 -05:00
Cole Medin	818854474f	fix(workflows): stop warning about model/provider on loop nodes (#1090 ) * fix(workflows): stop warning about model/provider on loop nodes (#1082) The loader incorrectly classified loop nodes as "non-AI nodes" and warned that model/provider fields were ignored, even though the DAG executor has supported these fields on loop nodes since commit `594d5daa`. Changes: - Add LOOP_NODE_AI_FIELDS constant excluding model/provider from the warn list - Update loader to use LOOP_NODE_AI_FIELDS for loop node field checking - Fix BASH_NODE_AI_FIELDS comment that incorrectly referenced loop nodes - Add tests for loop node model/provider acceptance and unsupported field warnings Fixes #1082 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(workflows): update stale comment and add LOOP_NODE_AI_FIELDS unit tests - Update section comment from "bash/loop nodes" to "non-AI nodes" since loop nodes do support model/provider (the fix in this PR) - Export LOOP_NODE_AI_FIELDS from schemas/index.ts alongside BASH/SCRIPT variants - Add dedicated describe block in schemas.test.ts verifying that model and provider are excluded and all other BASH_NODE_AI_FIELDS are still present Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * simplify: merge nodeType and aiFields into a single if/else chain in parseDagNode Eliminates the separate isNonAiNode predicate and nested ternary for aiFields selection by combining both into one explicit if/else block — each branch sets nodeType and aiFields together, removing the need to re-check node type twice. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 09:19:18 -05:00
Cole Medin	64bdd30ef4	Merge pull request #1066 from coleam00/archon/task-fix-issue-1042 fix: replace Telegraf with grammY to fix Bun TypeError crash	2026-04-16 09:15:56 -05:00
Cole Medin	a5e5d5ceeb	fix: address review findings for grammY Telegram adapter - Fix misleading 'unde***' log when ctx.from is undefined; use 'unknown' to match the Slack/Discord adapter pattern - Log post-startup bot runtime errors before reject() (no-op after onStart fires but errors are now visible in logs) - Add debug log when message is dropped due to no handler registered - Add stop() unit test to guard against grammY API rename regressions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 09:15:47 -05:00
Cole Medin	da1f8b7d97	fix: replace Telegraf with grammY to fix Bun TypeError crash (#1042 ) Telegraf v4's internal `redactToken()` assigns to readonly `error.message` properties, which crashes under Bun's strict ESM mode. Telegraf is EOL. Changes: - Replace `telegraf` dependency with `grammy` ^1.36.0 - Migrate adapter from Telegraf API to grammY API (Bot, bot.api, bot.start) - Use grammY's `onStart` callback pattern for async polling launch - Preserve 409 retry logic and all existing behavior - Update test mocks from telegraf types to grammy types Fixes #1042 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 09:15:46 -05:00

1 2 3 4 5 ...

1284 commits