feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# =============================================================================
|
|
|
|
|
# Archon - Remote Agentic Coding Platform
|
|
|
|
|
# Multi-stage build: deps → web build → production image
|
|
|
|
|
# =============================================================================
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Stage 1: Install dependencies
|
|
|
|
|
# ---------------------------------------------------------------------------
|
fix(docker): fix Docker build failures and add CI guard (#1022)
* fix(docker): update Bun base image from 1.2 to 1.3
The lockfile was generated with Bun 1.3.x locally but the Docker image
used oven/bun:1.2-slim. Bun 1.3 changed the lockfile format, causing
--frozen-lockfile to fail during docker build.
* fix(docker): pin Bun to exact version 1.3.9 matching lockfile
Floating tag 1.3-slim resolved to 1.3.11 which has a different lockfile
format than 1.3.9 used to generate bun.lock. Pin to exact patch version
to prevent --frozen-lockfile failures.
* fix(docker): add missing docs-web workspace package.json
The docs-web package was added as a workspace member but its
package.json was never added to the Dockerfile COPY steps. This caused
bun install --frozen-lockfile to fail because the workspace layout
in Docker didn't match the lockfile.
* fix(docker): use hoisted linker for Vite/Rollup compatibility
Bun's default "isolated" linker stores packages in node_modules/.bun/
with symlinks that Vite's Rollup bundler cannot resolve during
production builds (e.g., remark-gfm → mdast-util-gfm chain).
Using --linker=hoisted gives the classic flat node_modules layout
that Rollup expects. Local dev is unaffected (Vite dev server handles
the isolated layout fine).
* ci: pin Bun version to 1.3.9 and add Docker build check
- Align CI Bun version (was 1.3.11) with Dockerfile and local dev
(1.3.9) to prevent lockfile format mismatches between environments
- Add docker-build job to test.yml that builds the Docker image on
every PR — catches Dockerfile regressions (missing workspace
packages, linker issues, build failures) before they reach deploy
* fix(ci): add permissions for GHA cache and tighten Bun engine
- Add actions: write permission to docker-build job so GHA layer cache
writes succeed on PRs from forks
- Tighten package.json engines.bun from >=1.0.0 to >=1.3.9 to document
the minimum version that matches the lockfile format
* fix(ci): add smoke test, align Bun version across all workflows
Review fixes:
- Add load: true + health endpoint smoke test to docker-build CI job
so we verify the image actually starts, not just compiles
- Align Bun 1.3.9 in deploy-docs.yml and release.yml (were still 1.3.11)
- Document why docs-web source is intentionally omitted from Docker
* chore: float Docker to bun:1.3 and align CI to 1.3.11
- Dockerfile: oven/bun:1.3-slim (auto-tracks latest 1.3.x patches)
- CI workflows: bun-version 1.3.11 (current latest, reproducible)
- engines.bun: >=1.3.9 (minimum for local devs)
Lockfile format is stable across 1.3.x patches, so this is safe.
* fix(docker,ci): pin Docker to 1.3.11, loosen engines, harden smoke test
- Dockerfile: pin oven/bun:1.3.11-slim (was floating 1.3-slim) so Docker
builds are reproducible and match CI exactly.
- package.json: loosen engines to ^1.3.0 so end users on any 1.3.x can
run the CLI; CI/Docker remain pinned to the canonical latest.
- CI smoke test: replace 'sleep 5' with curl --retry-connrefused, and
move container cleanup to an 'if: always()' step so a failed health
check no longer leaks the named container.
---------
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
2026-04-07 07:37:47 +00:00
|
|
|
FROM oven/bun:1.3.11-slim AS deps
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
|
|
|
|
|
WORKDIR /app
|
|
|
|
|
|
|
|
|
|
# Copy root package files and lockfile
|
|
|
|
|
COPY package.json bun.lock ./
|
|
|
|
|
|
|
|
|
|
# Copy ALL workspace package.json files (monorepo lockfile depends on all of them)
|
|
|
|
|
COPY packages/adapters/package.json ./packages/adapters/
|
|
|
|
|
COPY packages/cli/package.json ./packages/cli/
|
|
|
|
|
COPY packages/core/package.json ./packages/core/
|
fix(docker): fix Docker build failures and add CI guard (#1022)
* fix(docker): update Bun base image from 1.2 to 1.3
The lockfile was generated with Bun 1.3.x locally but the Docker image
used oven/bun:1.2-slim. Bun 1.3 changed the lockfile format, causing
--frozen-lockfile to fail during docker build.
* fix(docker): pin Bun to exact version 1.3.9 matching lockfile
Floating tag 1.3-slim resolved to 1.3.11 which has a different lockfile
format than 1.3.9 used to generate bun.lock. Pin to exact patch version
to prevent --frozen-lockfile failures.
* fix(docker): add missing docs-web workspace package.json
The docs-web package was added as a workspace member but its
package.json was never added to the Dockerfile COPY steps. This caused
bun install --frozen-lockfile to fail because the workspace layout
in Docker didn't match the lockfile.
* fix(docker): use hoisted linker for Vite/Rollup compatibility
Bun's default "isolated" linker stores packages in node_modules/.bun/
with symlinks that Vite's Rollup bundler cannot resolve during
production builds (e.g., remark-gfm → mdast-util-gfm chain).
Using --linker=hoisted gives the classic flat node_modules layout
that Rollup expects. Local dev is unaffected (Vite dev server handles
the isolated layout fine).
* ci: pin Bun version to 1.3.9 and add Docker build check
- Align CI Bun version (was 1.3.11) with Dockerfile and local dev
(1.3.9) to prevent lockfile format mismatches between environments
- Add docker-build job to test.yml that builds the Docker image on
every PR — catches Dockerfile regressions (missing workspace
packages, linker issues, build failures) before they reach deploy
* fix(ci): add permissions for GHA cache and tighten Bun engine
- Add actions: write permission to docker-build job so GHA layer cache
writes succeed on PRs from forks
- Tighten package.json engines.bun from >=1.0.0 to >=1.3.9 to document
the minimum version that matches the lockfile format
* fix(ci): add smoke test, align Bun version across all workflows
Review fixes:
- Add load: true + health endpoint smoke test to docker-build CI job
so we verify the image actually starts, not just compiles
- Align Bun 1.3.9 in deploy-docs.yml and release.yml (were still 1.3.11)
- Document why docs-web source is intentionally omitted from Docker
* chore: float Docker to bun:1.3 and align CI to 1.3.11
- Dockerfile: oven/bun:1.3-slim (auto-tracks latest 1.3.x patches)
- CI workflows: bun-version 1.3.11 (current latest, reproducible)
- engines.bun: >=1.3.9 (minimum for local devs)
Lockfile format is stable across 1.3.x patches, so this is safe.
* fix(docker,ci): pin Docker to 1.3.11, loosen engines, harden smoke test
- Dockerfile: pin oven/bun:1.3.11-slim (was floating 1.3-slim) so Docker
builds are reproducible and match CI exactly.
- package.json: loosen engines to ^1.3.0 so end users on any 1.3.x can
run the CLI; CI/Docker remain pinned to the canonical latest.
- CI smoke test: replace 'sleep 5' with curl --retry-connrefused, and
move container cleanup to an 'if: always()' step so a failed health
check no longer leaks the named container.
---------
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
2026-04-07 07:37:47 +00:00
|
|
|
# docs-web source is NOT copied — it's a static site deployed separately
|
|
|
|
|
# (see .github/workflows/deploy-docs.yml). package.json is included only
|
|
|
|
|
# so Bun's workspace lockfile resolves correctly.
|
|
|
|
|
COPY packages/docs-web/package.json ./packages/docs-web/
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
COPY packages/git/package.json ./packages/git/
|
|
|
|
|
COPY packages/isolation/package.json ./packages/isolation/
|
|
|
|
|
COPY packages/paths/package.json ./packages/paths/
|
refactor: extract providers from @archon/core into @archon/providers (#1137)
* refactor: extract providers from @archon/core into @archon/providers
Move Claude and Codex provider implementations, factory, and SDK
dependencies into a new @archon/providers package. This establishes a
clean boundary: providers own SDK translation, core owns business logic.
Key changes:
- New @archon/providers package with zero-dep contract layer (types.ts)
- @archon/workflows imports from @archon/providers/types — no mirror types
- dag-executor delegates option building to providers via nodeConfig
- IAgentProvider gains getCapabilities() for provider-agnostic warnings
- @archon/core no longer depends on SDK packages directly
- UnknownProviderError standardizes error shape across all surfaces
Zero user-facing changes — same providers, same config, same behavior.
* refactor: remove config type duplication and backward-compat re-exports
Address review findings:
- Move ClaudeProviderDefaults and CodexProviderDefaults to the
@archon/providers/types contract layer as the single source of truth.
@archon/core/config/config-types.ts now imports from there.
- Remove provider re-exports from @archon/core (index.ts and types/).
Consumers should import from @archon/providers directly.
- Update @archon/server to depend on @archon/providers for MessageChunk.
* refactor: move structured output validation into providers
Each provider now normalizes its own structured output semantics:
- Claude already yields structuredOutput from the SDK's native field
- Codex now parses inline agent_message text as JSON when outputFormat
is set, populating structuredOutput on the result chunk
This eliminates the last provider === 'codex' branch from dag-executor,
making it fully provider-agnostic. The dag-executor checks structuredOutput
uniformly regardless of provider.
Also removes the ClaudeCodexProviderDefaults deprecated alias — all
consumers now use ClaudeProviderDefaults directly.
* fix: address PR review — restore warnings, fix loop options, cleanup
Critical fixes:
- Restore MCP missing env vars user-facing warning (was silently dropped)
- Restore Haiku + MCP tool search warning
- Fix buildLoopNodeOptions to pass workflow-level nodeConfig (effort,
thinking, betas, sandbox were silently lost for loop nodes)
- Add TODO(#1135) comments documenting env-leak gate gap
Cleanup:
- Remove backward-compat type aliases from deps.ts (keep WorkflowTokenUsage)
- Remove 26 unnecessary eslint-disable comments from test files
- Trim internal helpers from providers barrel (withFirstMessageTimeout,
getProcessUid, loadMcpConfig, buildSDKHooksFromYAML)
- Add @archon/providers dep to CLI package.json
- Fix 8 stale documentation paths pointing to deleted core/src/providers/
- Add E2E smoke test workflows for both Claude and Codex providers
* fix: forward provider system warnings to users in dag-executor
The dag-executor only forwarded system chunks starting with
"MCP server connection failed:" — all other provider warnings
(missing env vars, Haiku+MCP, structured output issues) were
logged but never reached the user.
Now forwards all system chunks starting with ⚠️ (the prefix
providers use for user-actionable warnings).
* fix: add providers package to Dockerfile and fix CI module resolution
- Add packages/providers/ to all three Dockerfile stages (deps,
production package.json copy, production source copy)
- Replace wildcard export map (./*) with explicit subpath entries
to fix module resolution in CI (bun workspace linking)
* chore: update bun.lock for providers package exports
2026-04-13 06:21:36 +00:00
|
|
|
COPY packages/providers/package.json ./packages/providers/
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
COPY packages/server/package.json ./packages/server/
|
|
|
|
|
COPY packages/web/package.json ./packages/web/
|
|
|
|
|
COPY packages/workflows/package.json ./packages/workflows/
|
|
|
|
|
|
|
|
|
|
# Install ALL dependencies (including devDependencies needed for web build)
|
fix(docker): fix Docker build failures and add CI guard (#1022)
* fix(docker): update Bun base image from 1.2 to 1.3
The lockfile was generated with Bun 1.3.x locally but the Docker image
used oven/bun:1.2-slim. Bun 1.3 changed the lockfile format, causing
--frozen-lockfile to fail during docker build.
* fix(docker): pin Bun to exact version 1.3.9 matching lockfile
Floating tag 1.3-slim resolved to 1.3.11 which has a different lockfile
format than 1.3.9 used to generate bun.lock. Pin to exact patch version
to prevent --frozen-lockfile failures.
* fix(docker): add missing docs-web workspace package.json
The docs-web package was added as a workspace member but its
package.json was never added to the Dockerfile COPY steps. This caused
bun install --frozen-lockfile to fail because the workspace layout
in Docker didn't match the lockfile.
* fix(docker): use hoisted linker for Vite/Rollup compatibility
Bun's default "isolated" linker stores packages in node_modules/.bun/
with symlinks that Vite's Rollup bundler cannot resolve during
production builds (e.g., remark-gfm → mdast-util-gfm chain).
Using --linker=hoisted gives the classic flat node_modules layout
that Rollup expects. Local dev is unaffected (Vite dev server handles
the isolated layout fine).
* ci: pin Bun version to 1.3.9 and add Docker build check
- Align CI Bun version (was 1.3.11) with Dockerfile and local dev
(1.3.9) to prevent lockfile format mismatches between environments
- Add docker-build job to test.yml that builds the Docker image on
every PR — catches Dockerfile regressions (missing workspace
packages, linker issues, build failures) before they reach deploy
* fix(ci): add permissions for GHA cache and tighten Bun engine
- Add actions: write permission to docker-build job so GHA layer cache
writes succeed on PRs from forks
- Tighten package.json engines.bun from >=1.0.0 to >=1.3.9 to document
the minimum version that matches the lockfile format
* fix(ci): add smoke test, align Bun version across all workflows
Review fixes:
- Add load: true + health endpoint smoke test to docker-build CI job
so we verify the image actually starts, not just compiles
- Align Bun 1.3.9 in deploy-docs.yml and release.yml (were still 1.3.11)
- Document why docs-web source is intentionally omitted from Docker
* chore: float Docker to bun:1.3 and align CI to 1.3.11
- Dockerfile: oven/bun:1.3-slim (auto-tracks latest 1.3.x patches)
- CI workflows: bun-version 1.3.11 (current latest, reproducible)
- engines.bun: >=1.3.9 (minimum for local devs)
Lockfile format is stable across 1.3.x patches, so this is safe.
* fix(docker,ci): pin Docker to 1.3.11, loosen engines, harden smoke test
- Dockerfile: pin oven/bun:1.3.11-slim (was floating 1.3-slim) so Docker
builds are reproducible and match CI exactly.
- package.json: loosen engines to ^1.3.0 so end users on any 1.3.x can
run the CLI; CI/Docker remain pinned to the canonical latest.
- CI smoke test: replace 'sleep 5' with curl --retry-connrefused, and
move container cleanup to an 'if: always()' step so a failed health
check no longer leaks the named container.
---------
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
2026-04-07 07:37:47 +00:00
|
|
|
# --linker=hoisted: Bun's default "isolated" linker stores packages in
|
|
|
|
|
# node_modules/.bun/ with symlinks that Vite/Rollup cannot resolve during
|
|
|
|
|
# production builds. Hoisted layout gives classic flat node_modules.
|
|
|
|
|
RUN bun install --frozen-lockfile --linker=hoisted
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Stage 2: Build web UI (Vite + React)
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
FROM deps AS web-build
|
|
|
|
|
|
|
|
|
|
# Copy full source (needed for workspace resolution and web build)
|
|
|
|
|
COPY . .
|
|
|
|
|
|
|
|
|
|
# Build the web frontend — output goes to packages/web/dist/
|
|
|
|
|
RUN bun run build:web && \
|
|
|
|
|
test -f packages/web/dist/index.html || \
|
|
|
|
|
(echo "ERROR: Web build produced no index.html" >&2 && exit 1)
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Stage 3: Production image
|
|
|
|
|
# ---------------------------------------------------------------------------
|
fix(docker): fix Docker build failures and add CI guard (#1022)
* fix(docker): update Bun base image from 1.2 to 1.3
The lockfile was generated with Bun 1.3.x locally but the Docker image
used oven/bun:1.2-slim. Bun 1.3 changed the lockfile format, causing
--frozen-lockfile to fail during docker build.
* fix(docker): pin Bun to exact version 1.3.9 matching lockfile
Floating tag 1.3-slim resolved to 1.3.11 which has a different lockfile
format than 1.3.9 used to generate bun.lock. Pin to exact patch version
to prevent --frozen-lockfile failures.
* fix(docker): add missing docs-web workspace package.json
The docs-web package was added as a workspace member but its
package.json was never added to the Dockerfile COPY steps. This caused
bun install --frozen-lockfile to fail because the workspace layout
in Docker didn't match the lockfile.
* fix(docker): use hoisted linker for Vite/Rollup compatibility
Bun's default "isolated" linker stores packages in node_modules/.bun/
with symlinks that Vite's Rollup bundler cannot resolve during
production builds (e.g., remark-gfm → mdast-util-gfm chain).
Using --linker=hoisted gives the classic flat node_modules layout
that Rollup expects. Local dev is unaffected (Vite dev server handles
the isolated layout fine).
* ci: pin Bun version to 1.3.9 and add Docker build check
- Align CI Bun version (was 1.3.11) with Dockerfile and local dev
(1.3.9) to prevent lockfile format mismatches between environments
- Add docker-build job to test.yml that builds the Docker image on
every PR — catches Dockerfile regressions (missing workspace
packages, linker issues, build failures) before they reach deploy
* fix(ci): add permissions for GHA cache and tighten Bun engine
- Add actions: write permission to docker-build job so GHA layer cache
writes succeed on PRs from forks
- Tighten package.json engines.bun from >=1.0.0 to >=1.3.9 to document
the minimum version that matches the lockfile format
* fix(ci): add smoke test, align Bun version across all workflows
Review fixes:
- Add load: true + health endpoint smoke test to docker-build CI job
so we verify the image actually starts, not just compiles
- Align Bun 1.3.9 in deploy-docs.yml and release.yml (were still 1.3.11)
- Document why docs-web source is intentionally omitted from Docker
* chore: float Docker to bun:1.3 and align CI to 1.3.11
- Dockerfile: oven/bun:1.3-slim (auto-tracks latest 1.3.x patches)
- CI workflows: bun-version 1.3.11 (current latest, reproducible)
- engines.bun: >=1.3.9 (minimum for local devs)
Lockfile format is stable across 1.3.x patches, so this is safe.
* fix(docker,ci): pin Docker to 1.3.11, loosen engines, harden smoke test
- Dockerfile: pin oven/bun:1.3.11-slim (was floating 1.3-slim) so Docker
builds are reproducible and match CI exactly.
- package.json: loosen engines to ^1.3.0 so end users on any 1.3.x can
run the CLI; CI/Docker remain pinned to the canonical latest.
- CI smoke test: replace 'sleep 5' with curl --retry-connrefused, and
move container cleanup to an 'if: always()' step so a failed health
check no longer leaks the named container.
---------
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
2026-04-07 07:37:47 +00:00
|
|
|
FROM oven/bun:1.3.11-slim AS production
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
|
2025-12-17 19:45:41 +00:00
|
|
|
# OCI Labels for GHCR
|
2026-04-04 15:47:22 +00:00
|
|
|
LABEL org.opencontainers.image.source="https://github.com/coleam00/Archon"
|
2025-12-17 19:45:41 +00:00
|
|
|
LABEL org.opencontainers.image.description="Control AI coding assistants remotely from Telegram, Slack, Discord, and GitHub"
|
|
|
|
|
LABEL org.opencontainers.image.licenses="MIT"
|
|
|
|
|
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
# Prevent interactive prompts during installation
|
|
|
|
|
ENV DEBIAN_FRONTEND=noninteractive
|
|
|
|
|
|
|
|
|
|
WORKDIR /app
|
|
|
|
|
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# Install system dependencies + gosu for privilege dropping in entrypoint
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
RUN apt-get update && apt-get install -y \
|
|
|
|
|
curl \
|
|
|
|
|
git \
|
|
|
|
|
bash \
|
|
|
|
|
ca-certificates \
|
|
|
|
|
gnupg \
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
gosu \
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
postgresql-client \
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# Chromium for agent-browser E2E testing (drives browser via CDP)
|
|
|
|
|
chromium \
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
|
|
|
|
|
|
# Install GitHub CLI
|
|
|
|
|
RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg \
|
|
|
|
|
&& chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg \
|
|
|
|
|
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
|
|
|
|
|
&& apt-get update \
|
|
|
|
|
&& apt-get install -y gh \
|
|
|
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
|
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# Install agent-browser CLI (Vercel Labs) for E2E testing workflows
|
|
|
|
|
# - Uses npm (not bun) because postinstall script downloads the native Rust binary
|
|
|
|
|
# - After install, symlink the Rust binary directly and purge nodejs/npm (~60MB saved)
|
|
|
|
|
# - The npm entry point is a Node.js wrapper; the native binary works standalone
|
|
|
|
|
# - agent-browser auto-detects Docker (via /.dockerenv) and adds --no-sandbox to Chromium
|
|
|
|
|
RUN apt-get update && apt-get install -y --no-install-recommends nodejs npm \
|
|
|
|
|
&& npm install -g agent-browser@0.22.1 \
|
|
|
|
|
&& NATIVE_BIN=$(find /usr/local/lib/node_modules/agent-browser -name 'agent-browser-*' -type f -executable 2>/dev/null | head -1) \
|
|
|
|
|
&& if [ -n "$NATIVE_BIN" ]; then \
|
|
|
|
|
cp "$NATIVE_BIN" /usr/local/bin/agent-browser-native \
|
|
|
|
|
&& chmod +x /usr/local/bin/agent-browser-native \
|
|
|
|
|
&& ln -sf /usr/local/bin/agent-browser-native /usr/local/bin/agent-browser; \
|
|
|
|
|
else \
|
|
|
|
|
echo "ERROR: agent-browser native binary not found after npm install" >&2 && exit 1; \
|
|
|
|
|
fi \
|
|
|
|
|
&& npm cache clean --force \
|
|
|
|
|
&& rm -rf /usr/local/lib/node_modules/agent-browser \
|
|
|
|
|
&& apt-get purge -y nodejs npm \
|
|
|
|
|
&& apt-get autoremove -y \
|
|
|
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
|
|
|
|
|
|
# Point agent-browser to system Chromium (avoids ~400MB Chrome for Testing download)
|
|
|
|
|
ENV AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
|
|
|
|
|
|
fix(providers): replace Claude SDK embed with explicit binary-path resolver (#1217)
* feat(providers): replace Claude SDK embed with explicit binary-path resolver
Drop `@anthropic-ai/claude-agent-sdk/embed` and resolve Claude Code via
CLAUDE_BIN_PATH env → assistants.claude.claudeBinaryPath config → throw
with install instructions. The embed's silent failure modes on macOS
(#1210) and Windows (#1087) become actionable errors with a documented
recovery path.
Dev mode (bun run) remains auto-resolved via node_modules. The setup
wizard auto-detects Claude Code by probing the native installer path
(~/.local/bin/claude), npm global cli.js, and PATH, then writes
CLAUDE_BIN_PATH to ~/.archon/.env. Dockerfile pre-sets CLAUDE_BIN_PATH
so extenders using the compiled binary keep working. Release workflow
gets negative and positive resolver smoke tests.
Docs, CHANGELOG, README, .env.example, CLAUDE.md, test-release and
archon skills all updated to reflect the curl-first install story.
Retires #1210, #1087, #1091 (never merged, now obsolete).
Implements #1176.
* fix(providers): only pass --no-env-file when spawning Claude via Bun/Node
`--no-env-file` is a Bun flag that prevents Bun from auto-loading
`.env` from the subprocess cwd. It is only meaningful when the Claude
Code executable is a `cli.js` file — in which case the SDK spawns it
via `bun`/`node` and the flag reaches the runtime.
When `CLAUDE_BIN_PATH` points at a native compiled Claude binary (e.g.
`~/.local/bin/claude` from the curl installer, which is Anthropic's
recommended default), the SDK executes the binary directly. Passing
`--no-env-file` then goes straight to the native binary, which
rejects it with `error: unknown option '--no-env-file'` and the
subprocess exits code 1.
Emit `executableArgs` only when the target is a `.js` file (dev mode
or explicit cli.js path). Caught by end-to-end smoke testing against
the curl-installed native Claude binary.
* docs: record env-leak validation result in provider comment
Verified end-to-end with sentinel `.env` and `.env.local` files in a
workflow CWD that the native Claude binary (curl installer) does not
auto-load `.env` files. With Archon's full spawn pathway and parent
env stripped, the subprocess saw both sentinels as UNSET. The
first-layer protection in `@archon/paths` (#1067) handles the
inheritance leak; `--no-env-file` only matters for the Bun-spawned
cli.js path, where it is still emitted.
* chore(providers): cleanup pass — exports, docs, troubleshooting
Final-sweep cleanup tied to the binary-resolver PR:
- Mirror Codex's package surface for the new Claude resolver: add
`./claude/binary-resolver` subpath export and re-export
`resolveClaudeBinaryPath` + `claudeFileExists` from the package
index. Renames the previously single `fileExists` re-export to
`codexFileExists` for symmetry; nothing outside the providers
package was importing it.
- Add a "Claude Code not found" entry to the troubleshooting reference
doc with platform-specific install snippets and pointers to the
AI Assistants binary-path section.
- Reframe the example claudeBinaryPath in reference/configuration.md
away from cli.js-only language; it accepts either the native binary
or cli.js.
* test+refactor(providers, cli): address PR review feedback
Two test gaps and one doc nit from the PR review (#1217):
- Extract the `--no-env-file` decision into a pure exported helper
`shouldPassNoEnvFile(cliPath)` so the native-binary branch is unit
testable without mocking `BUNDLED_IS_BINARY` or running the full
sendQuery pathway. Six new tests cover undefined, cli.js, native
binary (Linux + Windows), Homebrew symlink, and suffix-only matching.
Also adds a `claude.subprocess_env_file_flag` debug log so the
security-adjacent decision is auditable.
- Extract the three install-location probes in setup.ts into exported
wrappers (`probeFileExists`, `probeNpmRoot`, `probeWhichClaude`) and
export `detectClaudeExecutablePath` itself, so the probe order can be
spied on. Six new tests cover each tier winning, fall-through
ordering, npm-tier skip when not installed, and the
which-resolved-but-stale-path edge case.
- CLAUDE.md `claudeBinaryPath` placeholder updated to reflect that the
field accepts either the native binary or cli.js (the example value
was previously `/absolute/path/to/cli.js`, slightly misleading now
that the curl-installer native binary is the default).
Skipped from the review by deliberate scope decision:
- `resolveClaudeBinaryPath` async-with-no-await: matches Codex's
resolver signature exactly. Changing only Claude breaks symmetry;
if pursued, do both providers in a separate cleanup PR.
- `isAbsolute()` validation in parseClaudeConfig: Codex doesn't do it
either. Resolver throws on non-existence already.
- Atomic `.env` writes in setup wizard: pre-existing pattern this PR
touched only adjacently. File as separate issue if needed.
- classifyError branch in dag-executor for setup errors: scope creep.
- `.env.example` "missing #" claim: false positive (verified all
CLAUDE_BIN_PATH lines have proper comment prefixes).
* fix(test): use path.join in Windows-compatible probe-order test
The "tier 2 wins (npm cli.js)" test hardcoded forward-slash path
comparisons, but `path.join` produces backslashes on Windows. Caused
the Windows CI leg of the test suite to fail while macOS and Linux
passed. Use `path.join` for both the mock return value and the
expectation so the separator matches whatever the platform produces.
2026-04-14 14:56:37 +00:00
|
|
|
# Pre-configure the Claude Code SDK cli.js path for any consumer that runs
|
|
|
|
|
# a compiled Archon binary inside (or extending) this image. In source mode
|
|
|
|
|
# (the default `bun run start` ENTRYPOINT), BUNDLED_IS_BINARY is false and
|
|
|
|
|
# this variable is ignored — the SDK resolves cli.js via node_modules. Kept
|
|
|
|
|
# here so extenders don't need to rediscover the path.
|
|
|
|
|
# Path matches the hoisted layout produced by `bun install --linker=hoisted`.
|
|
|
|
|
ENV CLAUDE_BIN_PATH=/app/node_modules/@anthropic-ai/claude-agent-sdk/cli.js
|
|
|
|
|
|
2025-11-11 02:57:13 +00:00
|
|
|
# Create non-root user for running Claude Code
|
|
|
|
|
# Claude Code refuses to run with --dangerously-skip-permissions as root for security
|
|
|
|
|
RUN useradd -m -u 1001 -s /bin/bash appuser \
|
2025-12-17 19:45:41 +00:00
|
|
|
&& chown -R appuser:appuser /app
|
|
|
|
|
|
|
|
|
|
# Create Archon directories
|
|
|
|
|
RUN mkdir -p /.archon/workspaces /.archon/worktrees \
|
|
|
|
|
&& chown -R appuser:appuser /.archon
|
2025-11-11 02:57:13 +00:00
|
|
|
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# Copy root package files and lockfile
|
2025-12-16 13:34:58 +00:00
|
|
|
COPY package.json bun.lock ./
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
|
|
|
|
|
# Copy ALL workspace package.json files
|
|
|
|
|
COPY packages/adapters/package.json ./packages/adapters/
|
2026-01-21 22:07:18 +00:00
|
|
|
COPY packages/cli/package.json ./packages/cli/
|
|
|
|
|
COPY packages/core/package.json ./packages/core/
|
fix(docker): fix Docker build failures and add CI guard (#1022)
* fix(docker): update Bun base image from 1.2 to 1.3
The lockfile was generated with Bun 1.3.x locally but the Docker image
used oven/bun:1.2-slim. Bun 1.3 changed the lockfile format, causing
--frozen-lockfile to fail during docker build.
* fix(docker): pin Bun to exact version 1.3.9 matching lockfile
Floating tag 1.3-slim resolved to 1.3.11 which has a different lockfile
format than 1.3.9 used to generate bun.lock. Pin to exact patch version
to prevent --frozen-lockfile failures.
* fix(docker): add missing docs-web workspace package.json
The docs-web package was added as a workspace member but its
package.json was never added to the Dockerfile COPY steps. This caused
bun install --frozen-lockfile to fail because the workspace layout
in Docker didn't match the lockfile.
* fix(docker): use hoisted linker for Vite/Rollup compatibility
Bun's default "isolated" linker stores packages in node_modules/.bun/
with symlinks that Vite's Rollup bundler cannot resolve during
production builds (e.g., remark-gfm → mdast-util-gfm chain).
Using --linker=hoisted gives the classic flat node_modules layout
that Rollup expects. Local dev is unaffected (Vite dev server handles
the isolated layout fine).
* ci: pin Bun version to 1.3.9 and add Docker build check
- Align CI Bun version (was 1.3.11) with Dockerfile and local dev
(1.3.9) to prevent lockfile format mismatches between environments
- Add docker-build job to test.yml that builds the Docker image on
every PR — catches Dockerfile regressions (missing workspace
packages, linker issues, build failures) before they reach deploy
* fix(ci): add permissions for GHA cache and tighten Bun engine
- Add actions: write permission to docker-build job so GHA layer cache
writes succeed on PRs from forks
- Tighten package.json engines.bun from >=1.0.0 to >=1.3.9 to document
the minimum version that matches the lockfile format
* fix(ci): add smoke test, align Bun version across all workflows
Review fixes:
- Add load: true + health endpoint smoke test to docker-build CI job
so we verify the image actually starts, not just compiles
- Align Bun 1.3.9 in deploy-docs.yml and release.yml (were still 1.3.11)
- Document why docs-web source is intentionally omitted from Docker
* chore: float Docker to bun:1.3 and align CI to 1.3.11
- Dockerfile: oven/bun:1.3-slim (auto-tracks latest 1.3.x patches)
- CI workflows: bun-version 1.3.11 (current latest, reproducible)
- engines.bun: >=1.3.9 (minimum for local devs)
Lockfile format is stable across 1.3.x patches, so this is safe.
* fix(docker,ci): pin Docker to 1.3.11, loosen engines, harden smoke test
- Dockerfile: pin oven/bun:1.3.11-slim (was floating 1.3-slim) so Docker
builds are reproducible and match CI exactly.
- package.json: loosen engines to ^1.3.0 so end users on any 1.3.x can
run the CLI; CI/Docker remain pinned to the canonical latest.
- CI smoke test: replace 'sleep 5' with curl --retry-connrefused, and
move container cleanup to an 'if: always()' step so a failed health
check no longer leaks the named container.
---------
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
2026-04-07 07:37:47 +00:00
|
|
|
# docs-web source is NOT copied — it's a static site deployed separately
|
|
|
|
|
# (see .github/workflows/deploy-docs.yml). package.json is included only
|
|
|
|
|
# so Bun's workspace lockfile resolves correctly.
|
|
|
|
|
COPY packages/docs-web/package.json ./packages/docs-web/
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
COPY packages/git/package.json ./packages/git/
|
|
|
|
|
COPY packages/isolation/package.json ./packages/isolation/
|
|
|
|
|
COPY packages/paths/package.json ./packages/paths/
|
refactor: extract providers from @archon/core into @archon/providers (#1137)
* refactor: extract providers from @archon/core into @archon/providers
Move Claude and Codex provider implementations, factory, and SDK
dependencies into a new @archon/providers package. This establishes a
clean boundary: providers own SDK translation, core owns business logic.
Key changes:
- New @archon/providers package with zero-dep contract layer (types.ts)
- @archon/workflows imports from @archon/providers/types — no mirror types
- dag-executor delegates option building to providers via nodeConfig
- IAgentProvider gains getCapabilities() for provider-agnostic warnings
- @archon/core no longer depends on SDK packages directly
- UnknownProviderError standardizes error shape across all surfaces
Zero user-facing changes — same providers, same config, same behavior.
* refactor: remove config type duplication and backward-compat re-exports
Address review findings:
- Move ClaudeProviderDefaults and CodexProviderDefaults to the
@archon/providers/types contract layer as the single source of truth.
@archon/core/config/config-types.ts now imports from there.
- Remove provider re-exports from @archon/core (index.ts and types/).
Consumers should import from @archon/providers directly.
- Update @archon/server to depend on @archon/providers for MessageChunk.
* refactor: move structured output validation into providers
Each provider now normalizes its own structured output semantics:
- Claude already yields structuredOutput from the SDK's native field
- Codex now parses inline agent_message text as JSON when outputFormat
is set, populating structuredOutput on the result chunk
This eliminates the last provider === 'codex' branch from dag-executor,
making it fully provider-agnostic. The dag-executor checks structuredOutput
uniformly regardless of provider.
Also removes the ClaudeCodexProviderDefaults deprecated alias — all
consumers now use ClaudeProviderDefaults directly.
* fix: address PR review — restore warnings, fix loop options, cleanup
Critical fixes:
- Restore MCP missing env vars user-facing warning (was silently dropped)
- Restore Haiku + MCP tool search warning
- Fix buildLoopNodeOptions to pass workflow-level nodeConfig (effort,
thinking, betas, sandbox were silently lost for loop nodes)
- Add TODO(#1135) comments documenting env-leak gate gap
Cleanup:
- Remove backward-compat type aliases from deps.ts (keep WorkflowTokenUsage)
- Remove 26 unnecessary eslint-disable comments from test files
- Trim internal helpers from providers barrel (withFirstMessageTimeout,
getProcessUid, loadMcpConfig, buildSDKHooksFromYAML)
- Add @archon/providers dep to CLI package.json
- Fix 8 stale documentation paths pointing to deleted core/src/providers/
- Add E2E smoke test workflows for both Claude and Codex providers
* fix: forward provider system warnings to users in dag-executor
The dag-executor only forwarded system chunks starting with
"MCP server connection failed:" — all other provider warnings
(missing env vars, Haiku+MCP, structured output issues) were
logged but never reached the user.
Now forwards all system chunks starting with ⚠️ (the prefix
providers use for user-actionable warnings).
* fix: add providers package to Dockerfile and fix CI module resolution
- Add packages/providers/ to all three Dockerfile stages (deps,
production package.json copy, production source copy)
- Replace wildcard export map (./*) with explicit subpath entries
to fix module resolution in CI (bun workspace linking)
* chore: update bun.lock for providers package exports
2026-04-13 06:21:36 +00:00
|
|
|
COPY packages/providers/package.json ./packages/providers/
|
2026-01-21 22:07:18 +00:00
|
|
|
COPY packages/server/package.json ./packages/server/
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
COPY packages/web/package.json ./packages/web/
|
|
|
|
|
COPY packages/workflows/package.json ./packages/workflows/
|
|
|
|
|
|
|
|
|
|
# Install production dependencies only (--ignore-scripts skips husky prepare hook)
|
fix(docker): fix Docker build failures and add CI guard (#1022)
* fix(docker): update Bun base image from 1.2 to 1.3
The lockfile was generated with Bun 1.3.x locally but the Docker image
used oven/bun:1.2-slim. Bun 1.3 changed the lockfile format, causing
--frozen-lockfile to fail during docker build.
* fix(docker): pin Bun to exact version 1.3.9 matching lockfile
Floating tag 1.3-slim resolved to 1.3.11 which has a different lockfile
format than 1.3.9 used to generate bun.lock. Pin to exact patch version
to prevent --frozen-lockfile failures.
* fix(docker): add missing docs-web workspace package.json
The docs-web package was added as a workspace member but its
package.json was never added to the Dockerfile COPY steps. This caused
bun install --frozen-lockfile to fail because the workspace layout
in Docker didn't match the lockfile.
* fix(docker): use hoisted linker for Vite/Rollup compatibility
Bun's default "isolated" linker stores packages in node_modules/.bun/
with symlinks that Vite's Rollup bundler cannot resolve during
production builds (e.g., remark-gfm → mdast-util-gfm chain).
Using --linker=hoisted gives the classic flat node_modules layout
that Rollup expects. Local dev is unaffected (Vite dev server handles
the isolated layout fine).
* ci: pin Bun version to 1.3.9 and add Docker build check
- Align CI Bun version (was 1.3.11) with Dockerfile and local dev
(1.3.9) to prevent lockfile format mismatches between environments
- Add docker-build job to test.yml that builds the Docker image on
every PR — catches Dockerfile regressions (missing workspace
packages, linker issues, build failures) before they reach deploy
* fix(ci): add permissions for GHA cache and tighten Bun engine
- Add actions: write permission to docker-build job so GHA layer cache
writes succeed on PRs from forks
- Tighten package.json engines.bun from >=1.0.0 to >=1.3.9 to document
the minimum version that matches the lockfile format
* fix(ci): add smoke test, align Bun version across all workflows
Review fixes:
- Add load: true + health endpoint smoke test to docker-build CI job
so we verify the image actually starts, not just compiles
- Align Bun 1.3.9 in deploy-docs.yml and release.yml (were still 1.3.11)
- Document why docs-web source is intentionally omitted from Docker
* chore: float Docker to bun:1.3 and align CI to 1.3.11
- Dockerfile: oven/bun:1.3-slim (auto-tracks latest 1.3.x patches)
- CI workflows: bun-version 1.3.11 (current latest, reproducible)
- engines.bun: >=1.3.9 (minimum for local devs)
Lockfile format is stable across 1.3.x patches, so this is safe.
* fix(docker,ci): pin Docker to 1.3.11, loosen engines, harden smoke test
- Dockerfile: pin oven/bun:1.3.11-slim (was floating 1.3-slim) so Docker
builds are reproducible and match CI exactly.
- package.json: loosen engines to ^1.3.0 so end users on any 1.3.x can
run the CLI; CI/Docker remain pinned to the canonical latest.
- CI smoke test: replace 'sleep 5' with curl --retry-connrefused, and
move container cleanup to an 'if: always()' step so a failed health
check no longer leaks the named container.
---------
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
2026-04-07 07:37:47 +00:00
|
|
|
RUN bun install --frozen-lockfile --production --ignore-scripts --linker=hoisted
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
|
|
|
|
|
# Copy application source (Bun runs TypeScript directly, no compile step needed)
|
|
|
|
|
COPY packages/adapters/ ./packages/adapters/
|
|
|
|
|
COPY packages/cli/ ./packages/cli/
|
|
|
|
|
COPY packages/core/ ./packages/core/
|
|
|
|
|
COPY packages/git/ ./packages/git/
|
|
|
|
|
COPY packages/isolation/ ./packages/isolation/
|
|
|
|
|
COPY packages/paths/ ./packages/paths/
|
refactor: extract providers from @archon/core into @archon/providers (#1137)
* refactor: extract providers from @archon/core into @archon/providers
Move Claude and Codex provider implementations, factory, and SDK
dependencies into a new @archon/providers package. This establishes a
clean boundary: providers own SDK translation, core owns business logic.
Key changes:
- New @archon/providers package with zero-dep contract layer (types.ts)
- @archon/workflows imports from @archon/providers/types — no mirror types
- dag-executor delegates option building to providers via nodeConfig
- IAgentProvider gains getCapabilities() for provider-agnostic warnings
- @archon/core no longer depends on SDK packages directly
- UnknownProviderError standardizes error shape across all surfaces
Zero user-facing changes — same providers, same config, same behavior.
* refactor: remove config type duplication and backward-compat re-exports
Address review findings:
- Move ClaudeProviderDefaults and CodexProviderDefaults to the
@archon/providers/types contract layer as the single source of truth.
@archon/core/config/config-types.ts now imports from there.
- Remove provider re-exports from @archon/core (index.ts and types/).
Consumers should import from @archon/providers directly.
- Update @archon/server to depend on @archon/providers for MessageChunk.
* refactor: move structured output validation into providers
Each provider now normalizes its own structured output semantics:
- Claude already yields structuredOutput from the SDK's native field
- Codex now parses inline agent_message text as JSON when outputFormat
is set, populating structuredOutput on the result chunk
This eliminates the last provider === 'codex' branch from dag-executor,
making it fully provider-agnostic. The dag-executor checks structuredOutput
uniformly regardless of provider.
Also removes the ClaudeCodexProviderDefaults deprecated alias — all
consumers now use ClaudeProviderDefaults directly.
* fix: address PR review — restore warnings, fix loop options, cleanup
Critical fixes:
- Restore MCP missing env vars user-facing warning (was silently dropped)
- Restore Haiku + MCP tool search warning
- Fix buildLoopNodeOptions to pass workflow-level nodeConfig (effort,
thinking, betas, sandbox were silently lost for loop nodes)
- Add TODO(#1135) comments documenting env-leak gate gap
Cleanup:
- Remove backward-compat type aliases from deps.ts (keep WorkflowTokenUsage)
- Remove 26 unnecessary eslint-disable comments from test files
- Trim internal helpers from providers barrel (withFirstMessageTimeout,
getProcessUid, loadMcpConfig, buildSDKHooksFromYAML)
- Add @archon/providers dep to CLI package.json
- Fix 8 stale documentation paths pointing to deleted core/src/providers/
- Add E2E smoke test workflows for both Claude and Codex providers
* fix: forward provider system warnings to users in dag-executor
The dag-executor only forwarded system chunks starting with
"MCP server connection failed:" — all other provider warnings
(missing env vars, Haiku+MCP, structured output issues) were
logged but never reached the user.
Now forwards all system chunks starting with ⚠️ (the prefix
providers use for user-actionable warnings).
* fix: add providers package to Dockerfile and fix CI module resolution
- Add packages/providers/ to all three Dockerfile stages (deps,
production package.json copy, production source copy)
- Replace wildcard export map (./*) with explicit subpath entries
to fix module resolution in CI (bun workspace linking)
* chore: update bun.lock for providers package exports
2026-04-13 06:21:36 +00:00
|
|
|
COPY packages/providers/ ./packages/providers/
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
COPY packages/server/ ./packages/server/
|
|
|
|
|
COPY packages/workflows/ ./packages/workflows/
|
|
|
|
|
|
|
|
|
|
# Copy pre-built web UI from build stage
|
|
|
|
|
COPY --from=web-build /app/packages/web/dist/ ./packages/web/dist/
|
|
|
|
|
|
|
|
|
|
# Copy config, migrations, and bundled defaults
|
|
|
|
|
COPY .archon/ ./.archon/
|
|
|
|
|
COPY migrations/ ./migrations/
|
|
|
|
|
COPY tsconfig*.json ./
|
2025-11-13 23:29:17 +00:00
|
|
|
|
2025-11-11 02:57:13 +00:00
|
|
|
# Fix permissions for appuser
|
|
|
|
|
RUN chown -R appuser:appuser /app
|
|
|
|
|
|
2025-11-12 05:06:29 +00:00
|
|
|
# Create .codex directory for Codex authentication
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
RUN mkdir -p /home/appuser/.codex && chown appuser:appuser /home/appuser/.codex
|
|
|
|
|
|
|
|
|
|
# Configure git to trust Archon directories (as appuser)
|
|
|
|
|
RUN gosu appuser git config --global --add safe.directory '/.archon/workspaces' && \
|
|
|
|
|
gosu appuser git config --global --add safe.directory '/.archon/workspaces/*' && \
|
|
|
|
|
gosu appuser git config --global --add safe.directory '/.archon/worktrees' && \
|
|
|
|
|
gosu appuser git config --global --add safe.directory '/.archon/worktrees/*'
|
2025-11-12 05:06:29 +00:00
|
|
|
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# Copy entrypoint script (fixes volume permissions, drops to appuser)
|
2026-03-29 07:44:17 +00:00
|
|
|
# sed strips Windows CRLF in case .gitattributes eol=lf was bypassed
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
COPY docker-entrypoint.sh /usr/local/bin/
|
2026-03-29 07:44:17 +00:00
|
|
|
RUN sed -i 's/\r$//' /usr/local/bin/docker-entrypoint.sh \
|
|
|
|
|
&& chmod +x /usr/local/bin/docker-entrypoint.sh
|
2025-11-11 02:57:13 +00:00
|
|
|
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# Default port (matches .env.example PORT=3000)
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
EXPOSE 3000
|
|
|
|
|
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
ENTRYPOINT ["docker-entrypoint.sh"]
|