feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# =============================================================================
|
|
|
|
|
# Archon - Docker Compose
|
|
|
|
|
# =============================================================================
|
|
|
|
|
#
|
|
|
|
|
# Usage:
|
|
|
|
|
# docker compose up -d # App with SQLite (default)
|
|
|
|
|
# docker compose --profile with-db up -d # App + local PostgreSQL
|
|
|
|
|
# docker compose --profile cloud up -d # App + Caddy HTTPS reverse proxy
|
|
|
|
|
# docker compose --profile with-db --profile cloud up -d # All three
|
|
|
|
|
#
|
|
|
|
|
# Database:
|
|
|
|
|
# SQLite is the default (zero config). For PostgreSQL, either:
|
|
|
|
|
# - Use --profile with-db for a local container, and set in .env:
|
|
|
|
|
# DATABASE_URL=postgresql://postgres:postgres@postgres:5432/remote_coding_agent
|
|
|
|
|
# - Or point DATABASE_URL to an external database (Supabase, Neon, etc.)
|
|
|
|
|
#
|
|
|
|
|
# Data:
|
|
|
|
|
# Set ARCHON_DATA in .env to control where Archon stores data on the host:
|
|
|
|
|
# ARCHON_DATA=/opt/archon-data # Any absolute path on the host
|
|
|
|
|
# Default: Docker-managed volume (archon_data)
|
|
|
|
|
#
|
|
|
|
|
# Cloud (HTTPS):
|
|
|
|
|
# 1. Set DOMAIN=archon.example.com in .env
|
|
|
|
|
# 2. Point DNS A record to your server
|
|
|
|
|
# 3. Add --profile cloud — Caddy handles TLS automatically via Let's Encrypt
|
|
|
|
|
#
|
|
|
|
|
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
services:
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# -------------------------------------------------------------------------
|
|
|
|
|
# App (always runs)
|
|
|
|
|
# -------------------------------------------------------------------------
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
app:
|
2025-11-28 08:21:31 +00:00
|
|
|
build: .
|
2026-04-07 13:03:13 +00:00
|
|
|
image: archon
|
2025-11-28 08:21:31 +00:00
|
|
|
env_file: .env
|
|
|
|
|
environment:
|
2025-12-17 19:45:41 +00:00
|
|
|
ARCHON_DOCKER: "true"
|
2025-11-28 08:21:31 +00:00
|
|
|
ports:
|
2025-12-01 09:45:54 +00:00
|
|
|
- "${PORT:-3000}:${PORT:-3000}"
|
2025-11-28 08:21:31 +00:00
|
|
|
volumes:
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
- ${ARCHON_DATA:-archon_data}:/.archon
|
|
|
|
|
networks:
|
|
|
|
|
- archon-network
|
2025-11-28 08:21:31 +00:00
|
|
|
restart: unless-stopped
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
healthcheck:
|
|
|
|
|
test: ["CMD", "curl", "-f", "http://localhost:${PORT:-3000}/api/health"]
|
|
|
|
|
interval: 30s
|
|
|
|
|
timeout: 10s
|
|
|
|
|
retries: 3
|
|
|
|
|
start_period: 15s
|
2025-11-28 08:21:31 +00:00
|
|
|
dns:
|
|
|
|
|
- 8.8.8.8
|
|
|
|
|
- 8.8.4.4
|
|
|
|
|
sysctls:
|
|
|
|
|
- net.ipv6.conf.all.disable_ipv6=1
|
|
|
|
|
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# -------------------------------------------------------------------------
|
|
|
|
|
# PostgreSQL (optional: --profile with-db)
|
|
|
|
|
# Set DATABASE_URL in .env to connect the app to this container.
|
|
|
|
|
# -------------------------------------------------------------------------
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
postgres:
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
image: postgres:17-alpine
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
profiles: ["with-db"]
|
|
|
|
|
environment:
|
|
|
|
|
POSTGRES_DB: remote_coding_agent
|
|
|
|
|
POSTGRES_USER: postgres
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-postgres}
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
volumes:
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
- postgres_data:/var/lib/postgresql/data
|
2025-12-05 13:12:12 +00:00
|
|
|
- ./migrations/000_combined.sql:/docker-entrypoint-initdb.d/000_combined.sql:ro
|
|
|
|
|
- ./migrations:/migrations:ro
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
ports:
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
- "127.0.0.1:${POSTGRES_PORT:-5432}:5432"
|
|
|
|
|
networks:
|
|
|
|
|
- archon-network
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
healthcheck:
|
|
|
|
|
test: ["CMD-SHELL", "pg_isready -U postgres"]
|
|
|
|
|
interval: 5s
|
|
|
|
|
timeout: 5s
|
|
|
|
|
retries: 5
|
|
|
|
|
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
# -------------------------------------------------------------------------
|
|
|
|
|
# Caddy reverse proxy with automatic HTTPS (optional: --profile cloud)
|
|
|
|
|
# Requires DOMAIN set in .env. See Caddyfile for configuration.
|
|
|
|
|
# -------------------------------------------------------------------------
|
|
|
|
|
caddy:
|
|
|
|
|
image: caddy:2-alpine
|
|
|
|
|
profiles: ["cloud"]
|
|
|
|
|
restart: unless-stopped
|
|
|
|
|
env_file: .env
|
|
|
|
|
ports:
|
|
|
|
|
- "80:80"
|
|
|
|
|
- "443:443"
|
|
|
|
|
- "443:443/udp"
|
|
|
|
|
volumes:
|
|
|
|
|
- ./Caddyfile:/etc/caddy/Caddyfile:ro
|
|
|
|
|
- caddy_data:/data
|
|
|
|
|
- caddy_config:/config
|
|
|
|
|
networks:
|
|
|
|
|
- archon-network
|
|
|
|
|
depends_on:
|
|
|
|
|
app:
|
|
|
|
|
condition: service_healthy
|
|
|
|
|
|
|
|
|
|
# -------------------------------------------------------------------------
|
|
|
|
|
# Auth service — form-based login for Caddy forward_auth (optional: --profile auth)
|
|
|
|
|
# Use alongside --profile cloud: docker compose --profile cloud --profile auth up -d
|
|
|
|
|
# Requires AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET in .env.
|
|
|
|
|
# See docs/docker.md for setup instructions.
|
|
|
|
|
# -------------------------------------------------------------------------
|
|
|
|
|
auth-service:
|
|
|
|
|
build: ./auth-service
|
|
|
|
|
profiles: ["auth"]
|
|
|
|
|
restart: unless-stopped
|
|
|
|
|
env_file: .env
|
|
|
|
|
environment:
|
|
|
|
|
AUTH_PORT: "${AUTH_SERVICE_PORT:-9000}"
|
|
|
|
|
expose:
|
|
|
|
|
- "${AUTH_SERVICE_PORT:-9000}"
|
|
|
|
|
networks:
|
|
|
|
|
- archon-network
|
|
|
|
|
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
volumes:
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
archon_data:
|
feat: implement telegram + claude mvp with generic architecture
- Add generic IPlatformAdapter and IAssistantClient interfaces for extensibility
- Implement TelegramAdapter with streaming/batch modes
- Implement ClaudeClient with session persistence and resume capability
- Create TestAdapter for autonomous validation via HTTP endpoints
- Add PostgreSQL database with 3-table schema (conversations, codebases, sessions)
- Implement slash command system (/clone, /status, /getcwd, /setcwd, /reset, /help)
- Add Docker containerization with docker-compose (with-db profile for local PostgreSQL)
- Fix Claude Agent SDK spawn error (install bash, pass PATH environment variable)
- Fix workspace volume mount to use /workspace in container
- Add comprehensive documentation and health check endpoints
Architecture highlights:
- Platform-agnostic design allows adding Slack, GitHub, etc. via IPlatformAdapter
- AI-agnostic design allows adding Codex, etc. via IAssistantClient
- Orchestrator uses dependency injection with interface types
- Session persistence survives container restarts
- Working directory + codebase context determine Claude behavior
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:35:50 +00:00
|
|
|
postgres_data:
|
feat(docker): complete Docker deployment setup (#756)
* fix: overhaul Docker setup for working builds and server deployments
Multi-stage Dockerfile: deps → web build → production image. Fixes
missing workspace packages (was 3/9, now all 9), adds Vite web UI
build, removes broken single-file bundle, uses --production install.
Merges docker-compose.yml and docker-compose.cloud.yml into a single
file with composable profiles (with-db, cloud). Fixes health check
path (/api/health), postgres volume (/data), adds Caddyfile.example.
* docs: add comprehensive Docker guide and update cloud-deployment.md
New docs/docker.md covers quick start, composable profiles, config,
cloud deployment with HTTPS, pre-built image usage, building, and
troubleshooting. Updates cloud-deployment.md to use the new single
compose file with profiles and fixes stale health endpoint paths.
* docs: restructure docker.md — prerequisites before commands
Moves .env and Caddyfile setup to a Prerequisites section at the top,
before any docker compose commands. Adds troubleshooting entry for the
"not a directory" Caddyfile mount error.
* fix: pass env_file to Caddy container for DOMAIN variable
Caddy needs {$DOMAIN} from .env but the container had no env_file.
Without it, {$DOMAIN} is empty and Caddy parses the site block as
a global options block, causing "unrecognized global option" error.
* docs: rewrite docker.md with server quickstart and fix auth guidance
Restructures around a step-by-step Quick Start that walks through the
full server deployment (Docker install → .env → Caddyfile → DNS → run).
Removes CLAUDE_USE_GLOBAL_AUTH references — Docker has no local claude
CLI, so users must provide CLAUDE_CODE_OAUTH_TOKEN or CLAUDE_API_KEY.
* feat: warn when Docker app falls back to SQLite with postgres running
When ARCHON_DOCKER=true and DATABASE_URL is not set, logs a warning
with the exact connection string to add to .env. Prevents users from
running --profile with-db and unknowingly using SQLite instead.
* feat: configurable data directory via ARCHON_DATA env var
Users can set ARCHON_DATA=/opt/archon-data in .env to control where
Archon stores workspaces, worktrees, artifacts, and logs on the host.
Defaults to a Docker-managed volume when not set.
* fix: fix volume permission errors with entrypoint script
Docker volume mounts create /.archon/ as root, but the app runs as
appuser (UID 1001). New docker-entrypoint.sh runs as root to fix
permissions, then drops to appuser via gosu. Works both when running
as root (default) and as non-root (--user flag, Kubernetes).
* fix: configure git credentials from GH_TOKEN in Docker entrypoint
Git inside the container can't authenticate for HTTPS clones without
credentials. The entrypoint now configures git url.insteadOf to inject
GH_TOKEN into GitHub HTTPS URLs automatically.
* security: use credential helper for GH_TOKEN instead of url.insteadOf
The url.insteadOf approach stored the raw token in ~/.gitconfig as a
key name, visible to any process. Credential helper keeps the token
in the environment only. Also fixes: chown -Rh (no symlink follow),
signal propagation (exec bun directly as PID 1), error diagnostics,
and deduplicates root/non-root branches via RUNNER variable.
* security: scope SSE flush_interval to /api/stream/*, harden headers
flush_interval -1 was global, disabling buffering for all endpoints.
Now scoped to @sse path matcher. Also adds HSTS, changes X-Frame-Options
to DENY, and trims the comment header.
* security: use env-var for postgres password, bind port to localhost
Hardcoded postgres:postgres with port exposed to 0.0.0.0 is a risk
on servers with permissive firewalls. Now uses POSTGRES_PASSWORD env
var with fallback, and binds to 127.0.0.1 only.
* fix: caddy depends_on app with service_healthy condition
Without the health condition, Caddy starts proxying before the app
is ready, returning 502s on first boot.
* fix: remove hardcoded container_name from caddy service
Hardcoded name prevents running multiple instances on the same host.
Other services already use Compose default naming.
* security: exclude .claude/ from Docker image
Skills, commands, rules, and prompt engineering details are not needed
at runtime and expose internal architecture in the production image.
* fix: assert web build produces index.html in Dockerfile
A silent Vite failure could produce an empty dist/ — the container
would start with a healthy backend but a broken UI serving 404s.
* chore: remove redundant WORKDIR in Dockerfile Stage 2
WORKDIR /app is inherited from Stage 1 (deps). Re-declaring it adds
a no-op layer and implies something changed.
* feat: add cloud-init config for automated server setup
New deploy/cloud-init.yml for VPS providers — paste into User Data
field to auto-install Docker, clone repo, build image, and configure
firewall. User only needs to edit .env and run docker compose up.
* feat: add optional Caddy basic auth for cloud deployments
Single env var (CADDY_BASIC_AUTH) expands to the full basicauth directive
or nothing when unset — no app changes needed. Webhooks and health check
are excluded. Documented in .env.example, deploy config, and docker.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(docker): add agent-browser + Chromium for E2E testing workflows
Enables E2E validation workflows (archon-validate-pr, validate-ui,
replicate-issue) to run inside Docker containers out of the box.
- Install system Chromium via apt-get (~200MB vs ~500MB Chrome for Testing)
- Install agent-browser@0.22.1 via npm (postinstall downloads Rust binary)
- Purge nodejs/npm after install to keep image lean
- Set AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
- agent-browser auto-detects Docker and adds --no-sandbox
Closes #787
* fix(docker): symlink agent-browser native binary before purging nodejs
The npm entry point (bin/agent-browser.js) is a Node.js wrapper that
launches the Rust binary. After purging nodejs/npm to save ~60MB, the
wrapper can't execute. Fix by copying the native Rust binary directly
to /usr/local/bin and symlinking agent-browser to it.
* feat(docker): add cookie-based form auth sidecar for Caddy
- Add auth-service/ Node.js sidecar (/verify, /login GET/POST, /logout)
- Use bcryptjs for password hashing, HMAC-SHA256 signed HttpOnly cookies
- Add auth-service to docker-compose.yml under ["auth"] profile (expose: not ports:)
- Restructure Caddyfile.example with handle blocks for Option A (form auth), Option B (basic auth), None
- Add AUTH_USERNAME, AUTH_PASSWORD_HASH, COOKIE_SECRET env vars to .env.example and deploy/.env.example
- Add Form-Based Authentication section to docs/docker.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address review findings for auth-service (HIGH/MEDIUM)
Fixes applied:
- HIGH: validate AUTH_PASSWORD_HASH is a valid bcrypt hash at startup
(bcrypt.getRounds() guard — prevents silent lockout on placeholder hash)
- HIGH: add request method/URL context to unhandled error log + non-empty 500 body
- HIGH: add server.on('error') handler for port bind failures (EADDRINUSE/EACCES)
- HIGH: document AUTH_PORT/AUTH_SERVICE_PORT indirection in server.js comment
- HIGH: add auth-service/test.js with isSafeRedirect and cookie sign/verify tests
- MEDIUM: add escapeHtml() helper; apply to loginPage error param (latent XSS)
- MEDIUM: add 4 KB body size limit in readBody (prevents memory exhaustion)
- MEDIUM: export helpers + require.main guard (enables clean import-level testing)
- MEDIUM: fix docs/docker.md Step 4 instruction — clarify which handle block to comment out
Tests added:
- auth-service/test.js: 12 assertions for isSafeRedirect (safe paths + open redirect vectors)
- auth-service/test.js: 5 assertions for signCookie/verifyCookie round-trip and edge cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: escape $ in AUTH_PASSWORD_HASH example to prevent Docker Compose variable substitution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(core): break up god function in command-handler (#742)
* refactor(core): break up god function in command-handler
Extract handleWorktreeCommand, handleWorkflowCommand, handleRepoCommand,
and handleRepoRemoveCommand from the 1300-line handleCommand switch
statement. Add resolveRepoArg helper to eliminate duplication between
repo and repo-remove cases. handleCommand now contains ~200 lines of
routing logic only.
* fix: address review findings from PR #742
command-handler.ts:
- Replace fragile 'success' in discriminator with proper ResolveRepoArgResult
discriminated union (ok: true/false) and fix misleading JSDoc
- Add missing error handling to worktree orphans, workflow cancel, workflow reload
- Fix isolation_env_id UUID used as filesystem path in worktree create/list/orphans
(look up working_path from DB instead)
- Add cmd. domain prefix to all log events per CLAUDE.md convention
- Add identifier/isolationEnvId context to repo_switch_failed and worktree_remove_failed logs
- Capture isCurrentCodebase before mutation in handleRepoRemoveCommand
- Hoist duplicated workflowCwd computation in handleWorkflowCommand
- Remove stale (Phase 3D) comment marker
docs:
- Remove all /command-invoke references from CLAUDE.md, README.md,
docs/architecture.md, and .claude/rules/orchestrator.md
- Update command list to match actual handleCommand cases
- Replace outdated routing examples with current AI router pattern
* refactor: remove MAX_WORKTREES_PER_CODEBASE limit
Worktree count is no longer restricted. Remove the constant, the
limit field from WorktreeStatusBreakdown, the limit_reached block
reason, formatWorktreeLimitMessage, and all associated tests.
* fix: address review findings — error handling, log prefixes, tests, docs
- Wrap workflow list discoverWorkflowsWithConfig in try/catch (was the
only unprotected async call among workflow subcommands)
- Cast error to Error before logging in workflow cancel/status catch blocks
- Add cmd. domain prefix to all command-handler log events (12 events)
- Update worktree create test to use UUID isolation_env_id with DB lookup
- Add resolveRepoArg boundary tests (/repo 0, /repo N > count)
- Add worktree cleanup subcommand tests (merged, stale, invalid type)
- Add updateConversation assertion to repo-remove session test
- Fix stale docs: architecture.md command handler section, .claude → .archon
paths, remove /command-invoke from commands-reference, fix github.md example
* feat(workflows)!: replace standalone loop with DAG loop node (#785)
* feat(workflows): add loop node type to DAG workflows
Add LoopNode as a fourth DAG node type alongside command, prompt, and
bash. Loop nodes run an AI prompt repeatedly until a completion signal
is detected (LLM-decided via <promise>SIGNAL</promise>) or a
deterministic bash condition succeeds (until_bash exit 0).
This enables Ralph-style autonomous iteration as a composable node
within DAG workflows — upstream nodes can produce plans/task lists
that feed into the loop, and downstream nodes can act on the loop's
output via $nodeId.output substitution.
Changes:
- Add LoopNodeConfig, LoopNode interface, isLoopNode type guard
- Add loop branch in parseDagNode with full validation
- Extract detectCompletionSignal/stripCompletionTags to executor-shared
- Add executeLoopNode function in dag-executor with iteration logic
- Add nodeId field to loop iteration event interfaces
- Add 17 new tests (9 loader + 8 executor)
- Add archon-test-loop-dag and archon-ralph-dag default workflows
The standalone loop: workflow type is preserved but deprecated.
* refactor(workflows): rewrite archon-ralph-dag prompt to match command quality bar
Expand the loop prompt from ~75 lines to ~430 lines with:
- 7 numbered phases with checkpoints (matching archon-implement.md pattern)
- Environment setup: dependency install, CLAUDE.md reading, git state check
- Explicit DO/DON'T implementation rules
- Per-failure-type validation handling (type-check, lint, tests, format)
- Acceptance criteria verification before commit
- Exact commit message template with heredoc format
- Edge case handling (validation loops, blocked stories, dirty state, large stories)
- File format specs for prd.json schema and progress.txt structure
- Critical fix: "context is stale — re-read from disk" for fresh_context loops
Also improved bash setup node (dep install, structured output delimiters,
story counts) and report node (git log/diff stats, PR status check).
* feat(workflows)!: remove standalone loop workflow type
BREAKING: Standalone `loop:` workflows are no longer supported.
Loop iteration is now exclusively a DAG node type (LoopNode).
Existing loop workflows should be migrated to DAG workflows
with loop nodes — see archon-ralph-dag.yaml for the pattern.
Removed:
- LoopConfig type and LoopWorkflow from WorkflowDefinition union
- executeLoopWorkflow function (~600 lines) from executor.ts
- Loop dispatch in executeWorkflow
- Top-level loop: parsing in loader (now returns clear error message)
- archon-ralph-fresh.yaml, archon-ralph-stateful.yaml, archon-test-loop.yaml
- LoopEditor.tsx and loop mode from WorkflowBuilder UI
- ~900 lines of standalone loop tests
Kept (for DAG loop nodes):
- LoopNodeConfig, LoopNode, isLoopNode
- executeLoopNode in dag-executor.ts
- Loop iteration events in store/event-emitter
- isLoop tracking in web UI workflow store (fires for DAG loop nodes)
* fix: address all review findings for loop-dag-node PR
- Fix missing isDagWorkflow import in command-handler.ts (shipping bug)
- Wrap substituteWorkflowVariables and getAssistantClient in try-catch
with structured error output in executeLoopNode
- Add onTimeout callback for idle timeout (log + user notification + abort)
- Add cancellation user notification before returning failed state
- Differentiate until_bash ENOENT/system errors from expected non-zero exit
- Use logDir for per-iteration AI output logging (logAssistant, logTool,
logStepComplete, tool_called/tool_completed events, sendStructuredEvent)
- Reject retry: on loop nodes at load time (executor doesn't apply it)
- Remove dead isLoop field from WorkflowStartedEvent
- Fix stale error message "DAG/loop dispatch" -> "DAG dispatch"
- Fix stale commitWorkflowArtifacts doc referencing "loop-based"
- Fix archon-ralph-dag.yaml referencing deleted workflows
- Update CLAUDE.md: "Two execution modes", add loop node to DAG description
- Extract parseIdleTimeout helper (3 copies -> 1 in loader.ts)
- Use isLoopNode() type guard in validateDagStructure
- Simplify buildLoopNodeOptions with conditional spread
- Restore loop?: never on StepWorkflow for type safety
- Add tests: AI error mid-iteration, plain signal detection, false positive
- Fix stale test assertion for standalone loop rejection message
* feat: refactor Gitea adapter to community forge structure + tea CLI
Moves the Gitea platform adapter from the old location
(packages/server/src/adapters/gitea.ts) to the proper community
forge adapter structure:
packages/adapters/src/community/forge/gitea/
├── adapter.ts # Main adapter class
├── auth.ts # parseAllowedUsers, isGiteaUserAuthorized
├── types.ts # WebhookEvent interface
├── index.ts # Barrel export
└── adapter.test.ts # 43 passing tests
Key changes:
- Fix imports: createLogger, getArchonWorkspacesPath,
getCommandFolderSearchPaths now from @archon/paths
- Fix imports: cloneRepository, syncRepository, addSafeDirectory,
toRepoPath, toBranchName, isWorktreePath now from @archon/git
- Remove execAsync / child_process / promisify — use @archon/git
functions for all git operations
- auth.ts extracted from @archon/core into adapter package (mirrors
GitHub adapter's auth.ts pattern)
- types.ts extracted: WebhookEvent interface now standalone
- Replace gh CLI hints with tea CLI in context strings:
'tea issue view N' and 'tea pr view N'
- Register GiteaAdapter in packages/server/src/index.ts via
@archon/adapters/community/forge/gitea import
- Document GITEA_* env vars in .env.example
Tests: 43 pass, 0 fail
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Archon <archon@dynamous.ai>
Co-authored-by: Thomas <info@smartcode.diy>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Fitzy <fitzy@cyberfitz.org>
Co-authored-by: John Fitzpatrick <john@cyberfitz.org>
2026-03-26 13:02:04 +00:00
|
|
|
caddy_data:
|
|
|
|
|
caddy_config:
|
|
|
|
|
|
|
|
|
|
networks:
|
|
|
|
|
archon-network:
|
|
|
|
|
driver: bridge
|