Archon/Dockerfile
Rasmus Widing 81859d6842
fix(providers): replace Claude SDK embed with explicit binary-path resolver (#1217)
* feat(providers): replace Claude SDK embed with explicit binary-path resolver

Drop `@anthropic-ai/claude-agent-sdk/embed` and resolve Claude Code via
CLAUDE_BIN_PATH env → assistants.claude.claudeBinaryPath config → throw
with install instructions. The embed's silent failure modes on macOS
(#1210) and Windows (#1087) become actionable errors with a documented
recovery path.

Dev mode (bun run) remains auto-resolved via node_modules. The setup
wizard auto-detects Claude Code by probing the native installer path
(~/.local/bin/claude), npm global cli.js, and PATH, then writes
CLAUDE_BIN_PATH to ~/.archon/.env. Dockerfile pre-sets CLAUDE_BIN_PATH
so extenders using the compiled binary keep working. Release workflow
gets negative and positive resolver smoke tests.

Docs, CHANGELOG, README, .env.example, CLAUDE.md, test-release and
archon skills all updated to reflect the curl-first install story.

Retires #1210, #1087, #1091 (never merged, now obsolete).
Implements #1176.

* fix(providers): only pass --no-env-file when spawning Claude via Bun/Node

`--no-env-file` is a Bun flag that prevents Bun from auto-loading
`.env` from the subprocess cwd. It is only meaningful when the Claude
Code executable is a `cli.js` file — in which case the SDK spawns it
via `bun`/`node` and the flag reaches the runtime.

When `CLAUDE_BIN_PATH` points at a native compiled Claude binary (e.g.
`~/.local/bin/claude` from the curl installer, which is Anthropic's
recommended default), the SDK executes the binary directly. Passing
`--no-env-file` then goes straight to the native binary, which
rejects it with `error: unknown option '--no-env-file'` and the
subprocess exits code 1.

Emit `executableArgs` only when the target is a `.js` file (dev mode
or explicit cli.js path). Caught by end-to-end smoke testing against
the curl-installed native Claude binary.

* docs: record env-leak validation result in provider comment

Verified end-to-end with sentinel `.env` and `.env.local` files in a
workflow CWD that the native Claude binary (curl installer) does not
auto-load `.env` files. With Archon's full spawn pathway and parent
env stripped, the subprocess saw both sentinels as UNSET. The
first-layer protection in `@archon/paths` (#1067) handles the
inheritance leak; `--no-env-file` only matters for the Bun-spawned
cli.js path, where it is still emitted.

* chore(providers): cleanup pass — exports, docs, troubleshooting

Final-sweep cleanup tied to the binary-resolver PR:

- Mirror Codex's package surface for the new Claude resolver: add
  `./claude/binary-resolver` subpath export and re-export
  `resolveClaudeBinaryPath` + `claudeFileExists` from the package
  index. Renames the previously single `fileExists` re-export to
  `codexFileExists` for symmetry; nothing outside the providers
  package was importing it.
- Add a "Claude Code not found" entry to the troubleshooting reference
  doc with platform-specific install snippets and pointers to the
  AI Assistants binary-path section.
- Reframe the example claudeBinaryPath in reference/configuration.md
  away from cli.js-only language; it accepts either the native binary
  or cli.js.

* test+refactor(providers, cli): address PR review feedback

Two test gaps and one doc nit from the PR review (#1217):

- Extract the `--no-env-file` decision into a pure exported helper
  `shouldPassNoEnvFile(cliPath)` so the native-binary branch is unit
  testable without mocking `BUNDLED_IS_BINARY` or running the full
  sendQuery pathway. Six new tests cover undefined, cli.js, native
  binary (Linux + Windows), Homebrew symlink, and suffix-only matching.
  Also adds a `claude.subprocess_env_file_flag` debug log so the
  security-adjacent decision is auditable.

- Extract the three install-location probes in setup.ts into exported
  wrappers (`probeFileExists`, `probeNpmRoot`, `probeWhichClaude`) and
  export `detectClaudeExecutablePath` itself, so the probe order can be
  spied on. Six new tests cover each tier winning, fall-through
  ordering, npm-tier skip when not installed, and the
  which-resolved-but-stale-path edge case.

- CLAUDE.md `claudeBinaryPath` placeholder updated to reflect that the
  field accepts either the native binary or cli.js (the example value
  was previously `/absolute/path/to/cli.js`, slightly misleading now
  that the curl-installer native binary is the default).

Skipped from the review by deliberate scope decision:

- `resolveClaudeBinaryPath` async-with-no-await: matches Codex's
  resolver signature exactly. Changing only Claude breaks symmetry;
  if pursued, do both providers in a separate cleanup PR.
- `isAbsolute()` validation in parseClaudeConfig: Codex doesn't do it
  either. Resolver throws on non-existence already.
- Atomic `.env` writes in setup wizard: pre-existing pattern this PR
  touched only adjacently. File as separate issue if needed.
- classifyError branch in dag-executor for setup errors: scope creep.
- `.env.example` "missing #" claim: false positive (verified all
  CLAUDE_BIN_PATH lines have proper comment prefixes).

* fix(test): use path.join in Windows-compatible probe-order test

The "tier 2 wins (npm cli.js)" test hardcoded forward-slash path
comparisons, but `path.join` produces backslashes on Windows. Caused
the Windows CI leg of the test suite to fail while macOS and Linux
passed. Use `path.join` for both the mock return value and the
expectation so the separator matches whatever the platform produces.
2026-04-14 17:56:37 +03:00

190 lines
8.5 KiB
Docker

# =============================================================================
# Archon - Remote Agentic Coding Platform
# Multi-stage build: deps → web build → production image
# =============================================================================
# ---------------------------------------------------------------------------
# Stage 1: Install dependencies
# ---------------------------------------------------------------------------
FROM oven/bun:1.3.11-slim AS deps
WORKDIR /app
# Copy root package files and lockfile
COPY package.json bun.lock ./
# Copy ALL workspace package.json files (monorepo lockfile depends on all of them)
COPY packages/adapters/package.json ./packages/adapters/
COPY packages/cli/package.json ./packages/cli/
COPY packages/core/package.json ./packages/core/
# docs-web source is NOT copied — it's a static site deployed separately
# (see .github/workflows/deploy-docs.yml). package.json is included only
# so Bun's workspace lockfile resolves correctly.
COPY packages/docs-web/package.json ./packages/docs-web/
COPY packages/git/package.json ./packages/git/
COPY packages/isolation/package.json ./packages/isolation/
COPY packages/paths/package.json ./packages/paths/
COPY packages/providers/package.json ./packages/providers/
COPY packages/server/package.json ./packages/server/
COPY packages/web/package.json ./packages/web/
COPY packages/workflows/package.json ./packages/workflows/
# Install ALL dependencies (including devDependencies needed for web build)
# --linker=hoisted: Bun's default "isolated" linker stores packages in
# node_modules/.bun/ with symlinks that Vite/Rollup cannot resolve during
# production builds. Hoisted layout gives classic flat node_modules.
RUN bun install --frozen-lockfile --linker=hoisted
# ---------------------------------------------------------------------------
# Stage 2: Build web UI (Vite + React)
# ---------------------------------------------------------------------------
FROM deps AS web-build
# Copy full source (needed for workspace resolution and web build)
COPY . .
# Build the web frontend — output goes to packages/web/dist/
RUN bun run build:web && \
test -f packages/web/dist/index.html || \
(echo "ERROR: Web build produced no index.html" >&2 && exit 1)
# ---------------------------------------------------------------------------
# Stage 3: Production image
# ---------------------------------------------------------------------------
FROM oven/bun:1.3.11-slim AS production
# OCI Labels for GHCR
LABEL org.opencontainers.image.source="https://github.com/coleam00/Archon"
LABEL org.opencontainers.image.description="Control AI coding assistants remotely from Telegram, Slack, Discord, and GitHub"
LABEL org.opencontainers.image.licenses="MIT"
# Prevent interactive prompts during installation
ENV DEBIAN_FRONTEND=noninteractive
WORKDIR /app
# Install system dependencies + gosu for privilege dropping in entrypoint
RUN apt-get update && apt-get install -y \
curl \
git \
bash \
ca-certificates \
gnupg \
gosu \
postgresql-client \
# Chromium for agent-browser E2E testing (drives browser via CDP)
chromium \
&& rm -rf /var/lib/apt/lists/*
# Install GitHub CLI
RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg \
&& chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
&& apt-get update \
&& apt-get install -y gh \
&& rm -rf /var/lib/apt/lists/*
# Install agent-browser CLI (Vercel Labs) for E2E testing workflows
# - Uses npm (not bun) because postinstall script downloads the native Rust binary
# - After install, symlink the Rust binary directly and purge nodejs/npm (~60MB saved)
# - The npm entry point is a Node.js wrapper; the native binary works standalone
# - agent-browser auto-detects Docker (via /.dockerenv) and adds --no-sandbox to Chromium
RUN apt-get update && apt-get install -y --no-install-recommends nodejs npm \
&& npm install -g agent-browser@0.22.1 \
&& NATIVE_BIN=$(find /usr/local/lib/node_modules/agent-browser -name 'agent-browser-*' -type f -executable 2>/dev/null | head -1) \
&& if [ -n "$NATIVE_BIN" ]; then \
cp "$NATIVE_BIN" /usr/local/bin/agent-browser-native \
&& chmod +x /usr/local/bin/agent-browser-native \
&& ln -sf /usr/local/bin/agent-browser-native /usr/local/bin/agent-browser; \
else \
echo "ERROR: agent-browser native binary not found after npm install" >&2 && exit 1; \
fi \
&& npm cache clean --force \
&& rm -rf /usr/local/lib/node_modules/agent-browser \
&& apt-get purge -y nodejs npm \
&& apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/*
# Point agent-browser to system Chromium (avoids ~400MB Chrome for Testing download)
ENV AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
# Pre-configure the Claude Code SDK cli.js path for any consumer that runs
# a compiled Archon binary inside (or extending) this image. In source mode
# (the default `bun run start` ENTRYPOINT), BUNDLED_IS_BINARY is false and
# this variable is ignored — the SDK resolves cli.js via node_modules. Kept
# here so extenders don't need to rediscover the path.
# Path matches the hoisted layout produced by `bun install --linker=hoisted`.
ENV CLAUDE_BIN_PATH=/app/node_modules/@anthropic-ai/claude-agent-sdk/cli.js
# Create non-root user for running Claude Code
# Claude Code refuses to run with --dangerously-skip-permissions as root for security
RUN useradd -m -u 1001 -s /bin/bash appuser \
&& chown -R appuser:appuser /app
# Create Archon directories
RUN mkdir -p /.archon/workspaces /.archon/worktrees \
&& chown -R appuser:appuser /.archon
# Copy root package files and lockfile
COPY package.json bun.lock ./
# Copy ALL workspace package.json files
COPY packages/adapters/package.json ./packages/adapters/
COPY packages/cli/package.json ./packages/cli/
COPY packages/core/package.json ./packages/core/
# docs-web source is NOT copied — it's a static site deployed separately
# (see .github/workflows/deploy-docs.yml). package.json is included only
# so Bun's workspace lockfile resolves correctly.
COPY packages/docs-web/package.json ./packages/docs-web/
COPY packages/git/package.json ./packages/git/
COPY packages/isolation/package.json ./packages/isolation/
COPY packages/paths/package.json ./packages/paths/
COPY packages/providers/package.json ./packages/providers/
COPY packages/server/package.json ./packages/server/
COPY packages/web/package.json ./packages/web/
COPY packages/workflows/package.json ./packages/workflows/
# Install production dependencies only (--ignore-scripts skips husky prepare hook)
RUN bun install --frozen-lockfile --production --ignore-scripts --linker=hoisted
# Copy application source (Bun runs TypeScript directly, no compile step needed)
COPY packages/adapters/ ./packages/adapters/
COPY packages/cli/ ./packages/cli/
COPY packages/core/ ./packages/core/
COPY packages/git/ ./packages/git/
COPY packages/isolation/ ./packages/isolation/
COPY packages/paths/ ./packages/paths/
COPY packages/providers/ ./packages/providers/
COPY packages/server/ ./packages/server/
COPY packages/workflows/ ./packages/workflows/
# Copy pre-built web UI from build stage
COPY --from=web-build /app/packages/web/dist/ ./packages/web/dist/
# Copy config, migrations, and bundled defaults
COPY .archon/ ./.archon/
COPY migrations/ ./migrations/
COPY tsconfig*.json ./
# Fix permissions for appuser
RUN chown -R appuser:appuser /app
# Create .codex directory for Codex authentication
RUN mkdir -p /home/appuser/.codex && chown appuser:appuser /home/appuser/.codex
# Configure git to trust Archon directories (as appuser)
RUN gosu appuser git config --global --add safe.directory '/.archon/workspaces' && \
gosu appuser git config --global --add safe.directory '/.archon/workspaces/*' && \
gosu appuser git config --global --add safe.directory '/.archon/worktrees' && \
gosu appuser git config --global --add safe.directory '/.archon/worktrees/*'
# Copy entrypoint script (fixes volume permissions, drops to appuser)
# sed strips Windows CRLF in case .gitattributes eol=lf was bypassed
COPY docker-entrypoint.sh /usr/local/bin/
RUN sed -i 's/\r$//' /usr/local/bin/docker-entrypoint.sh \
&& chmod +x /usr/local/bin/docker-entrypoint.sh
# Default port (matches .env.example PORT=3000)
EXPOSE 3000
ENTRYPOINT ["docker-entrypoint.sh"]