The first open-source harness builder for AI coding. Make AI coding deterministic and repeatable.
Find a file
Rasmus Widing 33d31c44f1
fix: lock workflow runs by working_path (#1036, #1188 part 2) (#1212)
* fix: lock workflow runs by working_path (#1036, #1188 part 2)

Both bugs reduce to the same primitive: there's no enforced lock on
working_path, so two dispatches that resolve to the same filesystem
location can race. The DB row is the lock token; pending/running/paused
are "lock held"; terminal statuses release.

Changes:

- getActiveWorkflowRunByPath includes `pending` (with 5-min stale-orphan
  age window), accepts excludeId + selfStartedAt, and orders by
  (started_at ASC, id ASC) for a deterministic older-wins tiebreaker.
  Eliminates the both-abort race where two near-simultaneous dispatches
  with similar timestamps could mutually abort each other.

- Move the executor's guard call site to AFTER workflowRun is finalized
  (preCreated, resumed, or freshly created). This guarantees we always
  have self-ID + started_at to pass to the lock query.

- On guard fire after row creation: mark self as 'cancelled' so we don't
  leave a zombie pending row that would then become its own lock holder.

- New error message includes workflow name, duration, short run id, and
  three concrete next-action commands (status / cancel / different
  branch). Replaces the vague "Workflow already running".

- Resume orphan fix: when executor activates a resumable run, mark the
  orchestrator's pre-created row as 'cancelled'. Without this, every
  resume leaks a pending row that would block the user's own
  back-to-back resume until the 5-min stale window.

- New formatDuration helper for the error message (8 unit tests).

Tests:

- 5 new tests in db/workflows.test.ts: pending in active set, age window,
  excludeId exclusion, tiebreaker SQL shape, ordering.
- 5 new tests in executor.test.ts: self-id passed to query, self-cancel
  on guard fire, new message format, resume orphan cancellation,
  resume proceeds even if orphan cancel fails.
- Updated 2 executor-preamble tests for new structural behavior
  (row-then-guard, new message format).
- 8 new tests for formatDuration.

Deferred (kept scope tight):
- Worktree-layer advisory lockfile (residual #1188.2 microsecond race
  where both dispatches reach provider.create — bounded by git's own
  atomicity for `worktree add`).
- Startup cleanup of pre-existing stale pending rows (5-min age window
  makes them harmless).
- DB partial UNIQUE constraint migration (code-only is sufficient).

Fixes #1036
Fixes #1188 (part 2)

* fix: SQLite Date binding + UTC timestamp parse for path lock guard

Two issues found during E2E smoke testing:

1. bun:sqlite rejects Date objects as bindings ("Binding expected
   string, TypedArray, boolean, number, bigint or null"). Serialize
   selfStartedAt to ISO string before passing — PostgreSQL accepts
   ISO strings for TIMESTAMPTZ comparison too.

2. SQLite returns datetimes as plain strings without timezone suffix
   ("YYYY-MM-DD HH:MM:SS"), and JS new Date() parses such strings as
   local time. The blocking message was showing "running 3h" for
   workflows started seconds ago in a UTC+3 timezone.

   Added parseDbTimestamp helper that:
   - Returns Date.getTime() unchanged for Date inputs (PG path)
   - Treats SQLite-style strings as UTC by appending Z

   Used at both call sites: the lock query (selfStartedAt) and the
   blocking message duration.

Tests:
- 4 new tests in duration.test.ts for parseDbTimestamp covering
  Date input, SQLite UTC interpretation, explicit Z, and explicit
  +/-HH:MM offsets.
- Updated workflows.test.ts assertion for ISO serialization.

E2E smoke verified end-to-end:
- Sanity (single dispatch) succeeds.
- Two concurrent --no-worktree dispatches: one wins, one blocked
  with actionable message showing correct "Xs" duration.
- Resume + back-to-back resume both succeed (orphan correctly
  cancelled when resume activates).

* fix: address review — resume timestamp, lock-leak paths, status copy

CodeRabbit review on #1212 surfaced three real correctness gaps:

CRITICAL — resumeWorkflowRun preserved historical started_at, letting
a resumed row sort ahead of a currently-active holder in the lock
query's older-wins tiebreaker. Two active workflows could end up on
the same working_path. Fix: refresh started_at to NOW() in
resumeWorkflowRun. Original creation time is recoverable from
workflow_events history if needed for analytics.

MAJOR — lock-leak failure paths:
- If resumeWorkflowRun() throws, the orchestrator's pre-created row
  was left as 'pending' until the 5-min stale window. Fix: cancel
  preCreatedRun in the resume catch.
- If getActiveWorkflowRunByPath() throws, workflowRun (possibly
  already promoted to 'running' via resume) was left active with no
  auto-cleanup. Fix: cancel workflowRun in the guard catch.

MINOR — the blocking message always said "running" but the lock
query returns running, paused, AND fresh-pending rows. Telling a
user to "wait for it to finish" on a paused run (waiting on user
approval) would block them indefinitely. Fix: status-aware copy:
- paused: "paused waiting for user input" + approve/reject actions
- pending: "starting" verb
- running: keep current

Tests:
- New: resume refreshes started_at (asserts SQL contains
  `started_at = NOW()`)
- New: cancels preCreatedRun when resumeWorkflowRun throws
- New: cancels workflowRun when guard query throws
- New: paused message uses approve/reject actions, NOT "wait"
- New: pending message uses "starting" verb
- New: running message uses default copy
- Updated: existing tests for new error string ("already active"
  reflects status-aware semantics, not just "running")

Note: the user-facing error string changed from "already running on
this path" to "already active on this path (status)". Internal use
only — surfaced via getResult().error, not directly to users.

* fix: SQLite tiebreaker dialect bug + paired self struct + UX polish

CodeRabbit second review found one critical issue and several polish
items not addressed in 008013da.

CRITICAL — SQLite tiebreaker silently broken under default deployment.
SQLite stores started_at as TEXT "YYYY-MM-DD HH:MM:SS" (space sep).
Our ISO param is "YYYY-MM-DDTHH:MM:SS.mmmZ" (T sep). SQLite compares
text lexically: char 11 is space (0x20) in column vs T (0x54) in param,
so EVERY column value lex-sorts before EVERY ISO param. Result:
`started_at < $param` is always TRUE regardless of actual time. In
true concurrent dispatches, both sides see each other as "older" and
both abort — defeating the older-wins guarantee under SQLite, which
is the default deployment.

Fix: dialect-aware comparison in getActiveWorkflowRunByPath:
  - PostgreSQL: `started_at < $3::timestamptz` (TIMESTAMPTZ + cast)
  - SQLite: `datetime(started_at) < datetime($3)` (forces chronological
    via SQLite's date/time functions)

Documented with reproducer tests in adapters/sqlite.test.ts: lexical
returns wrong answer for "2026-04-14 12:00:00" < "2026-04-14T10:00:00Z";
datetime() returns correct answer.

Type design — collapse paired params into struct.
`excludeId` and `selfStartedAt` had to travel together (tiebreaker
references both) but were two independent optionals — future callers
could pass one without the other and silently degrade. Replaced with
a single `self?: { id: string; startedAt: Date }` to make the
paired-or-nothing invariant structural.

formatDuration(0) consistency.
Old: `if (ms <= 0) return '0s'` — special-cased 0ms despite the
"sub-second rounds up to 1s" comment. Fixed to `ms < 0` so 0ms
returns '1s' (a run that just started in the same DB second should
display as active, not literal zero).

Comment fix: "We acquired the lock via createWorkflowRun" was
misleading — createWorkflowRun creates a row; the lock is determined
later by the query.

Log context: added cwd to workflow.guard_self_cancel_failed and
pendingRunId to db_active_workflow_check_failed so operators can
correlate leaked rows.

Doc fixes:
- /workflow abandon doc said "marks as failed" — actually 'cancelled'
- database.md "Prevents concurrent workflow execution" → accurate
  description of path-based lock with stale-pending tolerance

Test additions:
- 3 SQLite-direct tests in adapters/sqlite.test.ts proving the
  lexical-vs-chronological bug and the datetime() fix
- Guard self-cancel update throw still surfaces failure to user

Signature change rippled through:
- IWorkflowStore.getActiveWorkflowRunByPath now takes (path, self?)
- All internal callers updated
2026-04-14 15:19:38 +03:00
.archon fix: prevent worktree isolation bypass via prompt and git-level adoption (#1198) 2026-04-14 09:44:12 +03:00
.claude docs: consolidate Claude guidance into CLAUDE.md 2026-04-12 20:21:16 +03:00
.github fix(ci): remove 'run' from bun --filter command in release workflow 2026-04-10 15:19:34 +03:00
.husky chore: Add pre-commit hook to prevent formatting drift (#226) (#229) 2026-01-14 13:54:20 +02:00
assets docs: add Archon logo and polish README header 2026-04-07 13:01:25 -05:00
auth-service feat(web): make workflow builder Node Library panel resizable (#837) 2026-03-27 00:14:37 +02:00
deploy deploy: harden cloud-init with archon user, swap, and fixes (#981) 2026-04-08 12:38:27 +03:00
homebrew chore: update Homebrew formula for v0.3.6 2026-04-12 09:19:27 +00:00
migrations fix(env): detect and refuse target-repo .env with sensitive keys (#1036) 2026-04-08 09:43:47 +03:00
packages fix: lock workflow runs by working_path (#1036, #1188 part 2) (#1212) 2026-04-14 15:19:38 +03:00
scripts fix: sync all workspace versions from root and automate in release skill 2026-04-10 15:31:43 +03:00
.dockerignore feat(config): add user-extensible Docker customization templates 2026-04-06 15:26:43 +03:00
.env.example feat: Phase 2 — community-friendly provider registry system (#1195) 2026-04-13 21:27:11 +03:00
.gitattributes fix(docker): CRLF entrypoint + Windows Docker Desktop docs 2026-04-06 15:21:03 +03:00
.gitignore feat(config): add user-extensible Docker customization templates 2026-04-06 15:26:43 +03:00
.lintstagedrc.json feat: Runtime loading of default commands/workflows (#324) 2026-01-21 23:08:23 +02:00
.prettierignore refactor: move docs site to packages/docs-web as workspace member 2026-04-06 11:09:32 +03:00
.prettierrc Unit test fixing for Windows and updated validate-2 command 2026-01-03 16:28:35 -06:00
bun.lock refactor: extract providers from @archon/core into @archon/providers (#1137) 2026-04-13 09:21:36 +03:00
bunfig.toml feat: Phase 1 - Monorepo structure with @archon/core and @archon/server packages (#311) 2026-01-20 19:08:38 +02:00
Caddyfile.example feat(docker): complete Docker deployment setup (#756) 2026-03-26 15:02:04 +02:00
CHANGELOG.md fix: extend worktree ownership guard to resolver adoption paths (#1206) 2026-04-14 12:10:19 +03:00
CLAUDE.md feat: Phase 2 — community-friendly provider registry system (#1195) 2026-04-13 21:27:11 +03:00
CONTRIBUTING.md feat: prepare for open-source migration to coleam00/Archon 2026-04-04 10:47:22 -05:00
docker-compose.override.example.yml fix(config): address review findings for Docker customization templates 2026-04-06 15:26:43 +03:00
docker-compose.yml chore: fix remaining references and update README for open-source launch 2026-04-07 08:03:13 -05:00
docker-entrypoint.sh feat(docker): complete Docker deployment setup (#756) 2026-03-26 15:02:04 +02:00
Dockerfile refactor: extract providers from @archon/core into @archon/providers (#1137) 2026-04-13 09:21:36 +03:00
Dockerfile.user.example chore: fix remaining references and update README for open-source launch 2026-04-07 08:03:13 -05:00
eslint.config.mjs refactor: extract providers from @archon/core into @archon/providers (#1137) 2026-04-13 09:21:36 +03:00
LICENSE feat: prepare for open-source migration to coleam00/Archon 2026-04-04 10:47:22 -05:00
package-lock.json Fix: Status command displays isolation_env_id (#88) (#89) 2025-12-17 12:12:31 +02:00
package.json Release 0.3.6 2026-04-12 12:16:49 +03:00
README.md feat: add archon serve command for one-command web UI install (#1011) 2026-04-10 13:33:47 +03:00
SECURITY.md feat: prepare for open-source migration to coleam00/Archon 2026-04-04 10:47:22 -05:00
tsconfig.json feat: Phase 5 - CLI binary distribution (#325) 2026-01-21 23:51:51 +02:00

Archon

Archon

The first open-source harness builder for AI coding. Make AI coding deterministic and repeatable.

coleam00%2FArchon | Trendshift

License: MIT CI Docs


Archon is a workflow engine for AI coding agents. Define your development processes as YAML workflows - planning, implementation, validation, code review, PR creation - and run them reliably across all your projects.

Like what Dockerfiles did for infrastructure and GitHub Actions did for CI/CD - Archon does for AI coding workflows. Think n8n, but for software development.

Why Archon?

When you ask an AI agent to "fix this bug", what happens depends on the model's mood. It might skip planning. It might forget to run tests. It might write a PR description that ignores your template. Every run is different.

Archon fixes this. Encode your development process as a workflow. The workflow defines the phases, validation gates, and artifacts. The AI fills in the intelligence at each step, but the structure is deterministic and owned by you.

  • Repeatable - Same workflow, same sequence, every time. Plan, implement, validate, review, PR.
  • Isolated - Every workflow run gets its own git worktree. Run 5 fixes in parallel with no conflicts.
  • Fire and forget - Kick off a workflow, go do other work. Come back to a finished PR with review comments.
  • Composable - Mix deterministic nodes (bash scripts, tests, git ops) with AI nodes (planning, code generation, review). The AI only runs where it adds value.
  • Portable - Define workflows once in .archon/workflows/, commit them to your repo. They work the same from CLI, Web UI, Slack, Telegram, or GitHub.

What It Looks Like

Here's an example of an Archon workflow that plans, implements in a loop until tests pass, gets your approval, then creates the PR:

# .archon/workflows/build-feature.yaml
nodes:
  - id: plan
    prompt: "Explore the codebase and create an implementation plan"

  - id: implement
    depends_on: [plan]
    loop:                                      # AI loop - iterate until done
      prompt: "Read the plan. Implement the next task. Run validation."
      until: ALL_TASKS_COMPLETE
      fresh_context: true                      # Fresh session each iteration

  - id: run-tests
    depends_on: [implement]
    bash: "bun run validate"                   # Deterministic - no AI

  - id: review
    depends_on: [run-tests]
    prompt: "Review all changes against the plan. Fix any issues."

  - id: approve
    depends_on: [review]
    loop:                                      # Human approval gate
      prompt: "Present the changes for review. Address any feedback."
      until: APPROVED
      interactive: true                        # Pauses and waits for human input

  - id: create-pr
    depends_on: [approve]
    prompt: "Push changes and create a pull request"

Tell your coding agent what you want, and Archon handles the rest:

You: Use archon to add dark mode to the settings page

Agent: I'll run the archon-idea-to-pr workflow for this.
       → Creating isolated worktree on branch archon/task-dark-mode...
       → Planning...
       → Implementing (task 1/4)...
       → Implementing (task 2/4)...
       → Tests failing - iterating...
       → Tests passing after 2 iterations
       → Code review complete - 0 issues
       → PR ready: https://github.com/you/project/pull/47

Previous Version

Looking for the original Python-based Archon (task management + RAG)? It's fully preserved on the archive/v1-task-management-rag branch.

Getting Started

Most users should start with the Full Setup - it walks you through credentials, installs the Archon skill into your projects, and gives you the web dashboard.

Already have Claude Code and just want the CLI? Jump to the Quick Install.

Full Setup (5 minutes)

Clone the repo and use the guided setup wizard. This configures credentials, platform integrations, and copies the Archon skill into your target projects.

Prerequisites - Bun, Claude Code, and the GitHub CLI

Bun - bun.sh

# macOS/Linux
curl -fsSL https://bun.sh/install | bash

# Windows (PowerShell)
irm bun.sh/install.ps1 | iex

GitHub CLI - cli.github.com

# macOS
brew install gh

# Windows (via winget)
winget install GitHub.cli

# Linux (Debian/Ubuntu)
sudo apt install gh

Claude Code - claude.ai/code

# macOS/Linux/WSL
curl -fsSL https://claude.ai/install.sh | bash

# Windows (PowerShell)
irm https://claude.ai/install.ps1 | iex
git clone https://github.com/coleam00/Archon
cd Archon
bun install
claude

Then say: "Set up Archon"

The setup wizard walks you through everything: CLI installation, authentication, platform selection, and copies the Archon skill to your target repo.

Quick Install (30 seconds)

Already have Claude Code set up? Install the standalone CLI binary and skip the wizard.

macOS / Linux

curl -fsSL https://archon.diy/install | bash

Windows (PowerShell)

irm https://archon.diy/install.ps1 | iex

Homebrew

brew install coleam00/archon/archon

Start Using Archon

Once you've completed either setup path, go to your project and start working:

cd /path/to/your/project
claude
Use archon to fix issue #42
What archon workflows do I have? When would I use each one?

The coding agent handles workflow selection, branch naming, and worktree isolation for you. Projects are registered automatically the first time they're used.

Important: Always run Claude Code from your target repo, not from the Archon repo. The setup wizard copies the Archon skill into your project so it works from there.

Web UI

Archon includes a web dashboard for chatting with your coding agent, running workflows, and monitoring activity. Binary installs: run archon serve to download and start the web UI in one step. From source: ask your coding agent to run the frontend from the Archon repo, or run bun run dev from the repo root yourself.

Register a project by clicking + next to "Project" in the chat sidebar - enter a GitHub URL or local path. Then start a conversation, invoke workflows, and watch progress in real time.

Key pages:

  • Chat - Conversation interface with real-time streaming and tool call visualization
  • Dashboard - Mission Control for monitoring running workflows, with filterable history by project, status, and date
  • Workflow Builder - Visual drag-and-drop editor for creating DAG workflows with loop nodes
  • Workflow Execution - Step-by-step progress view for any running or completed workflow

Monitoring hub: The sidebar shows conversations from all platforms - not just the web. Workflows kicked off from the CLI, messages from Slack or Telegram, GitHub issue interactions - everything appears in one place.

See the Web UI Guide for full documentation.

What Can You Automate?

Archon ships with workflows for common development tasks:

Workflow What it does
archon-assist General Q&A, debugging, exploration - full Claude Code agent with all tools
archon-fix-github-issue Classify issue → investigate/plan → implement → validate → PR → smart review → self-fix
archon-idea-to-pr Feature idea → plan → implement → validate → PR → 5 parallel reviews → self-fix
archon-plan-to-pr Execute existing plan → implement → validate → PR → review → self-fix
archon-issue-review-full Comprehensive fix + full multi-agent review pipeline for GitHub issues
archon-smart-pr-review Classify PR complexity → run targeted review agents → synthesize findings
archon-comprehensive-pr-review Multi-agent PR review (5 parallel reviewers) with automatic fixes
archon-create-issue Classify problem → gather context → investigate → create GitHub issue
archon-validate-pr Thorough PR validation testing both main and feature branches
archon-resolve-conflicts Detect merge conflicts → analyze both sides → resolve → validate → commit
archon-feature-development Implement feature from plan → validate → create PR
archon-architect Architectural sweep, complexity reduction, codebase health improvement
archon-refactor-safely Safe refactoring with type-check hooks and behavior verification
archon-ralph-dag PRD implementation loop - iterate through stories until done
archon-remotion-generate Generate or modify Remotion video compositions with AI
archon-test-loop-dag Loop node test workflow - iterative counter until completion
archon-piv-loop Guided Plan-Implement-Validate loop with human review between iterations

Archon ships 17 default workflows - run archon workflow list or describe what you want and the router picks the right one.

Or define your own. Default workflows are great starting points - copy one from .archon/workflows/defaults/ and customize it. Workflows are YAML files in .archon/workflows/, commands are markdown files in .archon/commands/. Same-named files in your repo override the bundled defaults. Commit them - your whole team runs the same process.

See Authoring Workflows and Authoring Commands.

Add a Platform

The Web UI and CLI work out of the box. Optionally connect a chat platform for remote access:

Platform Setup time Guide
Telegram 5 min Telegram Guide
Slack 15 min Slack Guide
GitHub Webhooks 15 min GitHub Guide
Discord 5 min Discord Guide

Architecture

┌─────────────────────────────────────────────────────────┐
│  Platform Adapters (Web UI, CLI, Telegram, Slack,       │
│                    Discord, GitHub)                      │
└──────────────────────────┬──────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────┐
│                     Orchestrator                        │
│          (Message Routing & Context Management)         │
└─────────────┬───────────────────────────┬───────────────┘
              │                           │
      ┌───────┴────────┐          ┌───────┴────────┐
      │                │          │                │
      ▼                ▼          ▼                ▼
┌───────────┐  ┌────────────┐  ┌──────────────────────────┐
│  Command  │  │  Workflow  │  │    AI Assistant Clients  │
│  Handler  │  │  Executor  │  │      (Claude / Codex)    │
│  (Slash)  │  │  (YAML)    │  │                          │
└───────────┘  └────────────┘  └──────────────────────────┘
      │              │                      │
      └──────────────┴──────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────┐
│              SQLite / PostgreSQL (7 Tables)             │
│   Codebases • Conversations • Sessions • Workflow Runs  │
│    Isolation Environments • Messages • Workflow Events  │
└─────────────────────────────────────────────────────────┘

Documentation

Full documentation is available at archon.diy.

Topic Description
Getting Started Setup guide (Web UI or CLI)
The Book of Archon 10-chapter narrative tutorial
CLI Reference Full CLI reference
Authoring Workflows Create custom YAML workflows
Authoring Commands Create reusable AI commands
Configuration All config options, env vars, YAML settings
AI Assistants Claude and Codex setup details
Deployment Docker, VPS, production setup
Architecture System design and internals
Troubleshooting Common issues and fixes

Contributing

Contributions welcome! See the open issues for things to work on.

Please read CONTRIBUTING.md before submitting a pull request.

License

MIT