mirror of https://github.com/coleam00/Archon synced 2026-04-21 13:37:41 +00:00

feat: DAG workflow engine with parallel execution and conditional branching (#450 )

* feat: add DAG workflow engine with parallel execution and conditional branching

Adds a third workflow execution mode (`nodes:`) alongside `steps:` and `loop:`.
DAG workflows support explicit dependency edges, parallel layer execution via
Promise.allSettled, conditional branching with `when:` expressions, join semantics
via `trigger_rule`, structured JSON output via Claude SDK `outputFormat`, and
upstream output capture via `$node_id.output` substitution.

- New: `DagNode`, `DagWorkflow`, `TriggerRule`, `NodeOutput`, `NodeState` types
- New: `condition-evaluator.ts` — pure `evaluateCondition` for `when:` expressions
- New: `dag-executor.ts` — topological sort, Promise.allSettled parallel layers,
  output capture, trigger rule evaluation, per-node provider/model resolution
- Updated: `loader.ts` — detect `nodes:` key, validate node graph, Kahn cycle detection
- Updated: `executor.ts` — route DAG workflows to dag-executor before loop dispatch
- Updated: `logger.ts` / `event-emitter.ts` — node_start/complete/skip/error events
- Updated: `workflow-bridge.ts` — SSE events for dag_node state changes
- Updated: `AssistantRequestOptions` — added `outputFormat` for Claude structured output
- Updated: `claude.ts` — thread `outputFormat` into SDK Options
- Tests: 37 new tests (condition-evaluator + dag-executor topological sort, trigger
  rules, loader cycle detection, invalid DAG rejection, valid DAG parsing)

* docs: document DAG workflow mode (nodes:) added in phase 1

Add full documentation for the new `nodes:` execution mode:
- docs/authoring-workflows.md: add third workflow type section,
  full DAG schema reference (node fields, trigger_rule, when:
  conditions, output_format, $nodeId.output substitution), a
  DAG example workflow, and update the variable table and summary
- CLAUDE.md: add nodes:/DAG bullet points to the Workflows section
- README.md: add nodes: example alongside steps: and loop:, update
  key design patterns to mention DAG mode

* fix: address DAG workflow engine review findings

Critical bugs:
- DB workflow status never updated after DAG completion (completeWorkflowRun/failWorkflowRun now called)
- resolveNodeProviderAndModel throws silently swallowed by Promise.allSettled — now caught and returned as failed node outputs
- substituteNodeOutputRefs JSON parse failure was silent — now logged as warn

Important fixes:
- Surface unparseable when: conditions to user via safeSendMessage (fail-open preserved)
- Missing upstream nodes treated as failed in checkTriggerRule instead of silently filtered out
- Config load failure in loadCommandPrompt upgraded from warn to error
- Circular import executor ↔ dag-executor broken via new command-validation.ts module
- Remove defensive "should never happen" else branch in executeNodeInternal (DagNode discriminated union guarantees it)

Type improvements:
- DagNode → CommandNode | PromptNode discriminated union (command/prompt mutually exclusive at type level)
- NodeOutput → discriminated union (error: string required on failed, absent on others)
- TRIGGER_RULES constant and isTriggerRule() added to types.ts, deduplicating loader.ts local definitions
- isDagWorkflow simplified to Array.isArray(workflow.nodes)
- output_format array guard in parseDagNode now rejects arrays and null values

Code quality:
- Replace all void workflowEventDb.createWorkflowEvent() with .catch() error logging
- Fix o.error ?? 'unknown' to o.state === 'failed' ? o.error : 'unknown' (type-safe)
- Export substituteNodeOutputRefs for unit testing

Tests (7 new):
- condition-evaluator: number and boolean JSON field coercion
- dag-executor: none_failed_min_one_success with all-skipped deps, nodes+loop conflict,
  invalid trigger_rule rejection, substituteNodeOutputRefs (3 cases), all-nodes-skipped mechanism

* docs: fix two inaccuracies in DAG workflow documentation

- README: "Nodes without depends_on run in parallel" was misleading —
  root nodes run concurrently with each other in the same layer, but a
  single root node doesn't run "in parallel" with anything. Reworded to
  "are in the first layer and run concurrently with each other".

- authoring-workflows.md: Variable Substitution section intro said
  "Loop prompts and DAG node prompts/commands support these variables"
  but step-based workflows also support the same variables via
  substituteWorkflowVariables in executor.ts. Updated to say all
  workflow types.

* fix: address PR #450 review findings in DAG workflow engine

Correctness:
- Remove throw from !anyCompleted path to prevent double workflow_failed
  emission; add safeSendMessage and return instead
- Guard lastSequentialSessionId assignment against undefined overwrite

Type safety:
- Narrow workflowProvider from string to 'claude' | 'codex' in
  resolveNodeProviderAndModel and executeDagWorkflow signatures
- Remove unsafe 'as claude | codex' cast
- Add compile-time assertion that NodeOutput covers all NodeState values

Silent failure surfacing:
- Pre-execution node failure now notifies user via safeSendMessage
- Unexpected Promise.allSettled rejection notifies user and logs layerIdx
- completeWorkflowRun DB failure notifies user of potential inconsistency
- Codex node with output_format now warns user (not just server log)
- Make resolveNodeProviderAndModel async to support the above

Dead code:
- Remove unused 'export type { MergedConfig }' re-export

Comments:
- Update safeSendMessage/substituteWorkflowVariables/loadCommandPrompt
  TODOs to reflect Rule of Three is now met
- Fix executeNodeInternal docstring to mention context:'fresh' nodes
- Fix evaluateCondition @param: "settled" not "completed" upstreams
- Fix NodeOutput doc: "JSON-encoded string from the SDK"

Tests (7 new):
- substituteNodeOutputRefs: unknown node ref resolves to empty string
- checkTriggerRule: absent upstream synthesised as failed (x2)
- buildTopologicalLayers: two independent chains share layers correctly
- evaluateCondition: valid expression returns parsed: true

2026-02-18 15:13:22 +02:00

26 KiB

Raw Blame History

Authoring Workflows for Archon

This guide explains how to create workflows that orchestrate multiple commands into automated pipelines. Read Authoring Commands first - workflows are built from commands.

What is a Workflow?

A workflow is a YAML file that defines a sequence of commands to execute. Workflows enable:

Multi-step automation: Chain multiple AI agents together
Artifact passing: Output from step 1 becomes input for step 2
Autonomous loops: Iterate until a condition is met

name: fix-github-issue
description: Investigate and fix a GitHub issue end-to-end
steps:
  - command: investigate-issue
  - command: implement-issue
    clearContext: true

File Location

Workflows live in .archon/workflows/ relative to the working directory:

.archon/
├── workflows/
│   ├── my-workflow.yaml
│   └── review/
│       └── full-review.yaml    # Subdirectories work
└── commands/
    └── [commands used by workflows]

Archon discovers workflows recursively - subdirectories are fine. If a workflow file fails to load (syntax error, validation failure), it's skipped and the error is reported via /workflow list.

CLI vs Server: The CLI reads workflow files from wherever you run it (sees uncommitted changes). The server reads from the workspace clone at ~/.archon/workspaces/owner/repo/, which only syncs from the remote before worktree creation. If you edit a workflow locally but don't push, the server won't see it.

Three Workflow Types

1. Step-Based Workflows

Execute commands in sequence:

name: feature-development
description: Plan, implement, and create PR for a feature

steps:
  - command: create-plan
  - command: implement-plan
    clearContext: true
  - command: create-pr
    clearContext: true

2. Loop-Based Workflows

Iterate until completion signal:

name: autonomous-implementation
description: Keep iterating until all tests pass

loop:
  until: COMPLETE
  max_iterations: 10
  fresh_context: false

prompt: |
  Read the plan and implement the next incomplete item.
  Run tests after each change.

  When ALL items pass validation, output:
  <promise>COMPLETE</promise>

3. DAG-Based Workflows (nodes:)

Execute nodes in dependency order with parallel layers and conditional branching:

name: classify-and-fix
description: Classify issue type, then run the appropriate fix path

nodes:
  - id: classify
    command: classify-issue
    output_format:
      type: object
      properties:
        type:
          type: string
          enum: [BUG, FEATURE]
      required: [type]

  - id: investigate
    command: investigate-bug
    depends_on: [classify]
    when: "$classify.output.type == 'BUG'"

  - id: plan
    command: plan-feature
    depends_on: [classify]
    when: "$classify.output.type == 'FEATURE'"

  - id: implement
    command: implement-changes
    depends_on: [investigate, plan]
    trigger_rule: none_failed_min_one_success

Nodes without depends_on run immediately. Nodes in the same topological layer run concurrently via Promise.allSettled. Skipped nodes (failed when: condition or trigger_rule) propagate their skipped state to dependants.

Step-Based Workflow Schema

# Required
name: workflow-name              # Unique identifier (kebab-case)
description: |                   # Multi-line description
  What this workflow does.
  When to use it.
  What it produces.

# Optional
provider: claude                 # 'claude' or 'codex' (default: from config)
model: sonnet                    # Model override (default: from config)
modelReasoningEffort: medium     # Codex only: 'minimal' | 'low' | 'medium' | 'high' | 'xhigh'
webSearchMode: live              # Codex only: 'disabled' | 'cached' | 'live'
additionalDirectories:           # Codex only: Additional directories to include
  - /absolute/path/to/other/repo

# Required for step-based
steps:
  - command: step-one            # References .archon/commands/step-one.md

  - command: step-two
    clearContext: true           # Start fresh AI session (default: false)

  - parallel:                    # Run multiple commands concurrently
      - command: review-code
      - command: review-comments
      - command: review-tests

Step Options

Field	Type	Default	Description
`command`	string	required	Command name (without `.md`)
`clearContext`	boolean	`false`	Start fresh session for this step

When to Use `clearContext: true`

Use fresh context when:

The previous step produced an artifact the next step should read
You want to avoid context pollution
The next step has a completely different focus

steps:
  - command: investigate-issue    # Explores codebase, writes artifact
  - command: implement-issue      # Reads artifact, implements fix
    clearContext: true            # Fresh start - works from artifact only

Loop-Based Workflow Schema

name: autonomous-loop
description: |
  Iterate until completion signal detected.
  Good for: PRD implementation, test-fix cycles, iterative refinement.  

# Optional (same as step-based workflows)
provider: claude                 # 'claude' or 'codex' (default: from config)
model: sonnet                    # Model override (default: from config)
modelReasoningEffort: medium     # Codex only
webSearchMode: live              # Codex only
additionalDirectories:           # Codex only
  - /absolute/path/to/other/repo

# Required for loop-based
loop:
  until: COMPLETE                # Signal to detect in AI output
  max_iterations: 10             # Safety limit (fails if exceeded)
  fresh_context: false           # true = fresh session each iteration

# Required for loop-based
prompt: |
  Your instructions here.

  Variables available:
  - $WORKFLOW_ID - unique run identifier
  - $USER_MESSAGE - original trigger
  - $ARGUMENTS - same as $USER_MESSAGE
  - $BASE_BRANCH - base branch (config or auto-detected)
  - $CONTEXT - GitHub issue/PR context (if available)

  When done, output: <promise>COMPLETE</promise>

Loop Options

Field	Type	Default	Description
`until`	string	required	Completion signal to detect
`max_iterations`	number	required	Safety limit
`fresh_context`	boolean	`false`	Fresh session each iteration

Completion Signal Detection

The AI signals completion by outputting:

<promise>COMPLETE</promise>

Or (simpler but less reliable):

COMPLETE

The <promise> tags are recommended - they're case-insensitive and harder to accidentally trigger.

When to Use `fresh_context`

Setting	Use When	Tradeoff
`false`	Short loops (<5 iterations), need memory	Context grows each iteration
`true`	Long loops, stateless work	Must track state in files

Stateful example (memory preserved):

loop:
  fresh_context: false  # AI remembers previous iterations

Stateless example (progress in files):

loop:
  fresh_context: true   # AI starts fresh, reads progress from disk

prompt: |
  Read progress from .archon/progress.json
  Implement the next incomplete item.
  Update progress file.
  When all complete: <promise>COMPLETE</promise>

DAG-Based Workflow Schema

# Required
name: workflow-name
description: |
  What this workflow does.  

# Optional (same as step/loop workflows)
provider: claude
model: sonnet
modelReasoningEffort: medium     # Codex only
webSearchMode: live              # Codex only

# Required for DAG-based
nodes:
  - id: classify                 # Unique node ID (used for dependency refs and $id.output)
    command: classify-issue      # Loads from .archon/commands/classify-issue.md
    output_format:               # Optional: enforce structured JSON output (Claude only)
      type: object
      properties:
        type:
          type: string
          enum: [BUG, FEATURE]
      required: [type]

  - id: investigate
    command: investigate-bug
    depends_on: [classify]       # Wait for classify to complete
    when: "$classify.output.type == 'BUG'"  # Skip if condition is false

  - id: plan
    command: plan-feature
    depends_on: [classify]
    when: "$classify.output.type == 'FEATURE'"

  - id: implement
    command: implement-changes
    depends_on: [investigate, plan]
    trigger_rule: none_failed_min_one_success  # Run if at least one dep succeeded

  - id: inline-node
    prompt: "Summarize the changes made in $implement.output"  # Inline prompt (no command file)
    depends_on: [implement]
    context: fresh               # Force fresh session for this node
    provider: claude             # Per-node provider override
    model: haiku                 # Per-node model override

Node Fields

Field	Type	Default	Description
`id`	string	required	Unique node identifier. Used in `depends_on`, `when:`, and `$id.output` substitution
`command`	string	—	Command name to load from `.archon/commands/`. Mutually exclusive with `prompt`
`prompt`	string	—	Inline prompt string. Mutually exclusive with `command`
`depends_on`	string[]	`[]`	Node IDs that must complete before this node runs
`when`	string	—	Condition expression. Node is skipped if false
`trigger_rule`	string	`all_success`	Join semantics when multiple upstreams exist
`output_format`	object	—	JSON Schema for structured output. Claude only — Codex nodes ignore this
`context`	`'fresh'`	—	Force a fresh AI session for this node
`provider`	`'claude'` \| `'codex'`	inherited	Per-node provider override
`model`	string	inherited	Per-node model override

`trigger_rule` Values

Value	Behavior
`all_success`	Run only if all upstream deps completed successfully (default)
`one_success`	Run if at least one upstream dep completed successfully
`none_failed_min_one_success`	Run if no deps failed AND at least one succeeded (skipped deps are ok)
`all_done`	Run when all deps are in a terminal state (completed, failed, or skipped)

`when:` Condition Syntax

Conditions use string equality against upstream node outputs:

when: "$nodeId.output == 'VALUE'"
when: "$nodeId.output != 'VALUE'"
when: "$nodeId.output.field == 'VALUE'"    # JSON dot notation for output_format nodes

Uses $nodeId.output to reference the full output string of a completed node
Use $nodeId.output.field to access a JSON field (for output_format nodes)
Invalid expressions default to true (fail open — node runs rather than silently skipping)
Skipped nodes propagate their skipped state to dependants

`$node_id.output` Substitution

In node prompts and commands, reference the output of any upstream node:

nodes:
  - id: classify
    command: classify-issue

  - id: fix
    command: implement-fix
    depends_on: [classify]
    # The command file can use $classify.output or $classify.output.field

Variable substitution order:

Standard variables ($WORKFLOW_ID, $USER_MESSAGE, $ARTIFACTS_DIR, etc.)
Node output references ($nodeId.output, $nodeId.output.field)

`output_format` for Structured JSON

Use output_format to enforce JSON output from a Claude node. This uses the Claude Agent SDK's outputFormat option with a JSON Schema:

nodes:
  - id: classify
    command: classify-issue
    output_format:
      type: object
      properties:
        type:
          type: string
          enum: [BUG, FEATURE]
        severity:
          type: string
          enum: [low, medium, high]
      required: [type]

Only supported for Claude nodes. Codex nodes log a warning and ignore output_format
The output is captured as a JSON string and available via $classify.output (full JSON) or $classify.output.type (field access)
Use output_format when downstream nodes need to branch on specific values via when:

Parallel Execution

Run multiple commands concurrently within a step:

steps:
  - command: setup-scope          # Creates shared context

  - parallel:                     # These run at the same time
      - command: review-code
      - command: review-comments
      - command: review-security

  - command: synthesize-reviews   # Combines all review artifacts
    clearContext: true

Parallel Execution Rules

Each parallel command gets a fresh session - no context sharing
All commands must complete before workflow continues
All failures are reported - not just the first one
Shared state via artifacts - commands read/write to known paths

Pattern: Coordinator + Parallel Agents

name: comprehensive-review
steps:
  # Step 1: Coordinator creates scope artifact
  - command: create-review-scope

  # Step 2: Parallel agents read scope, write findings
  - parallel:
      - command: code-review-agent
      - command: comment-quality-agent
      - command: test-coverage-agent

  # Step 3: Synthesizer reads all findings, posts summary
  - command: synthesize-review
    clearContext: true

The coordinator writes to .archon/artifacts/reviews/pr-{n}/scope.md. Each agent reads scope, writes to {category}-findings.md. The synthesizer reads all findings and produces final output.

The Artifact Chain

Workflows work because artifacts pass data between steps:

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│ Step 1          │     │ Step 2          │     │ Step 3          │
│ investigate     │     │ implement       │     │ create-pr       │
│                 │     │                 │     │                 │
│ Reads: input    │     │ Reads: artifact │     │ Reads: git diff │
│ Writes: artifact│────▶│ Writes: code    │────▶│ Writes: PR      │
└─────────────────┘     └─────────────────┘     └─────────────────┘
         │                       │
         ▼                       ▼
  .archon/artifacts/      src/feature.ts
  issues/issue-123.md     src/feature.test.ts

Designing Artifact Flow

When creating a workflow, plan the artifact chain:

Step	Reads	Writes
`investigate-issue`	GitHub issue via `gh`	`.archon/artifacts/issues/issue-{n}.md`
`implement-issue`	Artifact from step 1	Code files, tests
`create-pr`	Git diff	GitHub PR

Each command must know:

Where to find its input
Where to write its output
What format to use

Model Configuration

Workflows can configure AI models and provider-specific options at the workflow level.

Configuration Priority

Model and options are resolved in this order:

Workflow-level - Explicit settings in the workflow YAML
Config defaults - assistants.* in .archon/config.yaml
SDK defaults - Built-in defaults from Claude/Codex SDKs

Provider and Model

name: my-workflow
provider: claude     # 'claude' or 'codex' (default: from config)
model: sonnet        # Model override (default: from config assistants.claude.model)

Claude models:

sonnet - Fast, balanced (recommended)
opus - Powerful, expensive
haiku - Fast, lightweight
claude-* - Full model IDs (e.g., claude-3-5-sonnet-20241022)
inherit - Use model from previous session

Codex models:

Any OpenAI model ID (e.g., gpt-5.3-codex, o5-pro)
Cannot use Claude model aliases

Codex-Specific Options

name: my-workflow
provider: codex
model: gpt-5.3-codex
modelReasoningEffort: medium    # 'minimal' | 'low' | 'medium' | 'high' | 'xhigh'
webSearchMode: live             # 'disabled' | 'cached' | 'live'
additionalDirectories:
  - /absolute/path/to/other/repo
  - /path/to/shared/library

Model reasoning effort:

minimal, low - Fast, cheaper
medium - Balanced (default)
high, xhigh - More thorough, expensive

Web search mode:

disabled - No web access (default)
cached - Use cached search results
live - Real-time web search

Additional directories:

Codex can access files outside the codebase
Useful for shared libraries, documentation repos
Must be absolute paths

Model Validation

Workflows are validated at load time:

Provider/model compatibility checked
Invalid combinations fail with clear error messages
Validation errors shown in /workflow list

Example validation error:

Model "sonnet" is not compatible with provider "codex"

Example: Config Defaults + Workflow Override

.archon/config.yaml:

assistants:
  claude:
    model: haiku  # Fast model for most tasks
  codex:
    model: gpt-5.3-codex
    modelReasoningEffort: low
    webSearchMode: disabled

Workflow with override:

name: complex-analysis
description: Deep code analysis requiring powerful model
provider: claude
model: opus  # Override config default (haiku) for this workflow
steps:
  - command: analyze-architecture
  - command: generate-report

The workflow uses opus instead of the config default haiku, but other settings inherit from config.

Workflow Description Best Practices

Write descriptions that help with routing and user understanding:

description: |
  Investigate and fix a GitHub issue end-to-end.

  **Use when**: User provides a GitHub issue number or URL
  **NOT for**: Feature requests, refactoring, documentation

  **Produces**:
  - Investigation artifact
  - Code changes
  - Pull request linked to issue

  **Steps**:
  1. Investigate root cause
  2. Implement fix with tests
  3. Create PR

Good descriptions include:

What the workflow does
When to use it (and when NOT to)
What it produces
High-level steps

Variable Substitution

All workflow types (steps, loop, nodes) support these variables in prompts and commands:

Variable	Description
`$WORKFLOW_ID`	Unique ID for this workflow run
`$USER_MESSAGE`	Original message that triggered workflow
`$ARGUMENTS`	Same as `$USER_MESSAGE`
`$ARTIFACTS_DIR`	Pre-created artifacts directory for this workflow run
`$BASE_BRANCH`	Base branch from config or auto-detected from repo
`$CONTEXT`	GitHub issue/PR context (if available)
`$EXTERNAL_CONTEXT`	Same as `$CONTEXT`
`$ISSUE_CONTEXT`	Same as `$CONTEXT`
`$nodeId.output`	Output of a completed upstream DAG node (DAG workflows only)
`$nodeId.output.field`	JSON field from a structured upstream node output (DAG workflows only)

Example:

prompt: |
  Workflow: $WORKFLOW_ID
  Original request: $USER_MESSAGE

  GitHub context:
  $CONTEXT

  [Instructions...]

Example Workflows

Simple Two-Step

name: quick-fix
description: |
  Fast bug fix without full investigation.
  Use when: Simple, obvious bugs.
  NOT for: Complex issues needing root cause analysis.  

steps:
  - command: analyze-and-fix
  - command: create-pr
    clearContext: true

Investigation Pipeline

name: fix-github-issue
description: |
  Full investigation and fix for GitHub issues.

  Use when: User provides issue number/URL
  Produces: Investigation artifact, code fix, PR  

steps:
  - command: investigate-issue    # Creates .archon/artifacts/issues/issue-{n}.md
  - command: implement-issue      # Reads artifact, implements fix
    clearContext: true

Parallel Review

name: comprehensive-pr-review
description: |
  Multi-agent PR review covering code, comments, tests, and security.

  Use when: Reviewing PRs before merge
  Produces: Review findings, synthesized summary  

steps:
  - command: create-review-scope

  - parallel:
      - command: code-review-agent
      - command: comment-quality-agent
      - command: test-coverage-agent
      - command: security-review-agent

  - command: synthesize-reviews
    clearContext: true

Autonomous Loop

name: implement-prd
description: |
  Autonomously implement a PRD, iterating until all stories pass.

  Use when: Full PRD implementation
  Requires: PRD file at .archon/prd.md  

loop:
  until: COMPLETE
  max_iterations: 15
  fresh_context: true       # Progress tracked in files

prompt: |
  # PRD Implementation Loop

  Workflow: $WORKFLOW_ID

  ## Instructions

  1. Read PRD from `.archon/prd.md`
  2. Read progress from `.archon/progress.json`
  3. Find the next incomplete story
  4. Implement it with tests
  5. Run validation: `bun run validate`
  6. Update progress file
  7. If ALL stories complete and validated:
     Output: <promise>COMPLETE</promise>

  ## Progress File Format

  ```json
  {
    "stories": [
      {"id": 1, "status": "complete", "validated": true},
      {"id": 2, "status": "in_progress", "validated": false}
    ]
  }

Important

Implement ONE story per iteration
Always run validation after changes
Update progress file before ending iteration


### DAG: Classify and Route

```yaml
name: classify-and-fix
description: |
  Classify issue type and run the appropriate path in parallel.

  Use when: User reports a bug or requests a feature
  Produces: Code fix (bug path) or feature plan (feature path), then PR

nodes:
  - id: classify
    command: classify-issue
    output_format:
      type: object
      properties:
        type:
          type: string
          enum: [BUG, FEATURE]
      required: [type]

  - id: investigate
    command: investigate-bug
    depends_on: [classify]
    when: "$classify.output.type == 'BUG'"

  - id: plan
    command: plan-feature
    depends_on: [classify]
    when: "$classify.output.type == 'FEATURE'"

  - id: implement
    command: implement-changes
    depends_on: [investigate, plan]
    trigger_rule: none_failed_min_one_success

  - id: create-pr
    command: create-pr
    depends_on: [implement]
    context: fresh

Test-Fix Loop

name: fix-until-green
description: |
  Keep fixing until all tests pass.
  Use when: Tests are failing and need automated fixing.  

loop:
  until: ALL_TESTS_PASS
  max_iterations: 5
  fresh_context: false      # Remember what we've tried

prompt: |
  # Fix Until Green

  ## Instructions

  1. Run tests: `bun test`
  2. If all pass: <promise>ALL_TESTS_PASS</promise>
  3. If failures:
     - Analyze the failure
     - Fix the code (not the test, unless test is wrong)
     - Run tests again

  ## Rules

  - Don't skip or delete failing tests
  - Don't modify test expectations unless they're wrong
  - Each iteration should fix at least one failure

Common Patterns

Pattern: Gated Execution

Run different paths based on conditions:

name: smart-fix
description: Route to appropriate fix strategy based on issue complexity

steps:
  - command: analyze-complexity   # Writes complexity assessment
  - command: route-to-strategy    # Reads assessment, invokes appropriate workflow
    clearContext: true

The route-to-strategy command reads the complexity artifact and can invoke sub-workflows.

Pattern: Checkpoint and Resume

For long workflows, save checkpoints:

name: large-migration
description: Multi-file migration with checkpoint recovery

steps:
  - command: create-migration-plan    # Writes plan artifact
  - command: migrate-batch-1          # Checkpoints after each batch
    clearContext: true
  - command: migrate-batch-2
    clearContext: true
  - command: validate-migration
    clearContext: true

Each batch command saves progress to an artifact, allowing recovery if the workflow fails mid-way.

Pattern: Human-in-the-Loop

Pause for human approval:

name: careful-refactor
description: Refactor with human approval at each stage

steps:
  - command: propose-refactor         # Creates proposal artifact
  # Workflow pauses here - human reviews proposal
  # Human triggers next workflow to continue:

Then a separate workflow to continue:

name: execute-refactor
steps:
  - command: execute-approved-refactor
  - command: create-pr
    clearContext: true

Debugging Workflows

Check Workflow Discovery

bun run cli workflow list

Run with Verbose Output

bun run cli workflow run {name} "test input"

Watch the streaming output to see each step.

Check Artifacts

After a workflow runs, check the artifacts:

ls -la .archon/artifacts/
cat .archon/artifacts/issues/issue-*.md

Check Logs

Workflow execution logs to:

.archon/logs/{workflow-id}.jsonl

Each line is a JSON event (step start, AI response, tool call, etc.).

Workflow Validation

Before deploying a workflow:

Test each command individually

bun run cli workflow run {workflow} "test input"

Verify artifact flow
- Does step 1 produce what step 2 expects?
- Are paths correct?
- Is the format complete?
Test edge cases
- What if the input is invalid?
- What if a step fails?
- What if an artifact is missing?
Check iteration limits (for loops)
- Is max_iterations reasonable?
- What happens when limit is hit?

Summary

Workflows orchestrate commands - YAML files that define execution order
Three types: Step-based (sequential), loop-based (iterative), and DAG-based (dependency graph)
Artifacts are the glue - Commands communicate via files, not memory
clearContext: true - Fresh session for a step, works from artifacts
Parallel execution - Step parallel: blocks and DAG nodes in the same layer both run concurrently
Loops need signals - Use <promise>COMPLETE</promise> to exit
DAG branching - when: conditions and trigger_rule control which nodes run
output_format - Enforce structured JSON output from Claude nodes for reliable branching
Test thoroughly - Each command, the artifact flow, and edge cases

26 KiB Raw Blame History