Archon

mirror of https://github.com/coleam00/Archon synced 2026-04-21 21:47:53 +00:00

Author	SHA1	Message	Date
Rasmus Widing	779f9af63e	Fix: Add stale workflow cleanup and defense-in-depth error handling (#237 ) * Fix: Add stale workflow cleanup and defense-in-depth error handling Problem: Workflows could get stuck in "running" state indefinitely when the async generator disconnected but the AI subprocess continued working. This blocked new workflow invocations with "Workflow already running" errors. Root cause: No cleanup mechanism existed for workflows that failed to complete due to disconnection between the executor and the Claude SDK. Solution (defense-in-depth): 1. Activity-based staleness detection: Workflows inactive for 15+ minutes are auto-failed when a new workflow is triggered on the same conversation 2. Top-level error handling: All errors in workflow execution are caught and the workflow is properly marked as failed (prevents stuck state) 3. Manual cancel command: /workflow cancel lets users force-fail stuck workflows immediately Changes: - Add last_activity_at column via migration for staleness tracking - Add updateWorkflowActivity() to track activity during execution - Add staleness check before blocking concurrent workflows - Wrap workflow execution in try-catch to ensure failure is recorded - Add /workflow cancel subcommand to command handler - Update test to match new error handling behavior Fixes #232 * docs: Add /workflow cancel command to documentation * Improve error handling and add comprehensive tests for stale workflow cleanup Error handling improvements: - Add workflow ID and error context to updateWorkflowActivity logs - Add stack trace, error name, and cause to top-level catch block - Separate DB failure recording from file logging for clearer error messages - Add try-catch around staleness cleanup with user-facing error message - Check sendCriticalMessage return value and log when user not notified Test coverage additions: - Add staleness detection tests (stale vs non-stale, fallback to started_at) - Add /workflow cancel command tests - Add updateWorkflowActivity function tests (including non-throwing behavior) All 845 tests pass, type-check clean, lint clean.	2026-01-15 21:31:38 +02:00
Rasmus Widing	f0dec0cbb9	fix: Update router prompt and apply Prettier formatting - Fix router to not hardcode "review-pr" workflow name, instead directing AI to check the available workflow list for PR review workflows - Apply Prettier auto-formatting to multiple files	2026-01-14 11:59:50 +02:00
Rasmus Widing	8191db836b	feat: Add parallel block execution for workflows (#217 ) * feat: Add parallel block execution for workflows - Add SingleStep, ParallelBlock, and WorkflowStep types - Extend workflow parser to handle parallel: blocks in YAML - Implement executeParallelBlock() for concurrent step execution - Refactor executeStep into executeStepInternal for reusability - Update main execution loop to handle parallel blocks - Add 8 comprehensive tests for parallel block parsing - Add logging functions for parallel block events - Maintain backward compatibility with existing workflows Each parallel step runs as an independent Claude Code agent with its own fresh session, all working on the same worktree. Steps inside parallel blocks execute concurrently using Promise.all(), enabling 2-5x faster execution for parallel-safe workflows like code reviews. Resolves #205 * fix: Address PR #217 review feedback for parallel block execution Fixes based on comprehensive code review: Type Errors Fixed: - Added type guards (isSingleStep) before accessing .command on WorkflowStep - Removed unused executeStep function (dead code) - Removed unnecessary type assertions in type guards Parser Bugs Fixed: - Fixed nested parallel detection to check raw input before parsing - Fixed invalid command rejection to fail entire parallel block if any step invalid Error Handling Restored: - Added logWorkflowError() calls for parallel and sequential failure paths - Added logParallelBlockStart/Complete calls for workflow logging Test Improvements: - Fixed step notification format from "Step N" to "Step N/M" - Added 5 new tests for parallel block execution covering: - Basic parallel execution - Parallel failure handling - Sequential + parallel mix workflows - Notification format verification - Fresh session isolation for parallel steps Type Design Improvements: - Made type guards mutually exclusive (isSingleStep checks !('parallel' in step)) - Removed unnecessary type assertions after 'in' checks All 828 tests pass, type-check passes, no lint errors. * fix: Address PR #217 review feedback for parallel block execution Critical fixes: - Restore try-catch around updateWorkflowRun to prevent transient DB errors from crashing workflows (regression fix) - Update documentation to reflect wait-for-all behavior (not fail-fast) Important fixes: - Report ALL parallel failures in error message, not just the first one - Add error handling for DB operations in failure path to ensure user notification even when DB is unavailable - Replace `as any` casts with proper type guards in loader tests - Make StepDefinition a type alias for SingleStep (removes duplication) Test improvements: - Add test for workflows with multiple parallel blocks - Add test for all-parallel-steps-fail scenario - Add explicit backward compatibility test for step-only workflows Loader improvements: - Aggregate validation errors for better debugging - all errors are now collected and logged together instead of one at a time * fix: Add missing try-catch for sequential step failure and session test - Wrap failWorkflowRun in try-catch for sequential step failures (matches parallel block failure path for consistency) - Add test verifying session reset after parallel block completes (next sequential step correctly gets fresh session)	2026-01-13 13:22:00 +02:00
Rasmus Widing	34dce11185	Fix: Workflow executor missing GitHub issue context (#211 ) (#212 ) * Investigate issue #211: Workflow executor missing GitHub context Root cause: issueContext built in GitHub adapter and used for routing, but not passed to workflow executor. Context lost during workflow simplification refactor (commit `0352067`). Fix requires threading issueContext through orchestrator -> workflow routing -> executor -> variable substitution. Pattern already exists in command system (orchestrator.ts:473-476). * Investigate issue #211: Workflow executor missing GitHub context * Fix: Workflow executor missing GitHub issue context (#211) When workflows were triggered on GitHub issues/PRs, the issue context (title, body, labels) was built but never passed to the workflow executor. This caused AI to ask clarifying questions instead of executing workflows with the provided context. Changes: - Added issueContext parameter throughout workflow execution chain - Threaded context from orchestrator → routing → executor → steps - Added variable substitution support ($CONTEXT, $EXTERNAL_CONTEXT, $ISSUE_CONTEXT) - Appended context to prompts following existing command system pattern - Stored context in WorkflowRun metadata for session persistence Fixes #211 * Fix: Add missing issueContext parameter to executeLoopWorkflow Self-code-review caught critical bug where executeLoopWorkflow function used issueContext variable without receiving it as a parameter. This would cause compilation failure and runtime error for any loop-based workflow triggered from GitHub. Changes: - Added issueContext parameter to executeLoopWorkflow function signature - Passed issueContext argument at call site in executeWorkflow This completes the context threading for ALL workflow execution paths (both step-based and loop-based workflows). * Archive implementation report for issue #211 * Fix PR review findings: test coverage, silent failures, and double-context - Fix CI: Update workflows.test.ts to expect 5 parameters (metadata) - Fix silent failure: Clear $CONTEXT variables when issueContext is undefined to avoid sending literal "$CONTEXT" to AI - Fix double-context: Only append context if not already substituted via $CONTEXT variables (prevents duplicate context in prompts) - Add comprehensive tests for issueContext handling: - Step workflow with context passing and $CONTEXT substitution - Loop workflow with context passing and $ISSUE_CONTEXT substitution - Metadata storage verification - Edge case: clearing variables when no context provided - Add JSDoc documentation for issueContext parameters - Introduce SubstitutionResult type for cleaner tracking of context usage * Simplify workflow context substitution with helper function - Extract CONTEXT_VAR_PATTERN as module-level constant (single compilation) - Add buildPromptWithContext() helper to eliminate duplication - Simplify substituteWorkflowVariables() with chained replacements - Reduce code by 5 lines while improving readability * Address PR review findings: regex safety, error handling, tests, and docs Important fixes: - Fix regex lastIndex hazard by creating fresh regex instances for each operation - Add user warning when loop workflow metadata tracking fails (database issues) - Add JSON.stringify validation in createWorkflowRun to catch serialization errors Test improvements: - Add test for context with special regex characters ($, .*, [a-z]+, etc.) - Add test for multiple context variables in same prompt Documentation: - Add @param tags to substituteWorkflowVariables() JSDoc - Expand buildPromptWithContext() JSDoc with all 5 parameters documented - Enhance context variable clearing log with structured data	2026-01-13 12:20:26 +02:00
Rasmus Widing	1bf8af87b1	Fix: Detect and block concurrent workflow execution (#192 ) (#196 ) * Investigate issue #192: Detect and block concurrent workflow execution * Fix: Detect and block concurrent workflow execution (#192) When multiple workflow triggers are posted on the same issue, each one was starting a separate workflow execution, leading to duplicate work, wasted API tokens, and potential duplicate PRs. Changes: - Add concurrency check before createWorkflowRun() in executeWorkflow() - Use existing getActiveWorkflowRun() to query for active workflows - Send rejection message when workflow already running - Update test mocks to properly handle getActiveWorkflowRun() queries Fixes #192 * Archive investigation for issue #192	2026-01-13 10:58:18 +02:00
Rasmus Widing	592ded2733	Fix: Use code formatting for workflow/command names (#156 ) (#161 ) * Investigate issue #156: Add code formatting for workflow/command names Created comprehensive investigation artifact analyzing the enhancement request to use backticks for workflow and command names in bot messages. Assessment: - Priority: MEDIUM (improves clarity, doesn't block functionality) - Complexity: LOW (simple string formatting in 2 files + tests) - Confidence: HIGH (all locations identified, pattern established) Changes required: - src/workflows/executor.ts: 4 message templates - src/handlers/command-handler.ts: workflow list formatting - src/workflows/executor.test.ts: test expectations Artifact: .archon/artifacts/issues/issue-156.md * Fix: Use code formatting for workflow/command names (#156) Workflow and command names in bot messages were shown in plain text or bold formatting, making them hard to distinguish from prose. This reduces readability and is inconsistent with how commands are shown elsewhere (e.g., /help uses backticks). Changes: - Wrap workflow names in backticks in workflow start/complete/failure messages - Wrap command names in backticks in step notifications - Wrap workflow/command names in backticks in /workflows list - Update all test expectations to match new formatting Fixes #156 * Archive investigation for issue #156	2026-01-12 15:26:28 +02:00
Rasmus Widing	b3af7bd098	Add emoji status indicators to workflow messages (#155 ) (#160 ) * Investigate issue #155: Add emoji status indicators to workflow messages * Add emoji status indicators to workflow messages (#155) Workflow status messages now include visual emoji indicators for instant recognition: 🚀 start, ⏳ progress, ✅ success, ❌ failure. This improves UX across all platforms without changing logic. Changes: - Add 🚀 emoji to workflow start messages - Add ⏳ emoji to step progress messages - Add ✅ emoji to workflow completion messages - Add ❌ emoji to workflow failure messages (both step and database errors) - Update all tests to expect emoji prefixes Fixes #155 * Archive investigation for issue #155	2026-01-12 15:23:42 +02:00
Rasmus Widing	e47795aa6b	Fix: Skip step notification for single-step workflows (#154 ) (#159 ) * Investigate issue #154: Skip step notification for single-step workflows * Fix: Skip step notification for single-step workflows (#154) Single-step workflows like 'assist' or 'review-pr' were showing "Step 1/1" notifications which add no information since the workflow start message already indicates what's running. This creates unnecessary noise for users. Changes: - Add conditional check in executor.ts to only send step notifications for multi-step workflows - Update test to verify single-step workflows skip "Step 1/1" notification - Multi-step workflows continue to show progress with step notifications Fixes #154 * Archive investigation for issue #154	2026-01-12 15:21:40 +02:00
Rasmus Widing	84995d6abc	Fix: Remove redundant workflow completion message on GitHub (#158 ) (#162 ) * Investigate issue #158: Remove redundant workflow completion message for GitHub * Fix: Remove redundant workflow completion message on GitHub (#158) After workflows post their artifacts to GitHub issues, a separate "Workflow complete" comment was creating redundant notifications. Since GitHub uses batch mode, the artifact itself signals completion, making the extra comment unnecessary noise. Changes: - Add platform check in workflow executor to suppress completion message for GitHub - Keep completion message for streaming platforms (Telegram, Slack, Discord) - Add tests for platform-specific completion message behavior - Error messages remain unchanged (still sent for all platforms) Fixes #158 * Archive investigation for issue #158 * Address PR review feedback - Fix inaccurate comment: Changed from "streaming platforms" to "non-GitHub platforms" since Slack/Discord default to batch mode - Add structured context to suppression log (workflowName, workflowId, conversationId) for better debugging - Add explicit tests for Slack and Discord platforms * Simplify platform completion tests with it.each() Consolidate 4 near-identical test cases into single parameterized test: - telegram, slack, discord: should send completion message - github: should suppress completion message Reduces test code from 97 lines to 35 lines while maintaining coverage.	2026-01-12 15:21:23 +02:00
Rasmus Widing	e2b53e7a65	feat: Add Ralph-style autonomous iteration loops to workflow engine (#168 ) * Fix outdated command loading documentation * feat: Add Ralph-style autonomous iteration loops to workflow engine Enable workflows to iterate autonomously until a completion signal is detected (e.g., `<promise>COMPLETE</promise>`) or max iterations reached. Changes: - Add LoopConfig type with until signal, max_iterations, fresh_context - Extend WorkflowDefinition to support loop + prompt (mutually exclusive with steps) - Add executeLoopWorkflow function with completion signal detection - Update loader to parse and validate loop configuration - Add ralph.yaml example workflow demonstrating PRD implementation pattern - Add 22 new tests covering loop execution and parsing Loop workflows allow developers to run long-running tasks (like PRD implementation) without manual phase transitions, following the pattern popularized by Geoffrey Huntley. * Add test-loop workflow for Ralph loop testing * feat: Update worktree config to copy .archon files - Include all .archon files in worktree copy (not just .archon/ralph) - Update ralph.yaml with dynamic path detection for feature directories - Add PR creation step at completion - Use {prd-dir} variable for flexible path handling * feat: Add ralph-prd command for generating PRD files Creates structured PRD files for Ralph autonomous loops: - Outputs to .archon/ralph/{feature-slug}/ directory - Generates prd.md (full context) and prd.json (story tracking) - Feature-based naming to avoid conflicts between projects - Guides user through requirements gathering phases * feat: Add ralph-fresh workflow with fresh_context: true Fresh context mode for Ralph loops where each iteration: - Starts with a clean slate (no memory of previous iterations) - Re-reads progress.txt, prd.json, prd.md to understand current state - Relies on progress.txt "Codebase Patterns" section for learnings - Better for long loops and avoiding context confusion * refactor: Rename ralph workflows with explicit descriptions - Rename ralph.yaml → ralph-stateful.yaml (persistent memory mode) - Update ralph-fresh.yaml description for clarity - Both workflows now require explicit invocation - Clear INVOKE WITH / NOT FOR / HOW IT WORKS / TRADE-OFFS sections - Neither is "default" - user must choose explicitly * chore: Set max_iterations to 10 on ralph workflows * fix: Address PR review feedback for loop workflow - Wrap database metadata update in try-catch to prevent misleading errors - Add dropped message tracking and user warning in loop workflow - Make plain signal detection more restrictive (end of output or own line) - Add context (line number, preview) to YAML parse errors - Make max iterations error message actionable with suggestions - Remove unnecessary type assertions after discriminated union refactor * chore: Remove unrelated plan files from branch	2026-01-12 11:13:21 +02:00
Rasmus Widing	471ac592a9	Improve error handling in workflow engine (#150 ) * Investigate error handling improvements (#128, #126, #129) Add combined investigation artifact for batched error handling issues: - #128: loadCommandPrompt error specificity - #126: AI client error classification for user hints - #129: Move isValidCommandName to parse-time validation Artifact includes detailed implementation plan with code changes. * Improve error handling in workflow engine (#128, #126, #129) The workflow engine caught and handled errors but lost important context. Users saw generic messages like "Command prompt not found" without knowing the specific cause (security rejection vs empty file vs network timeout). Changes: - Add LoadCommandResult discriminated union for specific error reasons - Return reason: 'invalid_name' \| 'empty_file' \| 'not_found' from loadCommandPrompt - Add user-friendly hints for AI client errors based on error classification - Move command name validation to parse time in loader (fail fast) - Export isValidCommandName for use in loader - Add tests for parse-time command validation Fixes #128, fixes #126, fixes #129 * Archive investigation artifact for #128, #126, #129 * Address PR review feedback for error handling improvements - Add database error handling in executeWorkflow with try-catch blocks - Fix loadRepoConfig to log non-ENOENT errors (YAML syntax, permission denied) - Extend LoadCommandResult type with permission_denied and read_error reasons - Update loadCommandPrompt to return specific errors for EACCES vs other errors - Add JSDoc documenting content non-empty invariant - Re-export LoadCommandResult from types/index.ts for consistency - Add test for 403 permission error hint - Add unit tests for isValidCommandName function	2026-01-07 22:10:40 +02:00
Cole Medin	61af6fba65	Linting and unit test fixes	2026-01-03 18:59:07 -06:00
Rasmus Widing	a4da1649b6	Improve workflow router to always invoke a workflow (#135 ) * Improve workflow router to always invoke a workflow - Add $ARGUMENTS substitution to workflow executor so commands receive user message - Create assist workflow as catch-all fallback for questions, debugging, one-off tasks - Create review-pr workflow wrapper for code reviews - Update router prompt to require workflow selection (no text-only responses) - Enhance workflow descriptions to serve as routing instructions - Add tests for $ARGUMENTS substitution and multi-line descriptions * Fix re-triggering loop: remove @archon from command output The investigate-issue command was outputting "@archon implement issue #X" which triggered the bot to process its own output as a new mention.	2026-01-03 14:54:35 +02:00
Rasmus Widing	68bccfcabc	Wrap platform.sendMessage calls in try-catch in executor (#132 ) * Wrap platform.sendMessage calls in try-catch in workflow executor Add safeSendMessage() helper that catches and logs errors without rethrowing, ensuring platform message failures don't crash workflows or leave database state inconsistent. Fixes #127 * Improve error handling: classify errors, retry critical messages, track dropped messages - Classify errors as TRANSIENT, FATAL, or UNKNOWN to handle them appropriately - Only suppress transient/unknown errors; fatal auth errors are rethrown - Add sendCriticalMessage() with retry logic for failure/completion notifications - Track dropped messages in streaming mode and warn user - Enhance logging with workflowId, stepIndex, platformType, stack trace - Add 8 new tests covering error classification, retry, and edge cases * Simplify error handling in workflow executor - Extract error patterns to module-level constants (FATAL_PATTERNS, TRANSIENT_PATTERNS) - Add matchesPattern helper for cleaner pattern matching - Consolidate error logging into logSendError helper function - Extract delay helper for cleaner exponential backoff - Remove redundant timestamp fields (logging systems add these automatically) - Update test to match simplified logging structure No functionality changes - all error handling behavior preserved.	2026-01-02 15:40:23 +02:00
Rasmus Widing	37cf615207	Fix workflow engine type safety and routing - Change step→command in types and YAML - Add StepResult discriminated union for proper error handling - Remove global workflow registry (pass as parameters) - Rewrite router with /invoke-workflow pattern and restrictive prompt - Add path validation to prevent directory traversal - Move .archon/steps/ to .archon/commands/ - Add error handling to db/workflows.ts - Update tests for new patterns	2026-01-02 12:09:02 +02:00
Rasmus Widing	024f33708d	Expand test suite for workflow engine - Add logger.test.ts with 15 tests for JSONL logging - Add db/workflows.test.ts with 15 tests for database operations - Add edge case tests to loader.test.ts, router.test.ts, executor.test.ts - Fix test pollution by mocking at connection level instead of module level - All 641 tests pass	2026-01-02 11:31:04 +02:00
Rasmus Widing	759cb303a9	Add workflow engine for multi-step AI orchestration Implement a prompt orchestrator that chains prompts together for sequential AI execution with artifacts passed between steps: - Add workflow YAML parser for .archon/workflows/ discovery - Create step executor with context management (clearContext flag) - Implement router response parser for WORKFLOW: name detection - Add JSONL event logging for observability - Create /workflow list and /workflow reload commands - Add database table for workflow run tracking Workflows enable automated multi-step development tasks like plan -> implement -> create-pr with each step receiving context from previous steps.	2026-01-02 11:31:04 +02:00

17 commits