Signed-off-by: Oleg Ivaniv <me@olegivaniv.com> Co-authored-by: Albert Alises <albert.alises@gmail.com> Co-authored-by: Jaakko Husso <jaakko@n8n.io> Co-authored-by: Dimitri Lavrenük <20122620+dlavrenuek@users.noreply.github.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com> Co-authored-by: Tuukka Kantola <Tuukkaa@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Mutasem Aldmour <4711238+mutdmour@users.noreply.github.com> Co-authored-by: Raúl Gómez Morales <raul00gm@gmail.com> Co-authored-by: Elias Meire <elias@meire.dev> Co-authored-by: Dimitri Lavrenük <dimitri.lavrenuek@n8n.io> Co-authored-by: Tomi Turtiainen <10324676+tomi@users.noreply.github.com> Co-authored-by: Mutasem Aldmour <mutasem@n8n.io>
20 KiB
Architecture
Overview
Instance AI is an autonomous agent embedded in every n8n instance. It provides a natural language interface to workflows, executions, credentials, and nodes — with the goal that most users never need to interact with workflows directly.
The system follows the deep agent architecture — an orchestrator with explicit planning, dynamic sub-agent delegation, observational memory, and structured prompts. The LLM controls the execution loop; the architecture provides the primitives.
The system is LLM-agnostic and designed to work with any capable language model.
System Diagram
graph TB
subgraph Frontend ["Frontend (Vue 3)"]
UI[Chat UI] --> Store[Pinia Store]
Store --> SSE[SSE Event Client]
Store --> API[Stream API Client]
end
subgraph Backend ["Backend (Express)"]
API -->|POST /instance-ai/chat/:threadId| Controller
SSE -->|GET /instance-ai/events/:threadId| EventEndpoint[SSE Endpoint]
Controller --> Service[InstanceAiService]
EventEndpoint --> EventBus[Event Bus]
end
subgraph Orchestrator ["Orchestrator Agent"]
Service --> Factory[Agent Factory]
Factory --> OrcAgent[Orchestrator]
OrcAgent --> PlanTool[Plan Tool]
OrcAgent --> DelegateTool[Delegate Tool]
OrcAgent --> DirectTools[Domain Tools]
OrcAgent --> MCPTools[MCP Tools]
OrcAgent --> Memory[Memory System]
end
subgraph SubAgents ["Dynamic Sub-Agents"]
DelegateTool -->|spawns| SubAgent1[Sub-Agent A]
DelegateTool -->|spawns| SubAgent2[Sub-Agent B]
SubAgent1 --> ToolSubset1[Tool Subset]
SubAgent2 --> ToolSubset2[Tool Subset]
end
subgraph EventSystem ["Event System"]
OrcAgent -->|publishes| EventBus
SubAgent1 -->|publishes| EventBus
SubAgent2 -->|publishes| EventBus
EventBus --> ThreadStorage[Thread Event Storage]
end
subgraph Filesystem ["Filesystem Access"]
Service -->|auto-detect| FSProvider{Provider}
FSProvider -->|bare metal| LocalFS[LocalFilesystemProvider]
FSProvider -->|container/cloud| Gateway[LocalGateway]
Gateway -->|SSE + HTTP POST| Daemon["@n8n/fs-proxy daemon"]
end
subgraph n8n ["n8n Services"]
Service --> Adapter[AdapterService]
Adapter --> WorkflowService
Adapter --> ExecutionService
Adapter --> CredentialsService
Adapter --> NodeLoader[LoadNodesAndCredentials]
end
subgraph Storage ["Storage"]
Memory --> PostgreSQL
Memory --> SQLite[LibSQL / SQLite]
ThreadStorage --> PostgreSQL
ThreadStorage --> SQLite
end
subgraph Sandbox ["Sandbox (Optional)"]
Service -->|per-thread| WorkspaceManager[Workspace Manager]
WorkspaceManager --> DaytonaSandbox[Daytona Container]
WorkspaceManager --> LocalSandbox[Local Sandbox]
DaytonaSandbox --> SandboxFS[Filesystem + execute_command]
LocalSandbox --> SandboxFS
end
subgraph MCP ["MCP Servers"]
MCPTools --> ExternalServer1[External MCP Server]
MCPTools --> ExternalServer2[External MCP Server]
end
Deep Agent Architecture
The system implements the four pillars of the deep agent pattern:
1. Explicit Planning
The orchestrator uses a plan tool to externalize its execution strategy.
Between phases of the autonomous loop, the orchestrator reviews and updates the
plan. This serves as a context engineering mechanism — writing the plan forces
structured reasoning, and reading it back prevents goal drift over long loops.
Plans are stored in thread-scoped storage (see ADR-017).
2. Dynamic Sub-Agent Composition
The orchestrator composes sub-agents on the fly via the delegate tool. Instead
of a fixed taxonomy (Builder, Debugger, Evaluator), the orchestrator specifies:
- Role — free-form description ("workflow builder", "credential validator")
- Instructions — task-specific system prompt
- Tools — subset of registered tools the sub-agent needs
Sub-agents are stateless (ADR-011), get clean context windows, and publish events directly to the event bus (ADR-014). They cannot spawn their own sub-agents.
3. Observational Memory
Mastra's observational memory compresses old messages into dense observations via background Observer and Reflector agents. Tool-heavy workloads (workflow definitions, execution results) get 5–40x compression. This prevents context degradation over 50+ step autonomous loops (see ADR-016).
4. Structured System Prompt
The orchestrator's system prompt covers delegation patterns, planning discipline, loop behavior, and tool usage guidelines. Sub-agents get focused, task-specific prompts written by the orchestrator.
Agent Hierarchy
graph TD
O[Orchestrator Agent] -->|delegate| S1[Sub-Agent: role A]
O -->|build-workflow-with-agent| S2[Builder Agent]
O -->|plan| S3[Planned Tasks]
O -->|direct| T1[list-workflows]
O -->|direct| T2[run-workflow]
O -->|direct| T3[get-execution]
O -->|direct| T4[plan]
S3 -->|kind: build-workflow| S4[Builder Agent]
S3 -->|kind: manage-data-tables| S5[Data Table Agent]
S3 -->|kind: research| S6[Research Agent]
S3 -->|kind: delegate| S7[Custom Sub-Agent]
S1 -->|tools| T5[get-execution]
S1 -->|tools| T6[get-workflow]
S2 -->|tools| T7[search-nodes]
S2 -->|tools| T8[build-workflow]
style O fill:#f9f,stroke:#333
style S1 fill:#bbf,stroke:#333
style S2 fill:#bbf,stroke:#333
style S3 fill:#ffa,stroke:#333
style S4 fill:#bbf,stroke:#333
style S5 fill:#bbf,stroke:#333
style S6 fill:#bbf,stroke:#333
style S7 fill:#bbf,stroke:#333
Orchestrator handles directly:
- Read-only queries (list-workflows, get-execution, list-credentials)
- Execution triggers (run-workflow)
- Planning (plan tool — always direct)
- Verification and credential application (verify-built-workflow, apply-workflow-credentials)
Single-task delegation (delegate, build-workflow-with-agent):
- Complex multi-step operations (building workflows, debugging failures)
- Tasks that benefit from clean context (no accumulated noise)
- Builder agent runs as a background task — returns immediately
Multi-task plans (plan tool):
- Dependency-aware task graphs with parallel execution
- Each task dispatched to a preconfigured agent (builder, data-table, research, or delegate)
- User approves the plan before execution starts
The orchestrator decides what to delegate based on complexity — simple reads stay direct, complex operations go to focused sub-agents.
Package Responsibilities
@n8n/instance-ai (Core)
The agent package — framework-agnostic business logic.
- Agent factory (
agent/) — creates orchestrator instances with tools, memory, MCP, and tool search - Sub-agent factory (
agent/) — creates stateless sub-agents with mandatory protocol and tool subsets - Orchestration tools (
tools/orchestration/) —plan,delegate,build-workflow-with-agent,update-tasks,cancel-background-task,correct-background-task,verify-built-workflow,report-verification-verdict,apply-workflow-credentials,browser-credential-setup - Domain tools (
tools/) — native tools across workflows, executions, credentials, nodes, data tables, workspace, web research, filesystem, templates, and best practices - Runtime (
runtime/) — stream execution engine, resumable streams with HITL suspension, background task manager, run state registry - Planned tasks (
planned-tasks/) — task graph coordination, dependency resolution, scheduled execution - Workflow loop (
workflow-loop/) — deterministic build→verify→debug state machine for workflow builder agents - Workflow builder (
workflow-builder/) — TypeScript SDK code parsing, validation, patching, and prompt sections - Workspace (
workspace/) — sandbox provisioning (Daytona / local), filesystem abstraction, snapshot management - Memory (
memory/) — working memory template, title generation, memory configuration - Compaction (
compaction/) — LLM-based message history summarization for long conversations - Storage (
storage/) — iteration logs, task storage, planned task storage, workflow loop storage, agent tree snapshots - MCP client (
mcp/) — manages connections to external MCP servers, schema sanitization for Anthropic compatibility - Domain access (
domain-access/) — domain gating and access tracking for external URL approval - Stream mapping (
stream/) — Mastra chunk → canonical event translation, HITL consumption - Event bus interface (
event-bus/) — publishing agent events to the thread channel - Tracing (
tracing/) — LangSmith integration for step-level observability - System prompt (
agent/) — dynamic context-aware prompt based on instance configuration - Types (
types.ts) — all shared interfaces, service contracts, and data models
This package has no dependency on n8n internals. It defines service interfaces
(InstanceAiWorkflowService, etc.) that the backend adapter implements.
packages/cli/src/modules/instance-ai/ (Backend)
The n8n integration layer.
- Module — lifecycle management, DI registration, settings exposure. Only runs on
maininstance type. - Controller — REST endpoints for messages, SSE events, confirmations, threads, credits, and gateway
- Service — orchestrates agent creation, config parsing, storage setup, planned task scheduling, background task management
- Adapter — bridges n8n services to agent interfaces, enforces RBAC permissions
- Memory service — thread lifecycle, message persistence, expiration
- Settings service — admin settings (model, MCP, sandbox), user preferences
- Event bus — in-process EventEmitter (single instance) or Redis Pub/Sub (queue mode), with thread storage for event persistence and replay (max 500 events or 2 MB per thread)
- Filesystem —
LocalFilesystemProvider(bare metal) andLocalGateway(remote daemon via SSE protocol). Auto-detected based on runtime environment (seedocs/filesystem-access.md) - Entities — TypeORM entities for thread, message, memory, snapshots, iteration logs
- Repositories — data access layer (7 TypeORM repositories)
packages/@n8n/api-types (Shared Types)
The contract between frontend and backend.
- Event schemas —
InstanceAiEventdiscriminated union,InstanceAiEventTypeenum - Agent types —
InstanceAiAgentStatus,InstanceAiAgentKind,InstanceAiAgentNode - Task types —
TaskItem,TaskListfor progress tracking - Confirmation types — approval, text input, questions, plan review payloads
- DTOs — request/response shapes for REST API
- Push types — gateway state changes, credit metering events
- Reducer —
AgentRunState,InstanceAiMessagefor frontend state machine
packages/frontend/.../instanceAi/ (Frontend)
The chat interface.
- Store — thread management, message state, agent tree rendering, SSE connection lifecycle
- Reducer — event reducer that processes SSE events into agent tree state
- SSE client — subscribes to event stream, handles reconnect with replay
- API client — REST client for messages, confirmations, threads, memory, settings
- Agent tree — renders orchestrator + sub-agent events as a collapsible tree
- Components — input, workflow preview, tool call steps, task checklist, credential setup modal, domain access approval, debug/memory panels
Key Design Decisions
1. Clean Interface Boundary
The @n8n/instance-ai package defines service interfaces, not implementations.
The backend adapter implements these against real n8n services. This means:
- The agent core is testable in isolation
- The agent core can be reused outside n8n (e.g., CLI, tests)
- Swapping the agent framework doesn't affect n8n integration
2. Agent Created Per Request
A new orchestrator instance is created for each sendMessage call. This is
intentional:
- MCP server configuration can change between requests
- User context (permissions) is request-scoped
- Memory is handled externally (storage-backed), not in-agent
- Sub-agents are created dynamically within the request lifecycle
3. Pub/Sub Streaming
The event bus decouples agent execution from event delivery:
- All agents (orchestrator + sub-agents) publish to a per-thread channel
- Frontend subscribes via SSE with
Last-Event-IDfor reconnect/replay - All events carry
runId(correlates to triggering message) andagentId - SSE events use monotonically increasing per-thread
idvalues for replay - SSE supports both
Last-Event-IDheader and?lastEventIdquery parameter - Events are persisted to thread storage regardless of transport
- No need to pipe sub-agent streams through orchestrator tool execution
- One active run per thread (additional
POST /chatis rejected while active) - Cancellation via
POST /instance-ai/chat/:threadId/cancel(idempotent)
4. Module System Integration
Instance AI uses n8n's module system (@BackendModule). This means:
- It can be disabled via
N8N_DISABLED_MODULES=instance-ai - It only runs on
maininstance type (not workers) - It exposes settings to the frontend via the module
settings()method - It has proper shutdown lifecycle for MCP connection cleanup
Runtime & Streaming
The agent runtime is built on Mastra's streaming primitives with added resumability, HITL suspension, and background task management.
Stream Execution
streamAgentRun() → agent.stream() → executeResumableStream()
├─ for each chunk: mapMastraChunkToEvent() → eventBus.publish()
├─ on suspension: wait for confirmation → agent.resumeStream() → loop
└─ return StreamRunResult {status, mastraRunId, text}
The executeResumableStream() loop consumes Mastra chunks, translates them to
canonical InstanceAiEvent schema, publishes to the event bus, and handles HITL
suspension/resume cycles. Two control modes:
- Manual — returns suspension to caller (used by the orchestrator's main run)
- Auto — waits for confirmation and resumes automatically (used by background sub-agents)
Background Task Manager
Long-running tasks (workflow builds, data table operations, research) run as background tasks with concurrency limits (default: 5 per thread). Features:
- Correction queueing — users can steer running tasks mid-flight via
correct-background-task - Cancellation — three surfaces converge: stop button, "stop that" message,
or
cancelRun(global stop) - Message enrichment — running task context is injected into the orchestrator's messages so it can reference task IDs
Run State Registry
In-memory registry of active, suspended, and pending runs per thread. Manages:
- Active run tracking (one per thread)
- Suspended run state (awaiting HITL confirmation)
- Pending confirmation resolution
- Timeout sweeping for stale suspensions
Planned Tasks & Workflow Loop
Planned Task System
The plan tool creates dependency-aware task graphs for multi-step work. Each
task has a kind that determines its executor:
| Kind | Executor | Tools |
|---|---|---|
build-workflow |
Builder agent | search-nodes, build-workflow, get-node-type-definition, etc. |
manage-data-tables |
Data table agent | All *-data-table* tools |
research |
Research agent | web-search, fetch-url |
delegate |
Custom sub-agent | Orchestrator-specified subset |
Tasks run detached as background agents. Dependencies are respected — a task
only starts when all its deps have succeeded. The plan is shown to the user
for approval before execution begins.
Workflow Loop State Machine
The workflow builder follows a deterministic state machine for the build→verify→debug cycle:
build → submit → verify → (success | needs_patch | needs_rebuild | failed_terminal)
↓ ↓ ↓
finalize patch+submit rebuild+submit
↓ ↓
verify verify
The report-verification-verdict tool feeds results into this state machine,
which returns guidance for the next action. Same failure signature twice triggers
a terminal state to prevent infinite loops.
Tool Search & Deferred Tools
To keep the orchestrator's context lean, tools are stratified into two tiers:
- Core tools (always-loaded):
plan,delegate,ask-user,web-search,fetch-url— these are directly available to the LLM - Deferred tools (behind ToolSearchProcessor): all other domain tools —
discovered on-demand via
search_toolsand activated viaload_tool
This follows Anthropic's guidance on tool search for agents with large tool sets.
The processor is configurable via disableDeferredTools flag.
MCP Integration
External MCP servers are connected via McpClientManager. Their tools are:
- Schema-sanitized for Anthropic compatibility (ZodNull → optional, discriminated unions → flattened objects, array types → recursive element fix)
- Name-checked against reserved domain tool names (prevents malicious
shadowing of tools like
run-workflow) - Separated from domain tools in the orchestrator's tool set
- Cached by config hash across agent instances
Browser MCP tools (Chrome DevTools) are excluded from the orchestrator to avoid
context bloat from screenshots. They're available to browser-credential-setup
sub-agents.
Tracing & Observability
LangSmith integration provides step-level observability:
- Agent runs — root trace spans with metadata (agent_id, thread_id, model)
- LLM steps — per-step traces with messages, reasoning, tool calls, usage, finish reason
- Sub-agent traces — child spans under parent agent runs
- Working memory traces — spans for memory preparation phase
- Synthetic tool traces — internal tools (e.g.,
updateWorkingMemory) tracked separately from LLM-invoked tools
Message Compaction
For conversations that exceed the context window, generateCompactionSummary()
creates an LLM-generated summary of the conversation history. The summary uses
a structured format (Goal, Important facts, Current state, Open issues, Next
step) and is included as a <conversation-summary> block in subsequent requests.
Domain Access Gating
The DomainAccessTracker manages per-domain approval for external URL access.
When the agent calls fetch-url, the domain is checked against the tracker.
Unapproved domains trigger a HITL confirmation with domainAccess payload,
allowing the user to approve or deny access to specific hosts.
Security Model
- Permission scoping — all operations go through n8n's RBAC permission system via the adapter (
userHasScopes()) - Credential safety — tool outputs never include decrypted secrets; credential setup uses the n8n frontend UI where secrets are handled securely
- HITL confirmation — destructive operations (delete, publish, restore) require user approval via the suspension protocol
- Domain access gating — external URL fetches require per-domain user approval
- Memory isolation — working memory is user-scoped; messages, observations, plans, and event history are thread-scoped. Cross-user isolation is enforced.
- Sub-agent containment — sub-agents cannot spawn their own sub-agents,
can only use native domain tools from the registered pool (no MCP tools), and
have low
maxSteps. A mandatory protocol prevents cascading delegation. - MCP tool isolation — MCP tools are name-checked against reserved domain tool names to prevent malicious shadowing. Schema sanitization prevents schema-based attacks.
- Sandbox isolation — when enabled, code execution runs in isolated Daytona
containers (not on the host). File writes are path-traversal protected (must
stay within workspace root). Shell paths are quoted to prevent injection.
See
docs/sandboxing.mdfor details. - Filesystem safety — read-only interface, 512KB file size cap, binary
detection, default directory exclusions (node_modules, .git, dist), symlink
escape protection when basePath is set, 30s timeout per gateway request.
See
docs/filesystem-access.mdfor the full security model. - Web research safety — SSRF protection blocks private IPs, loopback, and non-HTTP(S) schemes. Post-redirect SSRF check prevents open-redirect attacks. Fetched content is treated as untrusted.
- Module gating — disabled by default unless
N8N_INSTANCE_AI_MODELis set