mirror of
https://github.com/lobehub/lobehub
synced 2026-04-21 09:37:28 +00:00
* ♻️ refactor(acp): move agent provider to agencyConfig + restore creation entry - Move AgentProviderConfig from chatConfig to agencyConfig.heterogeneousProvider - Rename type from 'acp' to 'claudecode' for clarity - Restore Claude Code agent creation entry in sidebar + menu - Prioritize heterogeneousProvider check over gateway mode in execution flow - Remove ACP settings from AgentChat form (provider is set at creation time) - Add getAgencyConfigById selector for cleaner access - Use existing agent workingDirectory instead of duplicating in provider config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> ✨ feat(acp): defer terminal events + extract model/usage per turn Three improvements to ACP stream handling: 1. Defer agent_runtime_end/error: Previously the adapter emitted terminal events from result.type directly into the Gateway handler. The handler immediately fires fetchAndReplaceMessages which reads stale DB state (before we persist final content/tools). Fix: intercept terminal events in the executor's event loop and forward them only AFTER content + metadata has been written to DB. 2. Extract model/usage per assistant event: Claude Code sets model name and token usage on every assistant event. Adapter now emits a 'step_complete' event with phase='turn_metadata' carrying these. Executor accumulates input/output/cache tokens across turns and persists them onto the assistant message (model + metadata.totalTokens). 3. Missing final text fix: The accumulated assistant text was being written AFTER agent_runtime_end triggered fetchAndReplaceMessages, so the UI rendered stale (empty) content. Deferred terminals solve this. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 🐛 fix(acp): eliminate orphan-tool warning flicker during streaming Root cause: LobeHub's conversation-flow parser (collectToolMessages) filters tool messages by matching `tool_call_id` against `assistant.tools[].id`. The previous flow created tool messages FIRST, then updated assistant.tools[], which opened a brief window where the UI saw tool messages that had no matching entry in the parent's tools array — rendering them as "orphan" with a scary "请删除" warning to the user. Fix: Reorder persistNewToolCalls into three phases: 1. Pre-register tool entries in assistant.tools[] (id only, no result_msg_id) 2. Create the tool messages in DB (tool_call_id matches pre-registered ids) 3. Back-fill result_msg_id and re-write assistant.tools[] Between phase 1 and phase 3 the UI always sees consistent state: every tool message in DB has a matching entry in the parent's tools array. Verified: orphan count stays at 0 across all sampled timepoints during streaming (vs 1+ before fix). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 🐛 fix(acp): dedupe tool_use + capture tool_result + persist result_msg_id Three critical fixes to ACP tool-call handling, discovered via live testing: 1. **tool_use dedupe** — Claude Code stream-json previously produced 15+ duplicate tool messages per tool_call_id. The adapter now tracks emitted ids so each tool_use → exactly one tool message. 2. **tool_result content capture** — tool_result blocks live in `type: 'user'` events in Claude Code's stream-json, not in assistant events. The adapter now handles the 'user' event type and emits a new `tool_result` HeterogeneousAgentEvent which the executor consumes to call messageService.updateToolMessage() with the actual result content. Previously all tool messages had empty content. 3. **result_msg_id on assistant.tools[]** — LobeHub's parse() step links tool messages to their parent assistant turn via tools[].result_msg_id. Without it, the UI renders orphan-message warnings. The executor now captures the tool message id returned by messageService.createMessage and writes it back into the assistant.tools[] JSONB. Also adds vitest config + 9 unit tests for the adapter covering lifecycle, content mapping, and tool_result handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> ✨ feat(acp): integrate external AI agents via ACP protocol Adds support for connecting external AI agents (Claude Code and future agents like Codex, Kimi CLI) into LobeHub Desktop via a new heterogeneous agent layer that adapts agent-specific protocols to the unified Gateway event stream. Architecture: - New @lobechat/heterogeneous-agents package: pluggable adapters that convert agent-specific outputs to AgentStreamEvent - AcpCtr (Electron main): agent-agnostic process manager with CLI presets registry, broadcasts raw stdout lines to renderer - acpExecutor (renderer): subscribes to broadcasts, runs events through adapter, feeds into existing createGatewayEventHandler - Tool call persistence: creates role='tool' messages via messageService before emitting tool_start/tool_end to the handler Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ♻️ refactor: rename acpExecutor to heterogeneousAgentExecutor - Rename file acpExecutor.ts → heterogeneousAgentExecutor.ts - Rename ACPExecutorParams → HeterogeneousAgentExecutorParams - Rename executeACPAgent → executeHeterogeneousAgent - Change operation type from execAgentRuntime to execHeterogeneousAgent - Change operation label to "Heterogeneous Agent Execution" - Change error type from ACPError to HeterogeneousAgentError - Rename acpData/acpContext variables to heteroData/heteroContext Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ♻️ refactor: rename AcpCtr and acp service to heterogeneousAgent Desktop side: - AcpCtr.ts → HeterogeneousAgentCtr.ts - groupName 'acp' → 'heterogeneousAgent' - IPC channels: acpRawLine → heteroAgentRawLine, etc. Renderer side: - services/electron/acp.ts → heterogeneousAgent.ts - ACPService → HeterogeneousAgentService - acpService → heterogeneousAgentService - Update all IPC channel references in executor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * 🔧 chore: switch CC permission mode to bypassPermissions Use bypassPermissions to allow Bash and other tool execution. Previously acceptEdits only allowed file edits, causing Bash tool calls to fail during CC execution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * 🐛 fix: don't fallback activeAgentId to empty string in AgentIdSync Empty string '' causes chat store to have a truthy but invalid activeAgentId, breaking message routing. Pass undefined instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * 🐛 fix: use AI_RUNTIME_OPERATION_TYPES for loading and cancel states stopGenerateMessage and cancelOperation were hardcoding ['execAgentRuntime', 'execServerAgentRuntime'], missing execHeterogeneousAgent. This caused: - CC execution couldn't be cancelled via stop button - isAborting flag wasn't set for heterogeneous agent operations Now uses AI_RUNTIME_OPERATION_TYPES constant everywhere to ensure all AI runtime operation types are handled consistently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ✨ feat: split multi-step CC execution into separate assistant messages Claude Code's multi-turn execution (thinking → tool → final text) was accumulating everything onto a single assistant message, causing the final text response to appear inside the tool call message. Changes: - ClaudeCodeAdapter: detect message.id changes and emit stream_end + stream_start with newStep flag at step boundaries - heterogeneousAgentExecutor: on newStep stream_start, persist previous step's content, create a new assistant message, reset accumulators, and forward the new message ID to the gateway handler This ensures each LLM turn gets its own assistant message, matching how Gateway mode handles multi-step agent execution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * 🐛 fix: fix multi-step CC execution and add DB persistence tests Adapter fixes: - Fix false step boundary on first assistant after init (ghost empty message) Executor fixes: - Fix parentId chain: new-step assistant points to last tool message - Fix content contamination: sync snapshot of content accumulators on step boundary - Fix type errors (import path, ChatToolPayload casts, sessionId guard) Tests: - Add ClaudeCodeAdapter unit tests (multi-step, usage, flush, edge cases) - Add ClaudeCodeAdapter E2E test (full multi-step session simulation) - Add registry tests - Add executor DB persistence tests covering: - Tool 3-phase write (pre-register → create → backfill) - Tool result content + error persistence - Multi-step parentId chain (assistant → tool → assistant) - Final content/reasoning/model/usage writes - Sync snapshot preventing cross-step contamination - Error handling with partial content persistence - Full multi-step E2E (Read → Write → text) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * 🔧 chore: add orphan tool regression tests and debug trace - Add orphan tool regression tests for multi-turn tool execution - Add __HETERO_AGENT_TRACE debug instrumentation for event flow capture Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ✨ feat: support image attachments in CC via stream-json stdin - Main process downloads files by ID from cloud (GET {domain}/f/{fileId}) - Local disk cache at lobehub-storage/heteroAgent/files/ (by fileId) - When fileIds present, switches to --input-format stream-json + stdin pipe - Constructs user message with text + image content blocks (base64) - Pass fileIds through executor → service → IPC → controller Closes LOBE-7254 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ♻️ refactor: pass imageList instead of fileIds for CC vision support - Use imageList (with url) instead of fileIds — Main downloads from URL directly - Cache by image id at lobehub-storage/heteroAgent/files/ - Only images (not arbitrary files) are sent to CC via stream-json stdin Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * 🐛 fix: read imageList from persisted DB message instead of chatUploadFileList chatUploadFileList is cleared after sendMessageInServer, so tempImages was empty by the time the executor ran. Now reads imageList from the persisted user message in heteroData.messages instead. Also removes debug console.log/console.error statements. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * update i18n * 🐛 fix: prevent orphan tool UI by deferring handler events during step transition Root cause: when a CC step boundary occurs, the adapter produces [stream_end, stream_start(newStep), stream_chunk(tools_calling)] in one batch. The executor deferred stream_start via persistQueue but forwarded stream_chunk synchronously — handler received tools_calling BEFORE stream_start, dispatching tools to the OLD assistant message → UI showed orphan tool warning. Fix: add pendingStepTransition flag that defers ALL handler-bound events through persistQueue until stream_start is forwarded, guaranteeing correct event ordering. Also adds: - Minimal regression test in gatewayEventHandler confirming correct ordering - Multi-tool per turn regression test from real LOBE-7240 trace - Data-driven regression replaying 133 real CC events from regression.json Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ✨ feat: add lab toggle for heterogeneous agent (Claude Code) - Add enableHeterogeneousAgent to UserLabSchema + defaults (off by default) - Add selector + settings UI toggle (desktop only) - Gate "Claude Code Agent" sidebar menu item behind the lab setting - Remove regression.json (no longer needed) - Add i18n keys for the lab feature Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * 🐛 fix: gate heterogeneous agent execution behind isDesktop check Without this, web users with an agent that has heterogeneousProvider config would hit the CC execution path and fail (no Electron IPC). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ♻️ refactor: rename tool identifier from acp-agent to claude-code Also update operation label to "External agent running". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ✨ feat: add CLI agent detectors for system tools settings Detect agentic coding CLIs installed on the system: - Claude Code, Codex, Gemini CLI, Qwen Code, Kimi CLI, Aider - Uses validated detection (which + --version keyword matching) - New "CLI Agents" category in System Tools settings - i18n for en-US and zh-CN Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * 🐛 fix: fix token usage over-counting in CC execution Two bugs fixed: 1. Adapter: same message.id emitted duplicate step_complete(turn_metadata) for each content block (thinking/text/tool_use) — all carry identical usage. Now deduped by message.id, only emits once per turn. 2. Executor: CC result event contains authoritative session-wide usage totals but was ignored. Now adapter emits step_complete(result_usage) from the result event, executor uses it to override accumulated values. Fixes LOBE-7261 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * 🔧 chore: gitignore cc-stream.json and .heterogeneous-tracing/ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * 🔧 chore: untrack .heerogeneous-tracing/ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ✨ feat: wire CC session resume for multi-turn conversations Reads `ccSessionId` from topic metadata and passes it as `resumeSessionId` into the heterogeneous-agent executor, which forwards it into the Electron main-process controller. `sendPrompt` then appends `--resume <id>` so the next turn continues the same Claude Code session instead of starting fresh. After each run, the CC init-event session_id (captured by the adapter) is persisted back onto the topic so the chain survives page reloads. Also stops killing the session in `finally` — it needs to stay alive for subsequent turns; cleanup happens on topic deletion or app quit. * 🐛 fix: record cache token breakdown in CC execution metadata The prior token-usage fix only wrote totals — `inputCachedTokens`, `inputWriteCacheTokens` and `inputCacheMissTokens` were dropped, so the pricing card rendered zero cached/write-cache tokens even though CC had reported them. Map the accumulated Anthropic-shape usage to the same breakdown the anthropic usage converter emits, so CC turns display consistently with Gateway turns. Refs LOBE-7261 * ♻️ refactor: write CC usage under metadata.usage instead of flat fields Flat `inputCachedTokens / totalInputTokens / ...` on `MessageMetadata` are the legacy shape; new code should put usage under `metadata.usage`. Move the CC executor to the nested shape so it matches the convention the rest of the runtime is migrating to. Refs LOBE-7261 * ♻️ refactor(types): mark flat usage fields on MessageMetadata as deprecated Stop extending `ModelUsage` and redeclare each token field inline with a `@deprecated` JSDoc pointing to `metadata.usage` (nested). Existing readers still type-check, but IDEs now surface the deprecation so writers migrate to the nested shape. * ♻️ refactor(types): mark flat performance fields on MessageMetadata as deprecated Stop extending `ModelPerformance` and redeclare `duration` / `latency` / `tps` / `ttft` inline with `@deprecated`, pointing at `metadata.performance`. Mirrors the same treatment just done for the token usage fields. * ✨ feat: CC agent gets claude avatar + lands on chat page directly Skip the shared createAgent hook's /profile redirect for the Claude Code variant — its config is fixed so the profile editor would be noise — and preseed the Claude avatar from @lobehub/icons-static-avatar so new CC agents aren't blank. * 🐛 fix(conversation-flow): read usage/performance from nested metadata `splitMetadata` only scraped the legacy flat token/perf fields, so messages written under the new canonical shape (`metadata.usage`, `metadata.performance`) never populated `UIChatMessage.usage` and the Extras panel rendered blank. - Prefer nested `metadata.usage` / `metadata.performance` when present; keep flat scraping as fallback for pre-migration rows. - Add `usage` / `performance` to FlatListBuilder's filter sets so the nested blobs don't leak into `otherMetadata`. - Drop the stale `usage! || metadata` fallback in the Assistant / CouncilMember Extra renders — with splitMetadata fixed, `item.usage` is always populated when usage data exists, and passing raw metadata as ModelUsage is wrong now that the flat fields are gone. * 🐛 fix: skip stores.reset on initial dataSyncConfig hydration `useDataSyncConfig`'s SWR onSuccess called `refreshUserData` (which runs `stores.reset()`) whenever the freshly-fetched config didn't deep-equal the hard-coded initial `{ storageMode: 'cloud' }` — which happens on every first load. The reset would wipe `chat.activeAgentId` just after `AgentIdSync` set it from the URL, and because `AgentIdSync`'s sync effects are keyed on `params.aid` (which hasn't changed), they never re-fire to restore it. Result: topic SWR saw `activeAgentId === ''`, treated the container as invalid, and left the sidebar stuck on the loading skeleton. Gate the reset on `isInitRemoteServerConfig` so it only runs when the user actually switches sync modes, not on the first hydration. * ✨ feat(claude-code): wire Inspector layer for CC tool calls Mirrors local-system: each CC tool now has an inspector rendered above the tool-call output instead of an opaque default row. - `Inspector.tsx` — registry that passes the CC tool name itself as the shared factories' `translationKey`. react-i18next's missing-key fallback surfaces the literal name (Bash / Edit / Glob / Grep / Read / Write), so we don't add CC-specific entries to the plugin locale. - `ReadInspector.tsx` / `WriteInspector.tsx` — thin adapters that map Anthropic-native args (`file_path` / `offset` / `limit`) onto the shared inspectors' shape (`path` / `startLine` / `endLine`), so shared stays pure. Bash / Edit / Glob / Grep reuse shared factories directly. - Register `ClaudeCodeInspectors` under `claude-code` in the builtin-tools inspector dispatch. Also drops the redundant `Render/Bash/index.tsx` wrapper and pipes the shared `RunCommandRender` straight into the registry. * ♻️ refactor: use agentSelectors.isCurrentAgentHeterogeneous Two callsites (ConversationArea / useActionsBarConfig) were reaching into `currentAgentConfig(...)?.agencyConfig?.heterogeneousProvider` inline. Switch them to the existing `isCurrentAgentHeterogeneous` selector so the predicate lives in one place. * update * ♻️ refactor: drop no-op useCallback wrapper in AgentChat form `handleFinish` just called `updateConfig(values)` with no extra logic; the zustand action is already a stable reference so the wrapper added no memoization value. Leftover from the ACP refactor (930ba41fe3) where the handler once did more work — hand the action straight to `onFinish`. * update * ⏪ revert: roll back conversation-flow nested-shape reads Unwind the `splitMetadata` nested-preference + `FlatListBuilder` filter additions from 306fd6561f. The nested `metadata.usage` / `metadata.performance` promotion now happens in `parse.ts` (and a `?? metadata?.usage` fallback at the UI callsites), so conversation-flow's transformer layer goes back to its original flat-field-only behavior. * update * 🐛 fix(cc): wire Stop to cancel the external Claude Code process Previously hitting Stop only flipped the `execHeterogeneousAgent` operation to `cancelled` in the store — the spawned `claude -p` process kept running and kept streaming/persisting output for the user. The op's abort signal had no listeners and no `onCancelHandler` was registered. - On session start, register an `onCancelHandler` that calls `heterogeneousAgentService.cancelSession(sessionId)` (SIGINT to the CLI). - Read the op's `abortController.signal` and short-circuit `onRawLine` so late events the CLI emits between SIGINT and exit don't leak into DB writes. - Skip the error-event forward in `onError` / the outer catch when the abort came from the user, so the UI doesn't surface a misleading error toast on top of the already-cancelled operation. Verified end-to-end: prompt that runs a long sequence of Reads → click Stop → `claude -p` process is gone within 2s, op status = cancelled, no error message written to the conversation. * ✨ feat(sidebar): mark heterogeneous agents with an "External" tag Pipes the agent's `agencyConfig.heterogeneousProvider.type` through the sidebar data flow and renders a `<Tag>` next to the title for any agent driven by an external CLI runtime (Claude Code today, more later). Mirrors the group-member External pattern so future provider types just need a label swap — the field is a string, not a boolean. - `SidebarAgentItem.heterogeneousType?: string | null` on the shared type - `HomeRepository.getSidebarAgentList` selects `agents.agencyConfig` and derives the field via `cleanObject` - `AgentItem` shows `<Tag>{t('group.profile.external')}</Tag>` when the field is present Verified client-side by injecting `heterogeneousType: 'claudecode'` into a sidebar item at runtime — the "外部" tag renders next to the title in the zh-CN locale. * ♻️ refactor(i18n): dedicated key for the sidebar external-agent tag Instead of reusing `group.profile.external` (which is about group members that are user-linked rather than virtual), add `agentSidebar.externalTag` specifically for the heterogeneous-runtime tag. Keeps the two concepts separate so we can swap this one to "Claude Code" / provider-specific labels later without touching the group UI copy. Remember to run `pnpm i18n` before the PR so the remaining locales pick up the new key. * 🐛 fix: clear remaining CI type errors Three small fixes so `tsgo --noEmit` exits clean: - `AgentIdSync`: `useChatStoreUpdater` is typed off the chat-store key, whose `activeAgentId` is `string` (initial ''). Coerce the optional URL param to `''` so the store key type matches; `createStoreUpdater` still skips the setState when the value is undefined-ish. - `heterogeneousAgentExecutor.test.ts`: `scope: 'session'` isn't a valid `MessageMapScope` (the union dropped that variant); switch the fixture to `'main'`, which is the correct scope for agent main conversations. - Same test file: `Array.at(-1)` is `T | undefined`; non-null assert since the preceding calls guarantee the slot is populated. * 🐛 fix: loosen createStoreUpdater signature to accept nullable values Upstream `createStoreUpdater` types `value` as exactly `T[Key]`, so any call site feeding an optional source (URL param, selector that may return undefined) fails type-check — even though the runtime already guards `typeof value !== 'undefined'` and no-ops in that case. Wrap it once in `store/utils/createStoreUpdater.ts` with a `T[Key] | null | undefined` value type so callers can pass `params.aid` directly, instead of the lossy `?? ''` fallback the previous commit used (which would have written an empty-string sentinel into the chat store). Swap the import in `AgentIdSync.tsx`. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
421 lines
12 KiB
TypeScript
421 lines
12 KiB
TypeScript
import type { ChildProcess } from 'node:child_process';
|
|
import { spawn } from 'node:child_process';
|
|
import { randomUUID } from 'node:crypto';
|
|
import { mkdir, readFile, writeFile } from 'node:fs/promises';
|
|
import { join } from 'node:path';
|
|
import type { Readable, Writable } from 'node:stream';
|
|
|
|
import { app as electronApp, BrowserWindow } from 'electron';
|
|
|
|
import { createLogger } from '@/utils/logger';
|
|
|
|
import { ControllerModule, IpcMethod } from './index';
|
|
|
|
const logger = createLogger('controllers:HeterogeneousAgentCtr');
|
|
|
|
/** Directory under appStoragePath for caching downloaded files */
|
|
const FILE_CACHE_DIR = 'heteroAgent/files';
|
|
|
|
// ─── CLI presets per agent type ───
|
|
// Mirrors @lobechat/heterogeneous-agents/registry but runs in main process
|
|
// (can't import from the workspace package in Electron main directly)
|
|
|
|
interface CLIPreset {
|
|
baseArgs: string[];
|
|
promptMode: 'positional' | 'stdin';
|
|
resumeArgs?: (sessionId: string) => string[];
|
|
}
|
|
|
|
const CLI_PRESETS: Record<string, CLIPreset> = {
|
|
'claude-code': {
|
|
baseArgs: [
|
|
'-p',
|
|
'--output-format',
|
|
'stream-json',
|
|
'--verbose',
|
|
'--permission-mode',
|
|
'bypassPermissions',
|
|
],
|
|
promptMode: 'positional',
|
|
resumeArgs: (sid) => ['--resume', sid],
|
|
},
|
|
// Future presets:
|
|
// 'codex': { baseArgs: [...], promptMode: 'positional' },
|
|
// 'kimi-cli': { baseArgs: [...], promptMode: 'positional' },
|
|
};
|
|
|
|
// ─── IPC types ───
|
|
|
|
interface StartSessionParams {
|
|
/** Agent type key (e.g., 'claude-code'). Defaults to 'claude-code'. */
|
|
agentType?: string;
|
|
/** Additional CLI arguments */
|
|
args?: string[];
|
|
/** Command to execute */
|
|
command: string;
|
|
/** Working directory */
|
|
cwd?: string;
|
|
/** Environment variables */
|
|
env?: Record<string, string>;
|
|
/** Session ID to resume (for multi-turn) */
|
|
resumeSessionId?: string;
|
|
}
|
|
|
|
interface StartSessionResult {
|
|
sessionId: string;
|
|
}
|
|
|
|
interface ImageAttachment {
|
|
id: string;
|
|
url: string;
|
|
}
|
|
|
|
interface SendPromptParams {
|
|
/** Image attachments to include in the prompt (downloaded from url, cached by id) */
|
|
imageList?: ImageAttachment[];
|
|
prompt: string;
|
|
sessionId: string;
|
|
}
|
|
|
|
interface CancelSessionParams {
|
|
sessionId: string;
|
|
}
|
|
|
|
interface StopSessionParams {
|
|
sessionId: string;
|
|
}
|
|
|
|
interface GetSessionInfoParams {
|
|
sessionId: string;
|
|
}
|
|
|
|
interface SessionInfo {
|
|
agentSessionId?: string;
|
|
}
|
|
|
|
// ─── Internal session tracking ───
|
|
|
|
interface AgentSession {
|
|
agentSessionId?: string;
|
|
agentType: string;
|
|
args: string[];
|
|
command: string;
|
|
cwd?: string;
|
|
env?: Record<string, string>;
|
|
process?: ChildProcess;
|
|
sessionId: string;
|
|
}
|
|
|
|
/**
|
|
* External Agent Controller — manages external agent CLI processes via Electron IPC.
|
|
*
|
|
* Agent-agnostic: uses CLI presets from a registry to support Claude Code,
|
|
* Codex, Kimi CLI, etc. Only handles process lifecycle and raw stdout line
|
|
* broadcasting. All event parsing and DB persistence happens on the Renderer side.
|
|
*
|
|
* Lifecycle: startSession → sendPrompt → (heteroAgentRawLine broadcasts) → stopSession
|
|
*/
|
|
export default class HeterogeneousAgentCtr extends ControllerModule {
|
|
static override readonly groupName = 'heterogeneousAgent';
|
|
|
|
private sessions = new Map<string, AgentSession>();
|
|
|
|
// ─── Broadcast ───
|
|
|
|
private broadcast<T>(channel: string, data: T) {
|
|
for (const win of BrowserWindow.getAllWindows()) {
|
|
if (!win.isDestroyed()) {
|
|
win.webContents.send(channel, data);
|
|
}
|
|
}
|
|
}
|
|
|
|
// ─── File cache ───
|
|
|
|
private get fileCacheDir(): string {
|
|
return join(this.app.appStoragePath, FILE_CACHE_DIR);
|
|
}
|
|
|
|
/**
|
|
* Download an image by URL, with local disk cache keyed by id.
|
|
*/
|
|
private async resolveImage(
|
|
image: ImageAttachment,
|
|
): Promise<{ buffer: Buffer; mimeType: string }> {
|
|
const cacheDir = this.fileCacheDir;
|
|
const metaPath = join(cacheDir, `${image.id}.meta`);
|
|
const dataPath = join(cacheDir, image.id);
|
|
|
|
// Check cache first
|
|
try {
|
|
const metaRaw = await readFile(metaPath, 'utf8');
|
|
const meta = JSON.parse(metaRaw);
|
|
const buffer = await readFile(dataPath);
|
|
logger.debug('Image cache hit:', image.id);
|
|
return { buffer, mimeType: meta.mimeType || 'image/png' };
|
|
} catch {
|
|
// Cache miss — download
|
|
}
|
|
|
|
logger.info('Downloading image:', image.id);
|
|
|
|
const res = await fetch(image.url);
|
|
if (!res.ok)
|
|
throw new Error(`Failed to download image ${image.id}: ${res.status} ${res.statusText}`);
|
|
|
|
const arrayBuffer = await res.arrayBuffer();
|
|
const buffer = Buffer.from(arrayBuffer);
|
|
const mimeType = res.headers.get('content-type') || 'image/png';
|
|
|
|
// Write to cache
|
|
await mkdir(cacheDir, { recursive: true });
|
|
await writeFile(dataPath, buffer);
|
|
await writeFile(metaPath, JSON.stringify({ id: image.id, mimeType }));
|
|
logger.debug('Image cached:', image.id, `${buffer.length} bytes`);
|
|
|
|
return { buffer, mimeType };
|
|
}
|
|
|
|
/**
|
|
* Build a stream-json user message with text + image content blocks.
|
|
*/
|
|
private async buildStreamJsonInput(
|
|
prompt: string,
|
|
imageList: ImageAttachment[],
|
|
): Promise<string> {
|
|
const content: any[] = [{ text: prompt, type: 'text' }];
|
|
|
|
for (const image of imageList) {
|
|
try {
|
|
const { buffer, mimeType } = await this.resolveImage(image);
|
|
content.push({
|
|
source: {
|
|
data: buffer.toString('base64'),
|
|
media_type: mimeType,
|
|
type: 'base64',
|
|
},
|
|
type: 'image',
|
|
});
|
|
} catch (err) {
|
|
logger.error(`Failed to resolve image ${image.id}:`, err);
|
|
}
|
|
}
|
|
|
|
return JSON.stringify({
|
|
message: { content, role: 'user' },
|
|
type: 'user',
|
|
});
|
|
}
|
|
|
|
// ─── IPC methods ───
|
|
|
|
/**
|
|
* Create a session (stores config, process spawned on sendPrompt).
|
|
*/
|
|
@IpcMethod()
|
|
async startSession(params: StartSessionParams): Promise<StartSessionResult> {
|
|
const sessionId = randomUUID();
|
|
const agentType = params.agentType || 'claude-code';
|
|
|
|
this.sessions.set(sessionId, {
|
|
// If resuming, pre-set the agent session ID so sendPrompt adds --resume
|
|
agentSessionId: params.resumeSessionId,
|
|
agentType,
|
|
args: params.args || [],
|
|
command: params.command,
|
|
cwd: params.cwd,
|
|
env: params.env,
|
|
sessionId,
|
|
});
|
|
|
|
logger.info('Session created:', { agentType, sessionId });
|
|
return { sessionId };
|
|
}
|
|
|
|
/**
|
|
* Send a prompt to an agent session.
|
|
*
|
|
* Spawns the CLI process with preset flags. Broadcasts each stdout line
|
|
* as an `heteroAgentRawLine` event — Renderer side parses and adapts.
|
|
*/
|
|
@IpcMethod()
|
|
async sendPrompt(params: SendPromptParams): Promise<void> {
|
|
const session = this.sessions.get(params.sessionId);
|
|
if (!session) throw new Error(`Session not found: ${params.sessionId}`);
|
|
|
|
const preset = CLI_PRESETS[session.agentType];
|
|
if (!preset) throw new Error(`Unknown agent type: ${session.agentType}`);
|
|
|
|
const hasImages = params.imageList && params.imageList.length > 0;
|
|
|
|
// If images are attached, prepare the stream-json input BEFORE spawning
|
|
// so any download errors are caught early.
|
|
let stdinPayload: string | undefined;
|
|
if (hasImages) {
|
|
stdinPayload = await this.buildStreamJsonInput(params.prompt, params.imageList!);
|
|
}
|
|
|
|
return new Promise<void>((resolve, reject) => {
|
|
// Build CLI args: base preset + resume + user args
|
|
const cliArgs = [
|
|
...preset.baseArgs,
|
|
...(session.agentSessionId && preset.resumeArgs
|
|
? preset.resumeArgs(session.agentSessionId)
|
|
: []),
|
|
...session.args,
|
|
];
|
|
|
|
if (hasImages) {
|
|
// With files: use stdin stream-json mode
|
|
cliArgs.push('--input-format', 'stream-json');
|
|
} else {
|
|
// Without files: use positional prompt (simple mode)
|
|
if (preset.promptMode === 'positional') {
|
|
cliArgs.push(params.prompt);
|
|
}
|
|
}
|
|
|
|
logger.info('Spawning agent:', session.command, cliArgs.join(' '));
|
|
|
|
const proc = spawn(session.command, cliArgs, {
|
|
cwd: session.cwd,
|
|
env: { ...process.env, ...session.env },
|
|
stdio: [hasImages ? 'pipe' : 'ignore', 'pipe', 'pipe'],
|
|
});
|
|
|
|
// If using stdin mode, write the stream-json message and close stdin
|
|
if (hasImages && stdinPayload && proc.stdin) {
|
|
const stdin = proc.stdin as Writable;
|
|
stdin.write(stdinPayload + '\n', () => {
|
|
stdin.end();
|
|
});
|
|
}
|
|
|
|
session.process = proc;
|
|
let buffer = '';
|
|
|
|
// Stream stdout lines as raw events to Renderer
|
|
const stdout = proc.stdout as Readable;
|
|
stdout.on('data', (chunk: Buffer) => {
|
|
buffer += chunk.toString('utf8');
|
|
const lines = buffer.split('\n');
|
|
buffer = lines.pop() || '';
|
|
|
|
for (const line of lines) {
|
|
const trimmed = line.trim();
|
|
if (!trimmed) continue;
|
|
|
|
try {
|
|
const parsed = JSON.parse(trimmed);
|
|
|
|
// Extract agent session ID from init event (for multi-turn)
|
|
if (parsed.type === 'system' && parsed.subtype === 'init' && parsed.session_id) {
|
|
session.agentSessionId = parsed.session_id;
|
|
}
|
|
|
|
// Broadcast raw parsed JSON — Renderer handles all adaptation
|
|
this.broadcast('heteroAgentRawLine', {
|
|
line: parsed,
|
|
sessionId: session.sessionId,
|
|
});
|
|
} catch {
|
|
// Not valid JSON, skip
|
|
}
|
|
}
|
|
});
|
|
|
|
// Capture stderr
|
|
const stderrChunks: string[] = [];
|
|
const stderr = proc.stderr as Readable;
|
|
stderr.on('data', (chunk: Buffer) => {
|
|
stderrChunks.push(chunk.toString('utf8'));
|
|
});
|
|
|
|
proc.on('error', (err) => {
|
|
logger.error('Agent process error:', err);
|
|
this.broadcast('heteroAgentSessionError', {
|
|
error: err.message,
|
|
sessionId: session.sessionId,
|
|
});
|
|
reject(err);
|
|
});
|
|
|
|
proc.on('exit', (code) => {
|
|
logger.info('Agent process exited:', { code, sessionId: session.sessionId });
|
|
session.process = undefined;
|
|
|
|
if (code === 0) {
|
|
this.broadcast('heteroAgentSessionComplete', { sessionId: session.sessionId });
|
|
resolve();
|
|
} else {
|
|
const stderrOutput = stderrChunks.join('').trim();
|
|
const errorMsg = stderrOutput || `Agent exited with code ${code}`;
|
|
this.broadcast('heteroAgentSessionError', {
|
|
error: errorMsg,
|
|
sessionId: session.sessionId,
|
|
});
|
|
reject(new Error(errorMsg));
|
|
}
|
|
});
|
|
});
|
|
}
|
|
|
|
/**
|
|
* Get session info (agent's internal session ID for multi-turn resume).
|
|
*/
|
|
@IpcMethod()
|
|
async getSessionInfo(params: GetSessionInfoParams): Promise<SessionInfo> {
|
|
const session = this.sessions.get(params.sessionId);
|
|
return { agentSessionId: session?.agentSessionId };
|
|
}
|
|
|
|
/**
|
|
* Cancel an ongoing session.
|
|
*/
|
|
@IpcMethod()
|
|
async cancelSession(params: CancelSessionParams): Promise<void> {
|
|
const session = this.sessions.get(params.sessionId);
|
|
if (session?.process) {
|
|
session.process.kill('SIGINT');
|
|
}
|
|
}
|
|
|
|
/**
|
|
* Stop and clean up a session.
|
|
*/
|
|
@IpcMethod()
|
|
async stopSession(params: StopSessionParams): Promise<void> {
|
|
const session = this.sessions.get(params.sessionId);
|
|
if (!session) return;
|
|
|
|
if (session.process && !session.process.killed) {
|
|
session.process.kill('SIGTERM');
|
|
setTimeout(() => {
|
|
if (session.process && !session.process.killed) {
|
|
session.process.kill('SIGKILL');
|
|
}
|
|
}, 3000);
|
|
}
|
|
|
|
this.sessions.delete(params.sessionId);
|
|
}
|
|
|
|
@IpcMethod()
|
|
async respondPermission(): Promise<void> {
|
|
// No-op for CLI mode (permissions handled by --permission-mode flag)
|
|
}
|
|
|
|
/**
|
|
* Cleanup on app quit.
|
|
*/
|
|
afterAppReady() {
|
|
electronApp.on('before-quit', () => {
|
|
for (const [, session] of this.sessions) {
|
|
if (session.process && !session.process.killed) {
|
|
session.process.kill('SIGTERM');
|
|
}
|
|
}
|
|
this.sessions.clear();
|
|
});
|
|
}
|
|
}
|