lobehub/docs/development/basic/chat-api.mdx
YuTengjing 463d6c8762
📝 docs: improve development guides to reflect current architecture (#12174)
* 🔧 chore(vscode): add typescript.tsdk and disable mdx server

Fix MDX extension crash caused by Cursor's bundled TypeScript version

* 🔧 chore(claude): add skills symlink to .claude directory

* 📝 docs: update development guides with current tech stack and architecture

- Update tech stack: Next.js 16 + React 19, hybrid routing (App Router + React Router DOM), tRPC, Drizzle ORM + PostgreSQL, react-i18next
- Update directory structure to reflect monorepo layout (apps/, packages/, e2e/, locales/)
- Expand src/server/ with detailed subdirectory descriptions
- Add complete SPA routing architecture with desktop and mobile route tables
- Add tRPC router grouping details (lambda, async, tools, mobile)
- Add data flow diagram
- Simplify dev setup section to link to setup-development guide
- Fix i18n default language description (English, not Chinese)
- Sync all changes between zh-CN and English versions

* 📝 docs: expand data flow diagram in folder structure guide

Replace the single-line data flow with a detailed layer-by-layer
flow diagram showing each layer's location and responsibility.

* 📝 docs: modernize feature development guide

- Remove outdated clientDB/pglite/indexDB references
- Update schema path to packages/database/src/schemas/
- Update types path to packages/types/src/
- Replace inline migration steps with link to db-migrations guide
- Add complete layered architecture table (Client Service, WebAPI,
  tRPC Router, Server Service, Server Module, Repository, DB Model)
- Clarify Client Service as frontend code
- Add i18n handling section with workflow and key naming convention
- Remove verbose CSS style code, keep core business logic only
- Expand testing section with commands, skill refs, and CI tip

* 🔥 docs: remove outdated frontend feature development guide

Content is superseded by the comprehensive feature-development guide
which covers the full chain from schema to testing.

* 📝 docs: add LobeHub ecosystem and community resources

Add official ecosystem packages (LobeUI, LobeIcons, LobeCharts,
LobeEditor, LobeTTS, LobeLint, Lobe i18n, MCP Mark) and community
platforms (Agent Market, MCP Market, YouTube, X, Discord).

* 📝 docs: improve contributing guidelines and resources

- Clarify semantic release triggers (feat/fix vs style/chore)
- Add testing section with Vitest/E2E/CI requirements
- Update contribution steps to include CI check
- Add LobeHub ecosystem packages and community platforms to resources

* 📝 docs: rewrite architecture guide to reflect current platform design

* 📝 docs: add code quality tools to architecture guide

* 📝 docs: rewrite chat-api guide to reflect current architecture

- Update sequence diagram with Agent Runtime loop as core execution engine
- Replace PluginGateway with ToolExecution layer (Builtin/MCP/Plugin)
- Update all path references (model-runtime, agent-runtime, fetch-sse packages)
- Split old AgentRuntime section into Model Runtime + Agent Runtime
- Add tool calling taxonomy: Builtin, MCP, and Plugin (deprecated)
- Add client-side vs server-side execution section
- Remove outdated adapter pseudo-code examples

* 📝 docs: update file paths in add-new-image-model guide

- src/libs/standard-parameters/ → packages/model-bank/src/standard-parameters/
- src/config/aiModels/ → packages/model-bank/src/aiModels/
- src/libs/model-runtime/ → packages/model-runtime/src/providers/

* 📝 docs: restore S3_PUBLIC_DOMAIN in deployment guides

The S3_PUBLIC_DOMAIN env var was incorrectly removed from all
documentation in commit 4a87b31. This variable is still required
by the code (src/server/services/file/impls/s3.ts) to generate
public URLs for uploaded files. Without it, image URLs sent to
vision models are just S3 keys instead of full URLs.

Closes #12161

* 📦 chore: pin @lobehub/ui to 4.33.4 to fix SortableList type errors

@lobehub/ui 4.34.0 introduced breaking type changes in SortableList
where SortableListItem became strict, causing type incompatibility
in onChange and renderItem callbacks across 6 files. Pin to 4.33.4
via pnpm overrides to enforce consistent version across monorepo.

* 🐛 fix: correct ReadableStream type annotations and add dom.asynciterable

- Add dom.asynciterable to tsconfig lib for ReadableStream async iteration
- Fix createCallbacksTransformer return type: TransformStream<string, Uint8Array>
- Update stream function return types from ReadableStream<string> to
  ReadableStream<Uint8Array> (llama.ts, ollama.ts, claude.ts)
- Remove @ts-ignore from for-await loops in test files
- Add explicit string[] type for chunks arrays

* Revert "📝 docs: restore S3_PUBLIC_DOMAIN in deployment guides"

This reverts commit 24073f83d3.
2026-02-07 22:29:14 +08:00

420 lines
13 KiB
Text

---
title: Chat API Client-Server Interaction Logic
description: >-
Explore the client-server interaction logic of LobeChat
Chat API, including event sequences and core components.
tags:
- Chat API
- Client-Server Interaction
- Model Runtime
- Agent Runtime
- MCP
---
# Chat API Client-Server Interaction Logic
This document explains the implementation logic of
LobeChat Chat API in client-server interactions,
including event sequences and core components involved.
## Interaction Sequence Diagram
```mermaid
sequenceDiagram
participant Client as Frontend Client
participant AgentLoop as Agent Runtime Loop
participant ChatService as ChatService
participant ChatAPI as Backend Chat API
participant ModelRuntime as Model Runtime
participant ModelProvider as Model Provider API
participant ToolExecution as Tool Execution Layer
Client->>AgentLoop: sendMessage()
Note over AgentLoop: Create GeneralChatAgent + AgentRuntime
loop Agent Plan-Execute Loop
AgentLoop->>AgentLoop: Agent decides next instruction
alt call_llm instruction
AgentLoop->>ChatService: getChatCompletion
Note over ChatService: Context engineering
ChatService->>ChatAPI: POST /webapi/chat/[provider]
ChatAPI->>ModelRuntime: Initialize ModelRuntime
ModelRuntime->>ModelProvider: Chat completion request
ModelProvider-->>ChatService: Stream back SSE response
ChatService-->>Client: onMessageHandle callback
else call_tool instruction
AgentLoop->>ToolExecution: Execute tool
Note over ToolExecution: Builtin / MCP / Plugin
ToolExecution-->>AgentLoop: Return tool result
else request_human_* instruction
AgentLoop-->>Client: Request user intervention
Client->>AgentLoop: User feedback
else finish instruction
AgentLoop-->>Client: onFinish callback
end
end
Note over Client,ModelProvider: Preset task scenario (bypasses Agent loop)
Client->>ChatService: fetchPresetTaskResult
ChatService->>ChatAPI: Send preset task request
ChatAPI-->>ChatService: Return task result
ChatService-->>Client: Return result via callback
```
## Main Process Steps
### 1. Client Initiates Request
After the user sends a message, `sendMessage()`
(`src/store/chat/slices/aiChat/actions/conversationLifecycle.ts`)
creates the user message and assistant message placeholder,
then calls `internal_execAgentRuntime()`.
### 2. Agent Runtime Drives the Loop
Agent Runtime is the **core execution engine**
of the entire chat flow. Every chat interaction
(from simple Q\&A to complex multi-step tool calling)
is driven by the `AgentRuntime.step()` loop.
**Initialization**
(`src/store/chat/slices/aiChat/actions/streamingExecutor.ts`):
1. Resolve agent config (model, provider, plugin list, etc.)
2. Create the tool registry via `createAgentToolsEngine()`
3. Create `GeneralChatAgent` (the "brain" that decides
what to do next) and `AgentRuntime` (the "engine"
that executes instructions)
4. Inject custom executors via `createAgentExecutors()`
**Execution Loop**:
```ts
while (state.status !== 'done' && state.status !== 'error') {
result = await runtime.step(state, nextContext);
// GeneralChatAgent decides: call_llm → call_tool → call_llm → finish
}
```
At each step, `GeneralChatAgent` returns an
`AgentInstruction` based on current state,
and `AgentRuntime` executes it via the corresponding
executor:
- `call_llm`: Call the LLM (see steps 3-5 below)
- `call_tool`: Execute tool calls (see step 6 below)
- `finish`: End the loop
- `compress_context`: Context compression
- `request_human_approve` / `request_human_prompt` /
`request_human_select`: Request user intervention
### 3. Frontend Processes LLM Request
When the Agent issues a `call_llm` instruction,
the executor calls ChatService:
- `src/services/chat/index.ts` preprocesses messages,
tools, and parameters
- Modules under `src/services/chat/mecha/` perform
context engineering, including agent config resolution,
model parameter resolution, MCP context injection, etc.
- Calls `getChatCompletion` to prepare request parameters
- Uses `fetchSSE` from the `@lobechat/fetch-sse` package
to send the request to the backend API
### 4. Backend Processes Request
- `src/app/(backend)/webapi/chat/[provider]/route.ts`
receives the request
- Calls `initModelRuntimeFromDB` to read user's provider
config from the database and initialize ModelRuntime
- A tRPC route `src/server/routers/lambda/aiChat.ts`
also exists for server-side message sending
and structured output scenarios
### 5. Model Call and Response Processing
- `ModelRuntime`
(`packages/model-runtime/src/core/ModelRuntime.ts`)
calls the respective model provider's API
and returns a streaming response
- Frontend processes the streaming response via
`fetchSSE` and
[fetchEventSource](https://github.com/Azure/fetch-event-source)
- Handles different types of events
(text, tool calls, reasoning, etc.)
- Passes results back to client through callback functions
### 6. Tool Calling Scenario
When the AI model returns a `tool_calls` field in its
response, the Agent issues a `call_tool` instruction.
LobeChat supports three types of tools:
**Builtin Tools**: Tools built into the application,
executed directly via local executors.
- Frontend executes them directly on the client
via the `invokeBuiltinTool` method
- Includes built-in features like search,
DALL-E image generation, etc.
**MCP Tools**: External tools connected via
[Model Context Protocol](https://modelcontextprotocol.io/).
- Frontend calls `MCPService` (`src/services/mcp.ts`)
via the `invokeMCPTypePlugin` method
- Supports three connection modes:
stdio, HTTP (streamable-http/SSE), and cloud (Klavis)
- MCP tool registration and discovery is managed
through MCP server configuration
**Plugin Tools**: Legacy plugin system,
invoked via API gateway.
This system is expected to be gradually deprecated
in favor of the MCP tool system.
- Frontend calls them via the
`invokeDefaultTypePlugin` method
- Retrieves plugin settings and manifest,
creates authentication headers,
and sends requests to the plugin gateway
After tool execution completes, results are written to
the message and returned to the Agent loop.
The Agent then calls the LLM again to generate
the final response based on tool results.
The tool dispatch logic is in
`src/store/chat/slices/plugin/actions/pluginTypes.ts`.
### 7. Preset Task Processing
Preset tasks are predefined functions typically triggered
when users perform specific actions
(bypassing the Agent Runtime loop, calling LLM directly).
These tasks use the `fetchPresetTaskResult` method,
which is similar to the normal chat flow
but uses specially designed prompt chains.
**Execution Timing**:
1. **Agent Information Auto-generation**: Triggered when
users create or edit an agent
- Agent avatar generation (via `autoPickEmoji` method)
- Agent description generation
(via `autocompleteAgentDescription` method)
- Agent tag generation
(via `autocompleteAgentTags` method)
- Agent title generation
(via `autocompleteAgentTitle` method)
2. **Message Translation**: Triggered when users manually
click the translate button
(via `translateMessage` method)
3. **Web Search**: When search is enabled but the model
doesn't support tool calling, search functionality
is implemented via `fetchPresetTaskResult`
**Code Examples**:
Agent avatar auto-generation implementation:
```ts
// src/features/AgentSetting/store/action.ts
autoPickEmoji: async () => {
const { config, meta, dispatchMeta } = get();
const systemRole = config.systemRole;
chatService.fetchPresetTaskResult({
onFinish: async (emoji) => {
dispatchMeta({ type: 'update', value: { avatar: emoji } });
},
onLoadingChange: (loading) => {
get().updateLoadingState('avatar', loading);
},
params: merge(
get().internal_getSystemAgentForMeta(),
chainPickEmoji(
[meta.title, meta.description, systemRole]
.filter(Boolean)
.join(','),
),
),
trace: get().getCurrentTracePayload({
traceName: TraceNameMap.EmojiPicker,
}),
});
};
```
Translation feature implementation:
```ts
// src/store/chat/slices/translate/action.ts
translateMessage: async (id, targetLang) => {
// ...omitted code...
// Detect language
chatService.fetchPresetTaskResult({
onFinish: async (data) => {
if (data && supportLocales.includes(data)) from = data;
await updateMessageTranslate(id, {
content,
from,
to: targetLang,
});
},
params: merge(
translationSetting,
chainLangDetect(message.content),
),
trace: get().getCurrentTracePayload({
traceName: TraceNameMap.LanguageDetect,
}),
});
// Perform translation
chatService.fetchPresetTaskResult({
onMessageHandle: (chunk) => {
if (chunk.type === 'text') {
content = chunk.text;
internal_dispatchMessage({
id,
type: 'updateMessageTranslate',
value: { content, from, to: targetLang },
});
}
},
onFinish: async () => {
await updateMessageTranslate(id, {
content,
from,
to: targetLang,
});
internal_toggleChatLoading(
false,
id,
n('translateMessage(end)', { id }) as string,
);
},
params: merge(
translationSetting,
chainTranslate(message.content, targetLang),
),
trace: get().getCurrentTracePayload({
traceName: TraceNameMap.Translation,
}),
});
};
```
### 8. Completion
When the Agent issues a `finish` instruction,
the loop ends, and the `onFinish` callback is called
with the complete response result.
## Client-Side vs Server-Side Execution
The Agent Runtime loop execution location
depends on the scenario:
- **Client-side loop** (browser): Regular 1:1 chat,
continue generation, group orchestration decisions.
The loop runs in the browser, entry point is
`internal_execAgentRuntime()`
(`src/store/chat/slices/aiChat/actions/streamingExecutor.ts`)
- **Server-side loop** (queue/local):
Group chat supervisor agent, sub-agent tasks,
API/Cron triggers. The loop runs on the server,
streaming events to the client via SSE, entry point is
`AgentRuntimeService.executeStep()`
(`src/server/services/agentRuntime/AgentRuntimeService.ts`),
tRPC route at `src/server/routers/lambda/aiAgent.ts`
## Model Runtime
Model Runtime (`packages/model-runtime/`) is the core
abstraction layer in LobeChat for interacting with
LLM model providers, adapting different provider APIs
into a unified interface.
**Core Responsibilities**:
- **Unified Abstraction Layer**: Hides differences
between AI provider APIs through the `LobeRuntimeAI`
interface
(`packages/model-runtime/src/core/BaseAI.ts`)
- **Model Initialization**: Initializes the corresponding
runtime instance through the provider mapping table
(`packages/model-runtime/src/runtimeMap.ts`)
- **Capability Encapsulation**: `chat` (streaming chat),
`models` (model listing), `embeddings` (text embeddings),
`createImage` (image generation),
`textToSpeech` (speech synthesis),
`generateObject` (structured output)
**Core Interface**:
```ts
// packages/model-runtime/src/core/BaseAI.ts
export interface LobeRuntimeAI {
baseURL?: string;
chat?(
payload: ChatStreamPayload,
options?: ChatMethodOptions,
): Promise<Response>;
generateObject?(
payload: GenerateObjectPayload,
options?: GenerateObjectOptions,
): Promise<any>;
embeddings?(
payload: EmbeddingsPayload,
options?: EmbeddingsOptions,
): Promise<Embeddings[]>;
models?(): Promise<any>;
createImage?: (
payload: CreateImagePayload,
) => Promise<CreateImageResponse>;
textToSpeech?: (
payload: TextToSpeechPayload,
options?: TextToSpeechOptions,
) => Promise<ArrayBuffer>;
}
```
**Adapter Architecture**: Through two factory functions —
`openaiCompatibleFactory` and
`anthropicCompatibleFactory` — most providers can be
integrated with minimal configuration. Currently supports
over 40 model providers (OpenAI, Anthropic, Google,
Azure, Bedrock, Ollama, etc.), with implementations
in `packages/model-runtime/src/providers/`.
## Agent Runtime
Agent Runtime (`packages/agent-runtime/`) is LobeChat's
agent orchestration engine. As described above, it is the
core execution engine that drives the entire chat flow.
**Core Components**:
- **`AgentRuntime`**
(`packages/agent-runtime/src/core/runtime.ts`):
The "engine" that executes the Agent instruction loop,
supporting `call_llm`, `call_tool`, `finish`,
`compress_context`, `request_human_*`, etc.
- **`GeneralChatAgent`**
(`packages/agent-runtime/src/agents/GeneralChatAgent.ts`):
The "brain" that decides which instruction to execute
next based on current state
- **`GroupOrchestrationRuntime`**
(`packages/agent-runtime/src/groupOrchestration/`):
Multi-agent orchestration supporting speak / broadcast /
delegate / executeTask collaboration modes
- **`UsageCounter`**: Token usage and cost tracking
- **`InterventionChecker`**:
Security blacklist for managing agent behavior boundaries