LocalAI/core/http/endpoints
Richard Palethorpe 3db60b57e6
fix(realtime): consume ChatDeltas when C++ autoparser clears Response (#9538)
The llama.cpp C++-side chat autoparser clears Reply.Message and delivers
parsed content/reasoning/tool-calls via Reply.chat_deltas. chat.go handles
this (non-SSE path uses ToolCallsFromChatDeltas/ContentFromChatDeltas/
ReasoningFromChatDeltas), but realtime.go only read pred.Response, so any
model routed through the autoparser (Qwen2.5/3 and friends) produced a
silent reply: backend emitted N tokens, the session surface saw zero.

Mirror the non-SSE chat path in realtime's triggerResponse: when deltas
carry tool calls or content, use them directly; otherwise fall back to
the existing raw-text parsing.

Assisted-by: claude-opus-4-7-1M [Claude Code]

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-04-24 14:41:38 +02:00
..
anthropic fix: use SetFunctionCallNameString when forcing a specific tool (3 sites) (#9526) 2026-04-24 09:06:42 +02:00
elevenlabs feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084) 2026-04-04 15:14:35 +02:00
explorer feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
jina feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084) 2026-04-04 15:14:35 +02:00
localai feat: add biometrics UI (#9524) 2026-04-24 08:50:34 +02:00
mcp feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
ollama feat(api): add ollama compatibility (#9284) 2026-04-09 14:15:14 +02:00
openai fix(realtime): consume ChatDeltas when C++ autoparser clears Response (#9538) 2026-04-24 14:41:38 +02:00
openresponses fix: use SetFunctionCallNameString when forcing a specific tool (3 sites) (#9526) 2026-04-24 09:06:42 +02:00