LocalAI/core/http
Richard Palethorpe 3db60b57e6
fix(realtime): consume ChatDeltas when C++ autoparser clears Response (#9538)
The llama.cpp C++-side chat autoparser clears Reply.Message and delivers
parsed content/reasoning/tool-calls via Reply.chat_deltas. chat.go handles
this (non-SSE path uses ToolCallsFromChatDeltas/ContentFromChatDeltas/
ReasoningFromChatDeltas), but realtime.go only read pred.Response, so any
model routed through the autoparser (Qwen2.5/3 and friends) produced a
silent reply: backend emitted N tokens, the session surface saw zero.

Mirror the non-SSE chat path in realtime's triggerResponse: when deltas
carry tool calls or content, use them directly; otherwise fall back to
the existing raw-text parsing.

Assisted-by: claude-opus-4-7-1M [Claude Code]

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-04-24 14:41:38 +02:00
..
auth feat: voice recognition (#9500) 2026-04-23 12:07:14 +02:00
endpoints fix(realtime): consume ChatDeltas when C++ autoparser clears Response (#9538) 2026-04-24 14:41:38 +02:00
middleware fix(openresponses): parse OpenAI-spec nested tool_choice + use correct setter (#9509) 2026-04-23 18:30:05 +02:00
react-ui feat: add biometrics UI (#9524) 2026-04-24 08:50:34 +02:00
routes feat: add biometrics UI (#9524) 2026-04-24 08:50:34 +02:00
static feat(realtime): WebRTC support (#8790) 2026-03-13 21:37:15 +01:00
views feat(realtime): WebRTC support (#8790) 2026-03-13 21:37:15 +01:00
app.go feat(api): add ollama compatibility (#9284) 2026-04-09 14:15:14 +02:00
app_test.go fix(streaming): skip chat deltas for role-init elements to prevent first token duplication (#9299) 2026-04-10 08:45:47 +02:00
explorer.go chore(refactor): move logging to common package based on slog (#7668) 2025-12-21 19:33:13 +01:00
http_suite_test.go feat(api): add support for open responses specification (#8063) 2026-01-17 22:11:47 +01:00
openresponses_test.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
render.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00