LocalAI/core/http/endpoints/openai
Richard Palethorpe 3db60b57e6
fix(realtime): consume ChatDeltas when C++ autoparser clears Response (#9538)
The llama.cpp C++-side chat autoparser clears Reply.Message and delivers
parsed content/reasoning/tool-calls via Reply.chat_deltas. chat.go handles
this (non-SSE path uses ToolCallsFromChatDeltas/ContentFromChatDeltas/
ReasoningFromChatDeltas), but realtime.go only read pred.Response, so any
model routed through the autoparser (Qwen2.5/3 and friends) produced a
silent reply: backend emitted N tokens, the session surface saw zero.

Mirror the non-SSE chat path in realtime's triggerResponse: when deltas
carry tool calls or content, use them directly; otherwise fall back to
the existing raw-text parsing.

Assisted-by: claude-opus-4-7-1M [Claude Code]

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-04-24 14:41:38 +02:00
..
types feat(realtime): WebRTC support (#8790) 2026-03-13 21:37:15 +01:00
chat.go fix(streaming): dedupe content, recover reasoning, unique tool_call IDs in deferred flush (#9470) 2026-04-21 21:59:33 +02:00
chat_emit.go fix(streaming): dedupe content, recover reasoning, unique tool_call IDs in deferred flush (#9470) 2026-04-21 21:59:33 +02:00
chat_emit_test.go fix(streaming): dedupe content, recover reasoning, unique tool_call IDs in deferred flush (#9470) 2026-04-21 21:59:33 +02:00
chat_test.go chore: refactor endpoints to use same inferencing path, add automatic retrial mechanism in case of errors (#9029) 2026-03-16 21:31:02 +01:00
completion.go feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084) 2026-04-04 15:14:35 +02:00
constants.go fix(api): SSE streaming format to comply with specification (#7182) 2025-11-09 22:00:27 +01:00
edit.go feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084) 2026-04-04 15:14:35 +02:00
embeddings.go feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084) 2026-04-04 15:14:35 +02:00
image.go feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084) 2026-04-04 15:14:35 +02:00
image_test.go Fix image upload processing and img2img pipeline in diffusers backend (#8879) 2026-03-11 08:05:50 +01:00
inference.go fix: thinking models with tools returning empty content (reasoning-only retry loop) (#9290) 2026-04-09 18:30:31 +02:00
inference_test.go fix: thinking models with tools returning empty content (reasoning-only retry loop) (#9290) 2026-04-09 18:30:31 +02:00
inpainting.go feat(UI): image generation improvements (#7804) 2025-12-31 21:59:46 +01:00
inpainting_test.go feat(realtime): WebRTC support (#8790) 2026-03-13 21:37:15 +01:00
list.go feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084) 2026-04-04 15:14:35 +02:00
openai_suite_test.go Fix image upload processing and img2img pipeline in diffusers backend (#8879) 2026-03-11 08:05:50 +01:00
realtime.go fix(realtime): consume ChatDeltas when C++ autoparser clears Response (#9538) 2026-04-24 14:41:38 +02:00
realtime_model.go fix: use SetFunctionCallNameString when forcing a specific tool (3 sites) (#9526) 2026-04-24 09:06:42 +02:00
realtime_transport.go feat(realtime): WebRTC support (#8790) 2026-03-13 21:37:15 +01:00
realtime_transport_webrtc.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
realtime_transport_ws.go feat(realtime): WebRTC support (#8790) 2026-03-13 21:37:15 +01:00
realtime_webrtc.go feat(realtime): WebRTC support (#8790) 2026-03-13 21:37:15 +01:00
transcription.go feat: wire transcription for llama.cpp, add streaming support (#9353) 2026-04-14 16:13:40 +02:00