LocalAI

mirror of https://github.com/mudler/LocalAI synced 2026-05-24 09:28:23 +00:00

History

Ettore Di Giacinto 9748a1cbc6 fix(streaming): skip chat deltas for role-init elements to prevent first token duplication (#9299 ) When TASK_RESPONSE_TYPE_OAI_CHAT is used, the first streaming token produces a JSON array with two elements: a role-init chunk and the actual content chunk. The grpc-server loop called attach_chat_deltas for both elements with the same raw_result pointer, stamping the first token's ChatDelta.Content on both replies. The Go side accumulated both, emitting the first content token twice to SSE clients. Fix: in the array iteration loops in PredictStream, detect role-init elements (delta has "role" key) and skip attach_chat_deltas for them. Only content/reasoning elements get chat deltas attached. Reasoning models are unaffected because their first token goes into reasoning_content, not content.		2026-04-10 08:45:47 +02:00
..
application	feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler (#9186 )	2026-03-31 08:28:56 +02:00
backend	feat(sam.cpp): add sam.cpp detection backend (#9288 )	2026-04-09 21:49:11 +02:00
cli	feat(api): add ollama compatibility (#9284 )	2026-04-09 14:15:14 +02:00
clients	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
config	feat(sam.cpp): add sam.cpp detection backend (#9288 )	2026-04-09 21:49:11 +02:00
dependencies_manager	feat(ui): move to React for frontend (#8772 )	2026-03-05 21:47:12 +01:00
explorer	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
gallery	fix: try to add whisperx and faster-whisper for more variants (#9278 )	2026-04-08 21:23:38 +02:00
http	fix(streaming): skip chat deltas for role-init elements to prevent first token duplication (#9299 )	2026-04-10 08:45:47 +02:00
p2p	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
schema	feat(sam.cpp): add sam.cpp detection backend (#9288 )	2026-04-09 21:49:11 +02:00
services	feat: track files being staged (#9275 )	2026-04-08 14:33:58 +02:00
startup	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
templates	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
trace	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00