LocalAI

mirror of https://github.com/mudler/LocalAI synced 2026-04-21 13:27:21 +00:00

History

Ettore Di Giacinto 53deeb1107 fix(reasoning): suppress partial tag tokens during autoparser warm-up The C++ PEG parser needs a few tokens to identify the reasoning format (e.g. "<\|channel>thought\n" for Gemma 4). During this warm-up, the gRPC layer was sending raw partial tag tokens to Go, which leaked into the reasoning field. - Clear reply.message in gRPC when autoparser is active but has no diffs yet, matching llama.cpp server behavior of only emitting classified output - Prefer C++ autoparser chat deltas for reasoning/content in all streaming paths, falling back to Go-side extraction for backends without autoparser (e.g. vLLM) - Override non-streaming no-tools result with chat delta content when available - Guard PrependThinkingTokenIfNeeded against partial tag prefixes during streaming accumulation - Reorder default thinking tokens so <\|channel>thought is checked before <\|think\|> (Gemma 4 templates contain both)		2026-04-04 20:45:57 +00:00
..
audio	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
concurrency	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
downloader	feat(distributed): Avoid resending models to backend nodes (#9193 )	2026-03-31 16:28:13 +02:00
functions	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
grpc	chore(refactor): use interface (#9226 )	2026-04-04 17:29:37 +02:00
huggingface-api	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
model	chore(refactor): use interface (#9226 )	2026-04-04 17:29:37 +02:00
oci	feat(ui): allow to cancel ops (#7264 )	2025-11-13 18:41:47 +01:00
reasoning	fix(reasoning): suppress partial tag tokens during autoparser warm-up	2026-04-04 20:45:57 +00:00
sanitize	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
signals	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
sound	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
store	chore: fix go.mod module (#2635 )	2024-06-23 08:24:36 +00:00
system	fix: gate CUDA directory checks on GPU vendor to prevent false CUDA detection (#8942 )	2026-03-12 07:53:39 +01:00
utils	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
vram	feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084 )	2026-04-04 15:14:35 +02:00
xio	feat(ui): allow to cancel ops (#7264 )	2025-11-13 18:41:47 +01:00
xsync	chore: fix go.mod module (#2635 )	2024-06-23 08:24:36 +00:00
xsysinfo	feat(gpu): add jetson/tegra detection	2026-03-31 15:45:07 +00:00