LocalAI

mirror of https://github.com/mudler/LocalAI synced 2026-05-24 09:28:23 +00:00

History

LocalAI [bot] c500461c69 feat(config): default prompt_cache_all to true (#9951 ) Upstream llama.cpp defaults `cache_prompt = true` (common/common.h), but `parse_options` in the grpc-server backend unconditionally forwards the proto `PromptCacheAll` field, so any model that didn't set `prompt_cache_all: true` in its YAML was getting `cache_prompt=false` — silently overriding llama.cpp's own default. With `kv_unified` and `cache_idle_slots` already on by default, this was the last piece preventing the per-request prompt cache from being usable out of the box. Make `PromptCacheAll` tristate (`*bool`), default it to `true` in `SetDefaults`, and dereference at the proto boundary. Users can still opt out with an explicit `prompt_cache_all: false`. Same pattern as `MMap`, `MMlock`, `Reranking`, etc. Co-authored-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-22 22:06:22 +02:00
..
audio_transform.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
backend_suite_test.go	feat: extract output with regexes from LLMs (#3491 )	2024-09-13 13:27:36 +02:00
detection.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
diarization.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
diarization_test.go	feat(api): add /v1/audio/diarization endpoint with sherpa-onnx + vibevoice.cpp (#9654 )	2026-05-05 15:10:13 +02:00
embeddings.go	feat(ui): Per model backend logs and various fixes (#9028 )	2026-03-18 08:31:26 +01:00
face_analyze.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
face_embed.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
face_verify.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
image.go	feat(ui): Per model backend logs and various fixes (#9028 )	2026-03-18 08:31:26 +01:00
llm.go	feat(gallery): verify backend OCI images with keyless cosign (#9823 )	2026-05-18 08:02:20 +02:00
llm_probe_test.go	Respect explicit reasoning config during GGUF thinking probe (#9463 )	2026-04-21 21:53:10 +02:00
llm_test.go	feat(autoparser): prefer chat deltas from backends when emitted (#9224 )	2026-04-04 12:12:08 +02:00
options.go	feat(config): default prompt_cache_all to true (#9951 )	2026-05-22 22:06:22 +02:00
options_internal_test.go	feat(vllm): expose AsyncEngineArgs via generic engine_args YAML map (#9563 )	2026-04-29 00:49:28 +02:00
rerank.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
soundgeneration.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
stores.go	feat: add biometrics UI (#9524 )	2026-04-24 08:50:34 +02:00
token_metrics.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
tokenize.go	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
transcript.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
tts.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
vad.go	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
video.go	feat(ui): Per model backend logs and various fixes (#9028 )	2026-03-18 08:31:26 +01:00
voice_analyze.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
voice_embed.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00
voice_verify.go	feat(whisper): honor client cancellation via ggml abort_callback (#9710 )	2026-05-08 01:44:47 +02:00