LocalAI/core/backend
Ettore Di Giacinto 02cc8cbcaa
feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120)
* feat(llama.cpp): expose env vars as options for consistency

This allows to configure everything in the YAML file of the model rather
than have global configurations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(llama.cpp): respect usetokenizertemplate and use llama.cpp templating system to process messages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Detect template exists if use tokenizer template is enabled

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Better recognization of chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixes to support tool calls while using templates from tokenizer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop template guessing, fix passing tools to tokenizer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Extract grammar and other options from chat template, add schema struct

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Automatically set use_jinja

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Cleanups, identify by default gguf models for chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-07 21:23:50 +01:00
..
backend_suite_test.go feat: extract output with regexes from LLMs (#3491) 2024-09-13 13:27:36 +02:00
detection.go feat(backends): add system backend, refactor (#6059) 2025-08-14 19:38:26 +02:00
embeddings.go feat(backends): add system backend, refactor (#6059) 2025-08-14 19:38:26 +02:00
image.go feat(backends): add system backend, refactor (#6059) 2025-08-14 19:38:26 +02:00
llm.go feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120) 2025-11-07 21:23:50 +01:00
llm_test.go feat(backends): add system backend, refactor (#6059) 2025-08-14 19:38:26 +02:00
options.go fix(llama.cpp): correctly set grammar triggers (#6432) 2025-10-10 19:50:17 +02:00
rerank.go feat(backends): add system backend, refactor (#6059) 2025-08-14 19:38:26 +02:00
soundgeneration.go feat: Add Agentic MCP support with a new chat/completion endpoint (#6381) 2025-10-05 17:51:41 +02:00
stores.go feat: refactor build process, drop embedded backends (#5875) 2025-07-22 16:31:04 +02:00
token_metrics.go feat(backends): add system backend, refactor (#6059) 2025-08-14 19:38:26 +02:00
tokenize.go feat(backends): add system backend, refactor (#6059) 2025-08-14 19:38:26 +02:00
transcript.go feat(whisper): Add diarization (tinydiarize) (#6184) 2025-09-10 19:09:28 +02:00
tts.go feat: Add Agentic MCP support with a new chat/completion endpoint (#6381) 2025-10-05 17:51:41 +02:00
vad.go feat(backends): add system backend, refactor (#6059) 2025-08-14 19:38:26 +02:00
video.go feat(diffusers): add support for wan2.2 (#6153) 2025-08-28 10:26:42 +02:00