LocalAI/pkg
Ettore Di Giacinto 02cc8cbcaa
feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120)
* feat(llama.cpp): expose env vars as options for consistency

This allows to configure everything in the YAML file of the model rather
than have global configurations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(llama.cpp): respect usetokenizertemplate and use llama.cpp templating system to process messages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Detect template exists if use tokenizer template is enabled

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Better recognization of chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixes to support tool calls while using templates from tokenizer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop template guessing, fix passing tools to tokenizer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Extract grammar and other options from chat template, add schema struct

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Automatically set use_jinja

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Cleanups, identify by default gguf models for chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-07 21:23:50 +01:00
..
audio feat: Realtime API support reboot (#5392) 2025-05-25 22:25:05 +02:00
concurrency chore: update jobresult_test.go (#4124) 2024-11-12 08:52:18 +01:00
downloader feat: support HF_ENDPOINT env for the HuggingFace endpoint (#6220) 2025-09-11 21:04:57 +02:00
functions feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120) 2025-11-07 21:23:50 +01:00
grpc feat(rfdetr): add object detection API (#5923) 2025-07-27 22:02:51 +02:00
langchain feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants (#2232) 2024-05-04 17:56:12 +02:00
model fix: guard from potential deadlock with requests in flight (#6484) 2025-10-16 21:28:19 +02:00
oci feat(cli): add command to create custom OCI images from directories (#5844) 2025-07-14 08:21:29 +02:00
signals chore: update cogito and simplify MCP logics (#6413) 2025-10-09 12:36:45 +02:00
sound feat: Realtime API support reboot (#5392) 2025-05-25 22:25:05 +02:00
store chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
system fix: runtime capability detection for backends (#6149) 2025-09-11 10:46:19 +02:00
utils feat(rfdetr): add object detection API (#5923) 2025-07-27 22:02:51 +02:00
xsync chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
xsysinfo feat: improve RAM estimation by using values from summary (#5525) 2025-06-05 19:16:26 +02:00