LocalAI/backend/cpp/llama-cpp
Ettore Di Giacinto 02cc8cbcaa
feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120)
* feat(llama.cpp): expose env vars as options for consistency

This allows to configure everything in the YAML file of the model rather
than have global configurations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(llama.cpp): respect usetokenizertemplate and use llama.cpp templating system to process messages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Detect template exists if use tokenizer template is enabled

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Better recognization of chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixes to support tool calls while using templates from tokenizer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop template guessing, fix passing tools to tokenizer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Extract grammar and other options from chat template, add schema struct

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Automatically set use_jinja

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Cleanups, identify by default gguf models for chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-07 21:23:50 +01:00
..
patches feat: do not bundle llama-cpp anymore (#5790) 2025-07-18 13:24:12 +02:00
CMakeLists.txt feat: do not bundle llama-cpp anymore (#5790) 2025-07-18 13:24:12 +02:00
grpc-server.cpp feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120) 2025-11-07 21:23:50 +01:00
Makefile chore: ⬆️ Update ggml-org/llama.cpp to 7f09a680af6e0ef612de81018e1d19c19b8651e8 (#7156) 2025-11-07 08:38:56 +01:00
package.sh feat: do not bundle llama-cpp anymore (#5790) 2025-07-18 13:24:12 +02:00
prepare.sh feat: do not bundle llama-cpp anymore (#5790) 2025-07-18 13:24:12 +02:00
run.sh fix(llama-cpp/darwin): make sure to bundle libutf8 libs (#6060) 2025-08-14 17:56:35 +02:00