Commit graph

320 commits

Author SHA1 Message Date
LocalAI [bot]
f76958d761
chore: ⬆️ Update ggml-org/llama.cpp to 0440bfd1605333726ea0fb7a836942660bf2f9a6 (#8216)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-26 00:50:35 +01:00
LocalAI [bot]
05a332cd5f
chore: ⬆️ Update ggml-org/llama.cpp to bb02f74c612064947e51d23269a1cf810b67c9a7 (#8196)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-24 21:19:43 +00:00
LocalAI [bot]
4019094111
chore: ⬆️ Update ggml-org/llama.cpp to 557515be1e93ed8939dd8a7c7d08765fdbe8be31 (#8183)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-24 08:57:08 +01:00
Ettore Di Giacinto
c0b21a921b
feat: detect thinking support from backend automatically if not explicitly set (#8167)
detect thinking support from backend automatically if not explicitly set

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-23 00:38:28 +01:00
LocalAI [bot]
b10045adc2
chore: ⬆️ Update ggml-org/llama.cpp to a5eaa1d6a3732bc0f460b02b61c95680bba5a012 (#8165)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-01-22 23:32:05 +00:00
LocalAI [bot]
c12b310028
chore: ⬆️ Update ggml-org/llama.cpp to c301172f660a1fe0b42023da990bf7385d69adb4 (#8151)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-22 00:51:22 +01:00
LocalAI [bot]
5687df4535
chore: ⬆️ Update ggml-org/llama.cpp to ad8d85bd94cc86e89d23407bdebf98f2e6510c61 (#8145)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-21 15:41:36 +00:00
Ettore Di Giacinto
f6daaa7c35
chore(deps): Bump llama.cpp to '1c7cf94b22a9dc6b1d32422f72a627787a4783a3' (#8136)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-21 00:12:13 +01:00
LocalAI [bot]
d3525b7509
chore: ⬆️ Update ggml-org/llama.cpp to 959ecf7f234dc0bc0cd6829b25cb0ee1481aa78a (#8122)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-19 22:50:47 +01:00
LocalAI [bot]
ab8ed24358
chore: ⬆️ Update ggml-org/llama.cpp to 287a33017b32600bfc0e81feeb0ad6e81e0dd484 (#8100)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-18 22:40:14 +01:00
LocalAI [bot]
1cd33047b4
chore: ⬆️ Update ggml-org/llama.cpp to 2fbde785bc106ae1c4102b0e82b9b41d9c466579 (#8087)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-17 21:10:18 +00:00
LocalAI [bot]
d4fd0c0609
chore: ⬆️ Update ggml-org/llama.cpp to 388ce822415f24c60fcf164a321455f1e008cafb (#8073)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-16 21:22:33 +00:00
LocalAI [bot]
cb8616c7d1
chore: ⬆️ Update ggml-org/llama.cpp to 785a71008573e2d84728fb0ba9e851d72d3f8fab (#8053)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-15 22:53:17 +01:00
LocalAI [bot]
49d6305509
chore: ⬆️ Update ggml-org/llama.cpp to d98b548120eecf98f0f6eaa1ba7e29b3afda9f2e (#8040)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-15 08:39:46 +01:00
LocalAI [bot]
d6e698876b
chore: ⬆️ Update ggml-org/llama.cpp to e4832e3ae4d58ac0ecbdbf4ae055424d6e628c9f (#8015)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-14 08:09:37 +01:00
LocalAI [bot]
7e35ec6c4f
chore: ⬆️ Update ggml-org/llama.cpp to bcf7546160982f56bc290d2e538544bbc0772f63 (#7991)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-12 21:14:33 +00:00
LocalAI [bot]
bc180c2638
chore: ⬆️ Update ggml-org/llama.cpp to 0c3b7a9efebc73d206421c99b7eb6b6716231322 (#7978)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-11 22:06:30 +01:00
LocalAI [bot]
5bfc3eebf8
chore: ⬆️ Update ggml-org/llama.cpp to b1377188784f9aea26b8abde56d4aee8c733eec7 (#7965)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-10 22:24:26 +01:00
LocalAI [bot]
fdc2c0737c
chore: ⬆️ Update ggml-org/llama.cpp to 593da7fa49503b68f9f01700be9f508f1e528992 (#7946)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-09 21:13:04 +00:00
Ettore Di Giacinto
f4b0a304d7
chore(llama.cpp): propagate errors during model load (#7937)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-09 07:52:49 +01:00
Ettore Di Giacinto
d16ec7aa9e
chore(deps): Bump llama.cpp to '480160d47297df43b43746294963476fc0a6e10f' (#7933)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-09 07:52:32 +01:00
LocalAI [bot]
c03e532a18
chore: ⬆️ Update ggml-org/llama.cpp to ae9f8df77882716b1702df2bed8919499e64cc28 (#7915)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-07 23:24:01 +01:00
Copilot
fd53978a7b
feat: package GPU libraries inside backend containers for unified base image (#7891)
* Initial plan

* Add GPU library packaging for isolated backend environments

- Create scripts/build/package-gpu-libs.sh for packaging CUDA, ROCm, SYCL, and Vulkan libraries
- Update llama-cpp, whisper, stablediffusion-ggml package.sh to include GPU libraries
- Update Dockerfile.python to package GPU libraries into Python backends
- Update libbackend.sh to set LD_LIBRARY_PATH for GPU library loading

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Address code review feedback: fix variable consistency and quoting

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Fix code review issues: improve glob handling and remove redundant variable

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Simplify main Dockerfile and workflow to use unified base image

- Remove GPU-specific driver installation from Dockerfile (CUDA, ROCm, Vulkan, Intel)
- Simplify image.yml workflow to build single unified base image for linux/amd64 and linux/arm64
- GPU libraries are now packaged in individual backend containers

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-01-07 15:48:51 +01:00
LocalAI [bot]
fb9879949c
chore: ⬆️ Update ggml-org/llama.cpp to ccbc84a5374bab7a01f68b129411772ddd8e7c79 (#7894)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-06 22:18:35 +01:00
Ettore Di Giacinto
26c4f80d1b
chore(llama.cpp/flags): simplify conditionals (#7887)
If ggml handle conditionals correctly we don't need to handle it here.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-06 15:02:20 +01:00
coffeerunhobby
5add7b47f5
fix: BMI2 crash on AVX-only CPUs (Intel Ivy Bridge/Sandy Bridge) (#7864)
* Fix BMI2 crash on AVX-only CPUs (Intel Ivy Bridge/Sandy Bridge)

Signed-off-by: coffeerunhobby <coffeerunhobby@users.noreply.github.com>

* Address feedback from review

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: coffeerunhobby <coffeerunhobby@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: coffeerunhobby <coffeerunhobby@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-01-06 00:13:48 +00:00
LocalAI [bot]
4f7b6b0bff
chore: ⬆️ Update ggml-org/llama.cpp to e443fbcfa51a8a27b15f949397ab94b5e87b2450 (#7881)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-05 22:55:40 +01:00
LocalAI [bot]
9d3da0bed5
chore: ⬆️ Update ggml-org/llama.cpp to 4974bf53cf14073c7b66e1151348156aabd42cb8 (#7861)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-05 00:10:18 +01:00
LocalAI [bot]
a7e155240b
chore: ⬆️ Update ggml-org/llama.cpp to e57f52334b2e8436a94f7e332462dfc63a08f995 (#7848)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-04 10:27:45 +01:00
coffeerunhobby
666d110714
fix: Prevent BMI2 instruction crash on AVX-only CPUs (#7817)
* Fix: Prevent BMI2 instruction crash on AVX-only CPUs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: apply no-bmi flags on non-darwin

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: coffeerunhobby <coffeerunhobby@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-03 08:36:55 +01:00
LocalAI [bot]
641606ae93
chore: ⬆️ Update ggml-org/llama.cpp to 706e3f93a60109a40f1224eaf4af0d59caa7c3ae (#7836)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-02 21:26:37 +00:00
Ettore Di Giacinto
5f6c941399
fix(llama.cpp/mmproj): fix loading mmproj in nested sub-dirs different from model path (#7832)
fix(mmproj): fix loading mmproj in nested sub-dirs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-02 20:17:30 +01:00
LocalAI [bot]
949de04052
chore: ⬆️ Update ggml-org/llama.cpp to ced765be44ce173c374f295b3c6f4175f8fd109b (#7822)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-02 08:44:49 +01:00
LocalAI [bot]
bc3e8793ed
chore: ⬆️ Update ggml-org/llama.cpp to 13814eb370d2f0b70e1830cc577b6155b17aee47 (#7809)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-31 23:04:01 +01:00
LocalAI [bot]
218f3a126a
chore: ⬆️ Update ggml-org/llama.cpp to 0f89d2ecf14270f45f43c442e90ae433fd82dab1 (#7795)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-31 08:53:41 +01:00
LocalAI [bot]
bc8ec5cb39
chore: ⬆️ Update ggml-org/llama.cpp to c9a3b40d6578f2381a1373d10249403d58c3c5bd (#7778)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-30 08:27:16 +01:00
LocalAI [bot]
1a6fd0f7fc
chore: ⬆️ Update ggml-org/llama.cpp to 4ffc47cb2001e7d523f9ff525335bbe34b1a2858 (#7760)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-28 21:10:39 +00:00
LocalAI [bot]
c95c482f36
chore: ⬆️ Update ggml-org/llama.cpp to a4bf35889eda36d3597cd0f8f333f5b8a2fcaefc (#7751)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-27 21:09:12 +00:00
LocalAI [bot]
ddf0281785
chore: ⬆️ Update ggml-org/llama.cpp to 7ac8902133da6eb390c4d8368a7d252279123942 (#7740)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-26 21:44:34 +00:00
LocalAI [bot]
86c68c9623
chore: ⬆️ Update ggml-org/llama.cpp to 85c40c9b02941ebf1add1469af75f1796d513ef4 (#7731)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-25 21:10:28 +00:00
LocalAI [bot]
2fe6e278c8
chore: ⬆️ Update ggml-org/llama.cpp to c18428423018ed214c004e6ecaedb0cbdda06805 (#7718)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-25 10:00:40 +01:00
Ettore Di Giacinto
0a168830ea
chore(deps): Bump llama.cpp to '5b6c9bc0f3c8f55598b9999b65aff7ce4119bc15' and refactor usage of base params (#7706)
* chore(deps): Bump llama.cpp to '5b6c9bc0f3c8f55598b9999b65aff7ce4119bc15' and refactor usage of base params

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: update AGENTS.md

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-24 00:28:27 +01:00
Ettore Di Giacinto
fc6057a952
chore(deps): bump llama.cpp to '0e1ccf15c7b6d05c720551b537857ecf6194d420' (#7684)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-22 09:50:42 +01:00
LocalAI [bot]
38cde81ff4
chore: ⬆️ Update ggml-org/llama.cpp to 52ab19df633f3de5d4db171a16f2d9edd2342fec (#7665)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-20 21:09:15 +00:00
LocalAI [bot]
626057bcca
chore: ⬆️ Update ggml-org/llama.cpp to ce734a8a2f9fb6eb4f0383ab1370a1b0014ab787 (#7654)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-19 21:15:39 +00:00
LocalAI [bot]
f25ac00bca
chore: ⬆️ Update ggml-org/llama.cpp to f9ec8858edea4a0ecfea149d6815ebfb5ecc3bcd (#7642)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-18 21:17:14 +00:00
LocalAI [bot]
5515119a7e
chore: ⬆️ Update ggml-org/llama.cpp to d37fc935059211454e9ad2e2a44e8ed78fd6d1ce (#7629)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-18 09:07:09 +01:00
LocalAI [bot]
14bb65b57b
chore: ⬆️ Update ggml-org/llama.cpp to ef83fb8601229ff650d952985be47e82d644bfaa (#7611)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-17 08:32:42 +01:00
Ettore Di Giacinto
2387b266d8
chore(llama.cpp): Add Missing llama.cpp Options to gRPC Server (#7584)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-15 21:55:20 +01:00
LocalAI [bot]
0f5cc4c07b
chore: ⬆️ Update ggml-org/llama.cpp to 5c8a717128cc98aa9e5b1c44652f5cf458fd426e (#7573)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-14 22:21:54 +01:00
LocalAI [bot]
3e4e6777d8
chore: ⬆️ Update ggml-org/llama.cpp to 5266379bcae74214af397f36aa81b2a08b15d545 (#7563)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-14 11:41:10 +01:00
Simon Redman
5de539ab07
fix(7355): Update llama-cpp grpc for v3 interface (#7566)
* fix(7355): Update llama-cpp grpc for v3 interface

Signed-off-by: Simon Redman <simon@ergotech.com>

* feat(llama-gprc): Trim whitespace from servers list

Signed-off-by: Simon Redman <simon@ergotech.com>

* Trim trailing spaces in grpc-server.cpp

Signed-off-by: Simon Redman <simon@ergotech.com>

---------

Signed-off-by: Simon Redman <simon@ergotech.com>
2025-12-14 11:40:33 +01:00
Ettore Di Giacinto
0b130fb811
fix(llama.cpp): handle corner cases with tool array content (#7528)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-12 08:15:45 +01:00
LocalAI [bot]
0771a2d3ec
chore: ⬆️ Update ggml-org/llama.cpp to a81a569577cc38b32558958b048228150be63eae (#7529)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-11 21:55:44 +00:00
LocalAI [bot]
72621a1d1c
chore: ⬆️ Update ggml-org/llama.cpp to 4dff236a522bd0ed949331d6cb1ee2a1b3615c35 (#7508)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-11 08:15:38 +01:00
LocalAI [bot]
ef44ace73f
chore: ⬆️ Update ggml-org/llama.cpp to 086a63e3a5d2dbbb7183a74db453459e544eb55a (#7496)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-10 12:05:13 +01:00
Ettore Di Giacinto
74ee1463fe
chore(deps/llama-cpp): bump to '2fa51c19b028180b35d316e9ed06f5f0f7ada2c1' (#7484)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-09 15:41:37 +01:00
LocalAI [bot]
5610384d8a
chore: ⬆️ Update ggml-org/llama.cpp to db97837385edfbc772230debbd49e5efae843a71 (#7447)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-07 08:32:35 +01:00
LocalAI [bot]
edf7141b9b
chore: ⬆️ Update ggml-org/llama.cpp to 8160b38a5fa8a25490ca33ffdd200cda51405688 (#7438)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-06 13:35:24 +01:00
Ettore Di Giacinto
024aa6a55b
chore(deps): bump llama.cpp to 'bde188d60f58012ada0725c6dd5ba7c69fe4dd87' (#7434)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-05 00:17:35 +01:00
LocalAI [bot]
ca2e878aaf
chore: ⬆️ Update ggml-org/llama.cpp to e9f9483464e6f01d843d7f0293bd9c7bc6b2221c (#7421)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-04 11:54:01 +01:00
LocalAI [bot]
957eea3da3
chore: ⬆️ Update ggml-org/llama.cpp to 61bde8e21f4a1f9a98c9205831ca3e55457b4c78 (#7415)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-03 16:27:12 +01:00
LocalAI [bot]
665441ca94
chore: ⬆️ Update ggml-org/llama.cpp to ec18edfcba94dacb166e6523612fc0129cead67a (#7406)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-12-02 07:59:52 +01:00
Ettore Di Giacinto
e3bcba5c45
chore: ⬆️ Update ggml-org/llama.cpp to 7f8ef50cce40e3e7e4526a3696cb45658190e69a (#7402)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-01 07:50:40 +01:00
LocalAI [bot]
0824fd8efd
chore: ⬆️ Update ggml-org/llama.cpp to 8c32d9d96d9ae345a0150cae8572859e9aafea0b (#7395)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-30 09:06:18 +01:00
Ettore Di Giacinto
468ac608f3
chore(deps): bump llama.cpp to 'd82b7a7c1d73c0674698d9601b1bbb0200933f29' (#7392)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-29 08:58:07 +01:00
LocalAI [bot]
1a53fd2b9b
chore: ⬆️ Update ggml-org/llama.cpp to 4abef75f2cf2eee75eb5083b30a94cf981587394 (#7382)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-28 00:08:27 +01:00
LocalAI [bot]
b5f4f4ac6d
chore: ⬆️ Update ggml-org/llama.cpp to eec1e33a9ed71b79422e39cc489719cf4f8e0777 (#7363)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-27 09:17:25 +01:00
Ettore Di Giacinto
7a94d237c4
chore(deps): bump llama.cpp to '583cb83416467e8abf9b37349dcf1f6a0083745a (#7358)
chore(deps): bump llama.cpp to '583cb83416467e8abf9b37349dcf1f6a0083745a'

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-26 08:23:21 +01:00
LocalAI [bot]
f6d2a52cd5
chore: ⬆️ Update ggml-org/llama.cpp to 0c7220db56525d40177fcce3baa0d083448ec813 (#7337)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-24 09:11:38 +01:00
LocalAI [bot]
05a00b2399
chore: ⬆️ Update ggml-org/llama.cpp to 3f3a4fb9c3b907c68598363b204e6f58f4757c8c (#7336)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-22 21:53:40 +00:00
LocalAI [bot]
bdfe8431fa
chore: ⬆️ Update ggml-org/llama.cpp to 23bc779a6e58762ea892eca1801b2ea1b9050c00 (#7331)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-22 08:44:01 +01:00
Ettore Di Giacinto
e88db7d142
fix(llama.cpp): handle corner cases with tool content (#7324)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-21 09:21:49 +01:00
LocalAI [bot]
b7b8a0a748
chore: ⬆️ Update ggml-org/llama.cpp to dd0f3219419b24740864b5343958a97e1b3e4b26 (#7322)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-21 08:11:47 +01:00
LocalAI [bot]
bfa07df7cd
chore: ⬆️ Update ggml-org/llama.cpp to 7d77f07325985c03a91fa371d0a68ef88a91ec7f (#7314)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-20 07:58:42 +01:00
Ettore Di Giacinto
3152611184
chore(deps): bump llama.cpp to '10e9780154365b191fb43ca4830659ef12def80f (#7311)
chore(deps): bump llama.cpp to '10e9780154365b191fb43ca4830659ef12def80f'

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-19 14:42:11 +01:00
LocalAI [bot]
4278506876
chore: ⬆️ Update ggml-org/llama.cpp to cb623de3fc61011e5062522b4d05721a22f2e916 (#7301)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-18 07:43:57 +01:00
LocalAI [bot]
fb834805db
chore: ⬆️ Update ggml-org/llama.cpp to 80deff3648b93727422461c41c7279ef1dac7452 (#7287)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-17 07:51:08 +01:00
Ettore Di Giacinto
d7f9f3ac93
feat: add support to logitbias and logprobs (#7283)
* feat: add support to logprobs in results

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: add support to logitbias

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-16 13:27:36 +01:00
LocalAI [bot]
d1a0dd10e6
chore: ⬆️ Update ggml-org/llama.cpp to 662192e1dcd224bc25759aadd0190577524c6a66 (#7277)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-16 08:41:12 +01:00
LocalAI [bot]
a09d49da43
chore: ⬆️ Update ggml-org/llama.cpp to 9b17d74ab7d31cb7d15ee7eec1616c3d825a84c0 (#7273)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-15 00:05:39 +01:00
Ettore Di Giacinto
03e9f4b140
fix: handle tool errors (#7271)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-14 17:23:56 +01:00
Ettore Di Giacinto
7129409bf6
chore(deps): bump llama.cpp to c4abcb2457217198efdd67d02675f5fddb7071c2 (#7266)
* chore(deps): bump llama.cpp to '92bb442ad999a0d52df0af2730cd861012e8ac5c'

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* DEBUG

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Bump

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test/debug

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Revert "DEBUG"

This reverts commit 2501ca3ff242076d623c13c86b3d6afcec426281.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-14 12:16:52 +01:00
Ettore Di Giacinto
3728552e94
feat: import models via URI (#7245)
* feat: initial hook to install elements directly

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP: ui changes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move HF api client to pkg

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add simple importer for gguf files

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add opcache

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wire importers to CLI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add omitempty to config fields

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add MLX importer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small refactors to star to use HF for discovery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Common preferences

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add support to bare HF repos

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(importer/llama.cpp): add support for mmproj files

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* add mmproj quants to common preferences

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix vlm usage in tokenizer mode with llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-12 20:48:56 +01:00
Mikhail Khludnev
04fe0b0da8
fix(reranker): llama-cpp sort score desc, crop top_n (#7211)
Signed-off-by: Mikhail Khludnev <mkhl@apache.org>
2025-11-12 09:13:01 +01:00
LocalAI [bot]
fae93e5ba2
chore: ⬆️ Update ggml-org/llama.cpp to 7d019cff744b73084b15ca81ba9916f3efab1223 (#7247)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-11 21:31:01 +00:00
LocalAI [bot]
5f4663252d
chore: ⬆️ Update ggml-org/llama.cpp to 13730c183b9e1a32c09bf132b5367697d6c55048 (#7232)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-11 00:03:01 +01:00
LocalAI [bot]
e42f0f7e79
chore: ⬆️ Update ggml-org/llama.cpp to b8595b16e69e3029e06be3b8f6635f9812b2bc3f (#7210)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-09 23:56:27 +01:00
Ettore Di Giacinto
679d43c2f5
feat: respect context and add request cancellation (#7187)
* feat: respect context

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* workaround fasthttp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(ui): allow to abort call

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Refactor

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: improving error

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Respect context also with MCP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tie to both contexts

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Make detection more robust

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-09 18:19:19 +01:00
LocalAI [bot]
f678c6b0a9
chore: ⬆️ Update ggml-org/llama.cpp to 333f2595a3e0e4c0abf233f2f29ef1710acd134d (#7201)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-08 21:06:17 +00:00
LocalAI [bot]
8ac7e28c12
chore: ⬆️ Update ggml-org/llama.cpp to 65156105069fa86a4a81b6cb0e8cb583f6420677 (#7184)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-08 09:07:44 +01:00
Ettore Di Giacinto
02cc8cbcaa
feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120)
* feat(llama.cpp): expose env vars as options for consistency

This allows to configure everything in the YAML file of the model rather
than have global configurations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(llama.cpp): respect usetokenizertemplate and use llama.cpp templating system to process messages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Detect template exists if use tokenizer template is enabled

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Better recognization of chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixes to support tool calls while using templates from tokenizer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop template guessing, fix passing tools to tokenizer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Extract grammar and other options from chat template, add schema struct

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Automatically set use_jinja

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Cleanups, identify by default gguf models for chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-07 21:23:50 +01:00
LocalAI [bot]
8f7c499f17
chore: ⬆️ Update ggml-org/llama.cpp to 7f09a680af6e0ef612de81018e1d19c19b8651e8 (#7156)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-07 08:38:56 +01:00
LocalAI [bot]
db9957b94e
chore: ⬆️ Update ggml-org/llama.cpp to a44d77126c911d105f7f800c17da21b2a5b112d1 (#7125)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-05 21:22:04 +00:00
LocalAI [bot]
98158881c2
chore: ⬆️ Update ggml-org/llama.cpp to ad51c0a720062a04349c779aae301ad65ca4c856 (#7098)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-04 21:19:58 +00:00
LocalAI [bot]
e2cb44ef37
chore: ⬆️ Update ggml-org/llama.cpp to c5023daf607c578d6344c628eb7da18ac3d92d32 (#7069)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-04 09:26:10 +01:00
LocalAI [bot]
2cad2c8591
chore: ⬆️ Update ggml-org/llama.cpp to cd5e3b57541ecc52421130742f4d89acbcf77cd4 (#7023)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-02 21:24:19 +00:00
Ettore Di Giacinto
424acd66ad
feat(llama.cpp): allow to set cache-ram and ctx_shift (#7009)
* feat(llama.cpp): allow to set cache-ram and ctx_shift

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-11-02 17:33:29 +01:00
LocalAI [bot]
f85e2dd1b8
chore: ⬆️ Update ggml-org/llama.cpp to 2f68ce7cfd20e9e7098514bf730e5389b7bba908 (#6998)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-11-02 09:44:37 +01:00
LocalAI [bot]
9ecfdc5938
chore: ⬆️ Update ggml-org/llama.cpp to 31c511a968348281e11d590446bb815048a1e912 (#6970)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-31 21:04:53 +00:00
LocalAI [bot]
0ddb2e8dcf
chore: ⬆️ Update ggml-org/llama.cpp to 4146d6a1a6228711a487a1e3e9ddd120f8d027d7 (#6945)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-31 14:51:03 +00:00
LocalAI [bot]
1e5b9135df
chore: ⬆️ Update ggml-org/llama.cpp to 16724b5b6836a2d4b8936a5824d2ff27c52b4517 (#6925)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-30 21:07:33 +00:00
LocalAI [bot]
dd21a0d2f9
chore: ⬆️ Update ggml-org/llama.cpp to 3464bdac37027c5e9661621fc75ffcef3c19c6ef (#6896)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-30 14:17:58 +01:00
LocalAI [bot]
fb825a2708
chore: ⬆️ Update ggml-org/llama.cpp to 851553ea6b24cb39fd5fd188b437d777cb411de8 (#6869)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-29 08:16:55 +01:00
LocalAI [bot]
e13cb8346d
chore: ⬆️ Update ggml-org/llama.cpp to 5a4ff43e7dd049e35942bc3d12361dab2f155544 (#6841)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-28 08:48:21 +01:00
LocalAI [bot]
8225697139
chore: ⬆️ Update ggml-org/llama.cpp to bbac6a26b2bd7f7c1f0831cb1e7b52734c66673b (#6783)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-27 08:45:14 +01:00
LocalAI [bot]
192589a17f
chore: ⬆️ Update ggml-org/llama.cpp to 5d195f17bc60eacc15cfb929f9403cf29ccdf419 (#6757)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-25 21:14:43 +00:00
LocalAI [bot]
ed4ac0b61e
chore: ⬆️ Update ggml-org/llama.cpp to 55945d2ef51b93821d4b6f4a9b994393344a90db (#6729)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-24 21:11:56 +00:00
LocalAI [bot]
b66bd2706f
chore: ⬆️ Update ggml-org/llama.cpp to 0bf47a1dbba4d36f2aff4e8c34b06210ba34e688 (#6703)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-23 21:10:51 +00:00
Chakib Benziane
32c0ab3a7f
fix: properly terminate llama.cpp kv_overrides array with empty key + updated doc (#6672)
* fix: properly terminate kv_overrides array with empty key

The llama model loading function expects KV overrides to be terminated
with an empty key (key[0] == 0). Previously, the kv_overrides vector was
not being properly terminated, causing an assertion failure.

This commit ensures that after parsing all KV override strings, we add a
final terminating entry with an empty key to satisfy the C-style array
termination requirement. This fixes the assertion error and allows the
model to load correctly with custom KV overrides.

Fixes #6643

- Also included a reference to the usage of the `overrides` option in
  the advanced-usage section.

Signed-off-by: blob42 <contact@blob42.xyz>

* doc: document the `overrides` option

---------

Signed-off-by: blob42 <contact@blob42.xyz>
2025-10-23 09:31:55 +02:00
LocalAI [bot]
24ce79a67c
chore: ⬆️ Update ggml-org/llama.cpp to a2e0088d9242bd9e57f8b852b05a6e47843b5a45 (#6676)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-22 21:05:27 +00:00
LocalAI [bot]
7a3d9ee5c1
chore: ⬆️ Update ggml-org/llama.cpp to 03792ad93609fc67e41041c6347d9aa14e5e0d74 (#6651)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-21 21:15:27 +00:00
LocalAI [bot]
4b30846d57
chore: ⬆️ Update ggml-org/llama.cpp to 84bf3c677857279037adf67cdcfd89eaa4ca9281 (#6621)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-21 09:22:03 +02:00
LocalAI [bot]
69adc46936
chore: ⬆️ Update ggml-org/llama.cpp to cec5edbcaec69bbf6d5851cabce4ac148be41701 (#6576)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-19 21:31:47 +00:00
LocalAI [bot]
f94b89c1b5
chore: ⬆️ Update ggml-org/llama.cpp to ee09828cb057460b369576410601a3a09279e23c (#6550)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-18 21:09:46 +00:00
LocalAI [bot]
cce185b345
chore: ⬆️ Update ggml-org/llama.cpp to 66b0dbcb2d462e7b70ba5a69ee8c3899ac2efb1c (#6520)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-17 21:14:57 +00:00
LocalAI [bot]
7bac49fb87
chore: ⬆️ Update ggml-org/llama.cpp to 1bb4f43380944e94c9a86e305789ba103f5e62bd (#6488)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-17 09:01:11 +02:00
LocalAI [bot]
9680a0b0fe
chore: ⬆️ Update ggml-org/llama.cpp to 466c1911ab736f0b7366127edee99f8ee5687417 (#6463)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-15 23:21:35 +02:00
LocalAI [bot]
7ed3666d2e
chore: ⬆️ Update ggml-org/llama.cpp to fa882fd2b1bcb663de23af06fdc391489d05b007 (#6454)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-14 21:08:17 +00:00
LocalAI [bot]
2e2e89e499
chore: ⬆️ Update ggml-org/llama.cpp to e60f241eacec42d3bd7c9edd37d236ebf35132a8 (#6452)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-14 09:06:39 +02:00
LocalAI [bot]
3a8fbb698e
chore: ⬆️ Update ggml-org/llama.cpp to a31cf36ad946a13b3a646bf0dadf2a481e89f944 (#6440)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-13 07:54:03 +02:00
LocalAI [bot]
c856d7dc73
chore: ⬆️ Update ggml-org/llama.cpp to 11f0af5504252e453d57406a935480c909e3ff37 (#6437)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-12 09:02:31 +02:00
LocalAI [bot]
fa6bbd9fa2
chore: ⬆️ Update ggml-org/llama.cpp to e60f01d941bc5b7fae62dd57fee4cec76ec0ea6e (#6434)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-11 09:30:48 +02:00
Ettore Di Giacinto
cd1e1124ea
fix(llama.cpp): correctly set grammar triggers (#6432)
* fix(llama.cpp): correctly set grammar triggers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Do not enable lazy by default

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-10-10 19:50:17 +02:00
Ettore Di Giacinto
791bc769c1
chore(deps): bump llama.cpp to '1deee0f8d494981c32597dca8b5f8696d399b0f2' (#6421)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-10-10 09:51:22 +02:00
LocalAI [bot]
336257cc3c
chore: ⬆️ Update ggml-org/llama.cpp to 9d0882840e6c3fb62965d03af0e22880ea90e012 (#6410)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-09 08:17:10 +02:00
LocalAI [bot]
5e1d809904
chore: ⬆️ Update ggml-org/llama.cpp to aeaf8a36f06b5810f5ae4bbefe26edb33925cf5e (#6408)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-08 08:01:08 +02:00
LocalAI [bot]
6f17c260a7
chore: ⬆️ Update ggml-org/llama.cpp to 3df2244df40c67dfd6ad548b40ccc507a066af2b (#6401)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-07 08:44:02 +02:00
LocalAI [bot]
d4d42740c8
chore: ⬆️ Update ggml-org/llama.cpp to ca71fb9b368e3db96e028f80c4c9df6b6b370edd (#6385)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-06 08:24:38 +02:00
LocalAI [bot]
6b2c8277c2
chore: ⬆️ Update ggml-org/llama.cpp to 86df2c9ae4f2f1ee63d2558a9dc797b98524639b (#6382)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-05 08:52:24 +02:00
LocalAI [bot]
6d5d3ebcf6
chore: ⬆️ Update ggml-org/llama.cpp to 128d522c04286e019666bd6ee4d18e3fbf8772e2 (#6379)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-04 19:00:50 +02:00
LocalAI [bot]
dd927c36f6
chore: ⬆️ Update ggml-org/llama.cpp to d64c8104f090b27b1f99e8da5995ffcfa6b726e2 (#6371)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-10-02 21:09:00 +00:00
LocalAI [bot]
052f42e926
chore: ⬆️ Update ggml-org/llama.cpp to 1fe4e38cc20af058ed320bd46cac934991190056 (#6368)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-10-02 16:29:57 +02:00
LocalAI [bot]
04fecd634a
chore: ⬆️ Update ggml-org/llama.cpp to b2ba81dbe07b6dbea9c96b13346c66973dede32c (#6366)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-30 21:13:23 +00:00
LocalAI [bot]
33c14198db
chore: ⬆️ Update ggml-org/llama.cpp to 5f7e166cbf7b9ca928c7fad990098ef32358ac75 (#6355)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-30 14:41:16 +02:00
LocalAI [bot]
dca685f784
chore: ⬆️ Update ggml-org/llama.cpp to bd0af02fc96c2057726f33c0f0daf7bb8f3e462a (#6352)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-28 21:08:50 +00:00
LocalAI [bot]
84ebf2a2c9
chore: ⬆️ Update ggml-org/llama.cpp to 4807e8f96a61b2adccebd5e57444c94d18de7264 (#6350)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-28 00:33:46 +02:00
Ettore Di Giacinto
ce5662ba90
chore(deps): bump llama.cpp to '72b24d96c6888c609d562779a23787304ae4609c' (#6349)
* chore(deps): bump llama.cpp to '72b24d96c6888c609d562779a23787304ae4609c'

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Disable OPENSSL (just introduced upstream)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-09-27 13:55:51 +02:00
Ettore Di Giacinto
9878f27813
chore(deps): bump llama.cpp to '835b2b915c52bcabcd688d025eacff9a07b65f52' (#6347)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-09-26 23:26:14 +02:00
jongames
f2b9452ec4
fix: reranking models limited to 512 tokens in llama.cpp backend (#6344)
Fix reranking models being limited to 512 tokens input in llama.cpp backend

Signed-off-by: JonGames <18472148+jongames@users.noreply.github.com>
2025-09-25 23:32:07 +00:00
LocalAI [bot]
238c68c57b
chore: ⬆️ Update ggml-org/llama.cpp to 4ae88d07d026e66b41e85afece74e88af54f4e66 (#6339)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-25 08:47:02 +02:00
LocalAI [bot]
737248256e
chore: ⬆️ Update ggml-org/llama.cpp to 1d0125bcf1cbd7195ad0faf826a20bc7cec7d3f4 (#6335)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-22 21:13:34 +00:00
LocalAI [bot]
6afcb932b7
chore: ⬆️ Update ggml-org/llama.cpp to da30ab5f8696cabb2d4620cdc0aa41a298c54fd6 (#6321)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-21 21:28:27 +00:00
LocalAI [bot]
e74ade9ebb
chore: ⬆️ Update ggml-org/llama.cpp to 7f766929ca8e8e01dcceb1c526ee584f7e5e1408 (#6319)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-20 21:05:25 +00:00
LocalAI [bot]
75eb98f8bd
chore: ⬆️ Update ggml-org/llama.cpp to f432d8d83e7407073634c5e4fd81a3d23a10827f (#6316)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-20 09:41:45 +02:00
LocalAI [bot]
ae3d8fb0c4
chore: ⬆️ Update ggml-org/llama.cpp to 3edd87cd055a45d885fa914d879d36d33ecfc3e1 (#6308)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-18 21:09:14 +00:00
LocalAI [bot]
902e47f0b0
chore: ⬆️ Update ggml-org/llama.cpp to 0320ac5264279d74f8ee91bafa6c90e9ab9bbb91 (#6306)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-18 09:27:18 +02:00
LocalAI [bot]
e4ac7b14a3
chore: ⬆️ Update ggml-org/llama.cpp to 8ff206097c2bf3ca1c7aa95f9d6db779fc7bdd68 (#6292)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-16 21:09:47 +00:00
LocalAI [bot]
e89b5cc0e3
chore: ⬆️ Update ggml-org/llama.cpp to b907255f4bd169b0dc7dca9553b4c54af5170865 (#6287)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-16 08:10:37 +02:00
LocalAI [bot]
2a18206033
chore: ⬆️ Update ggml-org/llama.cpp to 6c019cb04e86e2dacfe62ce7666c64e9717dde1f (#6265)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-09-14 21:19:41 +00:00