LocalAI

mirror of https://github.com/mudler/LocalAI synced 2026-05-24 09:28:23 +00:00

Author	SHA1	Message	Date
LocalAI [bot]	f6c9c20911	chore: ⬆️ Update ggml-org/llama.cpp to `2b2babd1243c67ca811c0a5852cedf92b1a20024` (#9747 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-05-10 21:17:38 +02:00
LocalAI [bot]	6cbf69dc29	chore: ⬆️ Update ggml-org/llama.cpp to `1e5ad35d560b90a8ac447d149c8f8447ae1fcaa0` (#9739 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-05-10 00:06:29 +02:00
LocalAI [bot]	a91e718473	chore: ⬆️ Update ggml-org/llama.cpp to `00d56b11c3477b99bc18562dc1d1834f0d961778` (#9733 ) Some checks are pending Tests extras backends / tests-moonshine (push) Blocked by required conditions Details Tests extras backends / tests-pocket-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-asr (push) Blocked by required conditions Details Tests extras backends / tests-nemo (push) Blocked by required conditions Details Tests extras backends / tests-voxcpm (push) Blocked by required conditions Details Tests extras backends / tests-ik-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-turboquant-grpc (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-quantization (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-smoke (push) Waiting to run Details Tests extras backends / tests-sherpa-onnx-realtime (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-acestep-cpp (push) Blocked by required conditions Details Tests extras backends / tests-qwen3-tts-cpp (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-speaker-recognition-grpc (push) Blocked by required conditions Details tests / tests-linux (1.26.x) (push) Waiting to run Details tests / tests-apple (1.26.x) (push) Waiting to run Details tests-aio / tests-aio (push) Waiting to run Details Tests extras backends / tests-vibevoice-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-localvqe-grpc-transform (push) Blocked by required conditions Details Tests extras backends / tests-voxtral (push) Blocked by required conditions Details Tests extras backends / tests-kokoros (push) Blocked by required conditions Details Tests extras backends / tests-insightface-grpc (push) Blocked by required conditions Details E2E Backend Tests / tests-e2e-backend (1.25.x) (push) Waiting to run Details UI E2E Tests / tests-ui-e2e (1.26.x) (push) Waiting to run Details ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-05-09 12:05:11 +02:00
LocalAI [bot]	4542833cb4	chore: ⬆️ Update ggml-org/llama.cpp to `9f5f0e689c9e977e5f23a27e344aa36082f44738` (#9724 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-09 10:18:05 +02:00
LocalAI [bot]	3b84582567	chore: ⬆️ Update ggml-org/llama.cpp to `05ff59cb57860cc992fc6dcede32c696efea711c` (#9714 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-08 01:44:17 +02:00
LocalAI [bot]	151d6c9cf0	chore: ⬆️ Update ggml-org/llama.cpp to `2496f9c14965c39589f53eea31bdb6d762b1d360` (#9698 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-07 08:29:27 +02:00
LocalAI [bot]	c9141098b6	chore: ⬆️ Update ggml-org/llama.cpp to `bbeb89d76c41bc250f16e4a6fefcc9b530d6e3f3` (#9676 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-05 23:45:54 +02:00
LocalAI [bot]	b88ddce0f3	chore: ⬆️ Update ggml-org/llama.cpp to `eff06702b2a52e1020ea009ebd86cb9f5acabab5` (#9637 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-05 09:52:28 +02:00
Russell Sim	18e039f305	fix(ci): fix AMDGPU_TARGETS empty-string bypass in hipblas builds (#9626 ) Some checks are pending Tests extras backends / tests-coqui (push) Blocked by required conditions Details Tests extras backends / tests-moonshine (push) Blocked by required conditions Details Tests extras backends / tests-pocket-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-asr (push) Blocked by required conditions Details Tests extras backends / tests-nemo (push) Blocked by required conditions Details Tests extras backends / tests-voxcpm (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-quantization (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-smoke (push) Waiting to run Details Tests extras backends / tests-sherpa-onnx-realtime (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-ik-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-turboquant-grpc (push) Blocked by required conditions Details Tests extras backends / tests-acestep-cpp (push) Blocked by required conditions Details Tests extras backends / tests-qwen3-tts-cpp (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-voxtral (push) Blocked by required conditions Details Tests extras backends / tests-kokoros (push) Blocked by required conditions Details Tests extras backends / tests-insightface-grpc (push) Blocked by required conditions Details Tests extras backends / tests-speaker-recognition-grpc (push) Blocked by required conditions Details tests / tests-linux (1.26.x) (push) Waiting to run Details tests / tests-apple (1.26.x) (push) Waiting to run Details tests-aio / tests-aio (push) Waiting to run Details E2E Backend Tests / tests-e2e-backend (1.25.x) (push) Waiting to run Details UI E2E Tests / tests-ui-e2e (1.26.x) (push) Waiting to run Details * fix(ci): fix AMDGPU_TARGETS empty-string bypass in hipblas builds `399c1dec` wired amdgpu-targets through the backend_build workflow_call interface, intending the input's default value to cover matrix entries that don't specify targets. However, GitHub Actions only applies a workflow_call input default when the caller omits the input entirely. When backend.yml passes `amdgpu-targets: ${{ matrix.amdgpu-targets }}` and the matrix entry has no amdgpu-targets key, the expression evaluates to an empty string, which is treated as an explicit value — bypassing the default. The result is Docker receiving AMDGPU_TARGETS="" which in turn causes Make's ?= default to be skipped (since the variable is already set in the environment, even to empty), and cmake gets -DAMDGPU_TARGETS= with no targets, so the HIP backend compiles for an indeterminate target rather than the intended GPU list. Fix this at two levels: 1. backend.yml: use a \|\| fallback in the expression so that an undefined matrix.amdgpu-targets never reaches the reusable workflow as an empty string. The target list is the canonical default and lives here. 2. backend_build.yml: remove the now-misleading default value from the input declaration. The default never fired due to the above bug, so keeping it implied a guarantee that didn't exist. 3. backend/cpp/llama-cpp/Makefile: add an explicit $(error ...) guard after the ?= assignment so that if AMDGPU_TARGETS is empty (whether from environment or any future CI wiring mistake) the build fails immediately with a clear message rather than silently producing a binary compiled for an unknown GPU target. Assisted-by: Claude Code:claude-sonnet-4-6 Signed-off-by: Russell Sim <rsl@simopolis.xyz> * fix(build): plumb AMDGPU_TARGETS through to Docker builds The docker-build-backend Makefile macro and Dockerfile.golang did not pass AMDGPU_TARGETS to the inner make invocation, so hipblas builds always used the backend Makefile's hardcoded default GPU targets regardless of what was specified via environment or CI inputs. Signed-off-by: Russell Sim <rsl@simopolis.xyz> --------- Signed-off-by: Russell Sim <rsl@simopolis.xyz>	2026-05-02 15:53:14 +02:00
LocalAI [bot]	9c4c3f9d8f	chore: ⬆️ Update ggml-org/llama.cpp to `beb42fffa45eded44804a1fd4916146222371581` (#9624 ) Some checks are pending Tests extras backends / tests-coqui (push) Blocked by required conditions Details Tests extras backends / tests-moonshine (push) Blocked by required conditions Details Tests extras backends / tests-pocket-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-asr (push) Blocked by required conditions Details Tests extras backends / tests-nemo (push) Blocked by required conditions Details Tests extras backends / tests-voxcpm (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-quantization (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-smoke (push) Waiting to run Details Tests extras backends / tests-sherpa-onnx-realtime (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-ik-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-turboquant-grpc (push) Blocked by required conditions Details Tests extras backends / tests-acestep-cpp (push) Blocked by required conditions Details Tests extras backends / tests-qwen3-tts-cpp (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-voxtral (push) Blocked by required conditions Details Tests extras backends / tests-kokoros (push) Blocked by required conditions Details Tests extras backends / tests-insightface-grpc (push) Blocked by required conditions Details Tests extras backends / tests-speaker-recognition-grpc (push) Blocked by required conditions Details tests / tests-linux (1.26.x) (push) Waiting to run Details tests / tests-apple (1.26.x) (push) Waiting to run Details tests-aio / tests-aio (push) Waiting to run Details E2E Backend Tests / tests-e2e-backend (1.25.x) (push) Waiting to run Details UI E2E Tests / tests-ui-e2e (1.26.x) (push) Waiting to run Details ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-01 02:02:56 +02:00
Ettore Di Giacinto	c02a50f2ab	feat(llama-cpp): bump to d775992 and adapt to spec params refactor (#9618 ) Some checks are pending Tests extras backends / tests-coqui (push) Blocked by required conditions Details Tests extras backends / tests-moonshine (push) Blocked by required conditions Details Tests extras backends / tests-pocket-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-asr (push) Blocked by required conditions Details Tests extras backends / tests-nemo (push) Blocked by required conditions Details Tests extras backends / tests-voxcpm (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-quantization (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-smoke (push) Waiting to run Details Tests extras backends / tests-sherpa-onnx-realtime (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-ik-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-turboquant-grpc (push) Blocked by required conditions Details Tests extras backends / tests-acestep-cpp (push) Blocked by required conditions Details Tests extras backends / tests-qwen3-tts-cpp (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-voxtral (push) Blocked by required conditions Details Tests extras backends / tests-kokoros (push) Blocked by required conditions Details Tests extras backends / tests-insightface-grpc (push) Blocked by required conditions Details Tests extras backends / tests-speaker-recognition-grpc (push) Blocked by required conditions Details tests / tests-linux (1.26.x) (push) Waiting to run Details tests / tests-apple (1.26.x) (push) Waiting to run Details tests-aio / tests-aio (push) Waiting to run Details E2E Backend Tests / tests-e2e-backend (1.25.x) (push) Waiting to run Details UI E2E Tests / tests-ui-e2e (1.26.x) (push) Waiting to run Details Bumps backend/cpp/llama-cpp/Makefile LLAMA_VERSION from 665abc6 to d775992, picking up upstream PR ggml-org/llama.cpp#22397 which splits common_params_speculative into nested draft / ngram_simple / ngram_mod sub-structs. Renames every grpc-server.cpp reference to match: speculative.mparams_dft.path -> speculative.draft.mparams.path speculative.{n_max,n_min} -> speculative.draft.{n_max,n_min} speculative.{p_min,p_split} -> speculative.draft.{p_min,p_split} speculative.{n_gpu_layers,n_ctx} -> speculative.draft.{n_gpu_layers,n_ctx} speculative.ngram_size_n -> speculative.ngram_simple.size_n speculative.ngram_size_m -> speculative.ngram_simple.size_m speculative.ngram_min_hits -> speculative.ngram_simple.min_hits The "speculative.n_max" JSON key sent to the upstream server stays unchanged — server-task.cpp still reads it and routes the value into draft.n_max internally. The turboquant fork (TheTom/llama-cpp-turboquant @ 11a241d) branched before #22397 and still exposes the flat layout. Since turboquant reuses the shared backend/cpp/llama-cpp/grpc-server.cpp, extend patch-grpc-server.sh with an idempotent sed block that reverts the ten field references back to the legacy flat names on the build copy only — the original under backend/cpp/llama-cpp/ stays compiling against vanilla upstream. Drop the block once the fork rebases. ik-llama-cpp has its own grpc-server.cpp with no speculative refs (0/2661 lines), so it is unaffected. Validated locally with `make docker-build-llama-cpp` (avx, avx2, avx512, fallback, grpc + rpc-server all built; image exported). Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-30 08:44:43 +02:00
LocalAI [bot]	8e50066fa2	chore: ⬆️ Update ggml-org/llama.cpp to `665abc609740d397d30c0d8ef4157dbf900bd1a3` (#9584 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-28 08:43:33 +02:00
LocalAI [bot]	05e94bd9e7	chore: ⬆️ Update ggml-org/llama.cpp to `f53577432541bb9edc1588c4ef45c66bf07e4468` (#9577 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-27 08:57:24 +02:00
LocalAI [bot]	d9cb0d6133	chore: ⬆️ Update ggml-org/llama.cpp to `dcad77cc3b0865153f486327064fb0320a57a476` (#9572 ) Some checks are pending Tests extras backends / detect-changes (push) Waiting to run Details Tests extras backends / tests-transformers (push) Blocked by required conditions Details Tests extras backends / tests-rerankers (push) Blocked by required conditions Details Tests extras backends / tests-diffusers (push) Blocked by required conditions Details Tests extras backends / tests-coqui (push) Blocked by required conditions Details Tests extras backends / tests-moonshine (push) Blocked by required conditions Details Tests extras backends / tests-pocket-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-asr (push) Blocked by required conditions Details Tests extras backends / tests-nemo (push) Blocked by required conditions Details Tests extras backends / tests-voxcpm (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-quantization (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-realtime (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-ik-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-turboquant-grpc (push) Blocked by required conditions Details Tests extras backends / tests-acestep-cpp (push) Blocked by required conditions Details Tests extras backends / tests-qwen3-tts-cpp (push) Blocked by required conditions Details Tests extras backends / tests-voxtral (push) Blocked by required conditions Details Tests extras backends / tests-kokoros (push) Blocked by required conditions Details Tests extras backends / tests-insightface-grpc (push) Blocked by required conditions Details Tests extras backends / tests-speaker-recognition-grpc (push) Blocked by required conditions Details tests / tests-linux (1.26.x) (push) Waiting to run Details tests / tests-e2e-container (push) Waiting to run Details tests / tests-apple (1.26.x) (push) Waiting to run Details E2E Backend Tests / tests-e2e-backend (1.25.x) (push) Waiting to run Details UI E2E Tests / tests-ui-e2e (1.26.x) (push) Waiting to run Details ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-26 12:38:35 +02:00
LocalAI [bot]	47cc3dc8d7	chore: ⬆️ Update ggml-org/llama.cpp to `361fe72acb7b9bd79059cc177cbeda99b35b5db9` (#9548 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-25 08:58:27 +02:00
LocalAI [bot]	7c1934b183	chore: ⬆️ Update ggml-org/llama.cpp to `187a45637054881ecacf17f8e2f6f8f2ba7df1c7` (#9520 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-24 09:17:06 +02:00
Ettore Di Giacinto	ed648b3b4e	fix(llama-cpp): include server-chat.cpp in grpc-server translation unit (#9511 ) * fix(llama-cpp): include server-chat.cpp in grpc-server translation unit Upstream llama.cpp refactor (ggml-org/llama.cpp#20690) moved the OAI/Anthropic/Responses and transcription conversion helpers out of server-common.cpp into a new server-chat.cpp, and server-task.cpp and server-context.cpp now call those symbols (convert_transcriptions_to_chatcmpl, server_chat_convert_responses_to_chatcmpl, server_chat_convert_anthropic_to_oai, server_chat_msg_diff_to_json_oaicompat) via server-chat.h. grpc-server.cpp builds as a single translation unit by #include-ing the upstream .cpp files directly. Without including server-chat.cpp, the declarations are satisfied at compile time via server-chat.h but the link step fails with undefined references once LLAMA_VERSION crosses the refactor commit (134d6e54). Guard the include with __has_include so the same source stays buildable on older LLAMA_VERSION pins that predate the refactor (where prepare.sh won't copy server-chat.cpp into tools/grpc-server/). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(llama-cpp): bump LLAMA_VERSION to 0d0764dfd Bump to ggml-org/llama.cpp@0d0764dfd2. Paired with the preceding grpc-server server-chat.cpp include so the refactor at 134d6e54 links cleanly. Supersedes PR #9494. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-23 14:59:39 +02:00
LocalAI [bot]	cd7b035716	chore: ⬆️ Update ggml-org/llama.cpp to `5a4cd6741fc33227cdacb329f355ab21f8481de2` (#9479 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-22 08:58:19 +02:00
LocalAI [bot]	8bb1e8f21f	chore: ⬆️ Update ggml-org/llama.cpp to `cf8b0dbda9ac0eac30ee33f87bc6702ead1c4664` (#9448 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-21 11:15:45 +02:00
LocalAI [bot]	babbbc6ec8	chore: ⬆️ Update ggml-org/llama.cpp to `4eac5b45095a4e8a1ff1cce4f6d030e0872fb4ad` (#9429 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-19 23:39:19 +02:00
LocalAI [bot]	6e49dba27c	chore: ⬆️ Update ggml-org/llama.cpp to `4f02d4733934179386cbc15b3454be26237940bb` (#9415 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-19 09:26:05 +02:00
Keith Mattix II	8839a71c87	fix(rocm): add gfx1151 support and expose AMDGPU_TARGETS build-arg (#9410 ) Some checks are pending Tests extras backends / tests-voxtral (push) Blocked by required conditions Details Tests extras backends / tests-kokoros (push) Blocked by required conditions Details build container images / core-image-build (ubuntu:24.04, cublas, 12, 8, --jobs=4 --output-sync=target, linux/amd64, ubuntu-latest, false, auto, -gpu-nvidia-cuda-12, noble, 2404) (push) Waiting to run Details build container images / core-image-build (ubuntu:24.04, vulkan, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, ubuntu-latest, false, auto, -gpu-vulkan, noble, 2404) (push) Waiting to run Details build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, auto, -nvidia-l4t-arm64, jammy, 2204) (push) Waiting to run Details build container images / gh-runner (ubuntu:24.04, cublas, 13, 0, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, false, auto, -nvidia-l4t-arm64-cuda-13, noble, 2404) (push) Waiting to run Details Security Scan / tests (push) Waiting to run Details Tests extras backends / tests-diffusers (push) Blocked by required conditions Details Tests extras backends / detect-changes (push) Waiting to run Details Tests extras backends / tests-transformers (push) Blocked by required conditions Details Tests extras backends / tests-rerankers (push) Blocked by required conditions Details Tests extras backends / tests-coqui (push) Blocked by required conditions Details Tests extras backends / tests-moonshine (push) Blocked by required conditions Details Tests extras backends / tests-pocket-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-asr (push) Blocked by required conditions Details Tests extras backends / tests-nemo (push) Blocked by required conditions Details Tests extras backends / tests-voxcpm (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-quantization (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-ik-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-turboquant-grpc (push) Blocked by required conditions Details Tests extras backends / tests-acestep-cpp (push) Blocked by required conditions Details Tests extras backends / tests-qwen3-tts-cpp (push) Blocked by required conditions Details tests / tests-linux (1.26.x) (push) Waiting to run Details tests / tests-e2e-container (push) Waiting to run Details tests / tests-apple (1.26.x) (push) Waiting to run Details E2E Backend Tests / tests-e2e-backend (1.25.x) (push) Waiting to run Details UI E2E Tests / tests-ui-e2e (1.26.x) (push) Waiting to run Details Add gfx1151 (AMD Strix Halo / Ryzen AI MAX) to the default AMDGPU_TARGETS list in the llama-cpp backend Makefile. ROCm 7.2.1 ships with gfx1151 Tensile libraries, so this architecture should be included in default builds. Also expose AMDGPU_TARGETS as an ARG/ENV in Dockerfile.llama-cpp so that users building for non-default GPU architectures can override the target list via --build-arg AMDGPU_TARGETS=<arch>. Previously, passing -DAMDGPU_TARGETS=<arch> through CMAKE_ARGS was silently overridden by the Makefile's own append of the default target list. Fixes #9374 Signed-off-by: Keith Mattix <keithmattix2@gmail.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-04-18 20:39:40 +02:00
Ettore Di Giacinto	c49feb546f	fix(llama-cpp): rename linked target common -> llama-common (#9408 ) Upstream llama.cpp (45cac7ca) renamed the CMake library target `common` to `llama-common`. Linking the old name caused `target_include_directories(... PUBLIC .)` from the common/ dir to not propagate, so `#include "common.h"` failed when building grpc-server.	2026-04-18 00:42:05 +02:00
LocalAI [bot]	7dbd9c056a	chore: ⬆️ Update ggml-org/llama.cpp to `4fbdabdc61c04d1262b581e1b8c0c3b119f688ff` (#9381 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-17 08:13:04 +02:00
Ettore Di Giacinto	5837b14888	chore: ⬆️ Update TheTom/llama-cpp-turboquant to `45f8a066ed5f5bb38c695cec532f6cef9f4efa9d' (#9385 ) chore: ⬆️ Update TheTom/llama-cpp-turboquant to `45f8a066ed5f5bb38c695cec532f6cef9f4efa9d` Drop 0002-ggml-rpc-bump-op-count-to-97.patch; the fork now has GGML_OP_COUNT == 97 and RPC_PROTO_PATCH_VERSION 2 upstream. Fetch all tags in backend/cpp/llama-cpp/Makefile so tag-only commits (the new turboquant pin is reachable only through the tag feature-turboquant-kv-cache-b8821-45f8a06) can be checked out.	2026-04-17 08:12:21 +02:00
LocalAI [bot]	96cd561d9d	chore: ⬆️ Update ggml-org/llama.cpp to `b3d758750a268bf93f084ccfa3060fb9a203192a` (#9370 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-16 01:12:39 +02:00
LocalAI [bot]	62862ca06b	chore: ⬆️ Update ggml-org/llama.cpp to `fae3a28070fe4026f87bd6a544aba1b2d1896566` (#9357 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-15 01:25:41 +02:00
Ettore Di Giacinto	87e6de1989	feat: wire transcription for llama.cpp, add streaming support (#9353 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-14 16:13:40 +02:00
LocalAI [bot]	906acba8db	chore: ⬆️ Update ggml-org/llama.cpp to `e97492369888f5311e4d1f3beb325a36bbed70e9` (#9347 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-14 08:54:25 +02:00
LocalAI [bot]	ea32b8953f	chore: ⬆️ Update ggml-org/llama.cpp to `1e9d771e2c2f1113a5ebdd0dc15bafe57dce64be` (#9330 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-13 09:42:18 +02:00
Ettore Di Giacinto	151ad271f2	feat(rocm): bump to 7.x (#9323 ) feat(rocm): bump to 7.2.1 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-12 08:51:30 +02:00
LocalAI [bot]	6fbda277c5	chore: ⬆️ Update ggml-org/llama.cpp to `ff5ef8278615a2462b79b50abdf3cc95cfb31c6f` (#9319 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-11 23:15:23 +02:00
LocalAI [bot]	62a674ce12	chore: ⬆️ Update ggml-org/llama.cpp to `e62fa13c2497b2cd1958cb496e9489e86bbd5182` (#9312 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-11 08:39:10 +02:00
LocalAI [bot]	d4cd6c284f	chore: ⬆️ Update ggml-org/llama.cpp to `d132f22fc92f36848f7ccf2fc9987cd0b0120825` (#9302 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-10 08:46:45 +02:00
Ettore Di Giacinto	2b05420f95	chore(llama.cpp): bump to 'd12cc3d1ca6bba741cd77887ac9c9ee18c8415c7' (#9282 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-09 08:12:05 +02:00
LocalAI [bot]	0526e60f8d	chore: ⬆️ Update ggml-org/llama.cpp to `66c4f9ded01b29d9120255be1ed8d5835bcbb51d` (#9269 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-08 08:27:38 +02:00
LocalAI [bot]	bccaba1f66	chore: ⬆️ Update ggml-org/llama.cpp to `d0a6dfeb28a09831d904fc4d910ddb740da82834` (#9259 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-07 00:38:36 +02:00
LocalAI [bot]	0dda4fe6f0	chore: ⬆️ Update ggml-org/llama.cpp to `761797ffdf2ce3f118e82c663b1ad7d935fbd656` (#9243 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-06 10:52:38 +02:00
Ettore Di Giacinto	c5a840f6af	fix(reasoning): warm-up Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-04 20:25:24 +00:00
LocalAI [bot]	7962dd16f7	chore: ⬆️ Update ggml-org/llama.cpp to `d006858316d4650bb4da0c6923294ccd741caefd` (#9215 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-04 09:44:39 +02:00
LocalAI [bot]	c0a023d13d	chore: ⬆️ Update ggml-org/llama.cpp to `a1cfb645307edc61a89e41557f290f441043d3c2` (#9203 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-03 08:30:15 +02:00
LocalAI [bot]	26f1b94f4d	chore: ⬆️ Update ggml-org/llama.cpp to `95a6ebabb277c4cc18247e7bc2a5502133caca63` (#9199 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-02 08:53:16 +02:00
LocalAI [bot]	cc5f33ce95	chore: ⬆️ Update ggml-org/llama.cpp to `0fcb3760b2b9a3a496ef14621a7e4dad7a8df90f` (#9196 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-01 00:48:40 +02:00
LocalAI [bot]	b0b37a472f	chore: ⬆️ Update ggml-org/llama.cpp to `08f21453aec846867b39878500d725a05bd32683` (#9190 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-31 09:27:08 +02:00
LocalAI [bot]	3d738164b7	chore: ⬆️ Update ggml-org/llama.cpp to `7c203670f8d746382247ed369fea7fbf10df8ae0` (#9160 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-30 08:27:26 +02:00
LocalAI [bot]	4c870288d9	chore: ⬆️ Update ggml-org/llama.cpp to `59d840209a5195c2f6e2e81b5f8339a0637b59d9` (#9144 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-28 18:18:06 +01:00
LocalAI [bot]	b86fa63f70	chore: ⬆️ Update ggml-org/llama.cpp to `a970515bdb0b1d09519106847660b0d0c84d2472` (#9137 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-26 07:56:41 +01:00
LocalAI [bot]	9bc68b2721	chore: ⬆️ Update ggml-org/llama.cpp to `9f102a1407ed5d73b8c954f32edab50f8dfa3f58` (#9127 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-25 07:52:14 +01:00
LocalAI [bot]	2ad8c149e0	chore: ⬆️ Update ggml-org/llama.cpp to `1772701f99dd3fc13f5783b282c2361eda8ca47c` (#9123 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-24 00:35:40 +01:00
LocalAI [bot]	31fcb1425d	chore: ⬆️ Update ggml-org/llama.cpp to `49bfddeca18e62fa3d39114a23e9fcbdf8a22388` (#9102 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-23 01:11:18 +01:00

1 2 3 4 5 ...

301 commits