LocalAI

mirror of https://github.com/mudler/LocalAI synced 2026-05-24 09:28:23 +00:00

History

Ettore Di Giacinto 9787bee48b Some checks failed Security Scan / tests (push) Has been cancelled Details fix(buun-llama-cpp): shim cudaMemcpy{To,From}Symbol + WARP_SIZE on fwht128 shuffles Two more hipblas-only build failures in buun's fattn.cu, fixed under the same patches/ infrastructure: 1. cudaMemcpyToSymbol / cudaMemcpyFromSymbol — buun's Q² calibration + TCQ codebook upload paths call the symbol variants of cudaMemcpy. ggml/src/ggml-cuda/vendors/hip.h aliases every other cudaMemcpy* name (cudaMemcpy, cudaMemcpyAsync, cudaMemcpy2DAsync, …) but the symbol pair was never added. 15+ "use of undeclared identifier" errors across fattn.cu lines 40, 54, 74-76, 94, 100-101, 371, 883, 905, 954, 976, 1449, 1463. Add the two missing aliases alongside the existing memcpy block. 2. __shfl_xor_sync fwht128 calls — same 3-arg omission pattern as the earlier argmax top-K fix. Lines 512 (ggml_cuda_fwht128 intra-warp butterfly) and 536 (fwht128_store_half neighbor fetch) drop the width argument that hip.h:33 requires. Add WARP_SIZE. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>		2026-04-24 20:09:36 +00:00
..
patches	fix(buun-llama-cpp): shim cudaMemcpy{To,From}Symbol + WARP_SIZE on fwht128 shuffles	2026-04-24 20:09:36 +00:00
apply-patches.sh	feat(backend): add buun-llama-cpp fork (DFlash + TCQ KV-cache)	2026-04-24 12:52:53 +00:00
Makefile	feat(backend): add buun-llama-cpp fork (DFlash + TCQ KV-cache)	2026-04-24 12:52:53 +00:00
package.sh	feat(backend): add buun-llama-cpp fork (DFlash + TCQ KV-cache)	2026-04-24 12:52:53 +00:00
patch-grpc-server.sh	fix(buun-llama-cpp): drop logit_bias_eog arg from params_from_json_cmpl	2026-04-24 12:52:53 +00:00
run.sh	feat(backend): add buun-llama-cpp fork (DFlash + TCQ KV-cache)	2026-04-24 12:52:53 +00:00