LocalAI

mirror of https://github.com/mudler/LocalAI synced 2026-05-24 09:28:23 +00:00

History

Ettore Di Giacinto b4e30692a2 feat(backends): add sglang (#9359 ) * feat(backends): add sglang Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(sglang): force AVX-512 CXXFLAGS and disable CI e2e job sgl-kernel's shm.cpp uses __m512 AVX-512 intrinsics unconditionally; -march=native fails on CI runners without AVX-512 in /proc/cpuinfo. Force -march=sapphirerapids so the build always succeeds, matching sglang upstream's docker/xeon.Dockerfile recipe. The resulting binary still requires an AVX-512 capable CPU at runtime, so disable tests-sglang-grpc in test-extra.yml for the same reason tests-vllm-grpc is disabled. Local runs with make test-extra-backend-sglang still work on hosts with the right SIMD baseline. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(sglang): patch CMakeLists.txt instead of CXXFLAGS for AVX-512 CXXFLAGS with -march=sapphirerapids was being overridden by add_compile_options(-march=native) in sglang's CPU CMakeLists.txt, since CMake appends those flags after CXXFLAGS. Sed-patch the CMakeLists.txt directly after cloning to replace -march=native. --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>		2026-04-16 22:40:56 +02:00
..
application	feat(ux): backend management enhancement (#9325 )	2026-04-12 00:35:22 +02:00
backend	feat: wire transcription for llama.cpp, add streaming support (#9353 )	2026-04-14 16:13:40 +02:00
cli	feat(ux): backend management enhancement (#9325 )	2026-04-12 00:35:22 +02:00
clients	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
config	feat(vllm): parity with llama.cpp backend (#9328 )	2026-04-13 11:00:29 +02:00
dependencies_manager	feat(ui): move to React for frontend (#8772 )	2026-03-05 21:47:12 +01:00
explorer	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
gallery	feat: refactor shared helpers and enhance MLX backend functionality (#9335 )	2026-04-13 18:44:03 +02:00
http	feat(backends): add sglang (#9359 )	2026-04-16 22:40:56 +02:00
p2p	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
schema	feat: wire transcription for llama.cpp, add streaming support (#9353 )	2026-04-14 16:13:40 +02:00
services	feat: wire transcription for llama.cpp, add streaming support (#9353 )	2026-04-14 16:13:40 +02:00
startup	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
templates	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
trace	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00