mirror of https://github.com/mudler/LocalAI synced 2026-05-24 09:28:23 +00:00

History

Ettore Di Giacinto e370318bd7 fix(vllm): seed pybind11 for fastsafetensors build under --no-build-isolation fastsafetensors==0.3 (transitive dep of vllm) imports pybind11 in setup.py without declaring it in build-system.requires. With --no-build-isolation it has to already exist in the venv, otherwise the wheel build fails with ModuleNotFoundError on arm64 L4T CUDA 13 (and any other profile that picks up vllm 0.20.0). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>		2026-04-28 20:08:26 +00:00
..
backend.py	feat(vllm): parity with llama.cpp backend (#9328 )	2026-04-13 11:00:29 +02:00
install.sh	feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553 )	2026-04-25 12:26:29 +02:00
Makefile	feat(mlx): add mlx backend (#6049 )	2025-08-22 08:42:29 +02:00
package.sh	feat(vllm): parity with llama.cpp backend (#9328 )	2026-04-13 11:00:29 +02:00
README.md	refactor: move backends into the backends directory (#1279 )	2023-11-13 22:40:16 +01:00
requirements-after.txt	feat(vllm): parity with llama.cpp backend (#9328 )	2026-04-13 11:00:29 +02:00
requirements-cpu-after.txt	feat(vllm): parity with llama.cpp backend (#9328 )	2026-04-13 11:00:29 +02:00
requirements-cpu.txt	feat(vllm): parity with llama.cpp backend (#9328 )	2026-04-13 11:00:29 +02:00
requirements-cublas12-after.txt	fix(vllm): drop flash-attn wheel to avoid torch 2.10 ABI mismatch (#9557 )	2026-04-25 15:38:13 +00:00
requirements-cublas12.txt	fix(vllm): drop flash-attn wheel to avoid torch 2.10 ABI mismatch (#9557 )	2026-04-25 15:38:13 +00:00
requirements-cublas13-after.txt	feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553 )	2026-04-25 12:26:29 +02:00
requirements-cublas13.txt	feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553 )	2026-04-25 12:26:29 +02:00
requirements-hipblas-after.txt	feat(vllm): parity with llama.cpp backend (#9328 )	2026-04-13 11:00:29 +02:00
requirements-hipblas.txt	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
requirements-install.txt	fix(vllm): seed pybind11 for fastsafetensors build under --no-build-isolation	2026-04-28 20:08:26 +00:00
requirements-intel-after.txt	feat(vllm): parity with llama.cpp backend (#9328 )	2026-04-13 11:00:29 +02:00
requirements-intel.txt	feat(qwen-tts): add Qwen-tts backend (#8163 )	2026-01-23 15:18:41 +01:00
requirements-l4t13-after.txt	feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553 )	2026-04-25 12:26:29 +02:00
requirements-l4t13.txt	feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553 )	2026-04-25 12:26:29 +02:00
requirements.txt	chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/vllm (#9177 )	2026-03-31 10:10:17 +02:00
run.sh	feat: Add backend gallery (#5607 )	2025-06-15 14:56:52 +02:00
test.py	feat(vllm): parity with llama.cpp backend (#9328 )	2026-04-13 11:00:29 +02:00
test.sh	feat: Add backend gallery (#5607 )	2025-06-15 14:56:52 +02:00

README.md

Creating a separate environment for the vllm project

make vllm