LocalAI/backend/python/vllm
Ettore Di Giacinto e370318bd7 fix(vllm): seed pybind11 for fastsafetensors build under --no-build-isolation
fastsafetensors==0.3 (transitive dep of vllm) imports pybind11 in
setup.py without declaring it in build-system.requires. With
--no-build-isolation it has to already exist in the venv, otherwise the
wheel build fails with ModuleNotFoundError on arm64 L4T CUDA 13 (and
any other profile that picks up vllm 0.20.0).

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-28 20:08:26 +00:00
..
backend.py feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
install.sh feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553) 2026-04-25 12:26:29 +02:00
Makefile feat(mlx): add mlx backend (#6049) 2025-08-22 08:42:29 +02:00
package.sh feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
README.md refactor: move backends into the backends directory (#1279) 2023-11-13 22:40:16 +01:00
requirements-after.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-cpu-after.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-cpu.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-cublas12-after.txt fix(vllm): drop flash-attn wheel to avoid torch 2.10 ABI mismatch (#9557) 2026-04-25 15:38:13 +00:00
requirements-cublas12.txt fix(vllm): drop flash-attn wheel to avoid torch 2.10 ABI mismatch (#9557) 2026-04-25 15:38:13 +00:00
requirements-cublas13-after.txt feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553) 2026-04-25 12:26:29 +02:00
requirements-cublas13.txt feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553) 2026-04-25 12:26:29 +02:00
requirements-hipblas-after.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-hipblas.txt feat(rocm): bump to 7.x (#9323) 2026-04-12 08:51:30 +02:00
requirements-install.txt fix(vllm): seed pybind11 for fastsafetensors build under --no-build-isolation 2026-04-28 20:08:26 +00:00
requirements-intel-after.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-intel.txt feat(qwen-tts): add Qwen-tts backend (#8163) 2026-01-23 15:18:41 +01:00
requirements-l4t13-after.txt feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553) 2026-04-25 12:26:29 +02:00
requirements-l4t13.txt feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553) 2026-04-25 12:26:29 +02:00
requirements.txt chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/vllm (#9177) 2026-03-31 10:10:17 +02:00
run.sh feat: Add backend gallery (#5607) 2025-06-15 14:56:52 +02:00
test.py feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
test.sh feat: Add backend gallery (#5607) 2025-06-15 14:56:52 +02:00

Creating a separate environment for the vllm project

make vllm