LocalAI/backend/python/vllm
eureka928 b19d7f23ed
fix(vllm): support both vLLM API versions and add grammar passthrough
- Handle both StructuredOutputsParams (vLLM latest) and
  GuidedDecodingParams (vLLM <=0.8.x) with graceful fallback
- Use the correct SamplingParams field name for each version
  (structured_outputs vs guided_decoding)
- Use 'json' parameter (not 'json_schema') matching both APIs
- Re-add grammar (GBNF/BNF) passthrough — both vLLM APIs accept
  a 'grammar' parameter handled by xgrammar which supports GBNF
- Priority: JSONSchema > json_object > Grammar

Ref: #6857
Signed-off-by: eureka928 <meobius123@gmail.com>
2026-04-13 13:52:05 +02:00
..
backend.py fix(vllm): support both vLLM API versions and add grammar passthrough 2026-04-13 13:52:05 +02:00
install.sh feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
Makefile feat(mlx): add mlx backend (#6049) 2025-08-22 08:42:29 +02:00
package.sh feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
README.md refactor: move backends into the backends directory (#1279) 2023-11-13 22:40:16 +01:00
requirements-after.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-cpu-after.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-cpu.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-cublas12-after.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-cublas12.txt Revert "chore(deps): bump torch from 2.7.0 to 2.7.1+xpu in /backend/python/vllm in the pip group across 1 directory" (#8367) 2026-02-03 08:34:54 +01:00
requirements-hipblas-after.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-hipblas.txt feat(rocm): bump to 7.x (#9323) 2026-04-12 08:51:30 +02:00
requirements-install.txt feat: migrate python backends from conda to uv (#2215) 2024-05-10 15:08:08 +02:00
requirements-intel-after.txt feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
requirements-intel.txt feat(qwen-tts): add Qwen-tts backend (#8163) 2026-01-23 15:18:41 +01:00
requirements.txt chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/vllm (#9177) 2026-03-31 10:10:17 +02:00
run.sh feat: Add backend gallery (#5607) 2025-06-15 14:56:52 +02:00
test.py feat(vllm): parity with llama.cpp backend (#9328) 2026-04-13 11:00:29 +02:00
test.sh feat: Add backend gallery (#5607) 2025-06-15 14:56:52 +02:00

Creating a separate environment for the vllm project

make vllm