LocalAI/backend/python/vllm
Ettore Di Giacinto 59108fbe32
feat: add distributed mode (#9124)
* feat: add distributed mode (experimental)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix data races, mutexes, transactions

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix events and tool stream in agent chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* use ginkgo

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(cron): compute correctly time boundaries avoiding re-triggering

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* enhancements, refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do not flood of healthy checks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do not list obvious backends as text backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* tests fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop redundant healthcheck

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* enhancements, refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-30 00:47:27 +02:00
..
backend.py feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
install.sh chore: Update to Ubuntu24.04 (cont #7423) (#7769) 2026-01-06 15:26:42 +01:00
Makefile feat(mlx): add mlx backend (#6049) 2025-08-22 08:42:29 +02:00
README.md refactor: move backends into the backends directory (#1279) 2023-11-13 22:40:16 +01:00
requirements-after.txt fix(python): move vllm to after deps, drop diffusers main deps 2024-08-07 23:34:37 +02:00
requirements-cpu.txt Revert "chore(deps): bump torch from 2.7.0 to 2.7.1+xpu in /backend/python/vllm in the pip group across 1 directory" (#8367) 2026-02-03 08:34:54 +01:00
requirements-cublas12-after.txt fix(vllm): Update flash-attn to specific wheel URL 2025-11-21 18:06:46 +01:00
requirements-cublas12.txt Revert "chore(deps): bump torch from 2.7.0 to 2.7.1+xpu in /backend/python/vllm in the pip group across 1 directory" (#8367) 2026-02-03 08:34:54 +01:00
requirements-hipblas.txt chore: Update to Ubuntu24.04 (cont #7423) (#7769) 2026-01-06 15:26:42 +01:00
requirements-install.txt feat: migrate python backends from conda to uv (#2215) 2024-05-10 15:08:08 +02:00
requirements-intel.txt feat(qwen-tts): add Qwen-tts backend (#8163) 2026-01-23 15:18:41 +01:00
requirements.txt chore(deps): bump grpcio from 1.76.0 to 1.78.1 in /backend/python/vllm (#8635) 2026-02-25 08:17:32 +01:00
run.sh feat: Add backend gallery (#5607) 2025-06-15 14:56:52 +02:00
test.py fix: vllm missing logprobs (#5279) 2025-04-30 12:55:07 +00:00
test.sh feat: Add backend gallery (#5607) 2025-06-15 14:56:52 +02:00

Creating a separate environment for the vllm project

make vllm