LocalAI/backend/python/vllm-omni
LocalAI [bot] 4e154b59e5
fix(ci): unbreak rerankers (torch bump) and vllm-omni on aarch64 (#9688)
Two unrelated CI breakages bundled together since both are one-liners:

- rerankers: bump torch 2.4.1 -> 2.7.1 on cpu/cublas12. The unpinned
  transformers resolves to 5.x, whose moe.py registers a custom_op with
  string-typed `'torch.Tensor'` annotations that torch 2.4.1's
  infer_schema rejects, blocking the gRPC server from starting and
  failing all 5 backend tests with "Connection refused" on :50051.
  Matches the version used by the transformers backend.

- vllm-omni: strip fa3-fwd from the upstream requirements/cuda.txt
  before resolving on aarch64. fa3-fwd 0.0.3 ships only an
  x86_64 wheel and has no sdist, making the cuda profile unsatisfiable
  on Jetson/SBSA. fa3-fwd is a soft runtime dep — vllm-omni's
  attention backends fall back to FA2 then SDPA when it's missing.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-06 17:07:24 +02:00
..
backend.py feat(backends/python): use tempfile.gettempdir() instead of hardcoded /tmp (#9629) 2026-05-01 10:56:24 +02:00
install.sh fix(ci): unbreak rerankers (torch bump) and vllm-omni on aarch64 (#9688) 2026-05-06 17:07:24 +02:00
Makefile feat(vllm-omni): add new backend (#8188) 2026-01-24 22:23:30 +01:00
requirements-after.txt feat(vllm-omni): add new backend (#8188) 2026-01-24 22:23:30 +01:00
requirements-cublas12-after.txt feat(vllm-omni): add new backend (#8188) 2026-01-24 22:23:30 +01:00
requirements-cublas12.txt feat(vllm-omni): add new backend (#8188) 2026-01-24 22:23:30 +01:00
requirements-cublas13.txt feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553) 2026-04-25 12:26:29 +02:00
requirements-hipblas.txt feat(rocm): bump to 7.x (#9323) 2026-04-12 08:51:30 +02:00
requirements-l4t13.txt feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553) 2026-04-25 12:26:29 +02:00
requirements.txt feat(vllm-omni): add new backend (#8188) 2026-01-24 22:23:30 +01:00
run.sh feat(vllm-omni): add new backend (#8188) 2026-01-24 22:23:30 +01:00
test.py feat(vllm-omni): add new backend (#8188) 2026-01-24 22:23:30 +01:00
test.sh feat(vllm-omni): add new backend (#8188) 2026-01-24 22:23:30 +01:00