LocalAI/.github
LocalAI [bot] 35f6db8c76
ci: split backend-jobs into single-arch and multi-arch matrices (#9746)
Symptom (run 25612992409): backend-merge-jobs failed with
"quay.io/go-skynet/local-ai-backends@sha256:fdbd93ca...: not found"
even though the per-arch build for -cpu-llama-cpp pushed that exact
digest 14h31m earlier.

Root cause: backend-merge-jobs was gated on the WHOLE backend-jobs
matrix (`needs: backend-jobs`). The multi-arch -cpu-llama-cpp legs
finished within 30 min, but a single-arch CUDA-12-llama-cpp slot in
the same matrix queued for ~8h (max-parallel: 8 throttle) and then
took ~6h to build cold. By the time it freed the merge to run, quay's
GC had reaped the per-arch digests pushed by the fast multi-arch legs
the day before.

Fix: split the linux backend matrix in two.

  backend-jobs-multiarch  - entries with `platform-tag` set (paired
    per-arch legs that feed backend-merge-jobs).
  backend-jobs-singlearch - entries without `platform-tag` (heavy
    standalone builds: CUDA, ROCm, Intel oneAPI, vLLM, sglang, etc.).

backend-merge-jobs now `needs:` only backend-jobs-multiarch. The
multi-arch matrix completes in ~2-3h, well inside quay's GC window.
Heavy single-arch entries keep running independently with no merge
dependency.

scripts/changed-backends.js gains a splitByArch() helper that
partitions filtered entries by whether `platform-tag` is set, and
emits matrix-singlearch + matrix-multiarch + has-backends-singlearch
+ has-backends-multiarch outputs (replacing the previous combined
matrix / has-backends pair). Applied in both the full-matrix and
filtered-matrix code paths. Smoke test: 199 single-arch + 72 multi-
arch + 35 darwin = 271 total entries; 36 merge-matrix entries
(one per multi-arch backend pair). Matches expectation.

Local `make backends/<name>` is unaffected — the script's outputs
only feed CI workflow matrices.

Assisted-by: Claude:claude-opus-4-7

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-10 18:15:53 +02:00
..
actions ci: phase 1-3 of GHA free tier migration (path filter, multi-arch split prep, /mnt disk relief) (#9726) 2026-05-08 23:43:41 +02:00
ci fix: roll out bluemonday Sanitize more widely (#3794) 2024-10-12 09:45:47 +02:00
gallery-agent fix(ci): switch gallery-agent to sigs.k8s.io/yaml (#9397) 2026-04-17 10:10:42 +02:00
ISSUE_TEMPLATE docs/examples: enhancements (#1572) 2024-01-18 19:41:08 +01:00
workflows ci: split backend-jobs into single-arch and multi-arch matrices (#9746) 2026-05-10 18:15:53 +02:00
backend-matrix.yml ci: refactor llama-cpp variant Dockerfiles to consume prebuilt base-grpc images (PR 2/2) (#9738) 2026-05-10 00:03:52 +02:00
bump_deps.sh feat: do not bundle llama-cpp anymore (#5790) 2025-07-18 13:24:12 +02:00
bump_docs.sh fix: github bump_docs.sh regex to drop emoji and other text (#2180) 2024-04-29 03:55:29 +00:00
bump_vllm_wheel.sh feat(vllm): expose AsyncEngineArgs via generic engine_args YAML map (#9563) 2026-04-29 00:49:28 +02:00
check_and_update.py fix(ci): fixup checksum scanning pipeline (#3631) 2024-09-23 10:56:10 +02:00
checksum_checker.sh fix(ci): fixup correct path for check_and_update.py (#2777) 2024-07-11 23:05:43 +02:00
dependabot.yml feat: Add backend gallery (#5607) 2025-06-15 14:56:52 +02:00
FUNDING.yml Create FUNDING.yml (#725) 2023-07-09 13:39:00 +02:00
labeler.yml chore(ci): update labels 2025-02-13 09:58:19 +01:00
PULL_REQUEST_TEMPLATE.md feat(vllm): Allow to set quantization (#1094) 2023-09-22 15:52:38 +02:00
release.yml feat(p2p): Federation and AI swarms (#2723) 2024-07-08 22:04:06 +02:00
stale.yml feat: add PR template and stale configuration (#316) 2023-05-20 09:10:20 +02:00