LocalAI/core/services
Ettore Di Giacinto 75fba9e03f
fix(distributed): scope Upgrade All to nodes that have the backend installed (#9678)
In distributed mode the React UI's "Upgrade All" button fanned every
detected outdated backend out to every healthy backend node, including
nodes that never had that backend installed. On heterogeneous clusters
this surfaced as platform errors (e.g. mac-mini-m4 asked to upgrade
cpu-insightface-development, which has no darwin/arm64 variant) and left
forever-retrying pending_backend_ops rows.

DistributedBackendManager.UpgradeBackend now queries ListBackends()
first, builds the target node-ID set from SystemBackend.Nodes, and only
fans out to those nodes — every per-node primitive
(adapter.InstallBackend, the pending-ops queue, BackendOpResult) is
unchanged. enqueueAndDrainBackendOp gains an optional targetNodeIDs
allowlist; Install/Delete keep their fan-to-everyone semantics by
passing nil. If no node reports the backend installed, UpgradeBackend
now returns a clear "not installed on any node" error instead of
producing a stuck queue.

Adds Ginkgo coverage for the smart fan-out: backend on a subset of
nodes goes only to those nodes; backend on no node returns the new
error and never sends a NATS install request.


Assisted-by: Claude:claude-opus-4-7 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-06 00:28:41 +02:00
..
advisorylock feat(distributed): sync state with frontends, better backend management reporting (#9426) 2026-04-19 17:55:53 +02:00
agentpool feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
agents feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
dbutil feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
distributed feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
facerecognition feat(face-recognition): add insightface/onnx backend for 1:1 verify, 1:N identify, embedding, detection, analysis (#9480) 2026-04-22 21:55:41 +02:00
finetune feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
galleryop feat(distributed): per-node backend installation from the gallery 2026-04-26 22:05:18 +00:00
jobs feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
mcp feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
messaging feat(distributed): support multiple replicas of one model on the same node (#9583) 2026-04-27 21:20:05 +02:00
modeladmin feat: localai assistant chat modality (#9602) 2026-04-28 19:29:27 +02:00
monitoring feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
nodes fix(distributed): scope Upgrade All to nodes that have the backend installed (#9678) 2026-05-06 00:28:41 +02:00
quantization feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
skills feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
storage feat: track files being staged (#9275) 2026-04-08 14:33:58 +02:00
testutil feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
voicerecognition feat: voice recognition (#9500) 2026-04-23 12:07:14 +02:00