LocalAI/core/services
Ettore Di Giacinto 4b66c3ad45 fix(distributed): don't increment Attempts on in-flight install timeout
An in-flight timeout (worker still pulling the OCI image) is not a
failed attempt, it's a delayed one. Incrementing Attempts let
genuinely-progressing slow installs (e.g. 30 GB CUDA images on Wi-Fi)
trip the reconciler's maxPendingBackendOpAttempts cap and dead-letter
the queue row while the worker was still legitimately working.

RecordPendingBackendOpInFlight now only updates LastError and NextRetryAt.
Also documents "running_on_worker" in the NodeOpStatus.Status enum
comment so Task 6 implementers see the full surface.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-22 20:32:59 +00:00
..
advisorylock feat(distributed): sync state with frontends, better backend management reporting (#9426) 2026-04-19 17:55:53 +02:00
agentpool fix(agentpool): close truncate-then-read race in agent_jobs.json persistence (#9811) 2026-05-13 23:58:43 +02:00
agents feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
dbutil feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
distributed feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
facerecognition feat(face-recognition): add insightface/onnx backend for 1:1 verify, 1:N identify, embedding, detection, analysis (#9480) 2026-04-22 21:55:41 +02:00
finetune chore: Security hardening (#9719) 2026-05-08 16:25:45 +02:00
galleryop feat(distributed): introduce galleryop.ErrWorkerStillInstalling sentinel 2026-05-22 20:08:45 +00:00
jobs feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
mcp feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
messaging fix(distributed): split NATS backend.upgrade off install + dedup loads (#9717) 2026-05-08 16:24:54 +02:00
modeladmin feat(gallery): Speed up load times and clean gallery entries (#9211) 2026-05-06 14:51:38 +02:00
monitoring feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
nodes fix(distributed): don't increment Attempts on in-flight install timeout 2026-05-22 20:32:59 +00:00
quantization feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
skills refactor(agents): bump skillserver, drop redundant Name from list_skills output (#9916) 2026-05-21 14:45:53 +02:00
storage feat: track files being staged (#9275) 2026-04-08 14:33:58 +02:00
testutil feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
voicerecognition feat: voice recognition (#9500) 2026-04-23 12:07:14 +02:00
worker feat(gallery): verify backend OCI images with keyless cosign (#9823) 2026-05-18 08:02:20 +02:00