LocalAI/core/cli
Richard Palethorpe c60ed75258 feat(middleware): Model routing, PII filtering, Cloud model proxies
Add a routing middleware stack and a cloud-proxy backend.

* cloud-proxy: a Go gRPC backend that forwards OpenAI- and
  Anthropic-shaped chat requests to upstream providers, with an
  optional translate mode (OpenAI request -> Anthropic /v1/messages
  -> OpenAI response) and full tool-calling support.

* routing: admission control, content-aware model routing
  (embedding cache + classifier + rerank + Arch-Router score),
  PII detection/redaction (regex + NER) with streaming filter and
  OpenAI/Anthropic adapters, and a per-user/per-key billing recorder
  backed by GORM or in-memory storage.

* middleware: UsageMiddleware records usage via the billing recorder,
  plus admission, route-model, usage-stamp and trace middlewares.

* observability: BackendTrace ring buffer stores full request bodies
  (capped), MITM proxy emits structured trace events, and router
  classifier decisions surface at /api/router/decide.

* gallery: Arch-Router-1.5B (Q4_K_M and Q8_0).

* UI: cloud-proxy model-editor fields, classifier system-prompt and
  score-normalization config, and a Traces page rendering request
  bodies.

Assisted-by: claude-code:claude-opus-4-7 [Read] [Edit] [Bash]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-05-24 09:42:31 +01:00
..
context feat: Merge repeated log lines in the terminal (#9141) 2026-03-26 22:16:13 +01:00
worker feat(gallery): verify backend OCI images with keyless cosign (#9823) 2026-05-18 08:02:20 +02:00
workerregistry feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
agent.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
agent_test.go feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler (#9186) 2026-03-31 08:28:56 +02:00
agent_worker.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
backends.go feat(gallery): verify backend OCI images with keyless cosign (#9823) 2026-05-18 08:02:20 +02:00
cli.go feat: localai assistant chat modality (#9602) 2026-04-28 19:29:27 +02:00
cli_suite_test.go feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler (#9186) 2026-03-31 08:28:56 +02:00
completion.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
completion_test.go feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler (#9186) 2026-03-31 08:28:56 +02:00
deprecations.go chore: Standardize CLI flag naming to kebab-case (M12) (#8912) 2026-03-09 22:15:39 +01:00
explorer.go chore(refactor): move logging to common package based on slog (#7668) 2025-12-21 19:33:13 +01:00
federated.go chore: Standardize CLI flag naming to kebab-case (M12) (#8912) 2026-03-09 22:15:39 +01:00
mcp_server.go feat: localai assistant chat modality (#9602) 2026-04-28 19:29:27 +02:00
models.go feat(gallery): verify backend OCI images with keyless cosign (#9823) 2026-05-18 08:02:20 +02:00
run.go feat(middleware): Model routing, PII filtering, Cloud model proxies 2026-05-24 09:42:31 +01:00
run_safety.go chore: Security hardening (#9719) 2026-05-08 16:25:45 +02:00
run_safety_test.go chore: Security hardening (#9719) 2026-05-08 16:25:45 +02:00
soundgeneration.go feat(whisper): honor client cancellation via ggml abort_callback (#9710) 2026-05-08 01:44:47 +02:00
transcript.go feat(whisper): honor client cancellation via ggml abort_callback (#9710) 2026-05-08 01:44:47 +02:00
tts.go feat(whisper): honor client cancellation via ggml abort_callback (#9710) 2026-05-08 01:44:47 +02:00
util.go feat: improve CLI error messages with actionable guidance (#8880) 2026-04-21 11:53:26 +02:00
worker.go fix(distributed): split NATS backend.upgrade off install + dedup loads (#9717) 2026-05-08 16:24:54 +02:00