LocalAI/core/schema
Richard Palethorpe c60ed75258 feat(middleware): Model routing, PII filtering, Cloud model proxies
Add a routing middleware stack and a cloud-proxy backend.

* cloud-proxy: a Go gRPC backend that forwards OpenAI- and
  Anthropic-shaped chat requests to upstream providers, with an
  optional translate mode (OpenAI request -> Anthropic /v1/messages
  -> OpenAI response) and full tool-calling support.

* routing: admission control, content-aware model routing
  (embedding cache + classifier + rerank + Arch-Router score),
  PII detection/redaction (regex + NER) with streaming filter and
  OpenAI/Anthropic adapters, and a per-user/per-key billing recorder
  backed by GORM or in-memory storage.

* middleware: UsageMiddleware records usage via the billing recorder,
  plus admission, route-model, usage-stamp and trace middlewares.

* observability: BackendTrace ring buffer stores full request bodies
  (capped), MITM proxy emits structured trace events, and router
  classifier decisions surface at /api/router/decide.

* gallery: Arch-Router-1.5B (Q4_K_M and Q8_0).

* UI: cloud-proxy model-editor fields, classifier system-prompt and
  score-normalization config, and a Traces page rendering request
  bodies.

Assisted-by: claude-code:claude-opus-4-7 [Read] [Edit] [Bash]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-05-24 09:42:31 +01:00
..
agent_jobs.go feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084) 2026-04-04 15:14:35 +02:00
anthropic.go fix(anthropic): show null index when not present, default to 0 (#9225) 2026-04-04 15:13:17 +02:00
anthropic_test.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
audio_transform.go feat: add LocalVQE backend and audio transformations UI (#9640) 2026-05-04 22:07:11 +02:00
backend.go feat: Add backend gallery (#5607) 2025-06-15 14:56:52 +02:00
diarization.go feat(api): add /v1/audio/diarization endpoint with sherpa-onnx + vibevoice.cpp (#9654) 2026-05-05 15:10:13 +02:00
elevenlabs.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
finetune.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
gallery-model.schema.json [gallery] add JSON schema for gallery model specification (#7890) 2026-01-06 22:10:43 +01:00
jina.go fix(reranker): tests and top_n check fix #7212 (#7284) 2025-11-16 17:53:23 +01:00
localai.go feat(middleware): Model routing, PII filtering, Cloud model proxies 2026-05-24 09:42:31 +01:00
message.go feat(middleware): Model routing, PII filtering, Cloud model proxies 2026-05-24 09:42:31 +01:00
message_test.go feat(middleware): Model routing, PII filtering, Cloud model proxies 2026-05-24 09:42:31 +01:00
ollama.go fix(ollama): accept float-encoded integer options (fixes #9837) (#9849) 2026-05-16 18:38:19 +02:00
ollama_test.go fix(ollama): accept float-encoded integer options (fixes #9837) (#9849) 2026-05-16 18:38:19 +02:00
openai.go feat(middleware): Model routing, PII filtering, Cloud model proxies 2026-05-24 09:42:31 +01:00
openresponses.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
prediction.go feat(middleware): Model routing, PII filtering, Cloud model proxies 2026-05-24 09:42:31 +01:00
quantization.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
request.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
schema_suite_test.go feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120) 2025-11-07 21:23:50 +01:00
tokenize.go feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084) 2026-04-04 15:14:35 +02:00
transcription.go feat: support word-level timestamps for faster-whisper (#9621) 2026-05-06 00:32:52 +02:00
transcription_format.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00