LocalAI/pkg
Richard Palethorpe 13734ae9fa
feat: Add Sherpa ONNX backend for ASR and TTS (#8523)
feat(backend): Add Sherpa ONNX backend and Omnilingual ASR

Adds a new Go backend wrapping sherpa-onnx via purego (no cgo). Same
approach as opus/stablediffusion-ggml/whisper — a thin C shim
(csrc/shim.c + shim.h → libsherpa-shim.so) wraps the bits purego
can't reach directly: nested struct config writes, result-struct field
reads, and the streaming TTS callback trampoline. The Go side uses
opaque uintptr handles and purego.NewCallback for the TTS callback.

Supports:
- VAD via sherpa-onnx's Silero VAD
- Offline ASR: Whisper, Paraformer, SenseVoice, Omnilingual CTC
- Online/streaming ASR: zipformer transducer with endpoint detection
  (AudioTranscriptionStream emits delta events during decode)
- Offline TTS: VITS (LJS, etc.)
- Streaming TTS: sherpa-onnx's callback API → PCM chunks on a channel,
  prefixed by a streaming WAV header

Gallery entries: omnilingual-0.3b-ctc-q8-sherpa (1600-language offline
ASR), streaming-zipformer-en-sherpa (low-latency streaming ASR),
silero-vad-sherpa, vits-ljs-sherpa.

E2E coverage: tests/e2e-backends for offline + streaming ASR,
tests/e2e for the full realtime pipeline (VAD + STT + TTS).

Assisted-by: claude-opus-4-7-1M [Claude Code]

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-04-24 14:40:06 +02:00
..
audio feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
concurrency feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
downloader feat: add biometrics UI (#9524) 2026-04-24 08:50:34 +02:00
functions feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
grpc feat: voice recognition (#9500) 2026-04-23 12:07:14 +02:00
huggingface-api fix(importer): emit all shards for multi-part GGUF models (#9513) 2026-04-23 15:00:02 +02:00
model feat: wire transcription for llama.cpp, add streaming support (#9353) 2026-04-14 16:13:40 +02:00
oci feat: backend versioning, upgrade detection and auto-upgrade (#9315) 2026-04-11 22:31:15 +02:00
reasoning fix(reasoning): suppress partial tag tokens during autoparser warm-up 2026-04-04 20:45:57 +00:00
sanitize feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
signals feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
sound feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
store chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
system feat(importer): expand importer flow to almost all backends (#9466) 2026-04-22 22:42:37 +02:00
utils feat: Add Sherpa ONNX backend for ASR and TTS (#8523) 2026-04-24 14:40:06 +02:00
vram feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084) 2026-04-04 15:14:35 +02:00
xio feat(ui): allow to cancel ops (#7264) 2025-11-13 18:41:47 +01:00
xsync chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
xsysinfo fix(gpu): better detection for MacOS and Thor (#9263) 2026-04-07 00:39:07 +02:00