LocalAI/pkg/utils
Richard Palethorpe 13734ae9fa
feat: Add Sherpa ONNX backend for ASR and TTS (#8523)
feat(backend): Add Sherpa ONNX backend and Omnilingual ASR

Adds a new Go backend wrapping sherpa-onnx via purego (no cgo). Same
approach as opus/stablediffusion-ggml/whisper — a thin C shim
(csrc/shim.c + shim.h → libsherpa-shim.so) wraps the bits purego
can't reach directly: nested struct config writes, result-struct field
reads, and the streaming TTS callback trampoline. The Go side uses
opaque uintptr handles and purego.NewCallback for the TTS callback.

Supports:
- VAD via sherpa-onnx's Silero VAD
- Offline ASR: Whisper, Paraformer, SenseVoice, Omnilingual CTC
- Online/streaming ASR: zipformer transducer with endpoint detection
  (AudioTranscriptionStream emits delta events during decode)
- Offline TTS: VITS (LJS, etc.)
- Streaming TTS: sherpa-onnx's callback API → PCM chunks on a channel,
  prefixed by a streaming WAV header

Gallery entries: omnilingual-0.3b-ctc-q8-sherpa (1600-language offline
ASR), streaming-zipformer-en-sherpa (low-latency streaming ASR),
silero-vad-sherpa, vits-ljs-sherpa.

E2E coverage: tests/e2e-backends for offline + streaming ASR,
tests/e2e for the full realtime pipeline (VAD + STT + TTS).

Assisted-by: claude-opus-4-7-1M [Claude Code]

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-04-24 14:40:06 +02:00
..
base64.go feat: add biometrics UI (#9524) 2026-04-24 08:50:34 +02:00
base64_test.go feat: add biometrics UI (#9524) 2026-04-24 08:50:34 +02:00
ffmpeg.go feat: Add Sherpa ONNX backend for ASR and TTS (#8523) 2026-04-24 14:40:06 +02:00
ffmpeg_test.go feat: Add Sherpa ONNX backend for ASR and TTS (#8523) 2026-04-24 14:40:06 +02:00
hash.go feat: embedded model configurations, add popular model examples, refactoring (#1532) 2024-01-05 23:16:33 +01:00
json.go fix: do not break on newlines on function returns (#864) 2023-08-04 21:46:36 +02:00
logging.go chore(refactor): move logging to common package based on slog (#7668) 2025-12-21 19:33:13 +01:00
path.go feat: elevenlabs sound-generation api (#3355) 2024-08-24 00:20:28 +00:00
strings.go feat: add distributed mode (#9124) 2026-03-30 00:47:27 +02:00
untar.go refactor: gallery inconsistencies (#2647) 2024-06-24 17:32:12 +02:00
urlfetch.go security: validate URLs to prevent SSRF in content fetching endpoints (#8476) 2026-02-10 15:14:14 +01:00
urlfetch_test.go security: validate URLs to prevent SSRF in content fetching endpoints (#8476) 2026-02-10 15:14:14 +01:00
utils_suite_test.go refactor: consolidate usage of GetURI (#674) 2023-06-26 12:25:38 +02:00