onju-v2

mirror of https://github.com/justLV/onju-v2 synced 2026-04-21 23:57:26 +00:00

Author	SHA1	Message	Date
justLV	13f9d59245	Add Qwen3-TTS as local TTS backend with voice cloning Adds mlx-audio-based Qwen3-TTS as an alternative to ElevenLabs, enabling fully offline voice synthesis with voice cloning from a short reference audio clip. Benchmarked at 0.52x RTF (sub-realtime) on Apple Silicon with the 1.7B-Base-4bit model.	2026-02-09 13:53:46 -08:00
justLV	0c9c75b3bf	Replace webrtcvad with Silero VAD (ONNX, no PyTorch) Switch from webrtcvad's binary is_speech to Silero VAD's calibrated float probability via direct ONNX session calls with numpy. The LSTM provides temporal smoothing natively, eliminating the sliding window hack. Frame size changes from 480 (30ms) to 512 (32ms) end-to-end to match Silero's requirements. Consolidate pipeline/requirements.txt into root requirements.txt, swap webrtcvad+setuptools for silero-vad+onnxruntime.	2026-02-07 17:00:02 -08:00
justLV	7162aa0f3b	Improve pipeline setup, logging, and test client compatibility Move venv to repo root with combined requirements.txt, fix libopus/portaudio discovery on macOS, replace deprecated audioop with numpy u-law encoder, add colored pipeline logging with suppressed third-party noise, fix mic deadlock on non-speech rejection, fix localhost IP mismatch for test client, add VAD visualization bar, tune VAD for conversational speech, and move runtime data to gitignored data/ directory.	2026-02-07 16:22:53 -08:00
justLV	b3538493a6	Add modular async pipeline server and ESP32 mDNS fallback Pipeline: async voice pipeline replacing monolithic threaded server. ASR, LLM, and TTS are independent pluggable services. ASR calls external parakeet-asr-server, LLM uses any OpenAI-compatible endpoint, TTS uses ElevenLabs with pluggable backend interface. Firmware: add mDNS hostname resolution as fallback when multicast discovery doesn't work. Resolves configured server_hostname via MDNS.queryHost() on boot, falls back to multicast if resolution fails. Also adds test_client.py that emulates an ESP32 device for testing without hardware (TCP server, Opus decode, mic streaming).	2026-02-07 15:04:12 -08:00

4 commits