- Rename TTS backend "qwen3" -> "local" across tts.py and README; the
code is a generic /v1/audio/speech client, not qwen-specific, and
config.yaml.example already used the local: key.
- Guard multicast_listener against non-UTF8 and empty packets so a
single bad announcement packet can't cancel the pipeline via gather.
- Fix credentials.h.template comments to reference flash.sh (not the
old flash_firmware.sh name).
- Drop stray test.wav arg from serial_monitor.py usage example in
README; the script takes an optional serial port, not an audio file.
Moves git_hash.h generation from the one-shot setup-git-hash.sh
post-commit hook into flash.sh, so fresh checkouts don't need a
bootstrap step. Only rewrites the file when the hash changes to
avoid triggering unnecessary recompiles.
Also wires GIT_HASH into m5_echo: startup log, multicast announce,
and the Device: line. Both sketches now append the hash to Device:.
flash.sh: switch m5_echo target from generic pico32 @ 115200 baud to
esp32:esp32:m5stack_atom @ 1500000 baud — ~8x faster uploads and
correct partition scheme (3MB app vs 1.3MB).
m5_echo: derive hostname as m5-echo-XXYYZZ from the WiFi STA MAC,
matching onjuino's pattern, and print Device line after Opus init.
- Rewrite README with v2 features (OpenClaw, M5 Echo, Opus, pluggable backends),
fold ARCHITECTURE.md and PIPELINE.md content inline
- Remove dev-only test scripts (streaming TTS, UDP recv, qwen3 bench, etc.)
- Remove redundant m5_echo/flash.sh and terminal.py (root scripts handle both)
- Consolidate credentials to .template naming, remove .example
- Embed parakeet-mlx ASR server as optional dependency (pipeline/services/asr_server.py)
- Default LLM to Claude Haiku 4.5 via OpenRouter, local example uses Gemma 4 E4B
- Update pyproject.toml with metadata, bump to 2.0.0
- Clean up .gitignore
- Handle zero-length Opus frame (0x00 0x00) as end-of-speech marker:
exits opusDecodeTask cleanly, clears isPlaying, re-enables mic
- Zero I2S DMA buffer on opusDecodeTask exit (prevents stale DMA)
- Reject 0xAA audio commands when callActive is false (prevents
bridge from restarting playback after user double-tapped to end)
- Don't reset mic_timeout after playback if call was ended
- LED: white flash for tap/interrupt, red-orange for call end
- Pipeline: append end-of-speech marker to Opus TCP payload
- ARCHITECTURE.md: document end-of-speech marker protocol
PTT devices (--device name=ip:ptt): skip VAD, buffer audio until packets
stop, skip LED commands, interrupt in-flight responses on new audio.
Auto-detected from multicast "PTT" announcement.
HTTP control server on :3002 for runtime device management:
POST/GET/DELETE /devices
Firmware: replace per-chunk DC offset with IIR filter to eliminate
zipper noise at chunk boundaries (m5_echo + onjuino).
Protocol: TCP timeouts use actual timeout param, failures are silent
for non-critical commands (LED blink).
Pipeline: labeled error logging (ASR/LLM/TTS), env var resolution
warning, Gemini OpenAI-compatible endpoint support.
Test scripts: rewritten to use pipeline modules, delete redundant
test_opus_tts.py, add pyproject.toml (replaces requirements.txt).
- L command toggles LED on/off (persisted to NVS) to reduce power jitter
- +/- commands adjust volume live, including during playback
- tcpServer.setNoDelay(true) to reduce TCP latency
- flash.sh --no-monitor flag to skip serial monitor after upload
Write each sample as L+R stereo pair for ALL_RIGHT I2S format — previous
mono writes dropped every other sample causing aliasing/sibilance on speech.
Set I2S rate to 16kHz (was 8kHz with broken 2x assumption).
Also: save server IP to NVS for auto-reconnect on reboot, add +/- serial
volume commands (work during playback), lower default volume to 5.
New m5_echo/ firmware for the ATOM Echo (ESP32-PICO-D4) with push-to-talk:
- Auto-starts call on boot via PTT multicast announcement
- Button hold = record mic (PDM, mu-law), release = listen
- Persistent TCP connection survives PTT cycles (Opus task discards
frames during PTT instead of closing connection)
- Handles ESP32 I2S ALL_RIGHT stereo quirks (2x sample rate
compensation for both mic and speaker)
- Includes flash script, serial terminal, and integration test tools
Also adds PTT_MODE flag to onjuino for bridge compatibility (multicast
announcement, auto-start call, skip VAD mic timeouts).