Commit graph

11 commits

Author SHA1 Message Date
justLV
0bc3ae209f Pre-publish fixes: local TTS key, multicast crash guard, doc drift
- Rename TTS backend "qwen3" -> "local" across tts.py and README; the
  code is a generic /v1/audio/speech client, not qwen-specific, and
  config.yaml.example already used the local: key.
- Guard multicast_listener against non-UTF8 and empty packets so a
  single bad announcement packet can't cancel the pipeline via gather.
- Fix credentials.h.template comments to reference flash.sh (not the
  old flash_firmware.sh name).
- Drop stray test.wav arg from serial_monitor.py usage example in
  README; the script takes an optional serial port, not an audio file.
2026-04-12 19:09:58 -07:00
justLV
04990145ae Embed git hash in m5_echo, generate git_hash.h in flash.sh
Moves git_hash.h generation from the one-shot setup-git-hash.sh
post-commit hook into flash.sh, so fresh checkouts don't need a
bootstrap step. Only rewrites the file when the hash changes to
avoid triggering unnecessary recompiles.

Also wires GIT_HASH into m5_echo: startup log, multicast announce,
and the Device: line. Both sketches now append the hash to Device:.
2026-04-12 14:18:21 -07:00
justLV
a2ab42929c Use m5stack_atom board for Atom Echo, MAC-based hostname
flash.sh: switch m5_echo target from generic pico32 @ 115200 baud to
esp32:esp32:m5stack_atom @ 1500000 baud — ~8x faster uploads and
correct partition scheme (3MB app vs 1.3MB).

m5_echo: derive hostname as m5-echo-XXYYZZ from the WiFi STA MAC,
matching onjuino's pattern, and print Device line after Opus init.
2026-04-10 16:50:51 -07:00
justLV
fcc2ef284b Fix Opus TCP read race: check available() after disconnect in frame read loop 2026-04-08 14:00:33 -07:00
justLV
260fbea9eb Fix onjuino interaction description (VAD, not double-tap), update m5_echo README terminology 2026-04-08 13:49:09 -07:00
justLV
398f89dca7 Prepare repo for v2 release: rewrite README, clean up dev scripts, embed ASR server
- Rewrite README with v2 features (OpenClaw, M5 Echo, Opus, pluggable backends),
  fold ARCHITECTURE.md and PIPELINE.md content inline
- Remove dev-only test scripts (streaming TTS, UDP recv, qwen3 bench, etc.)
- Remove redundant m5_echo/flash.sh and terminal.py (root scripts handle both)
- Consolidate credentials to .template naming, remove .example
- Embed parakeet-mlx ASR server as optional dependency (pipeline/services/asr_server.py)
- Default LLM to Claude Haiku 4.5 via OpenRouter, local example uses Gemma 4 E4B
- Update pyproject.toml with metadata, bump to 2.0.0
- Clean up .gitignore
2026-04-08 13:00:15 -07:00
justLV
e4d7bc7ca5 End-of-speech protocol, LED tweaks, call-end guard
- Handle zero-length Opus frame (0x00 0x00) as end-of-speech marker:
  exits opusDecodeTask cleanly, clears isPlaying, re-enables mic
- Zero I2S DMA buffer on opusDecodeTask exit (prevents stale DMA)
- Reject 0xAA audio commands when callActive is false (prevents
  bridge from restarting playback after user double-tapped to end)
- Don't reset mic_timeout after playback if call was ended
- LED: white flash for tap/interrupt, red-orange for call end
- Pipeline: append end-of-speech marker to Opus TCP payload
- ARCHITECTURE.md: document end-of-speech marker protocol
2026-04-07 16:41:59 -07:00
justLV
7bcb94833c Add PTT device support, IIR DC offset fix, control API, test script updates
PTT devices (--device name=ip:ptt): skip VAD, buffer audio until packets
stop, skip LED commands, interrupt in-flight responses on new audio.
Auto-detected from multicast "PTT" announcement.

HTTP control server on :3002 for runtime device management:
  POST/GET/DELETE /devices

Firmware: replace per-chunk DC offset with IIR filter to eliminate
zipper noise at chunk boundaries (m5_echo + onjuino).

Protocol: TCP timeouts use actual timeout param, failures are silent
for non-critical commands (LED blink).

Pipeline: labeled error logging (ASR/LLM/TTS), env var resolution
warning, Gemini OpenAI-compatible endpoint support.

Test scripts: rewritten to use pipeline modules, delete redundant
test_opus_tts.py, add pyproject.toml (replaces requirements.txt).
2026-04-06 14:22:20 -07:00
justLV
3c133ef40e Add LED toggle, TCP_NODELAY, volume controls, --no-monitor flag
- L command toggles LED on/off (persisted to NVS) to reduce power jitter
- +/- commands adjust volume live, including during playback
- tcpServer.setNoDelay(true) to reduce TCP latency
- flash.sh --no-monitor flag to skip serial monitor after upload
2026-04-03 17:06:44 -07:00
justLV
4cd008d822 Fix I2S stereo interleaving, persist server IP, add volume controls
Write each sample as L+R stereo pair for ALL_RIGHT I2S format — previous
mono writes dropped every other sample causing aliasing/sibilance on speech.
Set I2S rate to 16kHz (was 8kHz with broken 2x assumption).

Also: save server IP to NVS for auto-reconnect on reboot, add +/- serial
volume commands (work during playback), lower default volume to 5.
2026-04-03 16:16:48 -07:00
justLV
529981de54 Add M5Stack ATOM Echo PTT firmware and onjuino PTT mode flag
New m5_echo/ firmware for the ATOM Echo (ESP32-PICO-D4) with push-to-talk:
- Auto-starts call on boot via PTT multicast announcement
- Button hold = record mic (PDM, mu-law), release = listen
- Persistent TCP connection survives PTT cycles (Opus task discards
  frames during PTT instead of closing connection)
- Handles ESP32 I2S ALL_RIGHT stereo quirks (2x sample rate
  compensation for both mic and speaker)
- Includes flash script, serial terminal, and integration test tools

Also adds PTT_MODE flag to onjuino for bridge compatibility (multicast
announcement, auto-start call, skip VAD mic timeouts).
2026-04-03 15:36:42 -07:00