Rework the stall prompt to distinguish LOOKUP (say something specific,
three-to-seven words) from ACTION (content-free backchannel, two-to-five
words, no action verbs or promises) and restructure test_stall.py to
group cases by expected label for easier manual review.
- git mv test_client, test_mic, test_speaker into tests/
- Add tests/test_stall.py (benchmarks the Gemini stall classifier against
conversational/fetch/capture/act/follow-up queries)
- Add tests/test_stream.py (raw SSE chunk inspection against the agentic
gateway)
- Update config path resolution in the new tests to climb one level
- Update README Testing section with new tests/ paths