5
Voice Assistant Architecture
+100 XP5 min5 / 13
Overview: Voice Assistant Architecture
Overview: Voice Assistant Architecture
Voice assistant latency budget: VAD 10ms → STT 100-200ms → LLM 200-500ms → TTS 90-200ms → network 50-100ms = 450-1010ms total. Pipecat is the standard open-source framework. The hardest UX problem is barge-in: detecting user interruption while TTS is still playing.
1 of 3