Latency-based routing
A voice app lives or dies on response time — silence on the line is the failure mode. Latency-based routing steers each conversational turn to the quickest healthy endpoint using live signal.
- Routing decisions add ~95 ms p50 — LLM time dominates
- Latency-based and least-busy strategies for the hot path
- Streaming proxied transparently to your speech layer
- No hot-path buffering — tokens flow as they are generated