Our agent-runtime service was handling WebSocket messages from multiple channels (Teams, Chatwoot, Dixa) and calling the LLM synchronously within the same request lifecycle. Under concurrent load (>20 simultaneous sessions), this caused:
- WebSocket timeouts when LLM inference exceeded 30s
- Message loss on reconnection during inference