Skip to content

Instantly share code, notes, and snippets.

@tk6sudersen
tk6sudersen / adr-007-websocket-llm-queue.md
Last active June 4, 2026 19:25
connector-strategy-nestjs.ts

ADR-007 — Decoupling WebSocket Gateway from LLM Inference via Bull/Redis Queue

Context

Our agent-runtime service was handling WebSocket messages from multiple channels (Teams, Chatwoot, Dixa) and calling the LLM synchronously within the same request lifecycle. Under concurrent load (>20 simultaneous sessions), this caused:

  • WebSocket timeouts when LLM inference exceeded 30s
  • Message loss on reconnection during inference