At a high-level, we need to understand two points:
- gRPC is built on HTTP/2, and HTTP/2 is designed to have a single long-lived TCP connection (a sticky and persistent connection);
- To do gRPC load balancing, we need to shift from connection balancing to request balancing;
- Load Balancing gRPC services
- Official Doc: gRPC Load Balancing
- gRPC Load Balancing on Kubernetes without Tears (nice explanation on tradeoffs)
- Why load balancing gRPC is tricky? (very good article)
- Old-Official Doc: Load Balancing in gRPC
- Challenges of running gRPC services in production
- Microsoft Doc: Load balancing gRPC
- Load balancing gRPC service with Nginx
- gRPC load balancing — Service Meshes
- gRPC Load Balancing inside Kubernetes
- Proxyless gRPC load balancing in Kubernetes (using xDS API)
- gRPC Loadbalancing in GKE using Nginx Ingress Controller
- On gRPC Load Balancing
Enabling gRPC and HTTP/2 support on AWS Load Balancer (ALB) - unfortunately it does not support some HTTP/2 features like server push and multiplexing