shraddhaag/load_balancers.md

## load_balancers.md

      
    Raw
  

              load_balancers.md
            
          
    Load Balancers

What?


Devices responsible for distributing network (Layer 4) or application traffic (Layer 7).

Layer 4: distributes requests of transport and network layer protocols ie UDP, TCP, IP FTP
Layer 7: distributes requests of application layer protocols ie HTTP


abstracts the fleet of servers as one big server to the outside world (in other words, POC is the load balancer instead of the server itself) or "service virtualisation"

Why?


scalability: since the actual server is not the POC, we can modify the number of servers servicing traffic easily
availability/reliability: only sends traffic to servers that are healthy
performance: reduces load on a single server
predictability: confidence and control of the system

How?


distributes traffic to servers based on an algo (eg - round robin, least response time)
for Layer 7, it also distrbutes depending on the application specific data (ie request headers, cookies, request params)

Timeline

In the early days of commercial internet, startups didnt have the capital to afford dedicated web servers and settled for PC based servers which could not handle the amount of traffic received. And so the hunt for a cost effective solution began!

DNS round robin:

DNS can resolve incoming requests for a domain name to multiple IP addresses in different order.
^ basic solution where the domain name served as virtual POC
scalability: handled well with limitation

as load increases, all that was needed was add a new server, include IP address in the DNS records and done!
DNS response have a maximum length restricting the number of servers that can be added


availability: since DNS has no way to knowing server health, IP addresses of unhealthy servers are also returned


Software based:

built directly into application software or the OS of the application server, everyone built their own solution
eg:

each server in a cluster also listens to a cluster IP; requests connect to the clutser IP
whichever server first achnowledges the request routes the request to an actual server IP
this routing could be done on various logic, like whichever server has the least amount of active sessions


scalability: scalable to a limit; as each server needs to be in contact with all others, network traffic increased exponentially on addition of even a single server
availability: increased dramatically but was limited due to limited scalability


Network based:

application independent, hosted outside of application servers
virtual server address (POC); incoming requests would get forwarded to one of the actual servers
employed health monitors (although not as comprehensive as software based) to determine if particular server was fit to serve an incoming request
scalability, availability and predicatability all improved dramtically


Changed the course of HA discussion from "uptime" to what does "available" mean (response time metrics).

References:


https://www.f5.com/services/resources/glossary/load-balancer
https://community.f5.com/t5/technical-articles/what-is-load-balancing/ta-p/282793


## retries.md

      
    Raw
  

              retries.md
            
          
    Exponential Break off and Circuit Breaker pattern

A neat way to deal with transient errrors using a less expensive, not dumb retry mechanism.

References:


https://dzone.com/articles/understanding-retry-pattern-with-exponential-back
https://martinfowler.com/bliki/CircuitBreaker.html