pbailis/rapgenius-balancing.md

## rapgenius-balancing.md

      
    Raw
  

              rapgenius-balancing.md
            
          
    RapGenius has an interesting post about Heroku's randomized load balancing, complaining about how random placement degrades performance compared to prior omniscient approaches. RapGenius ran some simulations, including an experiments with a "Choice of Two" method:

Choice of two routing is the naive method from before, with the twist that when you assign a request to a random dyno, if that dyno is already busy then you reassign the request to a second random dyno, with no regard for whether the second dyno is busy

This differs subtly but substantially from the standard "Power of Two Choices" randomized load balancing:

each [request] is placed in the least loaded of d >= 2 [Dynos] chosen independently and uniformly at random

Take a look at the difference in queue lengths below, for 200 Dynos, 1000 timesteps, and a constant 200 requests per timestep (assuming requests are fulfilled in one timestep):
  Strategy                             Average       99th Percentile
  Random                               16.87119      58.746
  RapGenius "Choice of Two"            9.063825      23.613
  Standard "Power of Two Choices"      3.734270      5.402

Note the 3x difference on average between the two "two choice" strategies (and an almost 5x difference at the 99th percentile).
There are all sorts of fun variants on "Power of Two Choices" and their theoretical implications. I'm hardly an expert, but I recall that the standard "Two Choices" gets you pretty close to optimal, and adding more choices doesn't help nearly as much as the first one. Of course, querying two dynos takes time (although the requests are in parallel), so you can play games like "query 5, check the first two responses." There's some cool work by my friends at Berkeley on doing this for cluster schedulers (no paper yet).
Quick and Dirty Python below:
  '''
  Compares randomized load balancing, RapGenius's "choice of two" and
  traditional "power of two" randomized load balancing for a fixed
  arrival rate (determined by the "LOAD" parameter)
  '''
  
  from random import randint
  
  TRIALS=1000
  DYNOS=200
  LOAD=1
  PCTILE=.99
  
  def average(l):
      return float(sum(l))/len(l)
  
  def randint_wo_replacement(minrange, maxrange, skip):
      ret = randint(minrange, maxrange)
      if ret >= skip:
          ret = ((ret+1) % maxrange)
      return ret
  
  for strategy in ["random", "rapgenius", "pwrtwo"]:
      queues=[0]*DYNOS
      avgs=[]
      pctiles=[]
  
      for trial in range(0, TRIALS):
          for dyno in range(0, DYNOS):
              queues[dyno] = max(queues[dyno]-1, 0)
          for request in range(0, int(DYNOS*LOAD)):
              if strategy == "rapgenius":
                  first_check = randint(0, DYNOS-1)
                  if(queues[first_check] == 0):
                      queues[first_check] += 1
                  else:
                      queues[randint_wo_replacement(0, DYNOS-1, first_check)] += 1
              elif strategy == "pwrtwo":
                  first_check = randint(0, DYNOS-1)
                  second_check = randint_wo_replacement(0, DYNOS-1, first_check)
                  if(queues[first_check] < queues[second_check]):
                      queues[first_check] += 1
                  else:
                      queues[second_check] += 1
              elif strategy == "random":
                  queues[randint(0, DYNOS-1)] += 1
  
          avgs.append(average(queues))
          #get 99th pctile
          queues.sort()
          pctiles.append(queues[int(DYNOS*PCTILE-1)])
      print "%s: avg %f, %f percentile %f" % (strategy, average(avgs), PCTILE, average(pctiles))