tmilewski/gist:3933235

## gistfile1.md

      
    Raw
  

              gistfile1.md
            
          
    [transcript from http://www.youtube.com/watch?v=z73uiWKdJhw and irc]
Why is the server unhappy?


Bundle API is 70%-80% of traffic and it has to resolve the dependency graph and Marshal
x Processes are spinning in Ruby with 380% (of 400% total CPU [4 x 100%])
x Bundle can only use vanilla gems, so that's why we have to use Marshal
Redis Heapsize
Diskspace

Timing - middle of the day US
Responses include all of the gem version possible, so as gem versions grow, responses get bigger and bigger (quadratic?). Currently, responses are about 500K, but we could feasibly see 1M responses in the next few months.
Heroku API hosting

hone02 is working on a heroku sinatra app to handle bundler API requests, so we can spin up more as needed
Source code at: https://github.com/rubygems/bundler-api
Resolves dependency via postgres query, resolves in about 150ms
Polling

To start with, if we poll every minute and check for 304 Not Modified, then most minutes there won't be anything to do.
We're going to start with this since it's the smallest thing that works, then webhooks can come after.
Webhooks

*** how reliable are they?
*** what happens when things are out of sync?
*** how do we replay changes if apps become out of sync?
One way to do this: the webhook fires to update the API synchronously and enqueues into DJ if longer > 5 sec
Yanks

Also, we'll need to add a webhook for yanks and essentially apply a reverse diff.
Perhaps we just reload the full index every hour which would bring up all of the yanks up to date.
Request Speed

What's our expectation?

Better than now?
10 seconds is the current bundler timeout, so let's start with faster than that

Caching

reads happen often compared to pushes
Version.where("created_at > ?", 7.days.ago)
       .group_by { |x| x.created_at.to_date }
       .map { |x, g| [x, g.size] }

[[Mon, 15 Oct 2012, 299], 
 [Tue, 16 Oct 2012, 387], 
 [Sat, 13 Oct 2012, 145], 
 [Sun, 14 Oct 2012, 207], 
 [Wed, 17 Oct 2012, 375], 
 [Thu, 18 Oct 2012, 335], 
 [Fri, 19 Oct 2012, 310], 
 [Sat, 20 Oct 2012, 13]]

Almost every request is different, so caching wouldn't really result in any cache-hits.
Gem push log

Right now, we don't have a log of all pushes.

Possibly:

write out a file to S3 that has "this gem was pushed out at this date"
check that file as needed
fetch gemspecs that are newer
rebuild

Bundler could download the daily log on the first run of a day, they grab the five minute log throughout the day.  This is probably a longer term thing to work on.
Timing from push to API availability

5 minutes? 10 minutes?
10 minutes on a yank is not a big deal, for instance.
How to do the migration to Heroku app

We'll set up an endpoint (api-test) for us to test with.
Then, we can proxy some smallish amount of traffic over to the heroku app for testing to figure out how many dynos there are.
We'll have to get a *.federated.rubygems.org SSL certificate to proxy.
Mirrors


Bluebox
Tokyo - gets all traffic from Japan and China via geoip

Very simple to run a caching mirror.  We'd like to set one of these up in Europe as well.
Rackspace Migration

Stalled at the moment. Would have helped with the load issues, too, but moving the API off will help with the move because the load will be a lot less.
Issues: not enough CPU, network issues connecting to Redis, all things we'll have to deal when we move
Redis

If we separate the stats from dependency, we could .... [ed: got off on a tangent?]
Stats

There's a lot of data.
Create a retention policy?
1 year of data on S3, 90 days on the site (for instance)
date, gem, download count CSVish format
Issue tabled for now.