Skip to content

Instantly share code, notes, and snippets.

@tmilewski
Forked from wesgarrison/gist:3921560
Created October 22, 2012 18:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tmilewski/3933235 to your computer and use it in GitHub Desktop.
Save tmilewski/3933235 to your computer and use it in GitHub Desktop.
October 19 2012 - rubygems.org - notes on server

[transcript from http://www.youtube.com/watch?v=z73uiWKdJhw and irc]

Why is the server unhappy?

  • Bundle API is 70%-80% of traffic and it has to resolve the dependency graph and Marshal
  • x Processes are spinning in Ruby with 380% (of 400% total CPU [4 x 100%])
  • x Bundle can only use vanilla gems, so that's why we have to use Marshal
  • Redis Heapsize
  • Diskspace

Timing - middle of the day US

Responses include all of the gem version possible, so as gem versions grow, responses get bigger and bigger (quadratic?). Currently, responses are about 500K, but we could feasibly see 1M responses in the next few months.

Heroku API hosting

hone02 is working on a heroku sinatra app to handle bundler API requests, so we can spin up more as needed

Source code at: https://github.com/rubygems/bundler-api

Resolves dependency via postgres query, resolves in about 150ms

Polling

To start with, if we poll every minute and check for 304 Not Modified, then most minutes there won't be anything to do.

We're going to start with this since it's the smallest thing that works, then webhooks can come after.

Webhooks

*** how reliable are they? *** what happens when things are out of sync? *** how do we replay changes if apps become out of sync?

One way to do this: the webhook fires to update the API synchronously and enqueues into DJ if longer > 5 sec

Yanks

Also, we'll need to add a webhook for yanks and essentially apply a reverse diff.

Perhaps we just reload the full index every hour which would bring up all of the yanks up to date.

Request Speed

What's our expectation?

  • Better than now?
  • 10 seconds is the current bundler timeout, so let's start with faster than that

Caching

reads happen often compared to pushes

Version.where("created_at > ?", 7.days.ago)
       .group_by { |x| x.created_at.to_date }
       .map { |x, g| [x, g.size] }

[[Mon, 15 Oct 2012, 299], 
 [Tue, 16 Oct 2012, 387], 
 [Sat, 13 Oct 2012, 145], 
 [Sun, 14 Oct 2012, 207], 
 [Wed, 17 Oct 2012, 375], 
 [Thu, 18 Oct 2012, 335], 
 [Fri, 19 Oct 2012, 310], 
 [Sat, 20 Oct 2012, 13]]

Almost every request is different, so caching wouldn't really result in any cache-hits.

Gem push log

Right now, we don't have a log of all pushes.
Possibly:

  • write out a file to S3 that has "this gem was pushed out at this date"
  • check that file as needed
  • fetch gemspecs that are newer
  • rebuild

Bundler could download the daily log on the first run of a day, they grab the five minute log throughout the day. This is probably a longer term thing to work on.

Timing from push to API availability

5 minutes? 10 minutes?

10 minutes on a yank is not a big deal, for instance.

How to do the migration to Heroku app

We'll set up an endpoint (api-test) for us to test with. Then, we can proxy some smallish amount of traffic over to the heroku app for testing to figure out how many dynos there are.

We'll have to get a *.federated.rubygems.org SSL certificate to proxy.

Mirrors

  • Bluebox
  • Tokyo - gets all traffic from Japan and China via geoip

Very simple to run a caching mirror. We'd like to set one of these up in Europe as well.

Rackspace Migration

Stalled at the moment. Would have helped with the load issues, too, but moving the API off will help with the move because the load will be a lot less.

Issues: not enough CPU, network issues connecting to Redis, all things we'll have to deal when we move

Redis

If we separate the stats from dependency, we could .... [ed: got off on a tangent?]

Stats

There's a lot of data. Create a retention policy? 1 year of data on S3, 90 days on the site (for instance) date, gem, download count CSVish format

Issue tabled for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment