Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?

Zero downtime deploys with unicorn + nginx + runit + rvm + chef

Below are the actual files we use in one of our latest production applications at Agora Games to achieve zero downtime deploys with unicorn. You've probably already read the GitHub blog post on Unicorn and would like to try zero downtime deploys for your application. I hope these files and notes help. I am happy to update these files or these notes if there are comments/questions. YMMV (of course).

Other application notes:

  • Our application uses MongoDB, so we don't have database migrations to worry about as with MySQL or postgresql. That does not mean that we won't have to worry about issues with the database with indexes being built in MongoDB or what have you.
  • We use capistrano for deployment.

Salient points for each file:

  • deploy.rb: deploy:restart task should send a USR2 to the application controlled via runit.
  • rails-application.conf: We run unicorn in production listening on a UNIX socket behind NGINX.
  • rails-application.rb: Tell chef to send a USR2 to the application controller via runit.
  • sv-rails-application-run.erb: You need to daemonize the unicorn process with -D. The whole file is important :)
  • unicorn.rb: Choose a location for the unicorn PID file that is outside of the RAILS_ROOT directory. We also needed the before_exec block to address an issue with old releases being cleaned up by Capistrano.

Libraries for achieving zero downtime with database migrations:

  • Large Hadron Migrator gem - "a gem for online ActiveRecord and DataMapper migrations."
  • mysql_role_swap script - "a script written in Ruby to perform all of the tasks that we normally perform when promoting a slave database to master."

Articles and approaches for achieving zero downtime with database migrations:

Articles on continuous delivery and continuous deployment:

# Location: RAILS_ROOT/config/deploy.rb
# We noticed that the default asset precompilation happens after the current/ symlink is created. We
# changed asset precompilation to happen before the current/ symlink is moved so that we don't have a period
# where stylesheets, etc. for the running unicorn process are invalid.
before 'deploy:create_symlink', 'deploy:assets:precompile'
namespace :deploy do
desc <<-DESC
Send a USR2 to the unicorn process to restart for zero downtime deploys.
runit expects 2 to tell it to send the USR2 signal to the process.
DESC
task :restart, :roles => :app, :except => { :no_release => true } do
run "sv 2 #{application}"
end
end
# Location: cookbooks/nginx/files/default/rails-application.conf
# We run unicorn in production listening on a UNIX socket behind NGINX.
upstream unicorn {
server unix:/var/www/rails-application/tmp/sockets/production.sock fail_timeout=0;
}
server {
listen 80;
server_name rails-application.domain.com;
root /var/www/rails-application/current/public;
# set far-future expiration headers on static content
expires max;
server_tokens off;
# set up the rails servers as a virtual location for use later
location @rails {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_intercept_errors on;
proxy_redirect off;
proxy_pass http://unicorn;
expires off;
}
location / {
try_files $uri @rails;
}
}
# Location: cookbooks/unicorn/recipes/rails-application.rb
# As with the deploy.rb for capistrano, runit expects 2 to tell it to send
# the USR2 signal to the process. Configure the runit service to use that
# for the restart comment. Also, chef-client runs every 30 minutes and
# will run the restart command every run unless you tell it not to.
runit_service 'rails-application' do
restart_command '2'
run_restart false
end
# Location: cookbooks/unicorn/templates/default/sv-rails-application-run.erb
# Original author: @brentkirby - https://gist.github.com/1039720
#!/bin/bash
exec 2>&1
<% unicorn_command = @options[:unicorn_command] || 'unicorn_rails' -%>
#
# Since unicorn creates a new pid on restart/reload, it needs a little extra love to
# manage with runit. Instead of managing unicorn directly, we simply trap signal calls
# to the service and redirect them to unicorn directly.
#
# To make this work properly with RVM, you should create a wrapper for the app's gemset unicorn.
#
function is_unicorn_alive {
set +e
if [ -n $1 ] && kill -0 $1 >/dev/null 2>&1; then
echo "yes"
fi
set -e
}
echo "Service PID: $$"
CUR_PID_FILE=/var/www/rails-application/shared/pids/unicorn.pid
OLD_PID_FILE=$CUR_PID_FILE.oldbin
if [ -e $OLD_PID_FILE ]; then
OLD_PID=$(cat $OLD_PID_FILE)
echo "Waiting for existing master ($OLD_PID) to exit"
while [ -n "$(is_unicorn_alive $OLD_PID)" ]; do
/bin/echo -n '.'
sleep 2
done
fi
if [ -e $CUR_PID_FILE ]; then
CUR_PID=$(cat $CUR_PID_FILE)
if [ -n "$(is_unicorn_alive $CUR_PID)" ]; then
echo "Unicorn master already running. PID: $CUR_PID"
RUNNING=true
fi
fi
if [ ! $RUNNING ]; then
echo "Starting unicorn"
export rvm_user_install_flag=1
export rvm_trust_rvmrcs=1
export rvm_trust_rvmrcs_flag=1
source /var/lib/rails-application/.rvm/scripts/rvm
cd /var/www/rails-application/current
# You need to daemonize the unicorn process, http://unicorn.bogomips.org/unicorn_rails_1.html
bundle exec <%= unicorn_command %> -c config/unicorn.rb -E <%= @options[:environment] || 'production' %> -D
sleep 3
CUR_PID=$(cat $CUR_PID_FILE)
fi
function restart {
echo "Initialize new master with USR2"
kill -USR2 $CUR_PID
# Make runit restart to pick up new unicorn pid
sleep 2
echo "Restarting service to capture new pid"
exit
}
function graceful_shutdown {
echo "Initializing graceful shutdown"
kill -QUIT $CUR_PID
}
function unicorn_interrupted {
echo "Unicorn process interrupted. Possibly a runit thing?"
}
trap restart HUP QUIT USR2 INT
trap graceful_shutdown TERM KILL
trap unicorn_interrupted ALRM
echo "Waiting for current master to die. PID: ($CUR_PID)"
while [ -n "$(is_unicorn_alive $CUR_PID)" ]; do
/bin/echo -n '.'
sleep 2
done
echo "You've killed a unicorn!"
# Location: RAILS_ROOT/config/unicorn.rb
rails_env = ENV['RAILS_ENV'] || 'development'
worker_processes (rails_env == 'production' ? 6 : 1)
preload_app true
check_client_connection true
timeout 30
case rails_env
when 'production', 'staging'
# It is *very* important that you choose a location for the unicorn PID file that is
# outside of the RAILS_ROOT directory. We use capistrano for deployment, where we
# deploy via remote_cache. We noticed that when we had the unicorn PID file defined
# in a directory under RAILS_ROOT (the default PID location is RAILS_ROOT/tmp/pids/unicorn.pid),
# that the script was not able to reclaim the old unicorn PID file after the symlink
# for current/ gets moved to the latest deploy by capistrano.
pid '/var/www/rails-application/shared/pids/unicorn.pid'
listen "/var/www/rails-application/tmp/sockets/#{rails_env}.sock", :backlog => 2048
else
listen 3001
listen "#{`pwd`.strip}/tmp/sockets/#{rails_env}.sock"
end
# via http://unicorn.bogomips.org/Sandbox.html
# See section on BUNDLER_GEMFILE for Capistrano users
# We need this since we automatically run deploy:clean to
# cleanup old releases.
before_exec do |server|
ENV["BUNDLE_GEMFILE"] = "/var/www/rails-application/current/Gemfile"
end
before_fork do |server, worker|
# When sent a USR2, Unicorn will suffix its pidfile with .oldbin and
# immediately start loading up a new version of itself (loaded with a new
# version of our app). When this new Unicorn is completely loaded
# it will begin spawning workers. The first worker spawned will check to
# see if an .oldbin pidfile exists. If so, this means we've just booted up
# a new Unicorn and need to tell the old one that it can now die. To do so
# we send it a QUIT.
#
# Using this method we get 0 downtime deploys.
old_pid = '/var/www/rails-application/shared/pids/unicorn.pid.oldbin'
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
# someone else did our job for us
end
end
end
after_fork do |server, worker|
# Unicorn master loads the app then forks off workers - because of the way
# Unix forking works, we need to make sure we aren't using any of the parent's
# sockets, e.g. db connection
# defined?(ActiveRecord::Base) and ActiveRecord::Base.establish_connection
# Redis and Memcached would go here but their connections are established
# on demand, so the master never opens a socket
# $redis = Redis.connect
end

We symlink the RAILS_ROOT/tmp/pids dir to the shared dir to keep all the pids. Then we don't have to worry about changing the pid location.

@czarneckid how do you stop the unicorn instance with your runit script?

If I send it a kill/term signal, it does gracefully stop the unicorn instance, but right after, L83 in your script picks up and starts a new unicorn master.

I've been trying/fidling with the script to get it to actually STOP the unicorn master when called for (and start again when needed), but am unsuccessful.

I got it working by rewriting the script. See: https://gist.github.com/JeanMertz/8996796. All runit actions are now supported out of the box (start, stop, restart, reload, 2 (USR2))

I've found that the working_directory option is also needed for Capistrano deploys that prune old revisions:

working_directory(path)
sets the working directory for Unicorn. This ensures SIGUSR2 will start a new instance of Unicorn in this directory. This may be a symlink, a common scenario for Capistrano users. Unlike all other Unicorn configuration directives, this binds immediately for error checking and cannot be undone by unsetting it in the configuration file and reloading.

Without this set, I've run into ActionView::MissingTemplate errors after a rolling deploy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment