Last active

Start and Stop tasks for resque workers, with capistrano deploy hook (without God)

  • Download Gist
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
after "deploy:symlink", "deploy:restart_workers"
# Rake helper task.
def run_remote_rake(rake_cmd)
rake_args = ENV['RAKE_ARGS'].to_s.split(',')
cmd = "cd #{fetch(:latest_release)} && #{fetch(:rake, "rake")} RAILS_ENV=#{fetch(:rails_env, "production")} #{rake_cmd}"
cmd += "['#{rake_args.join("','")}']" unless rake_args.empty?
run cmd
set :rakefile, nil if exists?(:rakefile)
namespace :deploy do
desc "Restart Resque Workers"
task :restart_workers, :roles => :db do
run_remote_rake "resque:restart_workers"
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
# Start a worker with proper env vars and output redirection
def run_worker(queue, count = 1)
puts "Starting #{count} worker(s) with QUEUE: #{queue}"
ops = {:pgroup => true, :err => [(Rails.root + "log/resque_err").to_s, "a"],
:out => [(Rails.root + "log/resque_stdout").to_s, "a"]}
env_vars = {"QUEUE" => queue.to_s}
count.times {
## Using Kernel.spawn and Process.detach because regular system() call would
## cause the processes to quit when capistrano finishes
pid = spawn(env_vars, "rake resque:work", ops)
namespace :resque do
task :setup => :environment
desc "Restart running workers"
task :restart_workers => :environment do
desc "Quit running workers"
task :stop_workers => :environment do
pids =
Resque.workers.each do |worker|
if pids.empty?
puts "No workers to kill"
syscmd = "kill -s QUIT #{pids.join(' ')}"
puts "Running syscmd: #{syscmd}"
desc "Start workers"
task :start_workers => :environment do
run_worker("*", 2)
run_worker("high", 1)


I'm using this code to start my workers, and works great but I've receive the following capistrano output:

  • executing "cd /home/app/releases/20110512064921 && bundle exec rake RAILS_ENV=production resque:restart_workers" servers: [""] [] executing command ** out :: ** [out ::] Running syscmd: kill -s QUIT 2616 2617 30415 ** [out ::] Starting 1 worker(s) with QUEUE: * command finished [nil] # this is the output of the command failed: "sh -c 'cd /home/app/releases/20110512064921 && bundle exec rake RAILS_ENV=production resque:restart_workers'" on

The exit code of the command is nil so capistrano thought that the command failed, but I've checked and the workers was started correctly

Do you have any idea of what happens?

Thanks in advance

Hi Paco,
What output do you get when you run "sh -c 'cd /home/app/releases/20110512064921 && bundle exec rake RAILS_ENV=production resque:restart_workers'" straight on the server?
There might be problems with the StdErr and StdOut output redirection to log/resque_err and log/resque_stdout done in line 4 of the resque.rake script.

These is the output, I don't notice anything wrong. I run only one worker in * queue

(in /home/app/releases/20110425142917)
Running syscmd: kill -s QUIT 2635 3992 3993
Starting 1 worker(s) with QUEUE: *

Hi there,

I am using Resque 1.15.0 and this script hangs the capistrano script when it it comes to starting the workers. After waiting a couple minutes, I have to control-c and rollback. This there something I have to do with an older script like this? I am also running 1.8.7 so I am using posix/spawn for emulating the spawn function.

I wanted to add a word of caution. The worker_pids method will find any process that has the term 'resque' in it. If you are using the resque namespace and doing a restart, then this will find the capistrano threads and kill them so the start_workers task will never be executed.

This is the command the Worker class uses to find non-Solaris pids:

ps -A -o pid,command | grep "[r]esque" | grep -v "resque-web"

i ran into the problem of the task killing itself before completion and returning a non successful status code too. seeing the post above, i decided to manually determine the pid of the workers, instead of using the built in resque method.

the following line, while not the cleanest code, does the job correctly.

pids =
`ps -A -o pid,command | grep "[r]esque" | grep -v "resque-web" | grep -v "restart_workers" | grep -v "stop_workers" | grep -v "start_workers"`.each_line do |l| 
  pids << l.to_i 

I did not know Resque used ps and grepped for the term "[r]esque". That seems quite brittle.

I haven't used this script in a while, and would probably use Foreman with a Procfile these days.

Simply, we could change one line of the :stop_workers task

Resque.workers.each do |worker|


Resque.workers.each do |worker|
      pids <<':')[1]

It depends on the implementation of the to_s method of the Resque::Worker, but not the api. It's bad, but it works.

As I run into a case that could not fix by modifying the ps command:
I have two applications run in the same server, both of them have to use resque, by using the resque:restart_workers task, it will kill all the workers belong to both applications. And actually, I just want to kill the workers from one specify application.

Anyway, the best choice to solve this problem should be using something like 'god' or 'monit' to maintain the workers.

I ended up breaking a production server with this. Note:

Resque.workers.each do |worker|

Does not distinguish queues. Each time I deployed it would kill ALL queues and restart its own.

In the short term I solved it with:

Resque.workers.each do |worker|
  pids.concat(worker.worker_pids) if worker.queues.include?(@queue_name)

In the long term I am going to look into Foreman, god, monit, or whatever to monitor and restart workers.

I found this worked best for me in the :stop_workers task:

workers = Resque.workers! { |w| w.queues.include? queue } if queue
pids = { |w| w.to_s.sub /.+:(\d+):.+/, '\1' }

It's a combination of @kenniz's pid extraction technique (it is bad, but it's also used in parts of the resque code itself!), plus @kmcphillips's queue-specificity.

This slight mod to the regex accounts for processes with multiple (threaded) workers:

pids = { |w| w.to_s.sub /.+:(\d+)[-:].+/, '\1' }

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.