Skip to content

Instantly share code, notes, and snippets.

@tsmango
Created May 27, 2010 02:37
Show Gist options
  • Save tsmango/415372 to your computer and use it in GitHub Desktop.
Save tsmango/415372 to your computer and use it in GitHub Desktop.
...
# Restart Passenger
deploy.task :restart, :roles => :app do
...
# Restart the resque workers
run "cd #{current_path} && rake queue:restart_workers RAILS_ENV=production"
end
namespace :queue do
task :restart_workers => :environment do
pids = Array.new
Resque.workers.each do |worker|
pids << worker.to_s.split(/:/).second
end
if pids.size > 0
system("kill -QUIT #{pids.join(' ')}")
end
system("rm /var/run/god/resque-1.8.0*.pid")
end
end
@kbighorse
Copy link

I'm using the resque-1.8.0 gem, are you editing the file $GEM_HOME/resque-1.8.0/tasks/resque.rake? If so, should this be pushed to the master branch or is there a way to configure this locally? Also, I have 4 associated Resque processes:

$ ps -A -o pid,command | grep [r]esque

21798 sh -c QUEUE=* RAILS_ENV=production rake -f /var/www/app/current/Rakefile environment resque:work
21800 sh -c QUEUE=* RAILS_ENV=production rake -f /var/www/app/current/Rakefile environment resque:work
21801 resque-1.8.0: Waiting for *
21802 resque-1.8.0: Waiting for *

So worker.worker_pids will return all 4 pids, do I really want to kill them all? Plus I have 2 pids running in /var/run:

$ ls -lh /var/run/god

total 8.0K

-rw-r--r-- 1 root root 5 Jul 12 05:26 resque-0.pid

-rw-r--r-- 1 root root 5 Jul 12 05:26 resque-1.pid

I suspect I'm doing something weird here, please advise? Or maybe I have this right, but need to modify your solution?

Many thanks,

Kimball

@tsmango
Copy link
Author

tsmango commented Jul 13, 2010

Kimball,

I'm also using 1.8.0. I just created a new resque.rake file and stuck it in my lib/tasks/ inside of my application - I'm not modifying the one in resque.

You should have 2 processes per worker - your output from ps looks correct.

Here is what I see on my system:

# cat /var/run/god/resque-1.8.0.pid 
9646

# ruby script/console production
Loading production environment (Rails 2.3.3)
>> pids = Array.new
=> []
?> 
>> Resque.workers.each do |worker|
?> pids.concat(worker.worker_pids)
>> end
=> [#<Worker app1.frf.ly:9647:high,medium,low>]
>> 
?> pids
=> ["9646", "9647"]

# ps aux | grep resque
root      9646  0.0  0.0   1788   496 ?        Ss   Jul08   0:00 sh -c cd /var/www/apps/firefly/current && rake environment RAILS_ENV=production resque:work QUEUE=high,medium,low
root      9647  0.0  2.4  45400 42864 ?        S    Jul08   0:21 resque-1.8.0: Waiting for high,medium,low

God's pid will only show 1 process id. The process that god starts, spawns a second process. As you can see from my example above, when I manually check worker_pids from within a console, I get both process ids back that I see when just performing a ps aux | grep resque. You have two workers, so you have four processes.

I would double check that the process ids that are in the pid files in /var/run/god/resque-.pid are both in worker_pids when you call it. If they are then you should be good. You can kill all four processes and then as soon as god realizes the processes that it knew about (/var/run/god/resque) have died, they will get restarted and your applications environment will be reloaded.

Hope this helps. Good luck!

  • Tom

@kbighorse
Copy link

Ah, yes, I think I understand the parent/child architecture now, which is what you're talking about, correct me if I'm wrong.

This did answer my last question, I have a follow-up on the first, which is actually sort of a linux question. Is
rm /var/run/god/resque-1.8.0.pid the same as sudo god terminate? I've been clumsily using that when I've noticed resque workers running stale code. In the 'yes' case I do need to modify your solution to call rm on all pids, not just the one you have hard-coded.

FWIW, I had been calling kill QUIT/USR1 directly on the workers on the command line, and then sudo god terminate resque to get things right again, but I think yours is the correct solution. Unless I can just as well call sudo god terminate resque.

Thank you so much for your help! I suspect this thread will be helpful to many others out there in resqueland.

Kimball

@tsmango
Copy link
Author

tsmango commented Jul 13, 2010

Good questions.

I believe that removing the /var/run/god/resque-1.8.0.pid file and running sudo god terminate resque have two different effects.

When you remove the /var/run/god/resque-1.8.0.pid file, god just fires up a new resque worker based on your config.

When you call sudo god terminate resque, I would image it just terminates resque based on the process id in your pid file in /var/run/god. I've never used god terminate, though, so I don't know if your process is automatically restarted or not after running terminate. Regardless, if you use god to terminate resque, it would only know to kill the one process and it would probably leave the second, related process running which could cause issues (but don't hold me to that).

In the last line of my rake task, I do:

system("kill -QUIT #{pids.join(' ')} && rm /var/run/god/resque-1.8.0.pid")

What's important to note here is that the first part of that, the kill -QUIT #{pids.join(' ')} is the part that actually kills off the resque workers. The second part after the && is a second command to remove the pid file god knew about. By removing that pid file, it simply lets god know that it should fire up resque again. Removing the pid file doesn't actually quit anything, it just tells god to start it again.

So yes, since you're running multiple resque workers I would chnage that to read:

system("kill -QUIT #{pids.join(' ')} && rm /var/run/god/resque-*.pid")

Good luck!

  • Tom

@kbighorse
Copy link

Awesome, that's what I was thinking. Many thanks again!

Kimball

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment