wonjun27/Scaling Cronjobs

## Scaling Cronjobs
Scalr

http://highscalability.com/blog/2010/3/22/7-secrets-to-successfully-scaling-with-scalr-on-amazon-by-se.html
http://stackoverflow.com/questions/10061843/how-to-convert-linux-cron-jobs-to-the-amazon-way

Cron Jobs Are Hard To Distribute

Watch out when scaling out instances with cron jobs on them. Cron jobs aren't designed for the cloud. If the machine image holding your cron job scales out to 20 instances, your cron job will be executed 20 times more often.

This is fine if the scope of your cron job is limited to the instance itself, but if the scope is larger, the above becomes a serious problem. And if you single out a machine to run those cron jobs, you run the risk of not having it executed if that machine goes down.

You can work around this using SQS or any distributed queue service, but it's really quite bulky and time-consuming to setup, with not guarantee that the job will be executed on time.

The Apache Software Foundation has a neat tool for distributed lock services, called Zookeeper. Scalr based its distributed cron jobs off of it, so that users can setup scripts to be executed periodically, like cron jobs, without running the risk of multiple executions or failure to execute.

----

Cronjob lock with Redis

https://github.com/kvz/cronlock

Uses a central Redis server to globally lock cronjobs across a distributed system. This can be usefull if you have 30 webservers that you deploy crontabs to (such as mailing your customers), but you don't want 30 cronjobs spawned.

Of course you could also deploy your cronjobs to 1 box, but in volatile environments such as EC2 it can be helpful not to rely on 1 'throw away machine' for your scheduled tasks, and have 1 deploy-script for all your workers.

Another common problem that cronlock will solve is overlap by a single server/cronjob. It happens a lot that developers underestimate how long a job will run. This can happen because the job waits on something, acts different under high load/volume, or enters an endless loop.

In these cases you don't want the job to be fired again at the next cron-interval, making your problem twice as bad, some intervals later, there's a huge ps auxf with overlapping cronjobs, high server load, and eventually a crash.

By settings locks, cronlock can also prevent the overlap in longer-than-expected-running cronjobs.


----

Autoscaling Cronjobs on AWS

https://gist.github.com/wonjun27/db8748da6e925b09b939

Running cron jobs in AWS Auto Scaling group is tricky. When you deploy the same code and configuration to all instances in the group, cron job would run on all of them. You may not want that. This script detects the first instance in the group and allows only this instance to run the job. IAM user used by this script needs to have permissions to…

----

Autoscaling on AWS

http://stackoverflow.com/questions/11513791/cron-on-aws-or-distributed-systems-in-general

I had a similar problem. And I also had cron jobs that had to run every minute, but on a single host only

I solved it with this hack, which runs the amazon autoscaling tools to find out if the box on which it runs is the last one instantiated in this auto scaling group. This obviously assumes you use autoscaling, and that the hostname contains the instance ID.

#!/usr/bin/env ruby

AWS_AUTO_SCALING_HOME='/opt/AutoScaling'
AWS_AUTO_SCALING_URL='https://autoscaling.eu-west-1.amazonaws.com'
MY_GROUP = 'Production'

@cmd_out = `bash -c 'AWS_AUTO_SCALING_HOME=#{ AWS_AUTO_SCALING_HOME }\
  AWS_AUTO_SCALING_URL=#{ AWS_AUTO_SCALING_URL }\
  #{ AWS_AUTO_SCALING_HOME }/bin/as-describe-auto-scaling-instances'`

raise "Output empty, should not happen!" if @cmd_out.empty?
@lines = @cmd_out.split(/\r?\n/)
@last = @lines.select {|l| l.match MY_GROUP }.reverse.
  detect { |l| l =~ /^INSTANCE\s+\S+\s+\S+\s+\S+\s+InService\s+HEALTHY/ }
raise "No suitable host in autoscaling group!" unless @last
@last_host = @last.match(/^INSTANCE\s+(\S+)/)[1]
@hostname = `hostname`
if @hostname.index(@last_host)
  puts "It's me!"
  exit(0)
else
  puts "Someone else will do it!"
  exit(1)
end
Saved it as /usr/bin/lastonly, and then in cron jobs I do:

lastonly && do_my_stuff
Clearly it's not perfect, but it works for me, and it's simple!
	Scalr

	http://highscalability.com/blog/2010/3/22/7-secrets-to-successfully-scaling-with-scalr-on-amazon-by-se.html
	http://stackoverflow.com/questions/10061843/how-to-convert-linux-cron-jobs-to-the-amazon-way

	Cron Jobs Are Hard To Distribute

	Watch out when scaling out instances with cron jobs on them. Cron jobs aren't designed for the cloud. If the machine image holding your cron job scales out to 20 instances, your cron job will be executed 20 times more often.

	This is fine if the scope of your cron job is limited to the instance itself, but if the scope is larger, the above becomes a serious problem. And if you single out a machine to run those cron jobs, you run the risk of not having it executed if that machine goes down.

	You can work around this using SQS or any distributed queue service, but it's really quite bulky and time-consuming to setup, with not guarantee that the job will be executed on time.

	The Apache Software Foundation has a neat tool for distributed lock services, called Zookeeper. Scalr based its distributed cron jobs off of it, so that users can setup scripts to be executed periodically, like cron jobs, without running the risk of multiple executions or failure to execute.

	----

	Cronjob lock with Redis

	https://github.com/kvz/cronlock

	Uses a central Redis server to globally lock cronjobs across a distributed system. This can be usefull if you have 30 webservers that you deploy crontabs to (such as mailing your customers), but you don't want 30 cronjobs spawned.

	Of course you could also deploy your cronjobs to 1 box, but in volatile environments such as EC2 it can be helpful not to rely on 1 'throw away machine' for your scheduled tasks, and have 1 deploy-script for all your workers.

	Another common problem that cronlock will solve is overlap by a single server/cronjob. It happens a lot that developers underestimate how long a job will run. This can happen because the job waits on something, acts different under high load/volume, or enters an endless loop.

	In these cases you don't want the job to be fired again at the next cron-interval, making your problem twice as bad, some intervals later, there's a huge ps auxf with overlapping cronjobs, high server load, and eventually a crash.

	By settings locks, cronlock can also prevent the overlap in longer-than-expected-running cronjobs.


	----

	Autoscaling Cronjobs on AWS

	https://gist.github.com/wonjun27/db8748da6e925b09b939

	Running cron jobs in AWS Auto Scaling group is tricky. When you deploy the same code and configuration to all instances in the group, cron job would run on all of them. You may not want that. This script detects the first instance in the group and allows only this instance to run the job. IAM user used by this script needs to have permissions to…

	----

	Autoscaling on AWS

	http://stackoverflow.com/questions/11513791/cron-on-aws-or-distributed-systems-in-general

	I had a similar problem. And I also had cron jobs that had to run every minute, but on a single host only

	I solved it with this hack, which runs the amazon autoscaling tools to find out if the box on which it runs is the last one instantiated in this auto scaling group. This obviously assumes you use autoscaling, and that the hostname contains the instance ID.

	#!/usr/bin/env ruby

	AWS_AUTO_SCALING_HOME='/opt/AutoScaling'
	AWS_AUTO_SCALING_URL='https://autoscaling.eu-west-1.amazonaws.com'
	MY_GROUP = 'Production'

	@cmd_out = `bash -c 'AWS_AUTO_SCALING_HOME=#{ AWS_AUTO_SCALING_HOME }\
	AWS_AUTO_SCALING_URL=#{ AWS_AUTO_SCALING_URL }\
	#{ AWS_AUTO_SCALING_HOME }/bin/as-describe-auto-scaling-instances'`

	raise "Output empty, should not happen!" if @cmd_out.empty?
	@lines = @cmd_out.split(/\r?\n/)
	@last = @lines.select {\|l\| l.match MY_GROUP }.reverse.
	detect { \|l\| l =~ /^INSTANCE\s+\S+\s+\S+\s+\S+\s+InService\s+HEALTHY/ }
	raise "No suitable host in autoscaling group!" unless @last
	@last_host = @last.match(/^INSTANCE\s+(\S+)/)[1]
	@hostname = `hostname`
	if @hostname.index(@last_host)
	puts "It's me!"
	exit(0)
	else
	puts "Someone else will do it!"
	exit(1)
	end
	Saved it as /usr/bin/lastonly, and then in cron jobs I do:

	lastonly && do_my_stuff
	Clearly it's not perfect, but it works for me, and it's simple!