jvenezia/sizing_your_rails_application_with_unicorn_on_heroku.md

## sizing_your_rails_application_with_unicorn_on_heroku.md

      
    Raw
  

              sizing_your_rails_application_with_unicorn_on_heroku.md
            
          
    Sizing your Rails application with Unicorn on Heroku

This tutorial assumes you are running a Rails application on Heroku with a Unicorn server. However, if you are not using one of those you will find some general information which can help you in your context.
This article will help you to correctly configure your Unicorn server based on real data.
I recommend you to use New Relic, which is really easy to setup with Rails & Heroku. It will help you monitoring your application and configure it at its best.
Using Unicorn Web Server

It is highly recommended to use a Rails server which supports concurrent requests, like Puma, Passenger, or Unicorn.
I chose to use Unicorn, it was recommended by Heroku. But if your application is mainly used by slow clients you may prefer using Puma.
Add the Unicorn gem in your Gemfile.
gem 'unicorn'
Create the config/unicorn.rb file, which will contain Unicorn configurations.
worker_processes Integer(ENV["WEB_CONCURRENCY"] || 3)
timeout 15
preload_app true

before_fork do |server, worker|
  Signal.trap 'TERM' do
    puts 'Unicorn master intercepting TERM and sending myself QUIT instead'
    Process.kill 'QUIT', Process.pid
  end

  defined?(ActiveRecord::Base) and
    ActiveRecord::Base.connection.disconnect!
end

after_fork do |server, worker|
  Signal.trap 'TERM' do
    puts 'Unicorn worker intercepting TERM and doing nothing. Wait for master to send QUIT'
  end

  defined?(ActiveRecord::Base) and
    ActiveRecord::Base.establish_connection
end
The first line tells how many concurrent requests your Unicorn server can handle. You will find later in this article how to configure it precisely using New Relic.
The second line is the Unicorn timeout. When it is reached, the process is killed. The next section of this article explains how to choose this timeout.
Catch the timeout on your side

At this time, we have two levels of timeouts: Heroku router's and Unicorn's.
When a request reaches timeout, it is very important to understand it, so we need it to be clearly visible to improve slow actions.
It is helpful to understand how Heroku's timeouts occur. The Heroku router will drop a long-running request after 30 seconds and throw an H12 error, which will be visible in the application's logs.
However, this timeout does not affect the Unicorn server which will continue processing the request until it is completed. That's why it is very important to care about Unicorn's timeout too.

It is a good practice to set the Unicorn timeout lower than Heroku's one to prevent this from happening. This way, Unicorn will always reach its timeout before Heroku's to make sure that no useless process is running while the request has already been dropped by Heroku.

The problem here is that when the Unicorn's timeout is reached the process is killed, which prevents you from having any visibility into what the app is doing. This also prevents this data from being reported to New Relic.
It is highly recommended to use a third level of timeout to gain more visibility. The rack-timeout gem does that by raising Rack::Timeout::Error when this third timeout is reached.
Add the gem to your gemfile:
gem 'rack-timeout'
Create the config/initializers/timeout.rb file in which you will set up the timeout you need.
Rack::Timeout.timeout = 10  # seconds
Again, it is very important that this timeout is raised before the two other ones. So be sure it is lower than Heroku's timeout, and Unicorn's timeout.

Now you can handle this ruby exception as any other, and tell your users you are sorry about this abnormal timeout with a friendly message in your application!
This ruby exception will be logged and reported so it will now be easier for you to investigate on your timeout issue.
How many Unicorn processes can you run?

Now let's focus on the first line of the Unicorn configuration:
worker_processes Integer(ENV["WEB_CONCURRENCY"] || 3)
There are three kinds of dynos on heroku:

1X (512MB RAM)
2X (1024MB RAM)
PX (6GB RAM)

If one instance of your application runs with a maximum 250MB of RAM, it means that you will be able to run 4 processes maximum on a 2X Dyno.
4 Processes * 250MB = 1000MB
You can use New Relic to see the average memory usage per instance, in the "Instances Menu".

Rails memory leaks

Most of Rails applications have memory leaks and Unicorn will make it worse.
Each process consumes memory, and you have to keep your processes under the RAM limit of your dyno to prevent having R14 errors from Heroku, which likely results on High response times, timeouts, and request queuing.
As seen in the previous example, one process consumes an average of 250MB and we want it to never overpass this limit at any time.
The unicorn-worker-killer will prevent this from happening, by restarting the processes if it abnormally consumes too much memory.
It can also restart the process based on a maximum number of request the process has handled.
This gem restart Unicorn processes but not Heroku dynos. Restarting a process does not affect any requests; the process is restarted only if it is not currently handling a request.
Add the gem in your gemfile.
gem 'unicorn-worker-killer'
Add the configuration in the config.ru file of your Rails application.
# This file is used by Rack-based servers to start the application.

require 'unicorn/worker_killer'

# Max requests per worker
max_requests_min = ENV['UNICORN_WORKER_KILLER_MAX_REQUESTS_MIN'] || 2500
maw_requests_max = ENV['UNICORN_WORKER_KILLER_MAX_REQUESTS_MAX'] || 3000
use Unicorn::WorkerKiller::MaxRequests, max_requests_min, maw_requests_max

# Max memory size (RSS) per worker
oom_min = (ENV['UNICORN_WORKER_KILLER_OOM_MIN'] || 230) * (1024**2)
oom_max = (ENV['UNICORN_WORKER_KILLER_OOM_MAX'] || 250) * (1024**2)
use Unicorn::WorkerKiller::Oom, oom_min, oom_max

require ::File.expand_path('../config/environment', __FILE__)
use Rack::Deflater
run Curation::Application
Max requests

To have an idea of how many requests a process can handle before restarting, you must know how much time a process takes to reach the memory limit. You can see this on New Relic as shown below.

Then check the average requests per minute your application handles.

In this example, there is an average of 15 requests per minute, and the processes takes 90 minutes to reach the limit. 15 requests per minutes * 90 minutes = 1350 requests before reaching the limit.
Don't be afraid to choose a higher value. The most important is to setup the right memory usage limit.
The actual limit is decided by rand() between max_requests_min and max_requests_max per worker to prevent all workers to be dead at the same time.
Process memory usage limit

Simply choose a value below the maximum memory usage your processes can handle here. This way, when a process reaches the limit unicorn-worker-killer will restart it.
The actual limit is decided by rand() between oom_min and oom_max per worker to prevent all workers to be dead at the same time.
Conclusion

You now know how easy it is to properly size your Rails application.

Understand why timeouts are important and be sure you can log them.
Choose precisely how many processes you server can handle.
Be aware of memory leaks to prevent your application from running on the swap!

Feel free to give me feedbacks on this article!
You may want to read

Most of the content of this article comes from Heroku's documentation:

Heroku Request Timeouts
Heroku, Rails and Unicorn