In a web application there can be many time-consuming processes which, if we execute them synchronously within a controller
, will block the user from receiving a response (which they’ll experience as painfully slow load times) and in today’s world, this is completely and utterly unacceptable!
Here are some examples:
-
Preparing a download, say
zip
file, where file compression happens on-the-fly (e.g. Google Docs) -
POST
ing a request to an external API (e.g. sending a message to Slack) -
Kicking off a continuous-integration build (e.g. Circle CI, etc.)
-
Sending an email (e.g.
action_mailer
orMailchimp
)
These are all actions which should be “kicked off” by a controller, but we simply cannot force the user to wait around for their completion - we want to render a page for them to see while they wait for the main work to be completed.
In this lesson, we’re going to create an application that creates an enqueues background jobs and provides a dashboard interface for inspecting them.
0.1 Check which version of Ruby is installed:
ruby -v
ruby 2.6.3p62 (2019-04-16 revision 67580)
Better yet, use which
because it gives us the full path of our Ruby interpreter (and indicates whether we’re using rvm
):
which ruby
~/.rvm/rubies/ruby-2.6.3/bin/ruby
0.2 What version of Rails is installed?
rails -v
Rails 5.2.3
Or:
which rails
~/.rvm/gems/ruby-2.6.3/bin/rails
0.3 Is Postgres running?
If you have the Postgres OS X app installed, simply check the menu-bar application to check.
There are multiple ways to check the status of postgres
from the command line, including pg-ctl
but we won’t cover that now.
0.4 Is yarn
installed?
Because we’re using the webpacker
gem, we’ll need yarn
(the Javascript dependency manager).
brew install yarn
Then install the Javascript dependencies:
yarn install
Due to reasons, you may also need to manually install some of the Javascript dependencies:
yarn add bootstrap jquery popper.js
Let’s start by creating using rails new
to create a new Rails project:
rails new \
--database postgresql \
--webpack \
-m https://raw.githubusercontent.com/lewagon/rails-templates/master/devise.rb \
background-jobs-demo
Notice that we’ll use the devise
and webpacker
gems, which suggests our application will have the concept of User
, and render a front-end. We’ll clone a template using -m
as well, just to speed things up a bit.
Now, remember that we’re going to create a dashboard showing all queued background jobs… this is definitely not something we want to expose to any person off the street! Only admins should be able to view this list, so let’s begin by updating the default devise
User
model to add an admin
field, thereby providing a mechanism for restricting access to just admin users.
We’ll use rails generate
to create a migration:
rails generate migration AddAdminToUsers
Now, open the new migration file and add:
def change
add_column :users, :admin, :boolean, null: false, default: false
end
With this migration we’re adding a boolean
field called admin
, which cannot be null
, and whose default value is false
. Basically, when we create a new User
they will not be an admin - which makes sense.
Now let’s run the migration:
rake db:migrate
== 20190530000944 AddAdminToUsers: migrating ==================================
-- add_column(:users, :admin, :boolean, {:null=>false, :default=>false})
-> 0.0049s
== 20190530000944 AddAdminToUsers: migrated (0.0050s) =========================
Now, just for fun let’s open rails console
and create a new User
:
rails console
Running via Spring preloader in process 49386
Loading development environment (Rails 5.2.3)
[1] pry(main)> User.create! :email => 'admin@gmail.com', :password => 'password', :admin => true
(0.2ms) BEGIN
User Exists (1.9ms) SELECT 1 AS one FROM "users" WHERE "users"."email" = $1 LIMIT $2 [["email", "admin@gmail.com"], ["LIMIT", 1]]
User Create (0.9ms) INSERT INTO "users" ("email", "encrypted_password", "created_at", "updated_at", "admin") VALUES ($1, $2, $3, $4, $5) RETURNING "id" [["email", "admin@gmail.com"], ["encrypted_password", "$2a$11$zNeVCENAHLxCNC7SLYkhxuBdVsA4GarjNflZgXrrUmFO185BgNsmW"], ["created_at", "2019-05-30 00:11:25.211103"], ["updated_at", "2019-05-30 00:11:25.211103"], ["admin", true]]
(0.5ms) COMMIT
=> #<User id: 1, email: "admin@gmail.com", created_at: "2019-05-30 00:11:25", updated_at: "2019-05-30 00:11:25", admin: true>
The next step is to create our background job. We’ll use rails generate
:
rails generate job fake
Running via Spring preloader in process 49679
invoke test_unit
create test/jobs/fake_job_test.rb
create app/jobs/fake_job.rb
Open FakeJob
at app/jobs/fake_job.rb
and add:
class FakeJob < ApplicationJob
queue_as :default
def perform
puts "I'm starting the fake job"
sleep 3
puts "OK I'm done now"
end
end
This job is a simple illustration of the kind of activities that are appropriate for background jobs. Here, we are simply printing "I'm starting the fake job"
, sleeping for 3 seconds, and then printing "Ok I'm done now"
, with sleep 3
standing in for any kind of action that takes a long time to complete (e.g. an HTTP
request).
Let’s test our new job in the rails console
:
rails console
[1] pry(main)> FakeJob.perform_now
Performing FakeJob (Job ID: c15d93d2-be23-4da4-80c9-ba31c2d7c57b) from Async(default)
I'm starting the fake job
OK I'm done now
Performed FakeJob (Job ID: c15d93d2-be23-4da4-80c9-ba31c2d7c57b) from Async(default) in 3000.4ms
=> nil
A few things to note:
-
Running our job in
rails console
means it executessynchronously
, so… basically the same as if we executed it inside a controller. The end-goal is to execute itasynchronously
so it gets out of the user’s way. -
If you watch the execution, you’ll see 3 seconds elapse between
"I'm starting the fake job
and"Ok I'm done now
. -
The job returns
nil
.
And though returning nil
is rather boring and useless, a job can do anything! Remember that.
We have an admin User
and a Job
, but we need some infrastructure to execute those jobs in the background.
For this lesson we’ll use Sidekiq
but you could also use ActiveJob
or QueueAdapters
.
Sidekiq
is a job queue built on Redis
, an in-memory key-value store. Basically, Redis
is like a simple database with no permanent storage. Its main advantage is speed - a perfect fit for implementing a job queue.
We’ll use homebrew
to install Redis
:
brew update
brew install redis
This should work fine… unless you’ve setup homebrew
incorrectly or are using homebrew
from a user account that isn’t the one you installed it with, in which case you may need to change some file permissions. A note: DON’T use sudo
to overcome these errors… for really important reasons, which I won’t cover here. Instead use chown
to give the current user access to any affected directories (i.e. /usr/local/Homebrew
).
Now, let’s run redis-server
:
brew services start redis
==> Tapping homebrew/services
Cloning into '/usr/local/Homebrew/Library/Taps/homebrew/homebrew-services'...
remote: Enumerating objects: 12, done.
remote: Counting objects: 100% (12/12), done.
remote: Compressing objects: 100% (9/9), done.
remote: Total 12 (delta 0), reused 7 (delta 0), pack-reused 0
Unpacking objects: 100% (12/12), done.
Tapped 1 command (41 files, 58.7KB).
==> Successfully started `redis` (label: homebrew.mxcl.redis)
Ok, now that redis-server
is up-and-running, we have to configure our Rails application to connect to it, and plug into Sidekiq
specifically.
So, let’s install Sidekiq
. Open your Gemfile
and add:
gem 'sidekiq'
gem 'sidekiq-failures', '~> 1.0'
Then run:
bundle install
Fetching gem metadata from https://rubygems.org/............
Resolving dependencies...
...
Fetching rack-protection 2.0.5
Installing rack-protection 2.0.5
...
Fetching sidekiq 5.2.7
Installing sidekiq 5.2.7
Fetching sidekiq-failures 1.0.0
Installing sidekiq-failures 1.0.0
...
Bundle complete! 22 Gemfile dependencies, 81 gems now installed.
Use `bundle info [gemname]` to see where a bundled gem is installed.
Awesome. Now, we have to create a binstub
that wraps the sidekiq
gem. A binstub
is:
… wrapper scripts around executables (sometimes referred to as "binaries", although they don't have to be compiled) whose purpose is to prepare the environment before dispatching the call to the original executable.
sidekiq
is an “executable”, in other words: an entirely separate application to our Rails web application. The binstub
we’ve just created configures the execution environment of sidekiq
so that it plugs into our Rails app. Got it?
By the way, sidekiq
is the component that connects to redis-server
, not our Rails app.
Next, we have to configure our application to use sidekiq
as its default job queue. Open config/application.rb
and add:
config.active_job.queue_adapter = :sidekiq
Though this next step is optional, it’s pretty neat. The sidekiq
rubygem can serve a simple web dashboard showing us information about its queue. To configure this, open config/routes.rb
and add:
require "sidekiq/web"
# [...]
authenticate :user, lambda { |u| u.admin } do
mount Sidekiq::Web => '/sidekiq'
end
If you run rake routes
you’ll see we’ve got the new route:
sidekiq_web /sidekiq Sidekiq::Web
Strange that it doesn’t specify an HTTP
verb… but oh well?
Finally, lets configure Sidekiq
within our Rails application. Create config/sidekiq.yml
and add:
:concurrency: 3
:timeout: 60
:verbose: true
:queues:
- default
- mailers
A few notes on this configuration:
-
With
:concurrency: 3
we’re saying thatsidekiq
is allowed to process three background jobs simultaneously-
With
:timeout: 60
we’re saying thatsidekiq
should terminate any jobs that take longer than 60 seconds -
With
:verbose: true
we’re saying thatsidekiq
should print detailed error messages (this will help us debug) -
And finally, we’re creating two separate
queues
:default
andmailers
.
-
Now that we’ve configured our application to use Sidekiq
, let’s start it up.
In your terminal, open a new tab and execute:
sidekiq
2019-05-30T01:11:02.493Z 59109 TID-ovkeg76yd INFO: ==================================================================
2019-05-30T01:11:02.493Z 59109 TID-ovkeg76yd INFO: Please point sidekiq to a Rails 4/5 application or a Ruby file
2019-05-30T01:11:02.493Z 59109 TID-ovkeg76yd INFO: to load your worker classes with -r [DIR|FILE].
2019-05-30T01:11:02.493Z 59109 TID-ovkeg76yd INFO: ==================================================================
2019-05-30T01:11:02.493Z 59109 TID-ovkeg76yd INFO: sidekiq [options]
What’s this?! Turns out that if you run sidekiq
outside of the Rails application root, it doesn’t work properly.
Remember that binstub
we created earlier? Well, if we run sidekiq
within our application root, it will run sidekiq
with all the configuration we provided:
sidekiq
...
DEBUG: {:queues=>["default", "mailers"], :labels=>[], :concurrency=>3, :require=>".", :environment=>nil, :timeout=>60, :poll_interval_average=>nil, :average_scheduled_poll_interval=>5, :error_handlers=>[#<Sidekiq::ExceptionHandler::Logger:0x00007f917216b5d8>], :death_handlers=>[], :lifecycle_events=>{:startup=>[], :quiet=>[], :shutdown=>[], :heartbeat=>[]}, :dead_max_jobs=>10000, :dead_timeout_in_seconds=>15552000, :reloader=>#<Sidekiq::Rails::Reloader @app=BackgroundJobsDemo::Application>, :verbose=>true, :config_file=>"./config/sidekiq.yml", :strict=>true, :tag=>"background-jobs-demo", :identity=>"Charitys-MacBook-Pro.local:59363:6495772df58b"}
Shaboom shaboom.
Now, we should be able to enqueue jobs from anywhere inside our Rails application! Let’s start with creating one with the rails console
:
[1] pry(main)> FakeJob.perform_later
Enqueued FakeJob (Job ID: 842d7ede-4075-4cab-b9bc-fecf953e810b) to Sidekiq(default)
=> #<FakeJob:0x00007f91e210f470
@arguments=[],
@executions=0,
@job_id="842d7ede-4075-4cab-b9bc-fecf953e810b",
@priority=nil,
@provider_job_id="5420faaaeffc1e400e87bbac",
@queue_name="default">
Meanwhile, if we check on sidekiq
we’ll see that our background job has been enqueued and executed:
2019-05-30T01:16:05.343Z 59363 TID-ouxvy6kxb FakeJob JID-5420faaaeffc1e400e87bbac INFO: start
I'm starting the fake job
OK I'm done now
2019-05-30T01:16:08.389Z 59363 TID-ouxvy6kxb FakeJob JID-5420faaaeffc1e400e87bbac INFO: done: 3.046 sec
Clearbit is a service that gathers data about people from the public web. The Enrichment API in particular takes an email address and returns a detailed profile of all the associated public information.
A natural place to make such a request to Clearbit is whenever a new User
signs up to our app.
There are two main steps:
- Create the
Job
which queries the Clearbit API - Enqueue the
Job
from theUser
model
Using rails generate
, let’s create the Job
:
rails generate job UpdateUser
Running via Spring preloader in process 60709
invoke test_unit
create test/jobs/update_user_job_test.rb
create app/jobs/update_user_job.rb
Now, let’s open app/jobs/update_user_job.rb
and add:
class UpdateUserJob < ApplicationJob
queue_as :default
def perform(user_id)
user = User.find(user_id)
puts "Calling Clearbit API for #{user.email}..."
# TODO: perform a time consuming task like Clearbit's Enrinchment API.
sleep 2
puts "Done! Enriched #{user.email} with Clearbit"
end
end
We’ve defined #perform
to accept one parameter: user_id
. The method will use this id
to retrieve a user from the database (we created one earlier), fake doing an API call to Clearbit, and then false state that we’ve “enriched” the user with Clearbit information.
Of course, we’ve done no such thing, but you absolutely could and that’s the point.
Now, open app/models/user.rb
and add:
class User < ApplicationRecord
# [...]
after_save :async_update # Run on create & update
private
def async_update
UpdateUserJob.perform_later(self.id)
end
end
Now, whenever we create or update a User
model, #async_update
is invoked which enqueues our new UpdateUser
job. Let’s test it out in the rails console
:
[1] pry(main)> user = User.find(1)
User Load (0.9ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
=> #<User id: 1, email: "admin@gmail.com", created_at: "2019-05-30 00:11:25", updated_at: "2019-05-30 00:11:25", admin: true>
[2] pry(main)> user.save
(0.2ms) BEGIN
Enqueued UpdateUserJob (Job ID: a2f04afb-df96-4419-af3a-0b82b0aa589c) to Sidekiq(default) with arguments: 1
(0.5ms) COMMIT
=> true
And if we check sidekiq
we’ll see the job has been executed:
INFO: start
Calling Clearbit API for admin@gmail.com...
Done! Enriched admin@gmail.com with Clearbit
2019-05-30T01:38:21.436Z 59363 TID-ouxvy6l3n UpdateUserJob JID-001291260c2ce3ddfec6e812 INFO: done: 2.075 sec
Awesome! Amazing. Wonderful…
Deciding where to enqueue your background tasks is an important design choice.
For example, in this lesson we enqueued UpdateUserJob
whenever a User
model was saved
or created
, using the after_save
invocation on the model itself. To me, this makes sense because there are theoretically a number of places within our application when User
objects are saved - for example, when a new user signs up (and a User
instance is created for them), or when an existing User
instance is updated.
In other words, this particular job is tied to the act of saving a User
, which is very much the concern of our model layer - not our controller layer!
That said, if we wanted to enqueue this job in a controller for some reason (???), we could do so. We won’t actually go through the steps of creating this controller and view, but this is how it would be done:
# app/controllers/profiles_controller.rb
class ProfilesController < ApplicationController
def update
if current_user.update(user_params)
UpdateUserJob.perform_later(current_user.id) # <- The job is queued
flash[:notice] = "Your profile has been updated"
redirect_to root_path
else
render :edit
end
end
private
def user_params
# Some strong params of your choice
end
end
Of course, if you implemented this exactly as-is, the UpdateUserJob
task would be enqueued twice - once in the controller, and once in the User
model when after_save
is invoked. This makes no sense, so don’t do it.
Some better examples for a controller would be: compressing a large file on-the-fly, making an external API request, and so on.
ruby make
or rake
is an awesome tool, and writing rake
tasks is, to me at least, rather enjoyable.
We can use rake
for all kinds of tasks, but one of the more common themes is updating data en-masse. For example, let’s say you’ve just done a database migration where you’ve added a new field to the User
model that will store Clearbit information - rather than waiting for each User
to update itself (and thus enqueueing the UpdateUserJob
task) we can pro-actively update our data with a rake
task instead!
Let’s generate our rake
task using rails generate
:
rails generate task user update_all
Running via Spring preloader in process 64234
create lib/tasks/user.rake
Open lib/tasks/user.rake
and add:
namespace :user do
desc "Enriching all users with Clearbit (async)"
task :update_all => :environment do
users = User.all
puts "Enqueuing update of #{users.size} users..."
users.each do |user|
UpdateUserJob.perform_later(user.id)
end
# rake task will return when all jobs are _enqueued_ (not done).
end
end
This task will load all User
instances and enqueue an UpdateUserJob
for each of them. Now let’s run the task:
rake user:update_all
Enqueuing update of 1 users...
In sidekiq
we should see:
2019-05-30T02:04:00.982Z 59363 TID-ouxvy6kof UpdateUserJob JID-e93889ccb32bca6f121df643 INFO: start
Calling Clearbit API for admin@gmail.com...
Done! Enriched admin@gmail.com with Clearbit
2019-05-30T02:04:03.010Z 59363 TID-ouxvy6kof UpdateUserJob JID-e93889ccb32bca6f121df643 INFO: done: 2.028 sec
There are a few benefits to having our rake
tasks use Jobs
in this way, the biggest one being concurrency.
If instead we’d made the calls to Clearbit’s API synchronously within the task, overall it would take much longer since we’d only be completing one request at a time. Using jobs and the fact of sidekiq
’s concurrency feature, we can complete up to 3 API calls simultaneously! Of course, more powerful servers can handle even larger queues.
In the previous rake
task, we update all User
instances, but what if we want to update just one instance? We’d need to provide our task with the id
of that user… fortunately rake
tasks can be executed with parameters. We want something like this:
rake user:update[1]
Open lib/tasks/user.rake
and add:
desc "Enriching a given user with Clearbit (sync)"
task :update, [:user_id] => :environment do |t, args|
user = User.find(args[:user_id])
puts "Enriching #{user.email}..."
UpdateUserJob.perform_now(user.id)
# rake task will return when job is _done_
end
Notice the difference here?
task :update, [:user_id] => :environment do |t, args|
We’ve configured the user.update
task to accept one parameter, :user_id
, and so in the definition of this task we can access it with args[:user_id]
, like so:
user = User.find(args[:user_id])
Also, notice that when we enqueue UpdateUserJob
we are doing so synchronously (via perform_now
) since it’s just one API request that shouldn’t take more than a few seconds to complete. In other words, doing this job asynchronously doesn’t really confer much of benefit, so we won’t bother giving it to sidekiq
at all.
Now let’s run the task:
rake user:update[1]
Enriching admin@gmail.com...
Calling Clearbit API for admin@gmail.com...
Done! Enriched admin@gmail.com with Clearbit
Notice that the stdout
output of our job appears in this terminal window, not sidekiq
. If this were enqueued as an asynchronous job, the output would appear there, instead.
Actually, sending emails can be quite time-intensive, especially if you have to compose a custom email with information from your database (e.g. “Hello <username>, thanks for doing <some thing>
”), not to mention if you’re using an external mailing service like Mailchimp, this requires making HTTP requests to its API, which also takes time.
So, better to send emails in the background, and not block the user from interacting your application.
When we configured sidekiq
, remember that we created two distinct queues:
# ...
:queues:
- default
- mailers
devise
and action_mailer
make this easy… we don’t even have to create a custom job! All we need to do is invoke UserMailer#welcome
in our User
model… so let’s open app/models/user.rb
and add:
# ...
after_create :send_welcome_email
# ...
private
def send_welcome_email
UserMailer.welcome(self.id).deliver_later
end
Of course, UserMailer
hasn’t yet been created. Lets use rails generate
to do this:
rails generate mailer UserMailer
Running via Spring preloader in process 79934
create app/mailers/user_mailer.rb
invoke erb
create app/views/user_mailer
invoke test_unit
create test/mailers/user_mailer_test.rb
create test/mailers/previews/user_mailer_preview.rb
action_mailer
allows you to send emails from your application using mailer classes and views. Mailers work very similarly to controllers. They inherit from ActionMailer Base and live in app/mailers
, and they have associated views that appear in app/views
.
Now, let’s open app/mailers/user_mailer.rb
and add:
def welcome
@user = params[:user]
@url = 'http://example.com/login'
mail(to: @user.email, subject: 'Welcome to My Awesome Site')
end
This will enqueue our UserMailer#welcome
job and send off an email.
By default, sidekiq
checks its queues for jobs every 5 seconds, executes them as soon it’s able.
However, we may want to delay the execution of a job. The API for delaying jobs looks like this:
FakeJob.set(wait: 1.minute).perform_later
FakeJob.set(wait_until: Date.tomorrow.noon).perform_later
Ok, now that we’ve succeeded running sidekiq
in our development environment, how do we get it working in production?
Briefly, let’s review all the moving parts:
redis-server
: our in-memory key-value storesidekiq
: running in its own process and connected toredis-server
rails server
: configured to connect tosidekiq
Let’s take this a step at a time using Heroku as the production environment.
Using the heroku
CLI application, we can set up redis-server
with the rediscloud
add-on:
heroku addons:create rediscloud
Next, open the configuration file located at config/initializers/redis.rb
and add:
$redis = Redis.new
url = ENV["REDISCLOUD_URL"]
if url
Sidekiq.configure_server do |config|
config.redis = { url: url }
end
Sidekiq.configure_client do |config|
config.redis = { url: url }
end
$redis = Redis.new(:url => url)
end
This will configure redis
to make a connection to the production rediscloud
instance we just created .
Next, we have to tell heroku
to add an additional process: the sidekiq
worker. Open your Procfile
and add:
web: bundle exec puma -C config/puma.rb
worker: bundle exec sidekiq -C config/sidekiq.yml
Notice that we’re telling heroku
to create two separate processes, one for the rails app (i.e. web
) and one for the sidekiq
worker (i.e. worker
), and plugging our new configuration files into the processes.
Now, commit the changes and push them to heroku
.
Finally, scale up the sidekiq
worker like thus:
heroku ps:scale worker=1
heroku ps