Skip to content

Instantly share code, notes, and snippets.

@flavorjones
Created May 8, 2009 14:00
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save flavorjones/108780 to your computer and use it in GitHub Desktop.

Parallelize Your RSpec Suite

We all have multi-core machine these days, but most rspec suites still run in one sequential stream. Let's parallelize it!

The big hurdle here is managing multiple test databases. When multiple specs are running simultaneously, they each need to have exclusive access to the database, so that one spec's setup doesn't clobber the records of another spec's setup. We could create and manage multiple test database within our RDBMS. But I'd prefer something a little more ... ephemeral, that won't hang around after we're done, or require any manual management.

Enter SQLite's in-memory database, which is a full SQLite instance, created entirely within the invoking process's own memory footprint.

(Note #1: the gist for this blog is at http://gist.github.com/108780)

(Note #2: The following strategy is relatively well-known, but I thought it might be useful for Pivots-and-friends to see exactly how one Pivotal project has used this tactic for a big speed win.)

Here's the relevant section of our config/database.yml:

test-in-memory:
  adapter: sqlite3
  database: ':memory:'

Next, we need a way to indicate to the running rails process that it should use the in-memory database. We created an initializer file, config/intializers/in-memory-test.db:

def in_memory_database?
  ENV["RAILS_ENV"] == "test" and 
    ENV["IN_MEMORY_DB"] and
    Rails::Configuration.new.database_configuration['test-in-memory']['database'] == ':memory:'
end

if in_memory_database?
  puts "connecting to in-memory database ..."
  ActiveRecord::Base.establish_connection(Rails::Configuration.new.database_configuration['test-in-memory'])
  puts "building in-memory database from db/schema.rb ..."
  load "#{Rails.root}/db/schema.rb" # use db agnostic schema by default
  #  ActiveRecord::Migrator.up('db/migrate') # use migrations
end

Note that in the above, we're initializing the in-memory database with db/schema.rb, so make sure that file is up-to-date. (Or, you could uncomment the line that runs your migrations.)

Let's give that a whirl:

$ IN_MEMORY_DB=1 RAILS_ENV=test ./script/console 
Loading test environment (Rails 2.3.2)
connecting to in-memory database ...
building in-memory database from db/schema.rb ...
-- create_table("users", {:force=>true})
   -> 0.0065s
-- add_index("users", ["deleted_at"], {:name=>"index_users_on_deleted_at"})
   -> 0.0004s
-- add_index("users", ["id", "deleted_at"], {:name=>"index_users_on_id_and_deleted_at"})
   -> 0.0003s

...

>>

Super, we can see that the database is being initialized our of our schema.rb, and we get our console prompt. We're ready to roll!

But, running this:

IN_MEMORY_DB=yes spec spec

will still only result in a single process, albeit one running off a database that's entirely in-memory. We want parallelization!

The final step is a script that will run your spec suite for you. You may need to edit this for your particular situation, but then again, maybe not.

#  spec/suite.rb

require "spec/spec_helper"

if ENV['IN_MEMORY_DB']
  N_PROCESSES = [ENV['IN_MEMORY_DB'].to_i, 1].max
  specs = (Dir["spec/**/*_spec.rb"]).sort.in_groups_of(N_PROCESSES)
  processes = []

  interrupt_handler = lambda do
    STDERR.puts "caught keyboard interrupt, exiting gracefully ..."
    processes.each { |process| Process.kill "KILL", process }
    exit 1
  end

  Signal.trap 'SIGINT', interrupt_handler
  1.upto(N_PROCESSES) do |j|
    processes << Process.fork {
      specs.each do |array|
        if array[j-1]
          require array[j-1]
        end
      end
    }
  end
  1.upto(N_PROCESSES) { Process.wait }

else
  (Dir["spec/**/*_spec.rb"]).each do |file|
    require file
  end
end

Then, you simply run IN_MEMORY_DB=2 spec spec/suite.rb to run two parallel processes. Increase the number on larger machines for better results!

There's room for improvement here, notably in the naive method used to allocate the spec files to processes, but even as simple as this method is, our spec suite runs in about half the time it used to, on a dual-core machine.

test-in-memory:
adapter: sqlite3
database: ':memory:'
def in_memory_database?
ENV["RAILS_ENV"] == "test" and
ENV["IN_MEMORY_DB"] and
Rails::Configuration.new.database_configuration['test-in-memory']['database'] == ':memory:'
end
if in_memory_database?
puts "connecting to in-memory database ..."
ActiveRecord::Base.establish_connection(Rails::Configuration.new.database_configuration['test-in-memory'])
puts "building in-memory database from db/schema.rb ..."
load "#{Rails.root}/db/schema.rb" # use db agnostic schema by default
# ActiveRecord::Migrator.up('db/migrate') # use migrations
end
require "spec/spec_helper"
if ENV['IN_MEMORY_DB']
N_PROCESSES = [ENV['IN_MEMORY_DB'].to_i, 1].max
specs = (Dir["spec/**/*_spec.rb"]).sort.in_groups_of(N_PROCESSES)
processes = []
interrupt_handler = lambda do
STDERR.puts "caught keyboard interrupt, exiting gracefully ..."
processes.each { |process| Process.kill "KILL", process }
exit 1
end
Signal.trap 'SIGINT', interrupt_handler
1.upto(N_PROCESSES) do |j|
processes << Process.fork {
specs.each do |array|
if array[j-1]
require array[j-1]
end
end
}
end
1.upto(N_PROCESSES) { Process.wait }
else
(Dir["spec/**/*_spec.rb"]).each do |file|
require file
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment