if you're here from www.josh.works, here's the gist Jason sent back to me the next day, with his answers:
Jason's answers He said:
From then on, you duplicate that class variable to an instance variable. The .dup should "protect" you from one test modifying data for another test, which is a concern with class variables.
Hey Jason!
I've got a sticky question for you. The answer might be a simple "that's not possible", but I'm not positive.
I'll give you the context, then the code, then outline all of the things I've tried.
tl;dr: Every test file I run re-initializes our sales_engine with all of its many thousands of lines of data. I want to figure out how to use a setup method or module to spin up a SINGLE instance of the sales engine, and run the rest of our tests against that.
Brett Schwartz and I are building out our Black Thursday project. He and I about done with iteration 3.
It's actually been quite smooth sailing so far. We've got all our tests passing and the spec harness is quite happy. github repo
Before we started, the instructors talked a lot about making fixtures, or sample data to save us time on running our tests.
I made a bunch, but then ran into problems, as taking pieces of data from every file doesn't guarantee it's the RIGHT data. We had lots of method calls coming back empty, because it was doing math or logic on a file that originally had 5000 items, and now had just ten.
So, we canned using fixures, and decided to run with the full data sets.
Our tests were running slow, originally, because for every test in every file, we were initializing a new sales engine repo.
Here's what each test looked like when they were super slow:
class MerchantRepositoryTest < Minitest::Test
def setup
@se = SalesEngine.from_csv({
:items => "./data/items.csv",
:merchants => "./data/merchants.csv",})
end
def test_merchant_repository_exists
assert_instance_of MerchantRepository, @se.merchants
end
.
.
.
I thought initializing the repo before every test was a bad idea, so we went to this:
class InvoiceRepositoryTest < Minitest::Test
@@se = SalesEngine.from_csv({
:invoices => "./data/invoices.csv",
:items => "./data/items.csv",
:merchants => "./data/merchants.csv",
})
@@ir = @@se.invoices
def setup
@se = @@se
@ir = @@ir
end
def test_it_exists
assert_instance_of InvoiceRepository, @ir
end
.
.
.
Ruby complains that I've got class methods scattered about, so I don't think this is the right way to do it, but it saves us considerable time on testing. Tests went from ~20 seconds per file to 2 seconds per file. (the tests themselves have always completed in fractions of a second).
Then, we loaded up more data. Now it's ~4 seconds for the engine to initialize. Not the end of the world, except when I run rake unit_test
... we've got 14 test files. Now, it's 14 * 4
seconds, and takes a while. The spec harness still runs quite quickly, so its not that our code is slow (though I know it has lots of room for improvement.)
In my digging around on the internet, it sounds like Rspec can do this "before any test runs, setup the following..." approach. I don't know rspec, though, and don't want to switch over to it this late on this project just for this small gain.
I eventually made one last small improvement to the tests, and pulled out the engine initialization to a module:
module TestSetup
@@se = SalesEngine.from_csv({
:invoices => "./data/invoices.csv",
:items => "./data/items.csv",
:merchants => "./data/merchants.csv",
:transactions => "./data/transactions.csv",
:invoice_items => "./data/invoice_items.csv"
})
end
# test helper, included in every test:
require 'simplecov'
SimpleCov.start
gem 'minitest'
require 'minitest/autorun'
require 'pry'
require './lib/test_module' # <= calls the TestSetup module
# test file:
class MerchantRepositoryTest < Minitest::Test
include TestSetup
def setup
@se = @@se
end
def test_merchant_repository_exists
assert_instance_of MerchantRepository, @se.merchants
end
.
.
.
So, each test file loads the test_helper.rb file, which allows the test file to include TestSetup
and access all the test set-up in a single place.
So, this seems like a win from DRY principles, but I still don't like that I'm starting this engine a dozen times when I run my tests.
Do you know how to get around this? I've tried... many things. But don't know enough ruby to make educated guesses.
I've tried:
- Memoization, to get
@@se ||= [expensive_operation]
and had no luck, doing this in the test helper file or in the module, with or without making it a class variable. - adding/removing the
setup
methods from the tests, and working around them - it seems like a dozen other things.
So, how would you handle this? each test finishes in ~0.01 seconds, but takes ~4 seconds to finish the setup. it takes 54 seconds to run all the tests.
According to $ time rake unit_test
:
real 1m1.484s
user 0m56.605s
sys 0m1.520s
I think that means of that 1m1.4 seconds, all but 1.5 of them were spent... loading data?
Anyway, I'd love to know if you think there's a reasonable solution to this problem!
Unless you say otherwise, I'm going to let it go. I've learned a lot about metrics and benchmarking, as I've gone down this rabbit hole, but I'm reduced to just flailing around in the dark.
- minitest/minitest#61 <= far and away the most detailed discussion about this exact issue. The minitest developer said "not needed", and said why, but I wasn't able to interpret his answer, OR apply it to my code. His answer: minitest/minitest#61 (comment)
- http://www.justinweiss.com/articles/4-simple-memoization-patterns-in-ruby-and-one-gem/
- https://prograils.com/posts/ruby-methods-differences-load-require-include-extend
- http://stackoverflow.com/questions/6833361/access-a-modules-class-variables-inside-a-class-in-ruby
- https://en.wikipedia.org/wiki/Lazy_initialization
- https://chriskottom.com/blog/2014/10/4-fantastic-ways-to-set-up-state-in-minitest/
- http://stackoverflow.com/questions/2461432/lazy-evaluation-in-ruby
- http://stackoverflow.com/questions/1574797/how-to-load-dbseed-data-into-test-database-automatically
- and many, many more...