Skip to content

Instantly share code, notes, and snippets.

@ToniRib
Created March 11, 2020 15:43
Show Gist options
  • Save ToniRib/ff233e9117b066a8bc43c827cc6137fe to your computer and use it in GitHub Desktop.
Save ToniRib/ff233e9117b066a8bc43c827cc6137fe to your computer and use it in GitHub Desktop.
Non-Deterministic Specs

Fixing flaky specs

Need help? Watch this RubyConf video

CircleCI

Each CircleCI container now prints out all of the spec files that it ran along with the seed number. If you see a spec that you think might be failing due to an order dependency, you can run rspec <all spec files listed for that container pasted here> --bisect --seed <seed copied from that container pasted here> locally to get the minimum command to reproduce the failure.

If you run with SSH on CircleCI and need to install VIM on the box to be able to edit specs, follow the 2nd solution on this StackOverflow post to get it installed. You will likely need to use sudo in front of each of the commands.

You can get the SSH command from the bottom of each container if you have built with SSH. You MUST have a key set within your personal GitHub account in order to be able to use this feature (if you pushed the build). Once you have SSH'ed into the box, you can go into the main app directory to run any tests. You must run bundle exec rspec instead of just rspec to actually run tests.

JavaScript Console Messages

If you want to see the console messages generated from a console.log statement, find this section of code in the spec_helper.rb file and uncomment the line about driver messages:

config.after :each, type: :feature do
    # puts page.driver.console_messages
end

Don't forget to re-comment it before you commit otherwise our specs will get very noisy!

Common Problems

These are some of the common things that make specs flaky. Sometimes they might only make the spec brittle, which means that small changes can cause it to break immediately. This is also bad because it will require more dev work later to update brittle specs.

Non-Deterministic ActiveRecord Queries/Expectations

If your ActiveRecord query does not specify an order, there is no guarantee what order the records will be returned in. If you don't care about the order, use the RSpec matchers match_array or contains_exactly which only check the content of your collection, but not the order in which it is returned.

Similarly, even if you only create one or two AR objects in a test, you should never rely on .first and .last to provide you with the correct object. The methods will order by primary key if the table has one, but it isn't always safe to rely on you knowing the id of the objects since you are not explicitly specifying it. Assign those objects to variables and use the variables as a reference, or look up the objects by something more specific like name.

Being Too Specific When Checking Dates/Times

If you are comparing a timestamp to another timestamp exactly, you can see failures when they are off by microseconds (which you probably don't care about) or when the second has changed during the middle of the test (which you also probably don't care about). If you only care about the date, turn it into a string and compare the date instead of the full timestamp.

If you only care whether or not a timestamp changed, you can always use the change matcher to detect whether or not a change happened.

Instance Variables

These can fail because the instance variable is created once and then shared across all examples in the spec file. Therefore, if you're mutating it in one test, and then expecting it to be in a specific state in another test, there's no guarantee it is in that state since the tests are run in a random order. These should be converted to either local variables or let bindings. Similarly, just never use class variables. Ever. They persist across ALL specs.

Using Faker

Let's say you're writing tests for a query that should return you one user but not another based on a search term for the user's first name. If your search term is Bill and you create one user with a first name using the Faker gem and another user specifically with the name Billy, and then always expect your test to return only the Billy user, what would happen if Faker actually generates the name Bill for your non-matching user? The answer is that you would get back BOTH users and your test would fail. Never rely on the Faker gem like this. Always specify BOTH names, or at least choose a search term/name that is not included in Faker's source code. But it's definitely safer to just take the first route.

Additionally, some Faker methods have a small number of options to choose from. If you're using one of those for creating data for a column with a uniqueness validation, there is a chance you can create a collision and have your test fail the uniqueness validation.

Time/Date-Sensitive

Some specs fail when daylight savings time changes, or after a specific time of day, or during specific dates. You can use the timecop gem to freeze the time or travel to specific dates/times so that the specs are always run at a constant date or time of day. In addition, you can use the Rails helper to specify relative times, such as Time.now - 1.day or 1.hour.ago. This is generally better than specifying exact dates.

You can check if you have time sensitive specs by using Timecop in an around action to freeze the specs at a different time of day. Additionally, if you're using variables for date range (like start_time or end_time) try changing those to a different date range and running the specs to see if they still pass. It's not uncommon to see someone set a date range the encompasses the current date and relies on the fact that created_at values will be within that date range. However, when the the current date inevitably moves past the given end date, the specs can start failing. Specifying created_at, data_synced_at, updated_at, or whatever the code uses to determine whether or not the records are within the range explicitly when creating the objects as relative times compared to the date range should help with this.

Order-Dependent

Some specs are order dependent, usually due to one of these other issues listed here. Basically, your flay spec relies on some specific state, and due to the random order of the specs running, when that spec runs it sometimes is in the correct state and sometimes not. You'll often see them pass on their own, but fail if you run the whole spec or a set of specs together. These can be really difficult to debug if you need to run 10+ minutes of specs in order to see the failure (for instance, you see it on CircleCI or only when you run all feature specs).

If you suspect they are order dependent, read up on rspec --bisect. Find a time that the specs failed on CircleCI (or locally) and get the seed number that RSpec used. Then use this seed number along with bisect and the specs files you were running. RSpec will run all of those specs and try to detect failures. If it finds a failure, it will check to see if they are order-dependent. If the failure is determined to be order-dependent, RSpec will continue to run the tests in different chunks until it finally spits out the minimum command to reproduce the failure. Generally this will be a much smaller set of specs that can help you pinpoint what unexpected state is persistent across those specs.

OS Dependent

Some specs can fail due to operating system differences. These are pretty rare, but they do happen. For instance, locally we run specs on Mac but CircleCI uses Linux. Even though they both set up Postgres databases, we have seen failures due to collation differences between Mac and Linux. Basically, text sorts differently on these two operating systems, so you might see specs that check sorting pass locally and then consistently fail on CircleCI. Generally, these are actually a good time to just change the data you are checking in the spec since they are likely not actually a problem in production.

Mocking Unnecessarily

You should avoid using any_instance_of when mocking in tests because this will mock ALL instances of a class, which can definitely have unexpected consequences. Additionally, try not to mock more than you need, especially when actually setting up the data can be achieved easily. For instance, instead of doing:

allow(company).to receive(:anywhere_place_id).and_return(place.anywhere_place_id)

you should just do:

company.anywhere_place_id = place.id
company.save!

It's generally not great practice to mock out ActiveRecord message chains either, like:

allow(mission_response).to receive_message_chain(:campaign, :company, :places, :active, :where, :first).and_return place

as these can also be flaky and dependent heavily on the implementation of the code they are testing.

Server Time vs. App Time

Date.today != Time.now. This is a good StackOverflow post on it. Basically, Date.today uses the server time zone, which for our Heroku apps and CircleCI is generally not the time zone that you want. Time.zone.now will use the application's timezone. You can also use .in_time_zone(company.time_zone) if you want something set up for a specific company.

This can actually be a problem in production code also, so watch out for it. Basically, using Time instead of Date is almost always recommended.

Exact Number of Objects in Database

Even with Database Cleaner, you sometimes might get flakes when writing something like this:

expect(Place.count).to eq 1

subject.add_place(name: 'Superdome')

expect(Place.count).to eq 2

Also, that is tied very closely to the test setup. If someone later comes in and adds a place in let binding or before statement, your spec will start failing even if the add_place functionality hasn't changed, since maybe it will now be starting with 2 places in the database and adding one to make 3.

Instead you can use the change operator to check the difference between the before and after states:

expect { subject.add_place(name: 'Superdome') }.to change { Place.count }.by 1

Setting ENV Vars

Always use an around operator to accomplish this. Since they are global variables, if you reset one somewhere in a test, it will be like that for all the rest of the tests. By using an around hook, you can save off the original value, change it to what you need, run the test, then reset it to the original value.

The main GSC repo has this helper:

def mock_env_var!(key, value)
  around do |ex|
    original_value = ENV[key]
    ENV[key] = value
    ex.run
    ENV[key] = original_value
  end
end

Specifying Exact IDs for ActiveRecord Objects

The IDs use a sequence and count up from 1 as the tests run, so specifying any exact ID can later collide with another ID and result in an error.

Objects Created From Associations

A lot of the FactoryBot helpers create additional objects if you don't specify the ids for related objects. For example, if you create a section without specifying a campaign_id, a new campaign object will be created. If your test is then expecting something related to campaigns, it's possible that sometimes a campaign will be created that happens to meet the conditions of your test, thus making it fail sometimes.

To avoid this, make sure you know when additional objects are being created and change your let/before statements to only create the objects you need. In the example provided, you could create the campaign first, then when you create sections and tasks for it, always pass those helper methods the corresponding campaign id so additional ones are not created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment