I wrote this because I wanted a fast way to sanity-check the test environment and make sure each of our various example group types continues to behave the way we expect.
By default it pulls a list of 5 random spec files from each of our six spec suites (so usually 30 specs altogether), shuffles that list again so the specs aren't running in suite order, and runs them. You can pass in the SAMPLE_SIZE
env variable to modify the number of specs pulled from each group; I would recommend that if you use this task and decide to set this value, set it higher, not lower.
We all want a complete test suite that covers all your bases, exercising as much of the stack as possible, while still running from start to finish in 2 minutes or less. Unfortunately, large production applications can't always get there, and the time cost of running whole suites can make it hard to make sweeping changes of the kind you may need in order to get there. A change to the app environment — say, changing/refactoring something in our middleware stack, or removing a middleware we don't need anymore — could affect one spec suite differently from the others. At Typekit our test suite is relatively slow (10 min, not awful but not great) and I need faster feedback than that.
The purpose of this task is to force out and show any ripple effects from a global config/middleware change as quickly as possible, identify patterns, and react accordingly. It is not a substitute for a full test run. Pollsters query a random sampling of people because talking to everyone isn't practical to do very often, but the Census tries to count everybody because it's more accurate and doesn't need to get done very often. A sample spec run is more like a Gallup poll than the Census. Don't neglect to run your full suite.
Running a random list of specs has the added benefit of checking for unintended side effects. If a test fails during a random run that doesn't normally fail, that's a sign that something is hinky and needs more rigorous investigation. (This assumes of course that you're running random tests on a known good branch — if you've made any changes that might cause the build to go red, that just proves you broke something.)