Skip to content

Instantly share code, notes, and snippets.

@olivergeorge
Last active March 23, 2022 15:32
Show Gist options
  • Save olivergeorge/9ae12592b49f8da4d911650b793dcbda to your computer and use it in GitHub Desktop.
Save olivergeorge/9ae12592b49f8da4d911650b793dcbda to your computer and use it in GitHub Desktop.

This is a bit of a thought exercise. I doubt it’s perfect and I’m hoping for opinions and corrections with the goal of a well reasoned practical approach.

Motivation...

One way to look at type declarations in a static language is as a test which picks up potential incompatible code paths. E.g. data passed is incompatible with code.

In static languages the effort to write the test is reduced by virtue of being declared inline with the code and inference allows a few annotations to permeate - having said that we can achieve a similar results in Clojure.

Quick scan of tools at hand...

  • compiler analysis
  • pre/post function assertions
  • code assertions
  • generative testing
  • spec assertions
  • function specs
  • function instrumentation
  • spec based generative testing

The compilation process will throw warnings in some cases.

Coding in pre/post conditions and asserts has always been an option. It doesn’t help with writing test to exercise the function but does pick up cases where the code is exposed to something it isn’t intended to handle. Assert errors aren’t very informative themselves but they do show where it hurts.

The test.check library has been around for a while. It provides a way to generate random data for use in testing and where test fails it can attempt to find the simplest input which caused the error. There’s effort required to write tests which cover a full range of interesting inputs. It can be fiddly to ensure good code coverage. Testing is computationally intensive due to work required to generate data and the number of times code is executed.

Clojure spec builds on these ideas. It provides a way to describe the data which can be used to assert data is valid like pre/post assertions but with more informative errors (via instrumentation or s/assert). It can be used to generate data useful for testing, in fact it builds on test.check. It provides clojure.spec.test/check to exercise a function with generated arguments as part of testing. It can also replace functions allowing you to stub out side-effecting code to isolate code being tested. All of this is implemented with reuse in mind. Once we describe our data and functions with specs we have a range of tools available to us.

So with these tools how should we write our “type” tests?

Goals...

  • Ensure data passed between functions is compatible
  • Ensure functions return expected valid data

First implementation attempt...

  • Write spec describing inputs and outputs of our functions
  • Use clojure.spec.test/check to look for bugs

Without instrumentation we don’t get checking & errors when passing bad data. Tests will fail if data generated causes an exception or invalid return values are produced.

Second attempt...

  • Write spec describing inputs and outputs of our functions
  • Turn on instrumentation
  • Use clojure.spec.test/check to look for bugs

Now our calls are checked and reported. We still have challenges getting coverage of all code paths and potentially a lot of code is being executed aside from the function being tested. Any side effecting code is going to be a complication to setting up tests and getting repeatable errors.

Third attempt...

  • Write spec describing inputs and outputs of our functions
  • Turn on instrumentation and stub all side-effecting functions
  • Use clojure.spec.test/check to look for bugs

Now data passed to other functions are checked but side-effecting code is not executed. Instead a random return value is generated in place of those calls. This avoids complications associated with side-effecting code.

Additional challenges...

  • Some data types hard to generate - computationally intensive .
  • Some data types are hard to express - in code

Notes...

Calling side-effecting functions. Defining a spec and stubbing them out works. If it is third party code then consider adding an interop namespace to isolate the code and providing a place to hook up specs.

Working with higher order functions. Passing immutable data around is easy but passing functions is trickier. There are spec features for describing anonymous functions but generating them is a bit limited (in my limited experience)

What would be cool...

A service which tracked what tests have already been run and a way to only run generative tests for the bits which might have changed. This would be more efficient and opens up the idea of pushing testing cycles to other resources (not my laptop)

Ways to make generators smarter. Goal being ensuring function is tested with a good range of data and ensuring good code coverage.

A way to check code coverage as part of generative testing.

Being able to stub branch statements like “if” so that both paths can be exercised without “getting lucky”... some branches require very specific data to be generated.

Using specs in static analysis to pick up problems without needing to generate examples. Implies fancy inference. Requires someone willing to take pure type inference ideas and adapt them to an impure predicate based world - statistical or imperative type soundness? I’m guessing. Seems like there is a PhD in this but I am not an academic .

IDE features which use specs to guide the developer - warn when args violate function args spec, hover over symbol to see spec, suggest specs for functions...

Efficient data generation for clojurescript. Complex specs crash my tests. Max call stack exceeded. (in my limited experience)

IDE affordances. Since specs are intentionally decoupled from function implementation it’s harder to see the code and spec at the same time or work on both. If you don’t have tests running then specs can easily fall out of date. No doubt discipline help but...

@olivergeorge
Copy link
Author

@realgenekim
Copy link

realgenekim commented Jul 28, 2020

I love using https://github.com/gnl/ghostwheel by @gnl, and more recently, https://github.com/fulcrologic/guardrails by @awkay (because it's easier to setup).

I love being able to write specs as part of the defn, like this, which I find quite beautiful. And it will catch at run-time any spec failures in the arguments and return values. (In this case, I easily found an error where I was returning a symbol, instead of a sequence  — so great!)

(>defn find-dispatches
  " accumulate all events "
  [sexpr events]
  [any? vector? => (s/nilable seq?)]

@awkay
Copy link

awkay commented Jul 28, 2020

FYI, in regard to:

Using specs in static analysis to pick up problems without needing to generate examples. Implies fancy inference. Requires someone willing to take pure type inference ideas and adapt them to an impure predicate based world - statistical or imperative type soundness? I’m guessing. Seems like there is a PhD in this but I am not an academic .

I'm working on a version of Guardrails called Guardrails Pro (it will be an inexpensive commercial extension of Guardrails) that is doing exactly this.

@olivergeorge
Copy link
Author

Sounds interesting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment