Skip to content

Instantly share code, notes, and snippets.

@holiman
Last active October 3, 2018 15:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save holiman/fdec3547f2b104803abbd2c6e751a8e7 to your computer and use it in GitHub Desktop.
Save holiman/fdec3547f2b104803abbd2c6e751a8e7 to your computer and use it in GitHub Desktop.

Current status and problems

Currently, tests are checked in to the tests repo, which contains several things:

  • Manually crafted tests, like bcForkBlockTestCopier.json in the src/BlockchaintestsFiller/ directory. The 'filler' directories contain various tests that are 'to be filled'.
  • Generalized state tests. These are also unfilled, and contain basically prestate, and the expect-section for various 'indexes'. Indexes are used to generalize the test. Example test 'Bazonk':
    • Run this test with gas 2000001, then 400000, on Constantinople and Homestead.
    • And for those (four) tests, we expect the postState to contain <dump of state> The generalized statetests are then executed, and the example above would result in four filled tests.
  • Various odd-format tests, TransactionTests, VMTests etc.
  • Additionally, Generalized statetests are also converted into blockchaintests.

So for the example Generalized StateTest 'Bazonk' above, we have

  • Four stateTests,
  • Four blockchaintests
  • The manual test

So there are five files for that one. This leads to several problems, which this PR illustrates quite well:

  • 4866 files in a single commit
  • Impossible to review
  • Very difficult to find the root of a failing test, since it's difficult to find the 'true origin' and intent behind the test.

Proposed solution

We should bootstrap a new test repository, which only contains the manually created tests that are the source for the actual tests that are run on clients. Let's call them

  • source (a file which describes a test: today called 'fillers')
  • artefact (a file which is executable in a client test-harness).

That would play better with Github, and allow meaningful reviews of PRs and testcases, and discussion about PRs. The repo should also contain scripts that convert source to artefacts.

Then we should set up build server(s), that do the following:

  • On every commit to the test repo, regenerate tests
    • Output 1: A list of tests that have changed
    • Output 2: A zipped archive of changed tests

The server should also provide zipped archives of

  • All statetests,
  • All blockchaintests,
  • etc...

The test generation/artefact creation process should, when checking 'did this test change', not compare certain fields, namely the _info/filledwith data, which is only metadata that will update even if the test data does not change.

Benefits

  1. We don't have to rely on one person (Dmitry) to regenerate tests manually
  2. No more commits with thousands of files ,
  3. Better visibility into the generation process,
  4. Better information about changes to tests,
@holiman
Copy link
Author

holiman commented Sep 28, 2018

The test server should also allow pinning of certain revisions, such as

  • pinning the exact version to use for filling the tests (the evm engine or block executor)
  • pinning the exact version of Solidity to use for generating code from solidity-source
  • pinning the exact version of LL to use for generating code from LLL-source

@pipermerriam
Copy link

If I understand correctly, tests are currently filled/generated using the aleth (c++) implementation. I'd be interested in pursuing a strategy which fills the tests using multiple implementations, ideally with something in CI which fails a pull request if there is a disagreement/mismatch in the generation of a test across the different implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment