brson/nervos-test-ideas.md Secret

## nervos-test-ideas.md

      
    Raw
  

              nervos-test-ideas.md
            
          
    Some of the biggest concerns I have about correctness:


errors in assembly code
These are unfortunately hard to find in an automated way.


boundary conditions
Especially in every field of every instruction,
but also in memory accesses.


alignment
Alignment errors are extremely common,
and I haven't convinced myself that CKB-VM
can't be coerced into e.g. executing native assembly
instructions on misaligned data.


out of bounds accesses
One of the worst.


Here are some concrete,
actionable tasks,
to validate CKB-VM,
in a reasonable order of priority.
Update ckb-vm-test-suite's submodules

Both riscv-tests and riscv-compliance
are on 2019 revisions.
Important tests have probably been added since.
I also ran into problems building the test suite that would have been solved on
newer reversions.
Run dynamic analysis tools

miri is an obvious one,
but with so much assembly it's not going to get great covage.
Valgrind might turn up something interesting.
If miri can give useful coverage,
it might be added to CI.
Add structure-aware test cases to the fuzz project

The in-tree fuzz project only has the simplest test case.
More test cases would probably help the fuzzer dig deeper.
Use a fuzzing corpus

Fuzzers work best with a good starting corpus.
Fuzzing under CI at least is not using any corpus,
so it is probably not exercising the VM much.
Add afl and hongfuzz based fuzzers

The in-tree fuzz project only uses llvm's libfuzzer.
AFL and honggfuzz-based fuzzers may exercise different code paths.
Explore ways to compare CKB-VM to other implementations

To find unknown bugs we can compare CKB-VM's behavor to other RISC-V VMs.
Mahonson is exploring this in his rfuzzing project,
but we can also add it to the in-tree fuzz project.
The simplest way to compare is just to check the final result:
did the program succeed or fail;
but there are other things we could imagine comparing:

the complete memory space after execution
register state after execution
register state after every instruction step

Run all tests with and without mops

Mops are an optimization and the VM needs to behave
exactly the same with and without them.
Use a real-world corpus

Compare CKB-VM against spike on code pulled from crates.io
or other sources.
Add static analysis of assembly code

I haven't thought too much about this,
but I notice that the assembly for each interpreter op
follows patterns,
and I have seen patches from mahanson
that fix errors in those patterns
(like missing .p2align directives)
We could try to write a custom analysis
to enforce those patterns.
Use more types and less bit manipulation

The VM uses a lot of low level bit manipulation.
A lot of that is for performance or C/asm interop,
but probably not all of it.
It might be worthwhile to look for places that could benefit
from stronger typing.
Port assembly interpreter to ARM

Porting to other architectures can often reveal
incorrect assumptions,
sometimes with an impact on the correctness of the original architecture.
Add sentinel nodes to the live network

These nodes do expensive extra validation,
in order to catch unknown bugs live.
ex:

run both asm and rust interpreters,
and verify they get the same result
run both ckb-vm and a spike backend ...
run ckb-vm itself inside a VM and
look for suspicious memory accesses