This gist has some possible formats:
This is the format from the first spike.
This is the same as the other jsonl format, but with items as escaped strings instead of inline json objects.
This is a bit of a bodge. Normally, a tsv file has a bunch of rows, each of which represents the same kind of data. For example, you might have a school
column, a diocese
column, and a head-teacher
column, and each row represents a school and has values for each of these fields.
In our use case, however, we have different types of rows which provide different types of data:
add-item
which adds an itemappend-entry
which appends an entry with a given timestamp and item hashassert-root-hash
which asserts the current value of the merkle tree root
We worked around this by using the generic column names command
, val1
and val2
.
We wondered if it makes sense to record the entry as the canonical JSON for the entry which is used in the Merkle tree. This involves including the entry number in the format (which the other options don't do).
There are some interesting questions this raises:
- should we include the entry number in the serialisation format? (we had previously said "no" to this)
- should we revisit the canonical JSON for the entry?
The canonical JSON for the entry is used in the verifiable log for computing the hash of an entry. However, the entry-number is strictly redundant here; it corresponds precisely with the position within the Merkle tree.