Skip to content

Instantly share code, notes, and snippets.

@diracdeltas
Created Apr 29, 2014
Embed
What would you like to do?
diff of ruleset.sqlite files after VACUUM
0000000: 5351 4c69 7465 2066 6f72 6d61 7420 3300 SQLite format 3.
-0000010: 0400 0101 0040 2020 0000 00d7 0000 1032 .....@ .......2
-0000020: 0000 0000 0000 0000 0000 00b8 0000 0004 ................
+0000010: 0400 0101 0040 2020 0000 00df 0000 1032 .....@ .......2
+0000020: 0000 0000 0000 0000 0000 00bf 0000 0004 ................
0000030: 0000 0000 0000 0000 0000 0001 0000 0000 ................
0000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
-0000050: 0000 0000 0000 0000 0000 0000 0000 00d7 ................
+0000050: 0000 0000 0000 0000 0000 0000 0000 00df ................
0000060: 002d e604 0d00 0000 0302 c500 0389 031d .-..............
0000070: 02c5 0000 0000 0000 0000 0000 0000 0000 ................
0000080: 0000 0000 0000 0000 0000 0000 0000 0000 ................
@NacreData

This comment has been minimized.

Copy link

@NacreData NacreData commented Apr 29, 2014

http://sqlite.org/fileformat.html section 1.2
If I'm reading correctly (I kinda have to squint to read hex, not my native language…), the differences are:

  • Size of the database file in pages. The "in-header database size".
  • The schema format number. Supported schema formats are 1, 2, 3, and 4.
  • SQLITE_VERSION_NUMBER
@diracdeltas

This comment has been minimized.

Copy link
Owner Author

@diracdeltas diracdeltas commented Apr 29, 2014

I think you need to look at https://www.sqlite.org/fileformat2.html for sqlite3. I get:

  • offset 27 - part of "File Change Counter"
  • offset 43 - part of "Schema cookie"
  • offset 95 - part of "version valid-for number" (which should usually be the same as the value of "File Change Counter" according to the docs)

I'm guessing (based on the name) that the Schema cookie is initialized with a random value, so it ends up being non-deterministic. It's unclear to me whether the File Change Counter values are different because the sqlite writes are non-deterministic or because they're also being initialized to non-deterministic values. Either way, it seems that it would be fine to hard-code these values in the generated sqlite file.

@NacreData

This comment has been minimized.

Copy link

@NacreData NacreData commented Apr 29, 2014

Sounds like you got what you needed, good sleuthing! What's the goal? Seems from context perhaps you are producing checksum/hash verifiable SQLite files for distribution perhaps?

@diracdeltas

This comment has been minimized.

Copy link
Owner Author

@diracdeltas diracdeltas commented Apr 29, 2014

@NacreData This is part of the build process for HTTPS Everywhere, which is part of the Tor Browser Bundle. We want TBB builds to be fully deterministic (within a VM) to catch certain kinds of attacks. See blog post from Mike Perry: https://blog.torproject.org/blog/deterministic-builds-part-one-cyberwar-and-global-compromise

@NacreData

This comment has been minimized.

Copy link

@NacreData NacreData commented Apr 29, 2014

Cool. Makes sense. Thanks for working on that.

@jsha

This comment has been minimized.

Copy link

@jsha jsha commented Sep 3, 2014

Also related: original ticket (https://trac.torproject.org/projects/tor/ticket/11630#comment:20) and pull request with a fix for one broken behavior (unsorted input): https://github.com/EFForg/https-everywhere/pull/534/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment