Skip to content

Instantly share code, notes, and snippets.

@diracdeltas
Created April 29, 2014 17:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save diracdeltas/5af9bb793ce164b20c78 to your computer and use it in GitHub Desktop.
Save diracdeltas/5af9bb793ce164b20c78 to your computer and use it in GitHub Desktop.
diff of ruleset.sqlite files after VACUUM
0000000: 5351 4c69 7465 2066 6f72 6d61 7420 3300 SQLite format 3.
-0000010: 0400 0101 0040 2020 0000 00d7 0000 1032 .....@ .......2
-0000020: 0000 0000 0000 0000 0000 00b8 0000 0004 ................
+0000010: 0400 0101 0040 2020 0000 00df 0000 1032 .....@ .......2
+0000020: 0000 0000 0000 0000 0000 00bf 0000 0004 ................
0000030: 0000 0000 0000 0000 0000 0001 0000 0000 ................
0000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
-0000050: 0000 0000 0000 0000 0000 0000 0000 00d7 ................
+0000050: 0000 0000 0000 0000 0000 0000 0000 00df ................
0000060: 002d e604 0d00 0000 0302 c500 0389 031d .-..............
0000070: 02c5 0000 0000 0000 0000 0000 0000 0000 ................
0000080: 0000 0000 0000 0000 0000 0000 0000 0000 ................
@NacreData
Copy link

http://sqlite.org/fileformat.html section 1.2
If I'm reading correctly (I kinda have to squint to read hex, not my native language…), the differences are:

  • Size of the database file in pages. The "in-header database size".
  • The schema format number. Supported schema formats are 1, 2, 3, and 4.
  • SQLITE_VERSION_NUMBER

@diracdeltas
Copy link
Author

I think you need to look at https://www.sqlite.org/fileformat2.html for sqlite3. I get:

  • offset 27 - part of "File Change Counter"
  • offset 43 - part of "Schema cookie"
  • offset 95 - part of "version valid-for number" (which should usually be the same as the value of "File Change Counter" according to the docs)

I'm guessing (based on the name) that the Schema cookie is initialized with a random value, so it ends up being non-deterministic. It's unclear to me whether the File Change Counter values are different because the sqlite writes are non-deterministic or because they're also being initialized to non-deterministic values. Either way, it seems that it would be fine to hard-code these values in the generated sqlite file.

@NacreData
Copy link

Sounds like you got what you needed, good sleuthing! What's the goal? Seems from context perhaps you are producing checksum/hash verifiable SQLite files for distribution perhaps?

@diracdeltas
Copy link
Author

@NacreData This is part of the build process for HTTPS Everywhere, which is part of the Tor Browser Bundle. We want TBB builds to be fully deterministic (within a VM) to catch certain kinds of attacks. See blog post from Mike Perry: https://blog.torproject.org/blog/deterministic-builds-part-one-cyberwar-and-global-compromise

@NacreData
Copy link

Cool. Makes sense. Thanks for working on that.

@jsha
Copy link

jsha commented Sep 3, 2014

Also related: original ticket (https://trac.torproject.org/projects/tor/ticket/11630#comment:20) and pull request with a fix for one broken behavior (unsorted input): https://github.com/EFForg/https-everywhere/pull/534/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment