Skip to content

Instantly share code, notes, and snippets.

Last active October 30, 2015 13:56
Show Gist options
  • Save feliam/906fde03a5fd2eb05e61 to your computer and use it in GitHub Desktop.
Save feliam/906fde03a5fd2eb05e61 to your computer and use it in GitHub Desktop.
1/ PDBs informed by transactions and "stable storage" (@uwaterloo CS454 '86), Alice Pascal '85 (HT @bradtem), C#'88.
2/ I wanted MS C++ to have a persistent program database (and fast incremental builds). PDBs were a step by step way to sneak that in.
3/ MS C/C++ and then VC++ build speeds were slow, especially with windows.h growing and growing.
4/ So C7:VC1/2/4 added precompiled headers (critical heap save/reload/move was skunkworks) and PDBs.
5/ First built MSF -- transacted multistream file (filesystem in a file). Uodate/append new data to streams, commit all chgs w/ one write.
6/ Transactions critical to making PDBs robust. You can interrupt compile at any instant and PDB is always consistent.
7/ This is critical to making incremental compile tools. PDBs updated hundreds of times (e.g. a database) must stay consistent.
8/ So atop MSF implemented various services persisted as streams.
9/ First one, typeserver, provides stable persistent mapping from CodeView type record graphs to dense integer type indices.
10/ When front end opens PDB, loads serialized tyoe records from types stream. As it emits CV info it looks up type record graphs bottom up
11/ e.g. it does hash cons'ing. Only new derived types are added -- pre-existing types (from earlier compiles) already have type indices.
12/ Thus type records / graphs are shared and stored exactly once in the shared PDB. The OBJ files refer to the PDB and its type indices.
13/ Uniqifying / packing type info this way, incrementally across compiles, dramatically reduced total debug info written and fixed cvpack
14/ (cvpack was a v slow link time activity of eliminating redundant debug info in the exe via RAM intensive graph isomorphism discovery)
15/ w/ OBJs' types in typeserver in PDB, worst cvpack perf was fixed. Then eliminated cvpack entirely, merging ref'd PDBs' types during link
16/ By VC2 added unique names map, per module symbols, line no maps, etc. to PDB enabling incremental linking.
17/ On full link, build PDB of all modules' debug info. Then when an OBJ is recompiled, subtract its old debug info, and add its new info.
18/ My colleague @ricom and I would joke "twice incremental is still cheap" -- and it was. After a small edit, 5 s lilnks instead of minutes
19/ VC2 ilink/PDB, great colleagues at this point including SteveSm, AzermK, ShankarV, and from afar, RichardS. Great QA by @Dingo, FabriceD
20/ Then in VC4 added another PDB named IDB for two incremental recompile features, incremental recompilation and minimal rebuild.
21/ incr. recompile, son of C#'88 and QuickC 2.0, checksummed source regions, skipped recompile of unchanged fn bodies, patched OBJs
22/[redo] VC4 feature minimal rebuild complented incr recompile. Former helps with rebuild after .h edits, latter after .cpp edits.
23/ idea of MR: remember how each src file depends on each header, and remember what each header declares. Then to rebuild after hdr edit...
24/ ... determine how the header's declarations changed -- the.skip recompilation of source files which do not depend on said changes.
25/ MR is a huge win. Add a button to a form (edit form.h, included by n source files). In practice most such srcs not impacted, skipped.
26/ The trick with incremental compile systems is to not add tons of state and code to enable incr. One system I saw added ~1KB state / LOC!
27/ MR and Incr recompile rather cleverly used (reinvented) Bloom filters to represent facts about declarations and dependencies.
28/ A few KB of state *per compiland* to represent all dependencies! Skip recompile of src if empty intersection of dependencies, changes.
29/ ICC, MR state persisted, incrementally updated in IDB. The teams included SimonK, DanSp, RicoM, @joncaves IIRC.
30/ After I left LBU, PDB page sizes etc. were expanded again. At this point old format "JG" PDBs became new format "DS" PDBs. Sigh!
31/ I forgot to mention, moving debug info to PDBs made it easy to ship binaries sans debug info, then reunite later. Also PDB servers etc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment