Skip to content

Instantly share code, notes, and snippets.

@feliam
Last active October 30, 2015 13:56
Show Gist options
  • Save feliam/906fde03a5fd2eb05e61 to your computer and use it in GitHub Desktop.
Save feliam/906fde03a5fd2eb05e61 to your computer and use it in GitHub Desktop.
1/ PDBs informed by transactions and "stable storage" (@uwaterloo CS454 '86), Alice Pascal '85 (HT @bradtem), C#'88.
2/ I wanted MS C++ to have a persistent program database (and fast incremental builds). PDBs were a step by step way to sneak that in.
3/ MS C/C++ and then VC++ build speeds were slow, especially with windows.h growing and growing.
4/ So C7:VC1/2/4 added precompiled headers (critical heap save/reload/move was skunkworks) and PDBs.
5/ First built MSF -- transacted multistream file (filesystem in a file). Uodate/append new data to streams, commit all chgs w/ one write.
6/ Transactions critical to making PDBs robust. You can interrupt compile at any instant and PDB is always consistent.
7/ This is critical to making incremental compile tools. PDBs updated hundreds of times (e.g. a database) must stay consistent.
8/ So atop MSF implemented various services persisted as streams.
9/ First one, typeserver, provides stable persistent mapping from CodeView type record graphs to dense integer type indices.
10/ When front end opens PDB, loads serialized tyoe records from types stream. As it emits CV info it looks up type record graphs bottom up
11/ e.g. it does hash cons'ing. Only new derived types are added -- pre-existing types (from earlier compiles) already have type indices.
12/ Thus type records / graphs are shared and stored exactly once in the shared PDB. The OBJ files refer to the PDB and its type indices.
13/ Uniqifying / packing type info this way, incrementally across compiles, dramatically reduced total debug info written and fixed cvpack
14/ (cvpack was a v slow link time activity of eliminating redundant debug info in the exe via RAM intensive graph isomorphism discovery)
15/ w/ OBJs' types in typeserver in PDB, worst cvpack perf was fixed. Then eliminated cvpack entirely, merging ref'd PDBs' types during link
16/ By VC2 added unique names map, per module symbols, line no maps, etc. to PDB enabling incremental linking.
17/ On full link, build PDB of all modules' debug info. Then when an OBJ is recompiled, subtract its old debug info, and add its new info.
18/ My colleague @ricom and I would joke "twice incremental is still cheap" -- and it was. After a small edit, 5 s lilnks instead of minutes
19/ VC2 ilink/PDB, great colleagues at this point including SteveSm, AzermK, ShankarV, and from afar, RichardS. Great QA by @Dingo, FabriceD
20/ Then in VC4 added another PDB named IDB for two incremental recompile features, incremental recompilation and minimal rebuild.
21/ incr. recompile, son of C#'88 and QuickC 2.0, checksummed source regions, skipped recompile of unchanged fn bodies, patched OBJs
22/[redo] VC4 feature minimal rebuild complented incr recompile. Former helps with rebuild after .h edits, latter after .cpp edits.
23/ idea of MR: remember how each src file depends on each header, and remember what each header declares. Then to rebuild after hdr edit...
24/ ... determine how the header's declarations changed -- the.skip recompilation of source files which do not depend on said changes.
25/ MR is a huge win. Add a button to a form (edit form.h, included by n source files). In practice most such srcs not impacted, skipped.
26/ The trick with incremental compile systems is to not add tons of state and code to enable incr. One system I saw added ~1KB state / LOC!
27/ MR and Incr recompile rather cleverly used (reinvented) Bloom filters to represent facts about declarations and dependencies.
28/ A few KB of state *per compiland* to represent all dependencies! Skip recompile of src if empty intersection of dependencies, changes.
29/ ICC, MR state persisted, incrementally updated in IDB. The teams included SimonK, DanSp, RicoM, @joncaves IIRC.
30/ After I left LBU, PDB page sizes etc. were expanded again. At this point old format "JG" PDBs became new format "DS" PDBs. Sigh!
31/ I forgot to mention, moving debug info to PDBs made it easy to ship binaries sans debug info, then reunite later. Also PDB servers etc.
@jangray
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment