Skip to content

Instantly share code, notes, and snippets.

@hackcasual
Created March 8, 2016 01:49
Show Gist options
  • Save hackcasual/ea77cc31c6dafdda7274 to your computer and use it in GitHub Desktop.
Save hackcasual/ea77cc31c6dafdda7274 to your computer and use it in GitHub Desktop.

Emscripten Debugging System

ver -1

Proposal for debug tooling based on DWARF, purpose designed to be easy to integrate with Javascript and similar targets.

Scope

This document lays out a record format as well as several intrinsic style functions designed to support debugging transpiled C/C++.

Goals

  1. Support viewing heap allocated objects in a manner similar to existing C/C++ debug interfaces
  2. Support automatically populating a debugger view with variables as they come into scope
  3. Support multiple existing Javascript debug interfaces
  4. Capable of some level of operation even on highly optimized/minified code
  5. Support Source Maps even on highly optimized/minified code

CyberDWARF format

JSON record set containing type and function information

"cyberdwarf": {
  "version": 0,
  "types": ...,
  "type_name_map": ...,
  "functions": ...,
  "vtable_offsets": ...,
  "function_name_map": ...
}

Section version

Tracks the version of the file

Section types

The most complicated section, this represents a type graph, extracted from LLVM's debug info metadata.

Each record is represented as an array, with the type of record the first element. types is a dictionary associating type IDs to type records

"3":[1, "",38,"_ZTS13BoneTransform",0,0] defines a type record for type ID 3, 1 being a derived type, with no name, acting as a const reference to the type _ZTS13BoneTransform, 0 size and 0 offset.

Type Enum LLVM DI Subclass Fields
0 DIBasicType name, DWARF encoding attribute, offset, size
1 DIDerivedType name, DWARF Tag, base type ID, offset, size
2 DICompositeType name, identifier, tag, size, offset, [List of elements as type IDs]
3 DISubroutineType DWARF Tag
4 DISubrange count
5 DISubprogram name
6 DIEnumerator name, value
10 MDString value

Type IDs for base type ID or elements of DICompositeType can be either string or number types. If they are of type string, the type ID is resolved by going through the type_name_map.

DWARF Tag values are those taken from the DW_TAG_ constants, and the encoding attribute is similarly taken from the DW_ATE_ constants.

Section type_name_map

Lookup table for resolving types by name. Future work should look into pre-resolving this to reduce the size of the file.

"_ZTSSt20bad_array_new_length":"1631" maps the type given by name to its ID in types.

Section functions

Maps the name of a function to a dictionary of variable names with their associated type IDs

"_Z11ScaleMatrixR13BoneTransformRKS_f":{"result":"1","mat":"2","s":"4"}

Section vtable_offsets

Maps the location of a VTable entry in the global memory to the name of the VTable entry for a class.

"208":"_ZTVN10__cxxabiv117__class_type_infoE" indicates that the VTable for __cxxabiv1::__class_type_info is at offset 208 in the global static section. By examining a reference to a base type and looking up the pointer's VTable, the debugger can show the specific type instance

Section minified_name_map

Contains a map from minified function name to original emitted name. Generated from the symbol map.

Metadata intrinsics

llvm.dbg intrinsics

Based on llvm.dbg.declare and llvm.dbg.value, these intrinsics are designed to be no-ops during normal execution. While a program is being debugged, the debugger can replace them to visualize values from the running program

metadata_llvm_dbg_value_local

Called when the value of a program variable is in a Javascript value.

Arguments
  1. Local Javascript variable
  2. Type ID for types
  3. Offset
  4. Metadata ID for the DWARF expression to read variable

metadata_llvm_dbg_value_constant

Called when the value is fixed to some value.

Arguments
  1. Constant value
  2. Type ID for types
  3. Offset
  4. Metadata ID for the DWARF expression to read variable

Location intrinsics

metadata_filelocation

Unlike the dbg intrinsics, this is not designed to end up in the final Javascript. Instead it should be stripped out as the last step in building the output JS. This allows a diverse set of JS optimization passes to be mostly oblivious to offset tracking.

Arguments
  1. Line number
  2. Filename
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment