Proposal for debug tooling based on DWARF, purpose designed to be easy to integrate with Javascript and similar targets.
This document lays out a record format as well as several intrinsic style functions designed to support debugging transpiled C/C++.
- Support viewing heap allocated objects in a manner similar to existing C/C++ debug interfaces
- Support automatically populating a debugger view with variables as they come into scope
- Support multiple existing Javascript debug interfaces
- Capable of some level of operation even on highly optimized/minified code
- Support Source Maps even on highly optimized/minified code
JSON record set containing type and function information
"cyberdwarf": {
"version": 0,
"types": ...,
"type_name_map": ...,
"functions": ...,
"vtable_offsets": ...,
"function_name_map": ...
}
Tracks the version of the file
The most complicated section, this represents a type graph, extracted from LLVM's debug info metadata.
Each record is represented as an array, with the type of record the first element. types
is a dictionary associating type IDs to type records
"3":[1, "",38,"_ZTS13BoneTransform",0,0]
defines a type record for type ID 3, 1 being a derived type, with no name, acting as a const reference to the type _ZTS13BoneTransform
, 0 size and 0 offset.
Type Enum | LLVM DI Subclass | Fields |
---|---|---|
0 | DIBasicType | name, DWARF encoding attribute, offset, size |
1 | DIDerivedType | name, DWARF Tag, base type ID, offset, size |
2 | DICompositeType | name, identifier, tag, size, offset, [List of elements as type IDs] |
3 | DISubroutineType | DWARF Tag |
4 | DISubrange | count |
5 | DISubprogram | name |
6 | DIEnumerator | name, value |
10 | MDString | value |
Type IDs for base type ID
or elements of DICompositeType can be either string or number types. If they are of type string, the type ID is resolved by going through the type_name_map
.
DWARF Tag values are those taken from the DW_TAG_
constants, and the encoding attribute is similarly taken from the DW_ATE_
constants.
Lookup table for resolving types by name. Future work should look into pre-resolving this to reduce the size of the file.
"_ZTSSt20bad_array_new_length":"1631"
maps the type given by name to its ID in types
.
Maps the name of a function to a dictionary of variable names with their associated type IDs
"_Z11ScaleMatrixR13BoneTransformRKS_f":{"result":"1","mat":"2","s":"4"}
Maps the location of a VTable entry in the global memory to the name of the VTable entry for a class.
"208":"_ZTVN10__cxxabiv117__class_type_infoE"
indicates that the VTable for __cxxabiv1::__class_type_info
is at offset 208 in the global static section. By examining a reference to a base type and looking up the pointer's VTable, the debugger can show the specific type instance
Contains a map from minified function name to original emitted name. Generated from the symbol map.
Based on llvm.dbg.declare
and llvm.dbg.value
, these intrinsics are designed to be no-ops during normal execution. While a program is being debugged, the debugger can replace them to visualize values from the running program
Called when the value of a program variable is in a Javascript value.
- Local Javascript variable
- Type ID for
types
- Offset
- Metadata ID for the DWARF expression to read variable
Called when the value is fixed to some value.
- Constant value
- Type ID for
types
- Offset
- Metadata ID for the DWARF expression to read variable
Unlike the dbg intrinsics, this is not designed to end up in the final Javascript. Instead it should be stripped out as the last step in building the output JS. This allows a diverse set of JS optimization passes to be mostly oblivious to offset tracking.
- Line number
- Filename