My background: Not an expert on debuggers nor embedded systems.
Scope: GNU toolchain. What code does gcc generate? How does gdb interpret it?
END GOAL: An embedded developer should have an intuition for how call flow, control flow and data flow is presented by the debugger.
Many people have described DWARF and how to debug optimized code, but I haven't found an article that gives practical examples of debugging sessions with optimized code.
- Michael J. Eagers Introduction to the DWARF Debugging Format
- The DWARF Debugging Information Format Version 5
- Li, Y., Ding, S., Zhang, Q., & Italiano, D. (2020, June). Debug information validation for optimized code. In PLDI (pp. 1052-1065).
- Hennessy, John. "Symbolic debugging of optimized code." ACM Transactions on Programming Languages and Systems (TOPLAS) 4.3 (1982): 323-344
- Sami Al Bahras !!Con 2016 presentation Debugging Debuggers
- Eli Benderskis 6-part series on Debuggers
- Djordje Totorovic Triplefault presentation about Recovering optimized out variables by finding parameter values in parent frame
- Alexandre Olivia GCC gology: studing the impact of optimization on debugging
- Greg Laws CppCon presentation Under the hood of Linux c++ debugging tools
- How to Update Debug Info: A Guide for LLVM Pass Authors
- Use some common library such as newlib or libopencm3 and inspect various functions with optimizations disabled and enabled.
- Present the internals of DWARF in a readable format using Eli Benderskis pyelftools.
- What optimization level to choose? Should I discuss anything besides -Os? A great advantage of -Os is that its far easier for a human to read the code compared to -O2
- Should I run the code on target? Nucleos board. Or rednode or just compiler explorer?
- How much knowledge of assembly to expect?
- What embedded specific things is worth mentioning?
- What is the overall structure of the debug information in an ELF file?
- A tree of Debugging Information Entrys (DIE)
- Try not to get too bogged down in internals of the file format.
- Touched on in https://interrupt.memfault.com/blog/gnu-binutils#dumping-dwarf-information
- .debug_line for mapping PC to src line
- address->src is a N:1 mapping
- src->address is a 1:N mapping
- compiler explorer uses the dwarf .debug_line data for coloring lines
- readelf -wL
- dwarfdump -l
- .debug_frame
- .eh_frame
- Entry point of function is import - where you wanna place a breakpoint
- You want to have a mapping [addr_low, addr_high] => function name
- There's backtrace(3) which relies on .dynsym symbols
- can't use .eh_frame for unwinding
- you need to export functions with -rdynamic
- inlined functions have no stack frames
- tail-call opt replaces stack
- So you need dwarf info for proper backtraces.
- Many debuggers can link variables to their location on the stack, but I feel there should be a better visualizations of the stack.
- .debug_info
- Describe how the debug information can find if a variable is a constant, lives in memory or in a register.
- Local variables: DW_TAG_variable
- Parameters: DW_TAG_formal_parameter
- Segger Ozone shows if a variable is in memory, register or is a compile constant.
- Would be nice to have a tool that showed a marker in the left column for where a variable was defined when hovering. Maybe I can write something in those lines using pyelftools
A compiler may split a function into a hot and cold part. Happens a lot with JIT compilers and I guess with whole program optimization. But is it worth bringing it up here?
- gdb pretends that the call site and the start of the inlined function are different instructions.
- Stepi and nexti always show the inlined body though.
- Settings breakpoints at the call site of an inlined function may not work.
- Gdb may fail to locate the ret val of inlined calls after using the finish command.
- Do different IDEs have different ways of displaying inlined functions?
- If code has been reordered you may find the debugger jumping back and forth
- Show simple transformations that compilers do to loops: the canonical do-while form
- Are there any clever ways of visualizing that, which I haven't heard of?
- Some things hinders the compiler from optimizing, such as pointers, volatile, compile barriers.
- Discuss some tradeoffs.
- More aggressive inlining
- More abstractions to see through
- operator overloading
- You may want to skip certain functions in your standard library when stepping
- The joys of vtable overwrites (most likely out of scope)
- Clang and Gcc differs.
- Mention something about differences between DWARF versions?, 3, 4, 5.
- -g vs -g3
- dwarfdump
- readelf
- pyelftools
- Compare some IDEs?
.debug_abbrev Abbrevations used in the .debug_info section .debug_aranges Lookup table for mapping addresses to compilation units .debug_frame Call Frame information .debug_info The core DWARF information section .debug_line Line number information .debug_loc Location lits used in DW_AT_location attributes .debug_macinfo Macro information .debug_pubnames Lookup table for mapping objects and function names to compilation units .debug_pubtypes Lookup table for mapping type names to compilation units .debug_ranges Address ranges unsed in DW_AT_ranges attributes .debug_str String table used in .debug_info