Skip to content

Instantly share code, notes, and snippets.

@dannas
Last active September 6, 2020 17:58
Show Gist options
  • Save dannas/0532cb6ccaab6afc664967ce00899a32 to your computer and use it in GitHub Desktop.
Save dannas/0532cb6ccaab6afc664967ce00899a32 to your computer and use it in GitHub Desktop.
Tentative outline for debugging optimized code.

My background: Not an expert on debuggers nor embedded systems.

Scope: GNU toolchain. What code does gcc generate? How does gdb interpret it?

END GOAL: An embedded developer should have an intuition for how call flow, control flow and data flow is presented by the debugger.

Prior Art

Many people have described DWARF and how to debug optimized code, but I haven't found an article that gives practical examples of debugging sessions with optimized code.

Approach

  • Use some common library such as newlib or libopencm3 and inspect various functions with optimizations disabled and enabled.
  • Present the internals of DWARF in a readable format using Eli Benderskis pyelftools.

Questions

  • What optimization level to choose? Should I discuss anything besides -Os? A great advantage of -Os is that its far easier for a human to read the code compared to -O2
  • Should I run the code on target? Nucleos board. Or rednode or just compiler explorer?
  • How much knowledge of assembly to expect?
  • What embedded specific things is worth mentioning?

DWARF

How map my source code to the assembly and visa versa?

  • .debug_line for mapping PC to src line
  • address->src is a N:1 mapping
  • src->address is a 1:N mapping
  • compiler explorer uses the dwarf .debug_line data for coloring lines
  • readelf -wL
  • dwarfdump -l

How does a debugger generate a backtrace?

  • .debug_frame
  • .eh_frame
  • Entry point of function is import - where you wanna place a breakpoint
  • You want to have a mapping [addr_low, addr_high] => function name
  • There's backtrace(3) which relies on .dynsym symbols
    • can't use .eh_frame for unwinding
    • you need to export functions with -rdynamic
    • inlined functions have no stack frames
    • tail-call opt replaces stack
  • So you need dwarf info for proper backtraces.
  • Many debuggers can link variables to their location on the stack, but I feel there should be a better visualizations of the stack.

How does a debugger know where a variable lives and how does it deal with relabelling?

  • .debug_info
  • Describe how the debug information can find if a variable is a constant, lives in memory or in a register.
  • Local variables: DW_TAG_variable
  • Parameters: DW_TAG_formal_parameter
  • Segger Ozone shows if a variable is in memory, register or is a compile constant.
  • Would be nice to have a tool that showed a marker in the left column for where a variable was defined when hovering. Maybe I can write something in those lines using pyelftools

Outlining

A compiler may split a function into a hot and cold part. Happens a lot with JIT compilers and I guess with whole program optimization. But is it worth bringing it up here?

Inlining

  • gdb pretends that the call site and the start of the inlined function are different instructions.
  • Stepi and nexti always show the inlined body though.
  • Settings breakpoints at the call site of an inlined function may not work.
  • Gdb may fail to locate the ret val of inlined calls after using the finish command.
  • Do different IDEs have different ways of displaying inlined functions?

Reordering

  • If code has been reordered you may find the debugger jumping back and forth
  • Show simple transformations that compilers do to loops: the canonical do-while form
  • Are there any clever ways of visualizing that, which I haven't heard of?

Volatile

  • Some things hinders the compiler from optimizing, such as pointers, volatile, compile barriers.
  • Discuss some tradeoffs.

C++ specific considerations

  • More aggressive inlining
  • More abstractions to see through
  • operator overloading
  • You may want to skip certain functions in your standard library when stepping
  • The joys of vtable overwrites (most likely out of scope)

Quality of Debuginfo

  • Clang and Gcc differs.
  • Mention something about differences between DWARF versions?, 3, 4, 5.
  • -g vs -g3

Tools

  • dwarfdump
  • readelf
  • pyelftools
  • Compare some IDEs?

.debug_abbrev Abbrevations used in the .debug_info section .debug_aranges Lookup table for mapping addresses to compilation units .debug_frame Call Frame information .debug_info The core DWARF information section .debug_line Line number information .debug_loc Location lits used in DW_AT_location attributes .debug_macinfo Macro information .debug_pubnames Lookup table for mapping objects and function names to compilation units .debug_pubtypes Lookup table for mapping type names to compilation units .debug_ranges Address ranges unsed in DW_AT_ranges attributes .debug_str String table used in .debug_info

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment