Skip to content

Instantly share code, notes, and snippets.

@hdarjus
Last active September 27, 2021 13:32
Show Gist options
  • Save hdarjus/55c6bdca70c4fdde45aca14b5cd608b3 to your computer and use it in GitHub Desktop.
Save hdarjus/55c6bdca70c4fdde45aca14b5cd608b3 to your computer and use it in GitHub Desktop.
Debug C++ in an R package

Description

Feel free to share the content but please refer back to this original post whenever possible. Thanks, I appreciate it. The code comes without warranty and no responsibility is taken. There may be typos.

Setup

Here I document my setup for debugging the C++ code of any R package. This collection of files assumes that the stochvol package is debugged. This should work with C code as well, and it does not matter whether the package uses Rcpp. The setup uses GDB. Most importantly, the package has to be compiled with debug information and ideally with all optimizations turned off.

For pretty printing of armadillo objects, the gdb_armadillo_helpers Github repo is used from the directory ~/Development. To have syntax highlighting for this Gist, I renamed the files here and gave them extensions. The actual file names are shown in the comment at the beginning of the file where applicable.

I work in the Windows Subsystem for Linux (version 2). There is no GUI but since my editor of choice is a terminal editor (VIM with many plugins), this is not an issue at all for me. An alternative can be a dual boot system with an Ubuntu installed alongside one's Windows, or a virtual machine in Windows or MacOS.

The project directory is ~/Development/stochvol.

Debugging

To start debugging, execute

R_MAKEVARS_USER=~/.R/Makevars-debug Rscript --no-init-file --debugger=gdb --no-save --no-restore debug-stochvol.R

Useful GDB-commands are

  • break
  • print VARIABLE
  • info breakpoints
  • info locals
  • info args
  • disable
  • continue
  • call EXPRESSION

A neat trick: to overwrite the value of e.g. a double variable called d to 14.76, do

print &d
set *((double *) 0xbfbb0000) = 14.76

where the result of the print command was 0xbfbb0000. (Source)

Memory leaks

You can reproduce the memory leak check of CRAN at home by going to the parent directory of the R package (i.e. above its root directory; in our case ~/Development) and executing

R CMD build --no-build-vignettes --no-manual stochvol && \
  R_MAKEVARS_USER=~/.R/Makevars-debug R CMD check --as-cran --use-valgrind stochvol_3.1.0.tar.gz

The most useful tool for debugging memory leaks is the combination of Valgrind and GDB. Please refer to this description on how to use them in combination. The code to execute is

R_MAKEVARS_USER=~/.R/Makevars-debug Rscript --no-init-file --debugger=valgrind --debugger-args='--track-origins=yes --leak-check=full --show-reachable=yes --vgdb=yes --vgdb-error=0' --no-save --no-restore Development/debug-stochvol.R

It makes sense to use Valgrind without GDB too just as an automated check with your own script, such as debug-stochvol.R. For that, just omit --vgdb=yes --vgdb-error=0 from the above command.

Cache misses

Valgrind has a tool for counting cache misses as well. Execute from the parent directory

SOURCE_TMPDIR=~/Development/tmp-stochvol  # needed for cg_annotate line-by-line annotation to work well
mkdir -p ${SOURCE_TMPDIR}

R_MAKEVARS_USER=~/.R/Makevars-cache R CMD build --no-build-vignettes --no-manual stochvol && \
  R_MAKEVARS_USER=~/.R/Makevars-cache TMPDIR=${SOURCE_TMPDIR} R CMD INSTALL --no-staged-install stochvol_3.1.0.tar.gz && \
  Rscript --no-init-file --debugger=valgrind --debugger-args='--tool=cachegrind' --no-save --no-restore Development/debug-stochvol.R

The flag --no-staged-install does not work on my computer and temporary directories are created anyway. For line-by-line annotations, we need to find out what this temporary directory was. The steps:

  1. Execute cg_annotate --auto=yes cachegrind.out.<pid> and check at the end which files were not annotated because they have been removed. This gives us a hint about what the temporary directory was during package installation for R. We have to copy the source files to that path for cg_annotate to be able to annotate the lines.
  2. Do something like cp -r stochvol ${SOURCE_TMPDIR}/RtmpXXXXXXXXX/R.INSTALLXXXXXXXX/stochvol after creating these directories and with the XXXXXXX having the corresponding values.
  3. At the end, execute cg_annotate --auto=yes cachegrind.out.<pid> | less.

Profiling

Valgrind has a profiling tool as well. There are similar problems with temporary directories and line-by-line annotations as for cache misses. The only part that changes is that this should be executed:

SOURCE_TMPDIR=~/Development/tmp-stochvol  # needed for cg_annotate line-by-line annotation to work well

R_MAKEVARS_USER=~/.R/Makevars-cache R CMD build --no-build-vignettes --no-manual stochvol && \
  R_MAKEVARS_USER=~/.R/Makevars-cache TMPDIR=${SOURCE_TMPDIR} R CMD INSTALL --no-staged-install stochvol_3.1.0.tar.gz && \
  Rscript --no-init-file --debugger=valgrind --debugger-args='--tool=callgrind' --no-save --no-restore Development/debug-stochvol.R

and then run callgrind_annotate --inclusive=yes --auto=yes callgrind.out.<pid>.

message("Executing debug-stochvol.R")
set.seed(5)
devtools::load_all()
dat <- svsim(5000, mu = -9, phi = 0.95, sigma = 0.3, nu = 90, rho = -0.4)
y <- dat$y
h <- 2 * log(dat$vol)
theta <- list(mu = -9, phi = 0.95, sigma = 0.3, nu = 90, rho = -0.4)
priorspec <- specify_priors(mu = sv_normal(-10, 1),
phi = sv_beta(15, 1.5),
sigma2 = sv_gamma(0.5, 0.5),
rho = sv_beta(4, 4),
nu = sv_exponential(0.1))
def_gen_sv <- get_default_general_sv(priorspec, multi_asis = 2L)
def_gen_sv$nu_asis_setup <- c(2L, 2L, 2L)
def_gen_sv$theta_asis_setup <- c(2L, 2L, 2L)
a <- svsample_general_cpp(
y = y, draws = 100, burnin = 0, designmatrix = matrix(NA),
priorspec = priorspec, thinpara = 1, thinlatent = 1,
keeptime = "all", startpara = theta, startlatent = h,
keeptau = TRUE, print_settings = list(quiet = TRUE,
n_chains = 1,
chain = 1),
correct_model_misspecification = TRUE, interweave = TRUE,
myoffset = 0, general_sv = def_gen_sv)
# This is the main gdbinit file in the home directory. Path and file name:
# ~/.gdbinit
python
__import__('eigengdb').register_eigen_printers(None)
end
source ~/Development/gdb_armadillo_helpers/gdb_helpers/gdb_armadillo_printers.py
source ~/Development/gdb_armadillo_helpers/gdb_helpers/gdb_std_complex_printer.py
source ~/Development/gdb_armadillo_helpers/gdb_helpers/gdb_armadillo_xmethods.py
source ~/Development/gdb_armadillo_helpers/gdb_helpers/gdb_armadillo_to_numpy.py
# Print through valgrind
# https://stackoverflow.com/questions/23377559/how-can-i-use-a-variable-name-instead-of-addresses-when-debugging-valgrind-runs/23381044#23381044
# 0bit - initialized
# 1bit - uninitialized
# shown in hexadecimal code. For example, ffffffff is the output for a fully uninitialized 4-bit integer
define wat
eval "monitor get_vbits %p %d", &$arg0, sizeof($arg0)
end
# Print SEXP objects
define printR
eval "call Rf_PrintValue(%p->data)", &$arg0
end
set auto-load safe-path /
# Another central Makevars file for R. Path and file name
# ~/.R/Makevars-cache
CLANGCOMPILER=clang++
GCCCOMPILER=g++
GCCVER=-10
CLANGVER=-12
CCACHE=ccache
DEBUGGER=gdb
#CC=${CCACHE} clang${CLANGVER}
CC=${CCACHE} gcc${GCCVER}
CFLAGS+=-Wall -g3
#CFLAGS+=-g${DEBUGGER} --debug -O0
CPPCOMPILER=${CCACHE} ${GCCCOMPILER}${GCCVER}
#CPPCOMPILER=${CCACHE} ${CLANGCOMPILER}${CLANGVER}
CXX=${CPPCOMPILER}
CXX11=${CPPCOMPILER}
CXX14=${CPPCOMPILER}
CXX17=${CPPCOMPILER}
CXX20=${CPPCOMPILER}
CXX11STD=-std=c++11
CXX14STD=-std=c++14
CXX17STD=-std=c++17
CXX20STD=-std=c++20
CXXSTD=CXX11
CXXFLAGS=-pedantic -Wall
CXX11FLAGS=-pedantic -Wall
CXX14FLAGS=-pedantic -Wall
CXX17FLAGS=-pedantic -Wall
# -ffast-math improves speed by ignoring NaN-safety and floating point issues, among others
CXXFLAGS+=-O3 -g3 -ffast-math
CXX11FLAGS+=-O3 -g3 -ffast-math
CXX14FLAGS+=-O3 -g3 -ffast-math
CXX17FLAGS+=-O3 -g3 -ffast-math
LDFLAGS=-g
# Central Makevars file for R. Path and file name
# ~/.R/Makevars-debug
GCCCOMPILER=g++
GCCVER=-10
CCACHE=ccache
DEBUGGER=gdb
CC=${CCACHE} gcc${GCCVER}
CFLAGS=-Wall
CFLAGS+=-g${DEBUGGER} --debug -O0 -fsanitize=undefined -fno-omit-frame-pointer -fno-inline
CFLAGS+=-UNDEBUG
CPPCOMPILER=${CCACHE} ${GCCCOMPILER}${GCCVER}
#CPPCOMPILER=${CCACHE} ${CLANGCOMPILER}${CLANGVER}
CXX=${CPPCOMPILER}
CXX11=${CPPCOMPILER}
CXX14=${CPPCOMPILER}
CXX17=${CPPCOMPILER}
CXX20=${CPPCOMPILER}
CXX11STD=-std=c++11
CXX14STD=-std=c++14
CXX17STD=-std=c++17
CXX20STD=-std=c++20
CXXSTD=CXX20
CXXFLAGS=-pedantic -Wall
CXX11FLAGS=-pedantic -Wall
CXX14FLAGS=-pedantic -Wall
CXX17FLAGS=-pedantic -Wall
CXXFLAGS+=-g${DEBUGGER} --debug -O0 -UNDEBUG -fsanitize=undefined -fno-omit-frame-pointer -fno-inline
CXX11FLAGS+=-g${DEBUGGER} --debug -O0 -UNDEBUG -fsanitize=undefined -fno-omit-frame-pointer -fno-inline
CXX14FLAGS+=-g${DEBUGGER} --debug -O0 -UNDEBUG -fsanitize=undefined -fno-omit-frame-pointer -fno-inline
CXX17FLAGS+=-g${DEBUGGER} --debug -O0 -UNDEBUG -fsanitize=undefined -fno-omit-frame-pointer -fno-inline
# Project-specific gdbinit file in the project directory. File name
# .gdbinit
# Pre-set some interesting breakpoints that are then disabled by default
set breakpoint pending on
break stochvol::general_sv::centered::draw_theta(double, double, double, double, stochvol::general_sv::centered::SufficientStatistic const&, arma::Col<unsigned int> const&, stochvol::PriorSpec const&, stochvol::ExpertSpec_GeneralSV const&, stochvol::ProposalDiffusionKen const&)
break stochvol::general_sv::centered::theta_log_likelihood(double, double, double, double, stochvol::general_sv::centered::SufficientStatistic const&, stochvol::PriorSpec const&)
break stochvol::general_sv::noncentered::draw_theta(double, double, double, double, stochvol::general_sv::noncentered::SufficientStatistic const&, arma::Col<unsigned int> const&, stochvol:: PriorSpec const&, stochvol::ExpertSpec_GeneralSV const&, stochvol::ProposalDiffusionKen const&)
break stochvol::general_sv::noncentered::theta_log_likelihood(double, double, double, double, stochvol::general_sv::noncentered::SufficientStatistic const&, stochvol::PriorSpec const&)
break stochvol::general_sv::update
break update_t_error(arma::Col<double> const&, arma::Col<double>&, arma::Col<double> const&, arma::Col<double> const&, double&, stochvol::PriorSpec const&, bool, stochvol::Adaptation&, stochvol::ExpertSpec_GeneralSV::Strategy const&)
disable breakpoints
# These breakpoints are enabled by default
break sampling_latent_states.cc:219
break utils_main.h:88
break stochvol::general_sv::test_function
set breakpoint pending off
# Always enable exceptions to be caught by gdb
catch throw
catch catch
# Start R
run
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment