Feel free to share the content but please refer back to this original post whenever possible. Thanks, I appreciate it. The code comes without warranty and no responsibility is taken. There may be typos.
Here I document my setup for debugging the C++ code of any R package. This collection of files assumes that the stochvol
package is debugged. This should work with C code as well, and it does not matter whether the package uses Rcpp
.
The setup uses GDB. Most importantly, the package has to be compiled with debug information and ideally with all optimizations turned off.
For pretty printing of armadillo objects, the gdb_armadillo_helpers Github repo is used from the directory ~/Development. To have syntax highlighting for this Gist, I renamed the files here and gave them extensions. The actual file names are shown in the comment at the beginning of the file where applicable.
I work in the Windows Subsystem for Linux (version 2). There is no GUI but since my editor of choice is a terminal editor (VIM with many plugins), this is not an issue at all for me. An alternative can be a dual boot system with an Ubuntu installed alongside one's Windows, or a virtual machine in Windows or MacOS.
The project directory is ~/Development/stochvol.
To start debugging, execute
R_MAKEVARS_USER=~/.R/Makevars-debug Rscript --no-init-file --debugger=gdb --no-save --no-restore debug-stochvol.R
Useful GDB-commands are
break
print
VARIABLEinfo breakpoints
info locals
info args
disable
continue
call
EXPRESSION
A neat trick: to overwrite the value of e.g. a double
variable called d
to 14.76
, do
print &d
set *((double *) 0xbfbb0000) = 14.76
where the result of the print command was 0xbfbb0000
.
(Source)
You can reproduce the memory leak check of CRAN at home by going to the parent directory of the R package (i.e. above its root directory; in our case ~/Development) and executing
R CMD build --no-build-vignettes --no-manual stochvol && \
R_MAKEVARS_USER=~/.R/Makevars-debug R CMD check --as-cran --use-valgrind stochvol_3.1.0.tar.gz
The most useful tool for debugging memory leaks is the combination of Valgrind and GDB. Please refer to this description on how to use them in combination. The code to execute is
R_MAKEVARS_USER=~/.R/Makevars-debug Rscript --no-init-file --debugger=valgrind --debugger-args='--track-origins=yes --leak-check=full --show-reachable=yes --vgdb=yes --vgdb-error=0' --no-save --no-restore Development/debug-stochvol.R
It makes sense to use Valgrind without GDB too just as an automated check with your own script, such as debug-stochvol.R
. For that, just omit --vgdb=yes --vgdb-error=0
from the above command.
Valgrind has a tool for counting cache misses as well. Execute from the parent directory
SOURCE_TMPDIR=~/Development/tmp-stochvol # needed for cg_annotate line-by-line annotation to work well
mkdir -p ${SOURCE_TMPDIR}
R_MAKEVARS_USER=~/.R/Makevars-cache R CMD build --no-build-vignettes --no-manual stochvol && \
R_MAKEVARS_USER=~/.R/Makevars-cache TMPDIR=${SOURCE_TMPDIR} R CMD INSTALL --no-staged-install stochvol_3.1.0.tar.gz && \
Rscript --no-init-file --debugger=valgrind --debugger-args='--tool=cachegrind' --no-save --no-restore Development/debug-stochvol.R
The flag --no-staged-install
does not work on my computer and temporary directories are created anyway. For line-by-line annotations, we need to find out what this temporary directory was. The steps:
- Execute
cg_annotate --auto=yes cachegrind.out.<pid>
and check at the end which files were not annotated because they have been removed. This gives us a hint about what the temporary directory was during package installation for R. We have to copy the source files to that path forcg_annotate
to be able to annotate the lines. - Do something like
cp -r stochvol ${SOURCE_TMPDIR}/RtmpXXXXXXXXX/R.INSTALLXXXXXXXX/stochvol
after creating these directories and with the XXXXXXX having the corresponding values. - At the end, execute
cg_annotate --auto=yes cachegrind.out.<pid> | less
.
Valgrind has a profiling tool as well. There are similar problems with temporary directories and line-by-line annotations as for cache misses. The only part that changes is that this should be executed:
SOURCE_TMPDIR=~/Development/tmp-stochvol # needed for cg_annotate line-by-line annotation to work well
R_MAKEVARS_USER=~/.R/Makevars-cache R CMD build --no-build-vignettes --no-manual stochvol && \
R_MAKEVARS_USER=~/.R/Makevars-cache TMPDIR=${SOURCE_TMPDIR} R CMD INSTALL --no-staged-install stochvol_3.1.0.tar.gz && \
Rscript --no-init-file --debugger=valgrind --debugger-args='--tool=callgrind' --no-save --no-restore Development/debug-stochvol.R
and then run callgrind_annotate --inclusive=yes --auto=yes callgrind.out.<pid>
.