Experiments with gcc auto instrumentration of C/C++ source code for run-time function call tree tracing continued...
- In the original experiments [1] we managed to auto instrument C but not C++.
- So now it's back to the drawing board to figure out how to do the same for C++.
[1] Original auto instrumentation experiments
- There are several reasons:
- (a) There's only
-aux-info
for the C compiler but not C++ compiler. - (b) We want the techinque to be line number neutral, meaning that auto instrumentated files will not changing the line numbers of original source code.
- Being line number neutral with C was easy because we can either force include a file 'before' the original source code, or append new function after the original source code; whatever, the original source code line numbers remain the same.
- However, assuming we found a way to generate wrapper functions in C++ without
-aux-info
, then C++ wants class functions defined within the same single class scope. - This means the wrapper function needs to be inserted into the class itself which disrupts line numbers, e.g.
class Foo { bar(...){...} wrap_bar(...){...} };
- Apparently there's a GCC compiler extension which supports RAII [1] for C [2], allowing e.g. the possibility of tracing function entry and exit points [3] [4] etc.
- Since the attribute cleanup RAII extension works when compiling both C and C++, this means we are half way there to auto instrument both C and C++ code with the same technique.
- The other good thing about the technique is that it's line number neutral.
- Now we just need to figure out a way to auto insert a preprocessor macro at the start of every function.
[1] Resource acquisition is initialization [2] RAII in C: cleanup gcc compiler extension [3] Create macro in C that traces current scope [4] Poor man’s function_graph trace for C programs
Next problem: How to automatically insert a preprocessor macro at the start of every C or C++ function?
- This is a difficult problem because C and C++ are notoriously difficult to parse, and we don't want to go to the trouble of creating a new parser.
- Turns out that there is a tool called
CastXML
[1] (successor toGCC-XML
[2]) which describes much of C and C++ files in XML format. - 'Much of' includes a list of all functions including constructors and destructors.
- So by using
CastXML
we can get a list of all functions and the exact file and line number where the source code of the function can be found. - This makes the task of finding the insertion point for the instrumentation macros much easier, and it can be done with a simpler regex, rather than writing an own C or C++ parser.
- However, there is still the issue of figuring out the exact number of source files which need to be changed.
[1] CastXML is a C-family abstract syntax tree XML output tool [2] Welcome to GCC-XML, the XML output extension to GCC
- Because
CastXML
only includes a subset of C and C++ info as XML data, it doesn't always list all the source files necessary to compile an individual object file. - For this we must go back to the compiler and ask it to tell us all the dependents using the
-MM
[1] option, e.g. [2]. - In this way we can generate an exact list of dependent files without worrying that
CastXML
has excluded one or more of them because it contains no relevant info forCastXML
.
[1] -MM
[2] how to generate the dependency between files automatically?
- The previous mechanism, e.g., counted the number of function calls and displayed in upon exit.
- If data is only stored on the stack for the duration of the function call, how to store the function call number data?
- Also, in the future, further info will be stored such as the verbosity level.
- The instrumentation script, which inserts the macros, can also create a static variable per macro insertion and pass the address of the variable to the macro.
- The definition for the static variable can be appended to the main source file being compiled.
- This can cause an issue if a function from a header file is included in two or more main source files being compiled.
- The issue can be fixed by defining the static variable multiple times, but using attribute weak [1] linking.
[1] https://en.wikipedia.org/wiki/Weak_symbol
Next problem: How to instrument a file dependent only once even if it is used by multiple source files?
- For example,
baz.hpp
might be used byfoo.cpp
andbar.cpp
. - Todo!
- Note: The illustration does not include pretending to be
gcc
org++
to work with any makefile. - Note: The illustration does not include fancy instrumentation formatting and folding, etc.
- Note: The illustration does not include any mechanism to control the verbosity at run-time, or exclude instrumentation at compile time.
- Note: This is purely to illustrate the automatic instrumentation of source code using the above described techniques.
- Example main source files if compiling everything as C:
$ cat --number c-example-1.c
1 #include "cpp-example-1.cpp"
$ cat --number c-example-2.c
1 #include "cpp-example-2.cpp"
- Example main source files if compiling everything as C++:
$ cat --number cpp-example-1.cpp
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include "cpp-example-1.hpp"
4
5 void clean_up(int * my_int) { printf("- my_int=%d\n", *my_int); }
6 void bye_baz (void ) { printf("- called via atexit() via baz()\n"); }
7 void bye (void ) { printf("- called via atexit() via main()\n"); }
8 int baz (int a ) { atexit(bye_baz); return a ; }
9 int bar (int a ) { return baz(a); }
10
11 #ifdef __cplusplus
12 class Foo {
13 private:
14 my_struct my_struct_3;
15 #endif
16 int my_private(int a) { return bar(a); }
17
18 #ifdef __cplusplus
19 public:
20 Foo(const char* arg1) : my_struct_3("my_struct_3", arg1) { printf("- constructing Foo\n"); }
21 ~Foo() { printf("- deconstructing Foo\n"); }
22 #endif
23 int my_public(int a) { return my_private(a); }
24 #ifdef __cplusplus
25 };
26 #endif
27
28 int main() {
29 atexit(bye);
30 int my_int __attribute__ ((__cleanup__(clean_up))) = 1;
31 my_int = 5;
32 printf("- hello world\n");
33 int b = baz(1);
34 #ifdef __cplusplus
35 int x,y;
36 get_max <int> (x,y);
37 my_struct my_struct_2("my_struct_2");
38 Foo foo_1("a");
39 Foo foo_2("b");
40 return bar(foo_1.my_public(0));
41 #else
42 return bar( my_public(0));
43 #endif
44 }
$ cat --number cpp-example-2.cpp
1 #include <stdio.h>
2 #include "cpp-example-1.hpp"
3
4 #ifdef __cplusplus
5 my_struct my_struct_1("my_struct_1");
6 #endif
- Example header file that
CastXML
would miss / not list in its XML output:
$ cat cpp-example-1.hpp
#include "cpp-example-2.hpp"
- Example header file containing C++ functions to auto instrument:
$ cat cpp-example-2.hpp
#ifdef __cplusplus
struct my_struct {
my_struct(const char *f ) { f_ = f; printf("- ""constructing my_struct: arg1:%s\n" , f_ ); }
my_struct(const char *f, const char *b) { f_ = f; printf("- ""constructing my_struct: arg1:%s arg2:%s\n", f_, b); }
~my_struct( ) { printf("- deconstructing my_struct: %s\n" , f_ ); }
const char *f_;
};
template <class my_type>
my_type get_max (my_type a, my_type b) {
return (a>b?a:b);
}
#endif
$ rm -f ./c-example
$ gcc -g -o c-example-1.o -c c-example-1.c
$ gcc -g -o c-example-2.o -c c-example-2.c
$ gcc -g -o c-example c-example-1.o c-example-2.o && ./c-example
- hello world
- my_int=5
- called via atexit() via baz()
- called via atexit() via baz()
- called via atexit() via baz()
- called via atexit() via main()
$ rm -f ./cpp-example
$ gcc -g -o cpp-example-1.o -c cpp-example-1.cpp
$ gcc -g -o cpp-example-2.o -c cpp-example-2.cpp
$ gcc -g -o cpp-example cpp-example-1.o cpp-example-2.o -lstdc++ && ./cpp-example
- hello world
- my_int=5
- called via atexit() via baz()
- called via atexit() via baz()
- called via atexit() via baz()
- called via atexit() via main()
--- compile and run C++ ---
- constructing my_struct: arg1:my_struct_1
- hello world
- constructing my_struct: arg1:my_struct_2
- constructing my_struct: arg1:my_struct_3 arg2:a
- constructing Foo
- constructing my_struct: arg1:my_struct_3 arg2:b
- constructing Foo
- deconstructing Foo
- deconstructing my_struct: my_struct_3
- deconstructing Foo
- deconstructing my_struct: my_struct_3
- deconstructing my_struct: my_struct_2
- my_int=5
- called via atexit() via baz()
- called via atexit() via baz()
- called via atexit() via baz()
- called via atexit() via main()
- deconstructing my_struct: my_struct_1
- The script takes the C or C++ file to be compiled and creates a
CastXML
command line, and runs it, and loads in the generated XML. - In the case of C then we tell
CastXML
to use C, and also to 'disable' C++ via-U__cplusplus
. - The script runs the compiler using
-MM
to create a list of local dependencies. - For each local dependency, the script modifies it including adding the macros and changing the include names to the modified file names.
- Finally the newly written modified source files can be compiled and ran.
- Note: In this example, the non-instrumentated
printf()
output appears 'magically' indented because we overrideprintf()
to make the output look nicer.
$ perl cpp-xml-cpp.pl c-example-1.c
- cpp_file=c-example-1.c
- grab AST in XML via command: castxml --verbose --castxml-cc-gnu gcc -std=c89 -U__cplusplus --castxml-gccxml -o c-example-1.c.xml c-example-1.c 2>&1
> Ubuntu clang version 3.7.1-1ubuntu4 (tags/RELEASE_371/final) (based on LLVM 3.7.1)
> Target: x86_64-pc-linux-gnu
> Thread model: posix
> Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6.0.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.4.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.0.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.4.0
> Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.4.0
> Candidate multilib: .;@m64
> Selected multilib: .;@m64
> clang -cc1 version 3.7.1 based upon LLVM 3.7.1 default target x86_64-pc-linux-gnu
> #include "..." search starts here:
> #include <...> search starts here:
> /usr/include/c++/5
> /usr/include/x86_64-linux-gnu/c++/5
> /usr/include/c++/5/backward
> /usr/share/castxml/clang/include
> /usr/local/include
> /usr/lib/gcc/x86_64-linux-gnu/5/include-fixed
> /usr/include/x86_64-linux-gnu
> /usr/include
> End of search list.
- xml bytes: 130968
- xml_lines=1538
- cpp file local deps via command: gcc -MM -g c-example-1.c: c-example-1.o: c-example-1.c cpp-example-1.cpp cpp-example-1.hpp cpp-example-2.hpp
- looping over 4 local deps: c-example-1.c cpp-example-1.cpp cpp-example-1.hpp cpp-example-2.hpp
- cpp_file_local_dep: c-example-1.c
- cpp_lines=1
- file_id=
- call_lines=0
- replaced 1 times: cpp-example-1.cpp with cpp-example-1.cpp.cwrap.2.cpp
- wrote file: c-example-1.c.cwrap.2.c
- cpp_file_local_dep: cpp-example-1.cpp
- cpp_lines=44
- file_id=f22
- call_lines=8
- line 5+0: context :: :: name clean_up : what Function 1: void clean_up(int * my_int) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_clean_up_5, "cpp-example-1.cpp", "clean_up", "5"); printf("- my_int=%d\n", *my_int); }
- line 6+0: context :: :: name bye_baz : what Function 1: void bye_baz (void ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_bye_baz_6, "cpp-example-1.cpp", "bye_baz", "6"); printf("- called via atexit() via baz()\n"); }
- line 7+0: context :: :: name bye : what Function 1: void bye (void ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_bye_7, "cpp-example-1.cpp", "bye", "7"); printf("- called via atexit() via main()\n"); }
- line 8+0: context :: :: name baz : what Function 1: int baz (int a ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_baz_8, "cpp-example-1.cpp", "baz", "8"); atexit(bye_baz); return a ; }
- line 9+0: context :: :: name bar : what Function 1: int bar (int a ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_bar_9, "cpp-example-1.cpp", "bar", "9"); return baz(a); }
- line 16+0: context :: :: name my_private : what Function 1: int my_private(int a) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_my_private_16, "cpp-example-1.cpp", "my_private", "16"); return bar(a); }
- line 23+0: context :: :: name my_public : what Function 1: int my_public(int a) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_my_public_23, "cpp-example-1.cpp", "my_public", "23"); return my_private(a); }
- line 28+0: context :: :: name main : what Function 1: int main() {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_main_28, "cpp-example-1.cpp", "main", "28");
- replaced 1 times: cpp-example-1.hpp with cpp-example-1.hpp.cwrap.2.hpp
- wrote file: cpp-example-1.cpp.cwrap.2.cpp
- cpp_file_local_dep: cpp-example-1.hpp
- cpp_lines=1
- file_id=
- call_lines=0
- replaced 1 times: cpp-example-2.hpp with cpp-example-2.hpp.cwrap.2.hpp
- wrote file: cpp-example-1.hpp.cwrap.2.hpp
- cpp_file_local_dep: cpp-example-2.hpp
- cpp_lines=13
- file_id=
- call_lines=0
- wrote file: cpp-example-2.hpp.cwrap.2.hpp
- appending 8 function contexts to: c-example-1.c.cwrap.2.c
- wrote file: cwrap.h
$ perl cpp-xml-cpp.pl c-example-2.c
- cpp_file=c-example-2.c
- grab AST in XML via command: castxml --verbose --castxml-cc-gnu gcc -std=c89 -U__cplusplus --castxml-gccxml -o c-example-2.c.xml c-example-2.c 2>&1
> Ubuntu clang version 3.7.1-1ubuntu4 (tags/RELEASE_371/final) (based on LLVM 3.7.1)
> Target: x86_64-pc-linux-gnu
> Thread model: posix
> Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6.0.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.4.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.0.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.4.0
> Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.4.0
> Candidate multilib: .;@m64
> Selected multilib: .;@m64
> clang -cc1 version 3.7.1 based upon LLVM 3.7.1 default target x86_64-pc-linux-gnu
> #include "..." search starts here:
> #include <...> search starts here:
> /usr/include/c++/5
> /usr/include/x86_64-linux-gnu/c++/5
> /usr/include/c++/5/backward
> /usr/share/castxml/clang/include
> /usr/local/include
> /usr/lib/gcc/x86_64-linux-gnu/5/include-fixed
> /usr/include/x86_64-linux-gnu
> /usr/include
> End of search list.
- xml bytes: 58169
- xml_lines=716
- cpp file local deps via command: gcc -MM -g c-example-2.c: c-example-2.o: c-example-2.c cpp-example-2.cpp cpp-example-1.hpp cpp-example-2.hpp
- looping over 4 local deps: c-example-2.c cpp-example-2.cpp cpp-example-1.hpp cpp-example-2.hpp
- cpp_file_local_dep: c-example-2.c
- cpp_lines=1
- file_id=
- call_lines=0
- replaced 1 times: cpp-example-2.cpp with cpp-example-2.cpp.cwrap.2.cpp
- wrote file: c-example-2.c.cwrap.2.c
- cpp_file_local_dep: cpp-example-2.cpp
- cpp_lines=6
- file_id=
- call_lines=0
- replaced 1 times: cpp-example-1.hpp with cpp-example-1.hpp.cwrap.2.hpp
- wrote file: cpp-example-2.cpp.cwrap.2.cpp
- cpp_file_local_dep: cpp-example-1.hpp
- cpp_lines=1
- file_id=
- call_lines=0
- replaced 1 times: cpp-example-2.hpp with cpp-example-2.hpp.cwrap.2.hpp
- wrote file: cpp-example-1.hpp.cwrap.2.hpp
- cpp_file_local_dep: cpp-example-2.hpp
- cpp_lines=13
- file_id=
- call_lines=0
- wrote file: cpp-example-2.hpp.cwrap.2.hpp
- appending 0 function contexts to: c-example-2.c.cwrap.2.c
- wrote file: cwrap.h
$ rm -f ./c-example
$ gcc -g -o c-example-1.o -c c-example-1.c.cwrap.2.c -include cwrap.h
$ gcc -g -o c-example-2.o -c c-example-2.c.cwrap.2.c -include cwrap.h
$ gcc -g -o c-example c-example-1.o c-example-2.o && ./c-example
+ main() {
- hello world
+ baz() {
} // baz() #1
+ my_public() {
+ my_private() {
+ bar() {
+ baz() {
} // baz() #2
} // bar() #1
} // my_private() #1
} // my_public() #1
+ bar() {
+ baz() {
} // baz() #3
} // bar() #2
+ clean_up() {
- my_int=5
} // clean_up() #1
} // main() #1
+ bye_baz() {
- called via atexit() via baz()
} // bye_baz() #1
+ bye_baz() {
- called via atexit() via baz()
} // bye_baz() #2
+ bye_baz() {
- called via atexit() via baz()
} // bye_baz() #3
+ bye() {
- called via atexit() via main()
} // bye() #1
$ perl cpp-xml-cpp.pl cpp-example-1.cpp
- cpp_file=cpp-example-1.cpp
- grab AST in XML via command: castxml --verbose --castxml-cc-gnu gcc -std=gnu++98 --castxml-gccxml -o cpp-example-1.cpp.xml cpp-example-1.cpp 2>&1
> Ubuntu clang version 3.7.1-1ubuntu4 (tags/RELEASE_371/final) (based on LLVM 3.7.1)
> Target: x86_64-pc-linux-gnu
> Thread model: posix
> Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6.0.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.4.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.0.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.4.0
> Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.4.0
> Candidate multilib: .;@m64
> Selected multilib: .;@m64
> clang -cc1 version 3.7.1 based upon LLVM 3.7.1 default target x86_64-pc-linux-gnu
> #include "..." search starts here:
> #include <...> search starts here:
> /usr/include/c++/5
> /usr/include/x86_64-linux-gnu/c++/5
> /usr/include/c++/5/backward
> /usr/share/castxml/clang/include
> /usr/local/include
> /usr/lib/gcc/x86_64-linux-gnu/5/include-fixed
> /usr/include/x86_64-linux-gnu
> /usr/include
> End of search list.
- xml bytes: 168661
- xml_lines=1956
- cpp file local deps via command: gcc -MM -g cpp-example-1.cpp: cpp-example-1.o: cpp-example-1.cpp cpp-example-1.hpp cpp-example-2.hpp
- looping over 3 local deps: cpp-example-1.cpp cpp-example-1.hpp cpp-example-2.hpp
- cpp_file_local_dep: cpp-example-1.cpp
- cpp_lines=44
- file_id=f23
- call_lines=11
- line 5+0: context :: :: name clean_up : what Function 1: void clean_up(int * my_int) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_clean_up_5, "cpp-example-1.cpp", "clean_up", "5"); printf("- my_int=%d\n", *my_int); }
- line 6+0: context :: :: name bye_baz : what Function 1: void bye_baz (void ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_bye_baz_6, "cpp-example-1.cpp", "bye_baz", "6"); printf("- called via atexit() via baz()\n"); }
- line 7+0: context :: :: name bye : what Function 1: void bye (void ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_bye_7, "cpp-example-1.cpp", "bye", "7"); printf("- called via atexit() via main()\n"); }
- line 8+0: context :: :: name baz : what Function 1: int baz (int a ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_baz_8, "cpp-example-1.cpp", "baz", "8"); atexit(bye_baz); return a ; }
- line 9+0: context :: :: name bar : what Function 1: int bar (int a ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_bar_9, "cpp-example-1.cpp", "bar", "9"); return baz(a); }
- line 12+0: context :: :: name Foo : what Class 1: class Foo {
- line 28+0: context :: :: name main : what Function 1: int main() {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_main_28, "cpp-example-1.cpp", "main", "28");
- line 16+0: context Foo :: name my_private : what Method 1: int my_private(int a) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_foo_my_private_16, "cpp-example-1.cpp", "Foo::my_private", "16"); return bar(a); }
- line 20+0: context Foo :: name Foo : what Constructor 1: Foo(const char* arg1) : my_struct_3("my_struct_3", arg1) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_foo_foo_20, "cpp-example-1.cpp", "Foo::Foo", "20"); printf("- constructing Foo\n"); }
- line 21+0: context Foo :: name Foo : what Destructor 1: ~Foo() {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_foo_un_foo_21, "cpp-example-1.cpp", "Foo::~Foo", "21"); printf("- deconstructing Foo\n"); }
- line 23+0: context Foo :: name my_public : what Method 1: int my_public(int a) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_foo_my_public_23, "cpp-example-1.cpp", "Foo::my_public", "23"); return my_private(a); }
- replaced 1 times: cpp-example-1.hpp with cpp-example-1.hpp.cwrap.2.hpp
- wrote file: cpp-example-1.cpp.cwrap.2.cpp
- cpp_file_local_dep: cpp-example-1.hpp
- cpp_lines=1
- file_id=
- call_lines=0
- replaced 1 times: cpp-example-2.hpp with cpp-example-2.hpp.cwrap.2.hpp
- wrote file: cpp-example-1.hpp.cwrap.2.hpp
- cpp_file_local_dep: cpp-example-2.hpp
- cpp_lines=13
- file_id=f22
- call_lines=5
- line 2+0: context :: :: name my_struct : what Struct 1: struct my_struct {
- line 10+0: context :: :: name get_max : what Function 1: my_type get_max (my_type a, my_type b) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_get_max_10, "cpp-example-2.hpp", "get_max", "10");
- line 3+0: context my_struct :: name my_struct : what Constructor 1: my_struct(const char *f ) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_my_struct_3, "cpp-example-2.hpp", "my_struct::my_struct", "3"); f_ = f; printf("- ""constructing my_struct: arg1:%s\n" , f_ ); }
- line 4+0: context my_struct :: name my_struct : what Constructor 1: my_struct(const char *f, const char *b) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_my_struct_4, "cpp-example-2.hpp", "my_struct::my_struct", "4"); f_ = f; printf("- ""constructing my_struct: arg1:%s arg2:%s\n", f_, b); }
- line 5+0: context my_struct :: name my_struct : what Destructor 1: ~my_struct( ) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_un_my_struct_5, "cpp-example-2.hpp", "my_struct::~my_struct", "5"); printf("- deconstructing my_struct: %s\n" , f_ ); }
- wrote file: cpp-example-2.hpp.cwrap.2.hpp
- appending 14 function contexts to: cpp-example-1.cpp.cwrap.2.cpp
- wrote file: cwrap.h
$ perl cpp-xml-cpp.pl cpp-example-2.cpp
- cpp_file=cpp-example-2.cpp
- grab AST in XML via command: castxml --verbose --castxml-cc-gnu gcc -std=gnu++98 --castxml-gccxml -o cpp-example-2.cpp.xml cpp-example-2.cpp 2>&1
> Ubuntu clang version 3.7.1-1ubuntu4 (tags/RELEASE_371/final) (based on LLVM 3.7.1)
> Target: x86_64-pc-linux-gnu
> Thread model: posix
> Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6.0.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.4.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.0.0
> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.4.0
> Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.4.0
> Candidate multilib: .;@m64
> Selected multilib: .;@m64
> clang -cc1 version 3.7.1 based upon LLVM 3.7.1 default target x86_64-pc-linux-gnu
> #include "..." search starts here:
> #include <...> search starts here:
> /usr/include/c++/5
> /usr/include/x86_64-linux-gnu/c++/5
> /usr/include/c++/5/backward
> /usr/share/castxml/clang/include
> /usr/local/include
> /usr/lib/gcc/x86_64-linux-gnu/5/include-fixed
> /usr/include/x86_64-linux-gnu
> /usr/include
> End of search list.
- xml bytes: 69560
- xml_lines=837
- cpp file local deps via command: gcc -MM -g cpp-example-2.cpp: cpp-example-2.o: cpp-example-2.cpp cpp-example-1.hpp cpp-example-2.hpp
- looping over 3 local deps: cpp-example-2.cpp cpp-example-1.hpp cpp-example-2.hpp
- cpp_file_local_dep: cpp-example-2.cpp
- cpp_lines=6
- file_id=f10
- call_lines=0
- replaced 1 times: cpp-example-1.hpp with cpp-example-1.hpp.cwrap.2.hpp
- wrote file: cpp-example-2.cpp.cwrap.2.cpp
- cpp_file_local_dep: cpp-example-1.hpp
- cpp_lines=1
- file_id=
- call_lines=0
- replaced 1 times: cpp-example-2.hpp with cpp-example-2.hpp.cwrap.2.hpp
- wrote file: cpp-example-1.hpp.cwrap.2.hpp
- cpp_file_local_dep: cpp-example-2.hpp
- cpp_lines=13
- file_id=f9
- call_lines=4
- line 2+0: context :: :: name my_struct : what Struct 1: struct my_struct {
- line 3+0: context my_struct :: name my_struct : what Constructor 1: my_struct(const char *f ) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_my_struct_3, "cpp-example-2.hpp", "my_struct::my_struct", "3"); f_ = f; printf("- ""constructing my_struct: arg1:%s\n" , f_ ); }
- line 4+0: context my_struct :: name my_struct : what Constructor 1: my_struct(const char *f, const char *b) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_my_struct_4, "cpp-example-2.hpp", "my_struct::my_struct", "4"); f_ = f; printf("- ""constructing my_struct: arg1:%s arg2:%s\n", f_, b); }
- line 5+0: context my_struct :: name my_struct : what Destructor 1: ~my_struct( ) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_un_my_struct_5, "cpp-example-2.hpp", "my_struct::~my_struct", "5"); printf("- deconstructing my_struct: %s\n" , f_ ); }
- wrote file: cpp-example-2.hpp.cwrap.2.hpp
- appending 3 function contexts to: cpp-example-2.cpp.cwrap.2.cpp
- wrote file: cwrap.h
$ rm -f ./cpp-example
$ gcc -g -o cpp-example-1.o -c cpp-example-1.cpp.cwrap.2.cpp -include cwrap.h
$ gcc -g -o cpp-example-2.o -c cpp-example-2.cpp.cwrap.2.cpp -include cwrap.h
$ gcc -g -o cpp-example cpp-example-1.o cpp-example-2.o -lstdc++ && ./cpp-example
+ my_struct::my_struct() {
- constructing my_struct: arg1:my_struct_1
} // my_struct::my_struct() #1
+ main() {
- hello world
+ baz() {
} // baz() #1
+ my_struct::my_struct() {
- constructing my_struct: arg1:my_struct_2
} // my_struct::my_struct() #2
+ my_struct::my_struct() {
- constructing my_struct: arg1:my_struct_3 arg2:a
} // my_struct::my_struct() #1
+ Foo::Foo() {
- constructing Foo
} // Foo::Foo() #1
+ my_struct::my_struct() {
- constructing my_struct: arg1:my_struct_3 arg2:b
} // my_struct::my_struct() #2
+ Foo::Foo() {
- constructing Foo
} // Foo::Foo() #2
+ Foo::my_public() {
+ Foo::my_private() {
+ bar() {
+ baz() {
} // baz() #2
} // bar() #1
} // Foo::my_private() #1
} // Foo::my_public() #1
+ bar() {
+ baz() {
} // baz() #3
} // bar() #2
+ Foo::~Foo() {
- deconstructing Foo
} // Foo::~Foo() #1
+ my_struct::~my_struct() {
- deconstructing my_struct: my_struct_3
} // my_struct::~my_struct() #1
+ Foo::~Foo() {
- deconstructing Foo
} // Foo::~Foo() #2
+ my_struct::~my_struct() {
- deconstructing my_struct: my_struct_3
} // my_struct::~my_struct() #2
+ my_struct::~my_struct() {
- deconstructing my_struct: my_struct_2
} // my_struct::~my_struct() #3
+ clean_up() {
- my_int=5
} // clean_up() #1
} // main() #1
+ bye_baz() {
- called via atexit() via baz()
} // bye_baz() #1
+ bye_baz() {
- called via atexit() via baz()
} // bye_baz() #2
+ bye_baz() {
- called via atexit() via baz()
} // bye_baz() #3
+ bye() {
- called via atexit() via main()
} // bye() #1
+ my_struct::~my_struct() {
- deconstructing my_struct: my_struct_1
} // my_struct::~my_struct() #4
- Note: The local include files get updated to their instrumentated counterparts.
- Note: The
CWRAP_ENTRY()
macros get inserted after the{
of every function. - Note: The
CWRAP_ENTRY()
macros are given the static function variable address and other strings. - Note: The static function variables are appended to the end of the file.
$ cat --number cpp-example-1.cpp.cwrap.2.cpp
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include "cpp-example-1.hpp.cwrap.2.hpp"
4
5 void clean_up(int * my_int) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_clean_up_5, "cpp-example-1.cpp", "clean_up", "5"); printf("- my_int=%d\n", *my_int); }
6 void bye_baz (void ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_bye_baz_6, "cpp-example-1.cpp", "bye_baz", "6"); printf("- called via atexit() via baz()\n"); }
7 void bye (void ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_bye_7, "cpp-example-1.cpp", "bye", "7"); printf("- called via atexit() via main()\n"); }
8 int baz (int a ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_baz_8, "cpp-example-1.cpp", "baz", "8"); atexit(bye_baz); return a ; }
9 int bar (int a ) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_bar_9, "cpp-example-1.cpp", "bar", "9"); return baz(a); }
10
11 #ifdef __cplusplus
12 class Foo {
13 private:
14 my_struct my_struct_3;
15 #endif
16 int my_private(int a) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_foo_my_private_16, "cpp-example-1.cpp", "Foo::my_private", "16"); return bar(a); }
17
18 #ifdef __cplusplus
19 public:
20 Foo(const char* arg1) : my_struct_3("my_struct_3", arg1) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_foo_foo_20, "cpp-example-1.cpp", "Foo::Foo", "20"); printf("- constructing Foo\n"); }
21 ~Foo() {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_foo_un_foo_21, "cpp-example-1.cpp", "Foo::~Foo", "21"); printf("- deconstructing Foo\n"); }
22 #endif
23 int my_public(int a) {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_foo_my_public_23, "cpp-example-1.cpp", "Foo::my_public", "23"); return my_private(a); }
24 #ifdef __cplusplus
25 };
26 #endif
27
28 int main() {CWRAP_ENTRY(cwrap_cpp_example_1_cpp_main_28, "cpp-example-1.cpp", "main", "28");
29 atexit(bye);
30 int my_int __attribute__ ((__cleanup__(clean_up))) = 1;
31 my_int = 5;
32 printf("- hello world\n");
33 int b = baz(1);
34 #ifdef __cplusplus
35 int x,y;
36 get_max <int> (x,y);
37 my_struct my_struct_2("my_struct_2");
38 Foo foo_1("a");
39 Foo foo_2("b");
40 return bar(foo_1.my_public(0));
41 #else
42 return bar( my_public(0));
43 #endif
44 }
45 CWRAP_DATA cwrap_cpp_example_1_cpp_bar_9 = {NULL};
46 CWRAP_DATA cwrap_cpp_example_1_cpp_baz_8 = {NULL};
47 CWRAP_DATA cwrap_cpp_example_1_cpp_bye_7 = {NULL};
48 CWRAP_DATA cwrap_cpp_example_1_cpp_bye_baz_6 = {NULL};
49 CWRAP_DATA cwrap_cpp_example_1_cpp_clean_up_5 = {NULL};
50 CWRAP_DATA cwrap_cpp_example_1_cpp_foo_foo_20 = {NULL};
51 CWRAP_DATA cwrap_cpp_example_1_cpp_foo_my_private_16 = {NULL};
52 CWRAP_DATA cwrap_cpp_example_1_cpp_foo_my_public_23 = {NULL};
53 CWRAP_DATA cwrap_cpp_example_1_cpp_foo_un_foo_21 = {NULL};
54 CWRAP_DATA cwrap_cpp_example_1_cpp_main_28 = {NULL};
- Note: Here the static function variables used by the struct constructor and destructor functions are duplicated, but without compiler errors due to the extern weak attribute in the
CWRAP_ENTRY()
macro:
$ cat --number cpp-example-2.cpp.cwrap.2.cpp
1 #include <stdio.h>
2 #include "cpp-example-1.hpp.cwrap.2.hpp"
3
4 #ifdef __cplusplus
5 my_struct my_struct_1("my_struct_1");
6 #endif
$ cat --number cpp-example-2.hpp.cwrap.2.hpp
1 #ifdef __cplusplus
2 struct my_struct {
3 my_struct(const char *f ) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_my_struct_3, "cpp-example-2.hpp", "my_struct::my_struct", "3"); f_ = f; printf("- ""constructing my_struct: arg1:%s\n" , f_ ); }
4 my_struct(const char *f, const char *b) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_my_struct_4, "cpp-example-2.hpp", "my_struct::my_struct", "4"); f_ = f; printf("- ""constructing my_struct: arg1:%s arg2:%s\n", f_, b); }
5 ~my_struct( ) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_un_my_struct_5, "cpp-example-2.hpp", "my_struct::~my_struct", "5"); printf("- deconstructing my_struct: %s\n" , f_ ); }
6 const char *f_;
7 };
8
9 template <class my_type>
10 my_type get_max (my_type a, my_type b) {
11 return (a>b?a:b);
12 }
13 #endif
14 CWRAP_DATA cwrap_cpp_example_2_hpp_my_struct_my_struct_3 = {NULL};
15 CWRAP_DATA cwrap_cpp_example_2_hpp_my_struct_my_struct_4 = {NULL};
16 CWRAP_DATA cwrap_cpp_example_2_hpp_my_struct_un_my_struct_5 = {NULL};
- Unfortunately, CastXML only lists templates if they are used, whereas, e.g. it'll list structs whether they are used or not! [1].
- This means the above technique creates two different
cpp-example.hpp.cwrap.2.hpp
files; one forcpp-example-1.cpp
and one forcpp-example-2.cpp
. - Unfortunately they are different, because only
cpp-example-1.cpp
actually uses the template, whereascpp-example-2.cpp
does not use the template:
$ perl cpp-xml-cpp.pl cpp-example-1.cpp
...
- cpp_file_local_dep: cpp-example-2.hpp
...
- line 2+0: context :: :: name my_struct : what Struct 1: struct my_struct {
- line 10+0: context :: :: name get_max : what Function 1: my_type get_max (my_type a, my_type b) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_get_max_10, "cpp-example-2.hpp", "get_max", "10");
- line 3+0: context my_struct :: name my_struct : what Constructor 1: my_struct(const char *f ) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_my_struct_3, "cpp-example-2.hpp", "my_struct::my_struct", "3"); f_ = f; printf("- ""constructing my_struct: arg1:%s\n" , f_ ); }
- line 4+0: context my_struct :: name my_struct : what Constructor 1: my_struct(const char *f, const char *b) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_my_struct_4, "cpp-example-2.hpp", "my_struct::my_struct", "4"); f_ = f; printf("- ""constructing my_struct: arg1:%s arg2:%s\n", f_, b); }
- line 5+0: context my_struct :: name my_struct : what Destructor 1: ~my_struct( ) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_un_my_struct_5, "cpp-example-2.hpp", "my_struct::~my_struct", "5"); printf("- deconstructing my_struct: %s\n" , f_ ); }
$ perl cpp-xml-cpp.pl cpp-example-2.cpp
...
- cpp_file_local_dep: cpp-example-2.hpp
...
- line 2+0: context :: :: name my_struct : what Struct 1: struct my_struct {
- line 3+0: context my_struct :: name my_struct : what Constructor 1: my_struct(const char *f ) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_my_struct_3, "cpp-example-2.hpp", "my_struct::my_struct", "3"); f_ = f; printf("- ""constructing my_struct: arg1:%s\n" , f_ ); }
- line 4+0: context my_struct :: name my_struct : what Constructor 1: my_struct(const char *f, const char *b) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_my_struct_4, "cpp-example-2.hpp", "my_struct::my_struct", "4"); f_ = f; printf("- ""constructing my_struct: arg1:%s arg2:%s\n", f_, b); }
- line 5+0: context my_struct :: name my_struct : what Destructor 1: ~my_struct( ) {CWRAP_ENTRY(cwrap_cpp_example_2_hpp_my_struct_un_my_struct_5, "cpp-example-2.hpp", "my_struct::~my_struct", "5"); printf("- deconstructing my_struct: %s\n" , f_ ); }
- This is a tricky and annoying issue to solve, and needs some brain-storming.
- Idea #1: Create a unique version of
cpp-example-2.hpp
for each source file that includes it. - Cons: This might become messy. If 100+ source files include a particular header then we'll end up with 100+ intrumentated versions of the header; many of them the same.
- Idea #2: Like idea #1 except we SHA256 each header to find the unique ones, which are the only versions to be persisted on disk.
- Cons: A header file full of templates might still result in very many instrumentated versions of the header.
- Idea #3: Only have one version of the header file but incrementally update it as different source files using it get compiled.
- In the example above, the first source file processed updates the header and adds 4
CWRAP_ENTRY()
macros, including the one for the template. - If we somehow remembered those 4 macros added, when it comes to processing the next source file, we'd only update the already instrumented header, and only if a new macro needs adding.
- Cons: We'd need to use the timestamp of the original header file to decide if the instrumented header version needs to start from scratch again. Is this enough to make enough work as expected?
[1] CastXML behaviour: templates vs structs
Integrate this auto instrumentation technique with the latest cwrap
script supporting C and corountines
- Todo!