arfon/compiling.md

## compiling.md

      
    Raw
  

              compiling.md
            
          
    Fixing problems when compiling programs

Most of us in science learn to write and build computer program in an ad-hoc way, copying bits of Makefile from other people, and using trial and error to change things.
That's fine when everything works, but when stuff starts to go wrong it's useful to have an idea of what different steps are happening during compilation, so you can figure out what's going wrong.  In this post I'll talk at a a fairly high level about what happens when you compile code, and in a bit more detail about fixing problems you might see when you doing so.
Interpreted vs Compiled

One of the most important distinctions between programming languages is those that are interpreted and those which are compiled.  The purpose of a programming language is to translate from a human-readable form to a machine language.   Compiled languages, like C, C++, and Fortran, do this translation in advance, turning the code into a series of instructions and then executing it.  Interpreted langauges, like python, ruby, perl, and IDL, step through the program line by line, translating each one as they get to it.
The main advantage of compiled languages is runtime speed - doing everything in advance allows clever optimizations to be made in the translation.  The main advantage of interpreted ones is programmer ease and speed of writing - it's generally much easier to write interpreted programs, and they can be made very easy to use.
Most of the information in this note applies to compiled languages only.
Compiling & Linking

There are two steps to building programs from source code: compiling, and linking.
Compiling is translating the source code in each individual file into a machine code which can be executed by the processor directly.
Linking is connecting together functions and data from multiple files, and figuring out how they fit together.  This might be because the user has their code in multiple files, or because they call an external library, or because they use some standard functions from the programming language.
For example, if your Fortran code calls the function "sqrt", then the compiler needs to go off to the Fortran standard library and find that sqrt function, so it can call it when the time comes.  Or, for example, if your code is in two different files, and the first one calls a function defined in the second, then the compiler needs to figure this out and link them together.
A typical compilation command for a C program that uses a library called "xyz" might looks something like this
gcc -o program program.c functions.c -O2 -I/home/myname/xyz/include   -L/home/myname/xyz/include -lxyz
Let's break down this into its parts.
gcc is the name of the compiler, the GNU C compiler.  You might also use a different compiler like icc or clang in different situations or on different machines.
-o program tells the compiler that the generated, compiled, output file should be called "program".  If you don't specify this it will use a default, which is usually "a.out"
program.c functions.c  This is the list of files to compile.
-O2 -I/home/myname/xyz/include  These are compilation flags.  The -O2 chooses how agressively to optimize the code, and the -I flag specifies a directory to search for .h header files requested (by #include directives) in the c files.
-L/home/myname/xyz/include -lxyz  These are linker flags.  They tell the compiler to look in the directory specified by -L for a file libxyz.a or libxyz.so (or sometimes libxyz.dylib on a mac).  Any functions not found in .c files are expected to be found in this library.
Compilation errors

Compilation errors are mostly specific to the particular programming language that you are using, but there are some general patterns that can affect all of them.
The first thing to do when trying to solve compilation errors is to scroll up to the top of the error messages you get. Very often one problem will trigger off a cascade of others and the first message is usually the one to look at.
The most common type of compilation error is a syntax error, which usually just means there is a typo or stray character somewhere in your code.  Unfortunately the error messages you get can often be wildly opaque for this kind of error, and you are better off just making note of the line number the error is reported at and looking at that line or the ones just before it to look for anything odd.
I won't talk in detail about compilation errors much here because they are much more diverse than compiler errors.  The usual procedure when you find one you can't understand is just to google the text of it.
Libraries

Very simple codes may have nearly all their code in files that you, the author, have written, but in more complex programs you often use other people's libraries to perform specific tasks.  For example, you might use the FFTW library to run Fourier transfors, the lapack library to do matrix algebra, or many others.
Using libraries is essential to good programming - for any reasonably complicated but generic task it's likely someone else has already solved it computationally and shared the solution.
Incidentally - your job as a scientist who codes is not to write programs.  It's to write libraries.
Using Libraries

You (usually) need to tell your code about a libraries in two ways - once for the compilation phase and once for the linking.  A library usually consists of two different kinds of files.  The first is "headers", which are for the compiler, and end in .h for C, .mod for Fortran, and .h, .hpp, or .hh for C++.  The second is the actual "library", which are for the linker and end in .a, .so, or .dylib.
Using Headers

For the compilation phase the compiler doesn't need to know the details of how the function is implemented - it just needs to know how to call the function - how many arguments it takes, what types they are (integer, real, etc.), and what kind of value (if any) is returns.  That's what the headers contain.
In C/C++ you tell the compiler about a library by using a directive like #include "fftw3.h", which tells the compiler to go and find the named file fftw3.h and paste it into the current file.
In Fortran you do it by writing "use fftw", which tells the compiler to look for a file called fftw.mod and use the definititions in it
In both cases you also need to tell the compiler on the command line which directories to search for these header files.  That's done with the flag "-I/name/of/directory".  By convention, these header files are stored in directories called "include".
Using Libs

When you get to the linking stage the compiler actually needs to know how to run the functions you're looking for - what actual machine code to execute when a particular function is called.  It finds this in the "library" files.  You tell the compiler where to find these libraries, and what libraries to use, using the flags "-L/name/of/directory -lname_of_library" which makes the linker look for a file called libname_of_library.a, libname_of_library.so, or libname_of_library.dylib
Static & Dynamic Libraries

There are two ways that other libraries can be linked into your code: static and dynamic.  In static linking, when the program is compiled all the different functions and data from all the different libraries are gather together and put into the program you execute.  In dynamic linking the libraries are not put into the executable - instead the system waits until the program is actually run, and then the other library files are found and the functions connected together.
Traditionally the main advantage of shared libraries was that it meant that if multiple executables used the same library you didn't waste loads of disc space putting a copy of it in every single one - they all shared the same copy.  Nowadays that's less of an issue, but there are still some important advantages to shared libs.
One is that you can (with care) replace the library that a program is linked to without recompiling the program.  If someone comes out with a new FFTW, for example, you can just replace your copy of it and all the programs that link to it will use the new version (a warning: this can absolutely go horribly, horribly wrong if the library changes too much!)  Another important feature is that dynamic languages like python can only load code from shared libraries, so if you want to wrap your fast C code for use in python then you need to use this method.
Linker Errors

A problem in linking usually generates a very specific kind of error message.  Here's an example from C:
Undefined symbols for architecture x86_64:
"_gsl_integration_glfixed", referenced from:
_shear_shear_kernel in ccY8qjvH.o
"_gsl_integration_glfixed_table_alloc", referenced from:
_shear_shear_kernel in ccY8qjvH.o
"_gsl_integration_glfixed_table_free", referenced from:
_shear_shear_kernel in ccY8qjvH.o
"_gsl_interp_accel_alloc", referenced from:
_shear_shear_kernel in ccY8qjvH.o
"_gsl_interp_accel_free", referenced from:
_shear_shear_kernel in ccY8qjvH.o
Here's another example from a Fortran program:
Undefined symbols for architecture x86_64:
"___amlutils_MOD_fileexists", referenced from:
___camb_interface_tools_MOD_camb_initial_setup in ccA2l81i.o
"___camb_MOD_camb_cleanup", referenced from:
cleanup in ccyVOXic.o
"___camb_MOD_camb_getcls", referenced from:
___camb_interface_tools_MOD_camb_interface_save_cls in ccA2l81i.o
"___camb_MOD_camb_getresults", referenced from:
execute_thermal in ccyVOXic.o
execute_all in ccyVOXic.o
execute_cmb in ccyVOXic.o
"___camb_MOD_camb_setdefparams", referenced from:
___camb_interface_tools_MOD_camb_interface_set_params in ccA2l81i.o
This is telling you that some function that's used in your code has not been found in any of the files you want to compile, nor in any of the libraries that you told the compiler about.
The thing in the quotation marks is the name of the function it can't find, like "_gsl_integration_glfixed" in the first example.  One confusion here is that sometimes names can be mangled by the system a bit - there are nearly always some extra underscores added somewhere, and in the second fortran example the name of the module it's included in has been squashed in there with the word "MOD" too.  This is compiler-dependent, and particularly horrific with C++, where the names could become anything.
Solving Linker Errors

When you see an error like the one above, one of these things might be wrong:

a function name might be spelled wrong in code so it's not found
you may have not included a library on the command line
the library you included may not have been found
the library may have been found but for some reason be invalid, or the wrong version
the library may be found but be a different version so that the name you are looking for is not included in it
(Fortran only) An expression like x(4) in Fortran can mean either the 4th element of an array called x or a call to a function called x.  You may have forgotten to declare an array with the given name.