Environment Modules is a utility that has been used to manage executables and paths for high-performance computing clusters for multiple decades (1991!). The basic idea is that you can use modules to adapt your processing environment (and $PATH) to make sure that your environment is consistent. Importantly, this allows system administrators the ability to install and maintain multiple versions of software for different users. This tool can also be used to effectively manage your bioinformatics processing pipelines to ensure consistent analysis runs. For example, if you have a set of samples that will need to be analyzed consistently over a long time span, you could use modules to make sure that the same version of a program is used throughout the entire experiment while letting you use newer versions for different experiments.
If you are running your samples on a computing cluster, chances are you are already using modules to configure your environment (add programs to your path, etc). I'll first discuss installing modules from scratch using a FreeBSD system as the example. Next, I'll discuss using your own modules in the context of an existing cluster setup. Finally we'll look at how to add new modules to keep track of versions of your tools.
Environment Modules is likely available in your default OS package repository. For FreeBSD, it is in ports under the name "sysutils/modules", or as a downloadable package under the name "modules". For CentOS/RHEL, it is part of the EPEL repository under the name "environment-modules". I will assume that you have installed the software from the OS repository. If you need source-level installation instructions, see here: http://www.admin-magazine.com/HPC/Articles/Environment-Modules or https://github.com/hpcugent/easybuild/wiki/Installing-environment-modules-without-root-permissions
For FreeBSD, your command would be: pkg install modules
.
For CentOS (with EPEL enabled), you'd run: yum install environment-modules
The CentOS/RHEL EPEL package will take care of this for you. If you are using FreeBSD, you'll need to make sure that modules are enabled in your shell's initialization. You can do this on a per-user basis or for all users of a system. Basically, you'll need to add this line to your $HOME/.bashrc
file (or the appropriate file for your shell):
source /usr/local/Modules/$VERSION/init/bash
There is a version of this for each type of shell, so choose the one for the shell you are using. Replace the path with the correct path for your installation.
At this point, you can test whether or not the installation worked for you by logging out, re-logging in, and running module
. You should see a list of possible commands. In order to see what modules are currently available, you can run module avail
.
If you want to use a specific path for your managed modules for all users, you'll want to edit the file: $MODULEHOME/$MODULE_VERSION/init/.modulespath
. To add a new directory for modules, just add the pathname to the end of this file.
If you are trying to install a personal-path without root permissions (such as on an existing HPC cluster), you can use the command module use $HOME/.local/mysoftware
. This can be setup in your shell initialization or in the file $HOME/.modulerc
:
#%Module
module use $HOME/.local/mysoftware
It's important to note that this custom path isn't necessarily path that contains all of your programs. Rather, this path contains the module definitions that tell the system what programs/versions are available for loading. You can actually put the programs themselves in another location.
Note: If you can't see your custom module, perhaps one of the earlier entries in $MODULEHOME/$MODULE_VERSION/init/.modulespath
is causing an error. In the stock FreeBSD setup, this file contains /usr/local/lib
, which causes the program to segfault. If you comment out this line, then the program will keep going and show your custom module path.
In order to test that your custom path is configured properly, from the primary module path (check $MODULEHOME/$MODULE_VERSION/modulefiles
), copy the null
module file to your custom path. This is an empty module definitino file that can be used to demonstrate that your custom path is working.
Once you've added the null
module, run module avail
. You should see your new custom module path in the list with the null
module listed.
The first module that we will add is a simple one that will set the directory that will store our programs. Here, I've chosen the following directory setup:
$HOME/.local/modules
$HOME/.local/modules/modulefiles
All of the programs will go into $HOME/.local/modules
whereas all of the definition files will be in $HOME/.local/modules/modulefiles
. $HOME/.local/modules
has been added to put module path with module use $HOME/.local/modules/modulefiles
For example, for a program named 'foo' with a version of 0.1.2, all of the programs files would go in $HOME/.local/modules/foo/foo-0.1.2
, and the definition file would be named $HOME/.local/modules/modulefiles/foo/0.1.2
.
The program file directory could be setup anyway that you'd like. There is a lot of flexibility in how you setup each module. What I'm going to do is have the following folders: bin
, man
, and build
. Binary files go in bin
, man pages go in man
, and we will use build
to store the downloaded source code and itermediate files.
$HOME/.local/modules/foo
$HOME/.local/modules/foo/foo-0.1.2
$HOME/.local/modules/foo/foo-0.1.2/bin
$HOME/.local/modules/foo/foo-0.1.2/build
$HOME/.local/modules/foo/foo-0.1.2/man
$HOME/.local/modules/modulefiles/foo/0.1.2
If you use a directory structure like this, the only tricky part is the definition file $HOME/.local/modules/modulefiles/foo/0.1.2
.
Here is an example of what that file could look like. This example simply adds $HOME/.local/modules/foo/foo-0.1.1/bin
to your $PATH:
#%Module10#################################################################
#
## foo-0.1.1
##
proc ModulesHelp { } {
global version
puts stderr "\tfoo is a program"
puts stderr "\n\tVersion \$version\n"
}
module-whatis "
Foo does something...
"
conflict foo
# for Tcl script use only
set version "0.1.1"
prepend-path PATH $HOME/.local/modules/foo/foo-0.1.1/bin
prepend-path MANPATH $HOME/.local/modules/foo/foo-0.1.1/man
if [ module-info mode load ] {
puts stderr "foo version \$version loaded."
}
if [ module-info mode switch2 ] {
puts stderr "foo version \$version loaded."
}
if [ module-info mode remove ] {
puts stderr "foo version \$version unloaded."
}