brfitzpatrick/R2PBS.md

## R2PBS.md

      
    Raw
  

              R2PBS.md
            
          
    How to submit R code to a Portable Batch System (PBS) managed High Performance Computing (HPC) Facility (e.g. QUT's HPC Facility 'Lyra').

You will need:


An account with the relevant HPC facility (QUT staff and HDR students can request access to the QUT facility  here),
the R Script (.R file) you want to run,
a Job Script (.sub file) that tells Lyra how to run your R script and,
a file containing any data your R Script needs to run.

Note:
These instructions work on a system running Debian (GNU + Linux) with the Gnome Desktop Environment.
The main difference for MS Windows and MacOS users will be in how you connect to the HPC File Store and there are instructions on how to do that here (also at that page are several useful guides on topics relevant to using the HPC facilities at QUT).
MS Windows users wishing to use a secure shell to interact with a PBS system have the option of downloading and installing the terminal emulation software PuTTY.
Step 1 - creating your .sub file

Example .sub file to run an R script:
#!/bin/bash -l
#PBS -N job_name_here
#PBS -l walltime=1:00:00
#PBS -l select=1:ncpus=1:mem=8G
#PBS -j oe

module load r/3.3.1-foss-2016a

cd $PBS_O_WORKDIR

R --file=/home/username/your_R_script.R

The important lines of this .sub file to understand for submitting simple R jobs that run in serial are as explained below. The remaining lines you can just copy and paste unchanged into your .sub file without worrying about too much at this stage.
#PBS -l walltime=1:00:00 requests 1 hour of processing time ('walltime').
Your job will be killed once this time limit is reached regardless of whether or not it has finished running your script.  Thus it is good practise to try and predict how long your job will take to run and request that amount of walltime.
For instance, if you job involves iterative calculations run a few of these on your machine (if possible) to get an average time per iteration.  The CPU on your computer will likely run at a different speed to those of the QUT HPC nodes so once you have an estimate of how long an iteration takes on your computer submit a test job to Lyra that also only runs a few iterations and request your best estimate of the required walltime.  You can then use qjobs -x command (more on this below) to see how long your job took to run on Lyra, calculate an average time per iteration and forecast the walltime required for your full job accordingly.
#PBS -l select=1:ncpus=1:mem=8G requests that your job be run on a single CPU of a single node with 8Gb of RAM. If possible it is worth testing your job on your local machine to determine how much RAM it requires as you will encounter errors if your compute job attempts to use more RAM than you have requested.
module load r/3.3.1-foss-2016a loads R version 3.2.4 (see the section at the end of this guide for how to check which versions of R are currently available on Lyra).
R --file=/home/username/your_R_script.R uses the version of R load on with the line above to run the R script located at the file path /home/username/your_R_script.R on the HPC filestore (Step 2 details how to copy files across to the HPC filestore).
MS Windows users, your .sub file must use Unix/Linux/OSX style line endings.
You can make such a file with Notepad++ (or any good text editor) by activating the relevant setting e.g. in Notepad++:
Settings -> Preferences -> New Document -> Format (Line Ending) Unix/OSX.
Alternatively you can convert a .sub file with MS Windows style line endings to a .sub file with Unix/OSX style line endings with the dos2unix command line tool on Lyra itself.
Type man dos2unix when logged into Lyra via ssh to read about how to do this.
If you are curious about the difference between these two styles of line endings a succinct explanation can be found here.
Step 2 - copy your files to the HPC Filestore

How to send your Job Script (.sub file), the R Script (.R file) referenced in the Jobscript and any required data file(s) to HPC File Store.  MS Windows and MacOS users, instructions on how to connect to the HPC File Store from your machine and copy files back and forth may be found here.
Under GNU+Linux with a GNOME based GUI:


Open the Nautilus File Browser


Click Connect to Server


Server Address smb://hpc-fs.qut.edu.au/username
(use your QUT Username, the same one you use to log into your QUT Webmail)
Domain QUTAD
Username your_QUT_Username
Password your_QUT_password

Copy files to your directory on the HPC Filestore (i.e. copy your .sub file, your .R file and any .Rdata or .csv files of data you need).

Note: if your .R file needs to load some data you will need to copy this across to the HPC file store and have a load( ) or read.table( ) line in your .R file that specifies the location of the data on the HPC filestore with a filepath something like /home/your_qut_username/where_ever_you_put_the_data_file.
Step 3 - Use a secure shell to log into the HPC facility and submit your .sub file to the PBS system

Open a termial (on Windows open PuTTY).
Log into Lyra via a secure shell.  If you're on campus you just need to be connected to the network, if you're off campus you need to be using the QUT Virtual Private Network (VPN) e.g. with the QUT endorsed Cisco Anyconnect VPN (have a look at the IT Helpdesk pages on this).  The VPN also enables you to connect to the HPC filestore from off campus.
Log in:
ssh -l your_qut_username lyra.qut.edu.au

enter QUT your password (the same one you use to log into your QUT webmail)
Set the workding directory to where ever you copied your .sub, .R and data files.
cd /home/username/where_ever_you_put_files

Submit your job to the queue:
qsub your_sub_file.sub

You can check the progress of your active jobs with:
qjobs

If you have requested sufficient time for your R script to run and it runs without errors any results it writes out should appear in the current directory (unless you have changed directories in your .R file)
/home/your_qut_username/where_ever_you_put_files

A copy of the terminal output of running your script will also be written to this directory (this is very useful for debugging your jobs).
Once your job has compled you can copy your results back your machine with Nautilus (or whatever you are using to access the HPC File Store).
You can also view information on completed jobs with the following command qjobs -x.  This will output information such as the amount of RAM used over the duration of a jobs and the overall CPU utilisation given as a percentage.  If you see that the CPU utilisation for a completed job is less than 50% you could well benefit from the advice of the HPC Support Team on optimizing your code.
Checking the versions of R currently available for use on Lyra

Log into Lyra with ssh as above.
Issue the module avail r comand.
The output shoud look something like this:
---------------------------------------------- /pkg/suse12/modules/all -----------------------------------------------
r/3.3.1-foss-2016a                 raxml/8.2.9-foss-2016a-hybrid-avx2 rsem/1.2.30-foss-2016a
r/3.3.1-intel-2016b                renderproto/0.11-foss-2016a
raxml/8.2.9-foss-2016a-hybrid-avx  renderproto/0.11-intel-2015b

--------------------------------------------- /pkg/suse12/modules/devel ----------------------------------------------
renderproto/0.11-foss-2016a  renderproto/0.11-intel-2015b

---------------------------------------------- /pkg/suse12/modules/lang ----------------------------------------------
r/3.3.1-foss-2016a  r/3.3.1-intel-2016b

---------------------------------------------- /pkg/suse12/modules/bio -----------------------------------------------
raxml/8.2.9-foss-2016a-hybrid-avx  raxml/8.2.9-foss-2016a-hybrid-avx2 rsem/1.2.30-foss-2016a

For standard use of R use the versions that end in -foss-2016a (these have been compiled with the GNU Compiler Collection). Versions of R that end in -intel-2016b have been compiled with Intel compilers.
If you need to find the model number of the CPU your job is running on execute qjobs -x to find the
Host/Array/GPU/mics entry for you job.  It will be something like cl2n098/0*0.
pbsnodes cl3n004 | grep cputype resources_available.cputype = E5-2680v3,avx,avx2
Informs you that cl3n004 (Cluster 3 Node 4) has a E5-2680 CPU.
You can then Google this model number to discover it's clock speed.
Shared Memory Parallel Computing on a Single Node of the HPC Cluster

R includes a variety of packages for parallel computing summarised on the CRAN HPC Task View here.

In this example I will use the doMC package for parallel computing.
To use doMC you need to write your .sub file slightly differently:
#!/bin/bash -l
#PBS -N a_name_for_your_job
#PBS -l walltime=20:00:00
#PBS -l select=1:ncpus=16:mem=120G
#PBS -j oe

cd $PBS_O_WORKDIR

module load r/3.3.1-foss-2016a
export MC_CORES=16
export OMP_NUM_THREADS=1

R CMD BATCH --slave /home/username/your_R_script.R your_R_termianl_output.out


export MC_CORES=16 sets the global option MC_CORES which we will import into R in the .R script.
You must set MC_CORES to the number you supplied to ncpus in the .sub file line
#PBS -l select=1:ncpus=16:mem=120G
We use this option to inform R how many 'cores' the CPU possesses which in turn is the number of parallel processes R can run.
The Lyra nodes each have 16 or more 'cores' (actually they have 8 or more physical core cores each and each core has hyperthreading which allows it to efficiently execute 2 'cores' worth of work).
As we are using parallel computing at the R level we need to ensure that external libraries called by R do not attempt to use parallelism.  This is the function of the line:
export OMP_NUM_THREADS=1
For parallel computing with the doMC package we need to run a 'batch' of R processes.  This is achieved with the line:
R CMD BATCH --slave /home/username/your_R_script.R your_R_termianl_output.out
Your R script will need to load the doMC package require(doMC) and set the number of 'cores' doMC uses to be the number we set the MC_CORES option to contain:
registerDoMC(cores = getOption("mc.cores", 2L)))
#PBS -j oe results in the terminal output and errors being written to a single file, in this case:
/home/username/your_R_termianl_output.out
For a more comprehensive example of parallel computing with R on the QUT HPC Lyra please see Marcela's example here.
Installing an R Package for personal use on the HPC System


Download the package source from CRAN e.g. ranger_0.6.0.tar.gz
Copy the package source across to the HPC filestore
Add the command to install the package to your .sub file (note the R CMD INSTALL command must come after R has been loaded and before the command to execute the R script that loads the package).

#!/bin/bash -l
#PBS -N ranger_test
#PBS -l walltime=00:10:00
#PBS -l select=1:ncpus=1:mem=8G
#PBS -j oe

cd $PBS_O_WORKDIR

module load r/3.3.1-foss-2016a 

R CMD INSTALL -l /home/username/pkgs /home/username/ranger_0.6.0.tar.gz

R --file=/home/username/ranger_install_run.R


The command
R CMD INSTALL -l /home/username/pkgs /home/username/ranger_0.6.0.tar.gz
installs the package ranger from the source file located at /home/username/ranger_0.6.0.tar.gz to the location /home/username/pkgs.
To load the newly installed package in your .R script use the lib.loc argument in your library( ) command to load the package from the location to which you have installed it on the HPC filestore store e.g.
library('ranger', lib.loc = '/home/username/pkgs/')