iaradsouza1/Nextflow_on_NPAD_UFRN.md Secret

## Nextflow_on_NPAD_UFRN.md

      
    Raw
  

              Nextflow_on_NPAD_UFRN.md
            
          
    This a basic guideline to set some basic configurations in order to run nextflow pipelines at the NPAD.
First, I'm assuming that your analysis will take more than 9 GB (your home quota), then you'll need to use the scratch disk to store the results and the singularity images.
0. Install conda and nf-core

Here I recomend to use conda to install the nextflow and nf-core.
Download and install conda: https://docs.conda.io/en/latest/miniconda.html
Install nf-core with conda: https://nf-co.re/docs/usage/installation#bioconda-installation
1. Create a directory to be used as singularity cache in your scratch

Complex pipelines will need several singularity images. You need to download them beforehand into a cache directory. Create a directory at the scratch:
mkdir /scratch/iddsouza/singularity_images
mkdir /scratch/iddsouza/singularity_images/cache

2. Export env variables to your local profile

Add the following variables to your ~/.bashrc file. Also, call the singularity module by default (then you won't forget to call it when you run the pipeline)
export SINGULARITY_CACHEDIR='/home/iddsouza/scratch/singularity_images/cache'
export NXF_SINGULARITY_CACHEDIR='/home/iddsouza/scratch/singularity_images'
module load singularity

After modifying the ~/.bashrc file, restart your session by source ~/.bashrc.
3. Download the singularity images

After creating the directories, download the images. This is an example for the sarek pipeline, version 3.2.3. Change this line to your pipeline:
nf-core download --container-system singularity --container-cache-utilisation amend -r 3.2.3 -p 5 nf-core/sarek

4. Change the ~/.nextflow/config file

If this file is not present at your  ~/.nextflow/ directory, create it.
This is an example of my ~/.nextflow/config file:
singularity{
  autoMounts = true
}

process {
  executor = 'slurm' 
  queueSize = 50
  errorStrategy = 'retry'
  maxRetries = 1
  errorStrategy = { task.exitStatus in [125,139] ? 'retry' : 'finish' }
  memory = { check_max( 4.GB * task.attempt, 'memory' ) }
}


The required params are singularity{ autoMounts = true } and process { executor = 'slurm' }. This ensures that the nextflow handles the destination of each process to the processing nodes.
5. Create a screen to run the pipeline

Organize the files needed for your run at the local scratch. After that, create a screen to run the pipeline, with screen -S analysis, for example. In the screen, activate the nf-core environment.
If you prefer to use the nextflow tower, set the token to the ~/.bashrc file. (https://help.tower.nf/22.2/getting-started/usage/#nextflow-with-tower)
Don't forget to place the --outdir in the scratch.