dacarlin/guide_to_rosetta_slurm.md

## guide_to_rosetta_slurm.md

      
    Raw
  

              guide_to_rosetta_slurm.md
            
          
    How to run Rosetta with SLURM on the Genome Center's cluster Cabernet

To log in from a Bash prompt, run
ssh username@cabernet.genomecenter.ucdavis.edu

Today, this welcomed me when I logged in:
           ________________________________________
          / Cabernet currently consists of 27 nodes \
          | with 808 cores total. Type sinfo for    |
          \ more options.                           /  
           ----------------------------------------- . .
  ______________________________________________       . .
 /\    ___       _                              \         . .
/  :  / (_)     | |                              \          . .
|  : |      __, | |   _   ,_    _  _    _ _|_     \________ ___  
|  : |     /  | |/ \_|/  /  |  / |/ |  |/  |               |   |
|  :  \___/\_/|_/\_/ |__/   |_/  |  |_/|__/|_/     ________|___|
|  :               .genomecenter.ucdavis.edu      /  
\  ;                                             /
 \/_____________________________________________/
I want to run a large number for -nstruct in parallel

First, write a script that runs your Rosetta protocol. In this case, we want to relax an input crystal structure into Rosetta's energy function with the relax app. We want 1000 output structures to be produced in parallel.
Put in a file called sub.sh:
#!/bin/bash
#
#SBATCH --job-name=relax
#SBATCH --output=log.txt
#SBATCH --array=1-1000

module load rosetta 
relax.linuxgccrelease @flags -suffix $SLURM_ARRAY_TASK_ID
The -suffix $SLURM_ARRAY_TASK_ID is to tell Rosetta which SLURM process it's running in so it can name output files appropriately.
To run relax on an apo protein structure, the flags file contains the following Rosetta flags:
-s <input PDB>
-constrain_relax_to_start_coords 1
-renumber_pdb 1
The renumber_pdb flag will renumber all the residues in the structure starting with 1.
Change to the same directory as your sub.sh and input files and submit your job with:
sbatch sub.sh
I want to automate the running of similar jobs with variable flags (-s, -cst, -resfle, etc.)

In the case when you want to run the same protocol on multiple input structures, you can take an embarrassingly parallel approach by running all of the jobs concurrently rather than consecutively. This more complex situation adds a list of inputs in a file called list, and uses the SLURM_ARRAY_TASK_ID environment variable to select which input file should use used as input from the list. When you submit a batch run, each job gets an integer ID (0, 1, 2 ... n). We add a short Bash command to a print particular line of the list, which contains the Rosetta flags for a particular run. The list, in this case specifying three different input PDB structures, looks like
-s 3JYO.pdb
-s 3ORF.pdb
-s 1VLJ.pdb
For a parallel run, make a sub.sh like this:
#!/bin/bash
#
#SBATCH --job-name=parallel_relax
#SBATCH --output=log.txt
#SBATCH --array=1-3 

S=$( head -${SLURM_ARRAY_TASK_ID} list | tail -1 ) 
module load rosetta 
relax.linuxgccrelease @flags $S
The code head -${SLURM_ARRAY_TASK_ID} list | tail -1 returns the nth line in the file list, where n is equal to SLURM_ARRAY_TASK_ID.
Submit the run on Cabernet with
sbatch sub.sh
I want to scan over a parameter in RosettaScripts XML

This option also works well when you need to permutate variables in your XML. For example, if we want make a series of point mutations to the same protein, we can add in a special %% %% variable into our RosettaScript XML protocol that will get replaced at runtime.
Note the special %% %% variables in the MutateResidue mover declaration.
<ROSETTASCRIPTS>
<SCOREFXNS>
  <myscore weights=talaris2013_cst.wts/>
</SCOREFXNS>
<TASKOPERATIONS>
  <DetectProteinLigandInterface name=repack_sphere design=0 cut1=8.0 cut2=10.0 cut3=12.0 cut4=14.0 catres_interface=1 />
</TASKOPERATIONS>
<FILTERS>
  <EnzScore name=allcst score_type=cstE scorefxn=myscore whole_pose=1 energy_cutoff=100 />
</FILTERS>
<MOVERS>
  <MutateResidue name=mutate target=%%target%% new_res=%%new_res%% /> 
  <AddOrRemoveMatchCsts name=cstadd cst_instruction=add_new accept_blocks_missing_header=1 fail_on_constraints_missing=0 />
  <PredesignPerturbMover name=predock />
  <EnzRepackMinimize name=cst_opt cst_opt=1 minimize_rb=1 minimize_sc=1 minimize_bb=0 min_in_stages=0 minimize_lig=1/>
  <EnzRepackMinimize name=repack_wbb design=0 repack_only=1 scorefxn_minimize=myscore scorefxn_repack=myscore minimize_rb=1 minimize_sc=1 minimize_bb=1 minimize_lig=1 min_in_stages=0 backrub=0 task_operations=repack_sphere />
  <ParsedProtocol name=iterate>
    <Add mover=predock/>
    <Add mover=cst_opt/>
    <Add mover=repack_wbb/>
  </ParsedProtocol>
  <GenericMonteCarlo name=monte_repack mover_name=iterate filter_name=allcst />
</MOVERS>
<APPLY_TO_POSE>
</APPLY_TO_POSE>
<PROTOCOLS>
  <Add mover=cstadd />
  <Add mover=mutate />
  <Add mover=monte_repack />
</PROTOCOLS>
</ROSETTASCRIPTS>
Now, we can generate a list of the mutations we want. list:
-suffix "_325GLU" -parser:script_vars target=325 new_res=GLU
-suffix "_220GLU" -parser:script_vars target=220 new_res=GLU
-suffix "_298GLU" -parser:script_vars target=298 new_res=GLU
-suffix "_294LEU" -parser:script_vars target=294 new_res=LEU
-suffix "_407TYR" -parser:script_vars target=407 new_res=TYR
-suffix "_315ARG" -parser:script_vars target=315 new_res=ARG
-suffix "_164ASP" -parser:script_vars target=164 new_res=ASP
-suffix "_166GLU" -parser:script_vars target=166 new_res=GLU
-suffix "_415ASN" -parser:script_vars target=415 new_res=ASN
-suffix "_227TRP" -parser:script_vars target=227 new_res=TRP

Note that I've added suffixes to the output structures so they are written out with unique names containing the mutation.
The sub.sh is the same except for which binary we're calling:
#!/bin/bash
#
#SBATCH --job-name=mutants
#SBATCH --output=log.txt
#SBATCH --array=1-10

S=$( head -${SLURM_ARRAY_TASK_ID} list | tail -1 ) 
module load rosetta 
rosetta_scripts.linuxgccrelease @flags $S
and I've used these flags:
# options 
-s bglb.pdb
-out:path:all out 
-parser::protocol protocol.xml
-extra_res_fa pNPG.params
-enzdes::cstfile pNPG.enzdes.cst 

# packing
-packing::ex1
-packing::ex2
-packing::ex1aro:level 6
-packing::ex2aro
-packing::extrachi_cutoff 1
-packing::use_input_sc
-packing::flip_HNQ
-packing::no_optH false
-packing::optH_MCA false

# enzdes-specific 
-score::weights talaris2013_cst
-jd2::enzdes_out

# memory
-run::preserve_header
-run:version
-nblist_autoupdate
-linmem_ig 10
-chemical:exclude_patches LowerDNA  UpperDNA Cterm_amidation VirtualBB ShoveBB VirtualDNAPhosphate VirtualNTerm CTermConnect sc_orbitals pro_hydroxylated_case1 pro_hydroxylated_case2 ser_phosphorylated thr_phosphorylated  tyr_phosphorylated tyr_sulfated lys_dimethylated lys_monomethylated  lys_trimethylated lys_acetylated glu_carboxylated cys_acetylated tyr_diiodinated N_acetylated C_methylamidated MethylatedProteinCterm

Run with
sbatch sub.sh
Some tips and tricks

Cancel a running job

Stop a running job and kill all associated processes with
scancel <jobid>
Watching the output of batch jobs

If you submit a job and you want to watch the output, do
sbatch sub.sh --output=log.txt
tail -f log.txt
tail will follow the progress of the log file. Quit with ^C.
Jupyter notebooks

rsyncing tons of PDBs and log files back and forth to the cluster sucks. Running your data analysis with Juptyter notebooks on the cluster rocks. The recommended style is:
On Cabernet:
screen -d -m jupyter-notebook --no-browser --port 8889
then disconnect your session with exit. Note: we share the ports on Cabernet, so if 8889 is taken, try a number in the range 8000 to 9000.
On your machine:
ssh -N -f -L localhost:8888:localhost:8889 <user name>@cabernet.genomecenter.ucdavis.edu
and open localhost:8888 in your browser.
If you've used Epiphany

For those familiar with Sun Grid Engine (the scheduler used on Epiphany), the SLURM commands are very similar to SGE. However, the way they're configured is different.


SGE
SLURM


qsub sub.sh
srun sub.sh


qsub -t 1-100 sub.sh
sbatch --array=1-100 sub.sh


qstat (your jobs)
squeue -u  (your jobs)


qstat -u * (all jobs)
squeue (all jobs)


qlogin
salloc
SGE	SLURM
qsub sub.sh	srun sub.sh
qsub -t 1-100 sub.sh	sbatch --array=1-100 sub.sh
qstat (your jobs)	squeue -u (your jobs)
qstat -u * (all jobs)	squeue (all jobs)
qlogin	salloc