##Feynman: feynman.princeton.edu
Feynman-hepheno has somewhere between 12 and 16 nodes (unclear).
- Three areas -- home (
~
), group (/group/hepheno/
) and group storage (/mnt/hepheno/
) - Best to work from your user directory in
/group/hepheno
since~
has very little storage space - The contents of
/group/hepheno/group-setup.sh
are automatically sourced oncd
into the group directory/group/hepheno/
:- Activates a python virtual environment (ML: chmod others into the venv so they can pip install etc?)
- Initiates MPI and sets a few environment variables
- Different compiler versions (intel/gnu) as well as OpenMPI/Intel MPI versions can be activated using
module load <module>
and available modules seen usingmodule avail
. - The Fermi data is stored in
/mnt/hepheno/CTBCORE/
and/mnt/hepheno/FermiData
/group/hepheno/heptools/
contains relevent programs (MultiNest, pythia etc) as well as the NPTF code- Typical env:
# env.sh
# Feynman installation
export work_dir=$'/group/hepheno/heptools/test_code'
export psf_dir=$'/mnt/hepheno/CTBCORE/psf_data'
export maps_dir=$'/mnt/hepheno/CTBCORE/'
export fermi_data_dir=$'/mnt/hepheno/FermiData/'
export PYTHONPATH=$PYTHONPATH:/group/hepheno/heptools/Fermi-NPTF-edep-nrodd
###Submitting jobs
- PBS scheduler. Typical batch file:
#!/bin/sh
#PBS -j oe
#PBS -o /home/smsharma/scratch/job.out
#PBS -l nodes=1:ppn=1 -l cput=200:59:00
#PBS -q hepheno
#PBS -m abe
#PBS -k oe
#PBS -M smsharma@princeton.edu
export PYTHONPATH=/group/hepheno/heptools/Fermi-NPTF-edep-nrodd/
export PATH=/opt/intel/impi/5.0.3.048/lib64/:$PATH
source /group/hepheno/heptools/test_code/env.sh
source /opt/intel/impi/5.0.3.048/bin64/mpivars.sh
cd /group/hepheno/venv
source bin/activate
cd /group/hepheno/heptools/test_code
./shell_scripts/run_nptf_edep_minuit.sh
- To submit job,
qsub -l mem=4gb submit.batch
- Standard output and error stored in your home directory as the job is run
- Useful commands:
qstat [-u <username>]
to view jobsqdel <jobid>
to cancel jobs or this script to cancel all
##Tiger/della tiger.princeton.edu and della.princeton.edu
Tiger has 16 cores/node and della has 20 cores/node. There are four partitions on tiger: all, nongpu, gpu and serial.
all | gpu | nongpu | serial | |
---|---|---|---|---|
priority | 4000 | 5000 | 1 | 5000 |
totalnodes | 544 | 50 | 594 | 4 |
- Home dir
~
(negligible storage space),/scratch/gpfs/
(temp work, short-lived branches etc),/tigress/
(storage area, everything else) - Both tiger and della use common storage space
/tigress/
so probably best to work from there - Fermi data in
/tigress/smsharma/public
- Software in
/tigress/smsharma/public/heptools
(including NPTF code) - No group setup so need to run the following before using:
cd /tigress/smsharma/public/heptools/MultiNest/
export LD_LIBRARY_PATH=$(pwd)/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
cd /tigress/smsharma/public/heptools
source venv/bin/activate
export PYTHONPATH=$PYTHONPATH:/tigress/smsharma/public/heptools/Fermi-NPTF-edep
module load openmpi
- Typical env:
# env.sh
# Tigress installation
export work_dir=$(pwd)
export psf_dir=$'/tigress/smsharma/public/CTBCORE/psf_data'
export maps_dir=$'/tigress/smsharma/public/CTBCORE/'
export fermi_data_dir=$'/tigress/smsharma/public/FermiData/'
export PYTHONPATH=$PYTHONPATH:/tigress/smsharma/public/heptools/Fermi-NPTF-edep
###Submitting jobs
- SLURM scheduler. Check the quick start guide http://slurm.schedmd.com/quickstart.html . A quick run by srun
[jiange@tiger1 ~]$ srun -N3 -l -t 10:0:0 --gres=gpu:2 /bin/hostname
0: tiger-r12n12
1: tiger-r12n13
2: tiger-r12n14
The —gres arguments is a list of name[[:type]:count], denoting consumable resources. The specified resources will be allocated to the job on each node.
Typical batch file:
#!/bin/bash
#SBATCH -N 1 # node count
#SBATCH --ntasks-per-node=12
#SBATCH -t 4:00:00
#SBATCH --mail-type=begin
#SBATCH --mail-type=end
#SBATCH --mail-user=smsharma@princeton.edu
cd /tigress/smsharma/public/heptools/MultiNest/
export LD_LIBRARY_PATH=$(pwd)/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
cd /tigress/smsharma/public/heptools
source venv/bin/activate
export PYTHONPATH=$PYTHONPATH:/tigress/smsharma/public/heptools/Fermi-NPTF-edep
module load openmpi
cd /tigress/smsharma/public/heptools/test_code
source env_example.sh
./shell_scripts/run_nptf_edep_high_lat.sh
- To submit job,
sbatch submit.batch
- Standard output and error in same directory where job is run from.
- Annoyingly,
mpirun
/mpiexec
does not work properly on Della so need to usesrun
in therun.sh
files instead. - Useful commands:
squeue [-u <username>]
to view jobsscancel -u <username>
andscancel <jobid>
to cancel jobs
Most of this is just random stuff that I find useful while running on the cluster
If you want to use Sublime Text, either use rsub
or Sublime FTP
- Install local version of
tmux
using this script tmux new-session -s <name>
to start new sessiontmux attach -t <name>
to attach named sessionCtrl-B D
to detach sessiontmux attach
to attach sessionCtrl-B $
to rename session
kill -9 -1
kill [-u <user>]
- On Feynman, type:
ipython notebook --no-browser --port=7000
- On your computer, type:
ssh -N -f -L localhost:7000:localhost:7000 <username>@feynman.princeton.edu
- On your web browser, go to:
http://localhost:7000
pip install numpy matplotlib scipy healpy pandas Cython mpmath pymultinest mpi4py numexpr
tail -f ~/HL_Sim1_NPTF_0-15_np3.batch.o288395