* Home: home/IT4I-USERNAME (25 GB, 5k entries, per user)
* Work: /mnt/proj2/dd-**-**/ (20 TB, 5M entries, per project)
* Scratch: /scratch/project/dd-**-**/ (10 TB, 10M entries, per user)
gpu_info.pbs
:
#!/bin/bash
#PBS -q qnvidia
#PBS -N GPU_Info
#PBS -l select=1,walltime=00:05:00
#PBS -A DD-??-??
# Define the output directory for GPU information
SCRDIR=/scratch/project/${PBS_ACCOUNT,,}/${USER}/GPU_Info
# Change to the local scratch directory, exit on failure
cd /lscratch/${PBS_JOBID} || exit
# Load CUDA module
ml CUDA/11.3.1
# Use nvidia-smi to get GPU information
nvidia-smi -L > GPU_Info.txt
free -g | grep Mem | awk '{print $2}' >> GPU_Info.txt
lscpu >> GPU_Info.txt
df -h . >> GPU_Info.txt
hostname >> GPU_Info.txt
# Copy the information back to your home directory
mv GPU_Info.txt $SCRDIR
# Exit the script
exit
This section was be optimised for RPA simulation. See the advice on parallelisation at decomposition-of-the-box. Generally, there should be lower number of MPI processes.
In this example, we run SMILEI
on 2 nodes, ie. 128 CPUs in the express queue, maximum simulation time is 1 hour.
At each node we initiate 4 MPI processes, each process only one OMP thread.
Input file is located in the same folder as the submission script, its name is SIMNAME.py
.
The submission script moves the input to the scratch where the simulation is launched.
File runSMILEI.sh
#!/bin/bash
#PBS -q qexp
#PBS -N RPA5
#PBS -l select=2:ncpus=128:mpiprocs=128:ompthreads=1,walltime=0:59:00
#PBS -A DD-**-**
SIMNAME=rpa5
OURSCRDIR=/scratch/project/dd-**-**/USERNAME/$SIMNAME
SMILEI=/home/it4i-USERNAME/Smilei/smilei
# load the MPI module
module add HDF5/1.10.6-intel-2020a-parallel
module add Python/3.8.2-GCCcore-9.3.0
mkdir $OURSCRDIR
cd $OURSCRDIR
cp /home/it4i-USERNAME/SMILEI_runs/$SIMNAME.py $SIMNAME.py
date
mpirun $SMILEI $SIMNAME.py > /home/it4i-USERNAME/SMILEI_runs/$PBS_JOBID.out
date
exit
Now submit it:
qsub runSMILEI.sh