Startup steps for Julia CUDA MPI application relying on ParallelStencil.jl and ImplicitGlobalGrid.jl.
GPU cluster config:
- CUDA 11.0
- CUDA-aware OpenMPI 3.0.6
- gcc 8.3
Following steps should enable a successful multi-GPU run:
- Define the project's working home dir, usually on scratch:
project_dir=<path-to-project-dir>
app_dir=<relative-path-to-app-dir>
export HOME2=${project_dir}
cd ${HOME2}
- If using modules, clear the environment from any previously loaded modules
module purge > /dev/null 2>&1
module load julia
module load cuda/11.0
module load openmpi/gcc83-306-c110
- Define Julia specific config
export JULIA_PROJECT=${HOME2}/${app_dir}
export JULIA_DEPOT_PATH=${HOME2}/julia_depot
export JULIA_MPI_BINARY=system
export JULIA_CUDA_USE_BINARYBUILDER=false
export IGG_CUDAAWARE_MPI=1
export JULIA_NUM_THREADS=4
- a) On first run, add Julia modules to generate .toml files
cd ${app_dir}
julia
Then wihtin Julia
julia> ]
(project) pkg> activate .
(project) pkg> add ImplicitGlobalGrid
(project) pkg> add CUDA
(project) pkg> add MPI
(project) pkg> add ParallelStencil
(project) pkg> add Plots
julia> using ImplicitGlobalGrid
julia> using CUDA
julia> using MPI
julia> using ParallelStencil
julia> using Plots
julia> exit()
- b) All the times, trigger precompilation of MPI
cd ${app_dir}
julia --project -e 'using Pkg; pkg"instantiate"; pkg"build MPI"'
julia --project -e 'using Pkg; pkg"precompile"'
- Then, most optimal is to define a
submit_bash.sh
script containing following launch params
export HOME2=${project_dir}
export JULIA_PROJECT=${HOME2}/${app_dir}
export JULIA_DEPOT_PATH=${HOME2}/julia_depot
export JULIA_CUDA_USE_BINARYBUILDER=false
export JULIA_MPI_BINARY=system
export IGG_CUDAAWARE_MPI=1
export JULIA_NUM_THREADS=4
module purge > /dev/null 2>&1
module load julia
module load Qt
module load cuda/11.0
module load openmpi/gcc83-306-c110
julia_=$(which julia)
$julia_ -O3 --check-bounds=no <my-Julia-app>.jl
- Then define full path to the mpirun executable and run the
submit_bash.sh
script using a rankfile if needed
mpirun_=$(which mpirun)
$mpirun_ -np 8 -rf <gpu_rankfile> ./submit_bash.sh