Skip to content

Instantly share code, notes, and snippets.

@zbeekman
Created August 31, 2018 00:00
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zbeekman/000e45426cd55d08e9c638697a1cd5df to your computer and use it in GitHub Desktop.
Save zbeekman/000e45426cd55d08e9c638697a1cd5df to your computer and use it in GitHub Desktop.
Instructions for running Intel Advisor-xe to generate roofline plots (intel 18+)

Notes for Intel Advisor Roofline Generation of MPI/OpenMP Fortran Programs

These notes are targeting a Cray XC 40/50 system, however they should be relatively easy to generalize to other systems.

Environment

The following assumes your shell is bash. Intel provides .csh scripts too.

module swap PrgEnv-cray PrgEnv-intel
module swap intel/18.0.3.222
# For the next step, `which ifort` should point you in the right direction
source /opt/intel/advisor_2018/advixe-vars.sh intel64
export LD_LIBRARY_PATH="/opt/intel/advisor_2018/lib64:${LD_LIBRARY_PATH}"

The program must be compiled with debug symbols and dynamically linked. On Cray machines this means passing the following FCFLAGS

-g -dynamic

in addition to any other optimization flags being used.

Collecting the data

Two scripts are used to collect the survey data which includes realistic timings, and then the loops tripcount analysis which causes very large runtime dilation. The survey should be run first followed by the tripcounts analysis.

#!/bin/bash

# survey.sh

# set this locally or the ADVIXE_PROJ_DIR environment variable in you environment to choose
# where the Intel Advisor-xe sample & analysis files will go

export _local_proj_dir=${ADVIXE_PROJ_DIR:-./proj}

export PMI_RANK=${ALPS_APP_PE}
export PMI_NO_FORK=1 # Otherwise we'll be instrumenting ALPS
export PMI_NO_PREINITIALIZE=1
export PMI_MMAP_SYNC_WAIT_TIME=300

advixe-cl -collect survey -trace-mpi --no-auto-finalize -project-dir ${_local_proj_dir} $@
#!/bin/bash

# tripcounts.sh

# set this locally or the ADVIXE_PROJ_DIR environment variable in you environment to choose
# where the Intel Advisor-xe sample & analysis files will go

export _local_proj_dir=${ADVIXE_PROJ_DIR:-./proj}

export PMI_RANK=${ALPS_APP_PE}
export PMI_NO_FORK=1
export PMI_NO_PREINITIALIZE=1
export PMI_MMAP_SYNC_WAIT_TIME=300

advixe-cl -collect tripcounts -flop -trace-mpi -project-dir ${_local_proj_dir} $@

Ensure both scripts are readable and executable with something like:

chmod +rx ./survey.sh ./tripcounts.sh

Then, to collect the data, pick a suitably small problem size, since:

  1. You will only be able to examine results on a single MPI rank at any given time
  2. The runtime dilation during the trip count phase can be quite large

In your batch script, or with an interactive job, ensure your environment is setup correctly, as shown above. Then, perform the survey analysis followed by the trip count analysis:

export ADVIXE_PROJ_DIR=/some/path/to/project/directory # if you don't want ./proj to be used
aprun -B ./survey.sh ./a.out .     # -B will grab parameters from PBS, you can set -n, -N etc. explicitly instead
aprun -B ./tripcounts.sh ./a.out

Advisor-xe will create the directory ./proj or ${ADVIXE_PROJ_DIR} if it does not exist and will place sampling/report data there. To ensure the analysis is performed for the same architecture as the data collection was performed on, use the --snapshot flag, on the compute node if needed. (This is in fact needed if you wish to analyze the results for the KNL partition.)

aprun -n 1 -b advixe-cl --snapshot \
                        --project-dir ${ADVIXE_PROJ_DIR:-./proj} \
                        --pack \
                        --cache-sources \
                        --cache-binaries \
                        -- ${ADVIXE_PROJ_DIR:-./proj}_snapshot

Some survey data can also be exported as a CSV table for analysis with another tool using:

advixe-cl --report survey \
          --project-dir ${ADVIXE_PROJ_DIR:-./proj} \
          --show-all-columns \
          --format=csv \
          --report-output ./proj.csv

This webpage from Intel has the details about running advisor-xe: https://software.intel.com/en-us/articles/analyzing-intel-mpi-applications-using-intel-advisor

Conclusion

With this setup you should be able to collect roofline data for your HPC programs if you have access to a recent Intel Parallel Studio (18.x+) and Intel Advisor-xe. If you have questions or want to share your experiences, please comment below, and or tweet them to me.


GitHub followers Twitter Follow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment