Skip to content

Instantly share code, notes, and snippets.

@cgmb
Last active February 22, 2023 18:59
Show Gist options
  • Save cgmb/033d90cbbf0a24e90ac1af8a5bb4764c to your computer and use it in GitHub Desktop.
Save cgmb/033d90cbbf0a24e90ac1af8a5bb4764c to your computer and use it in GitHub Desktop.
profile rocblas_initialize() on Ubuntu 20.04
#!/usr/bin/env bash
# How to profile rocblas_initialize() on Ubuntu 20.04
#
# This guide is written as a script, but it's really intended for you to copy/paste
# the bits you need into your interactive shell as you go along.
#
# I have not yet figured out how to set up all the permissions to do this in docker,
# so I would suggest doing it on a real system.
set -exuo pipefail
# build everything with debug symbols and better stack traces
export CXXFLAGS="-g -fno-omit-frame-pointer"
# build ROCm from source
# https://gist.github.com/cgmb/948455f2ab1f7132815c6fe5200bce38
./build-rocm-5.4.2.sh
# install perf
# Tis commaand below assumes you are using the Ubuntu 20.04 Hardware Enablement stack.
# Check that your kernel version from `uname -a` matches the version of
# the package reported by `apt search linux-tools-generic`
apt install linux-tools-generic-hwe-20.04
cmake -S. -Bbuild -DCMAKE_PREFIX_PATH=/opt/rocm
make -C build
# libhsa-amd-aqlprofile64 is closed source, so you'll see a warning about the missing library.
# Don't worry. It's an optional component. aqlprofile is used for GPU profiling. It's not needed
# for profiling the host code.
perf stat build/rbi # view some stats
perf record build/rbi # create perf.data
perf report perf.data
cmake_minimum_required(VERSION 3.16)
project(rocblas-init-benchmark)
find_package(rocblas REQUIRED)
add_executable(rbi main.c)
target_link_libraries(rbi PRIVATE roc::rocblas)
#include <rocblas/rocblas.h>
int main() {
rocblas_initialize();
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment