Skip to content

Instantly share code, notes, and snippets.

@mwrnd
Created October 16, 2019 14:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mwrnd/a3d399bd4482c54faa794d5baef50e41 to your computer and use it in GitHub Desktop.
Save mwrnd/a3d399bd4482c54faa794d5baef50e41 to your computer and use it in GitHub Desktop.
AMD ROCm Installation Notes
sudo apt-get update
sudo apt-get upgrade
sudo reboot
**** NOTE: did not install 5.0 kernel image; did not run dist-upgrade
sudo gedit /etc/default/grub
--> change the following
GRUB_TIMEOUT=1
GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_SAVEDEFAULT=true
GRUB_DEFAULT=saved
sudo update-grub
sudo apt install libnuma-dev glibc-doc glibc-doc-reference libc-dev-bin
wget http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key
sha1sum rocm.gpg.key
echo 'deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main' | sudo tee /etc/apt/sources.list.d/rocm.list
sudo reboot
**** Note: only rocm-dkms is strictly required
sudo apt install rocm-dkms cmake autoconf automake libtool flex bison gcc-doc python-doc python3-doc python3-nose python3-nose gnu-standards autoconf-doc bison-doc cmake-doc flex-doc python-nose-doc python3-examples
echo 'export PATH=$PATH:/opt/rocm/bin:/opt/rocm/profiler/bin:/opt/rocm/opencl/bin/x86_64' | sudo tee -a /etc/profile.d/rocm.sh
sudo usermod -a -G video $LOGNAME
sudo reboot
sudo apt-get install build-essential clang clang-format clang-tidy cmake cmake-qt-gui ssh curl apt-utils pkg-config g++-multilib git libunwind-dev libfftw3-dev libelf-dev libncurses5-dev libpthread-stubs0-dev vim gfortran libboost-program-options-dev libssl-dev libboost-dev libboost-system-dev libboost-filesystem-dev rpm wget libboost1.65-doc libboost-atomic1.65-dev libboost-numpy1.65-dev
sudo apt-get install rocm-dkms rocm-dev rocm-libs hipcub rccl rocm-device-libs hsa-ext-rocr-dev hsakmt-roct-dev hsa-rocr-dev rocm-opencl rocm-opencl-dev rocm-utils miopen-hip miopengemm rocminfo rocm-profiler rocm_bandwidth_test rocr_debug_agent
sudo reboot
rocm_agent_enumerator
gfx000
gfx803
rocm_bandwidth_test
Unidirectional copy peak bandwidth GB/s
D/D 0 1
0 N/A 11.257026
1 11.117093 32.398932
Bdirectional copy peak bandwidth GB/s
D/D 0 1
0 N/A 14.794945
1 14.794945 N/A
dmesg | grep kfd
[ 3.328899] kfd kfd: Allocated 3969056 bytes on gart
[ 3.329348] kfd kfd: added device 1002:67df
sudo apt-get install python-numpy python-dev python-wheel python-mock python-future python-pip python-yaml python-setuptools python-crypto-doc python-cryptography-doc python-cryptography-vectors python-numpy-doc python-nose
sudo apt-get install python3-numpy python3-dev python3-wheel python3-mock python3-future python3-pip python3-yaml python3-setuptools
sudo pip install tensorflow-rocm --upgrade
python
>>> import tensorflow
>>> tensorflow.__version__
'1.14.2'
>>>
wget https://codeload.github.com/tensorflow/benchmarks/zip/abb1aec2f2db4ba73fac2e1359227aef59b10258
unzip benchmarks-master.zip
cd ~/benchmarks/scripts/tf_cnn_benchmarks
python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50
[...]
MIOpen(HIP): Warning [FindRecordUnsafe] File is unreadable: /opt/rocm/miopen/share/miopen/db/gfx803_32.cd.pdb.txt
MIOpen(HIP): Warning [FindRecordUnsafe] File is unreadable: /opt/rocm/miopen/share/miopen/db/gfx803_32.cd.pdb.txt
[...]
Step Img/sec total_loss
1 images/sec: 52.2 +/- 0.0 (jitter = 0.0) 8.220
10 images/sec: 52.1 +/- 0.0 (jitter = 0.1) 7.880
20 images/sec: 52.0 +/- 0.0 (jitter = 0.1) 7.910
30 images/sec: 52.0 +/- 0.0 (jitter = 0.1) 7.820
40 images/sec: 52.0 +/- 0.0 (jitter = 0.1) 8.005
50 images/sec: 52.0 +/- 0.0 (jitter = 0.1) 7.768
60 images/sec: 51.9 +/- 0.0 (jitter = 0.1) 8.114
70 images/sec: 51.9 +/- 0.0 (jitter = 0.1) 7.817
80 images/sec: 51.9 +/- 0.0 (jitter = 0.1) 7.978
90 images/sec: 51.9 +/- 0.0 (jitter = 0.1) 8.099
100 images/sec: 51.9 +/- 0.0 (jitter = 0.1) 8.042
----------------------------------------------------------------
total images/sec: 51.87
----------------------------------------------------------------
sudo cp /opt/rocm/miopen/share/miopen/db/gfx803_36.cd.pdb.txt /opt/rocm/miopen/share/miopen/db/gfx803_32.cd.pdb.txt
python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50
[...]
Step Img/sec total_loss
1 images/sec: 52.5 +/- 0.0 (jitter = 0.0) 8.220
10 images/sec: 52.6 +/- 0.0 (jitter = 0.1) 7.880
20 images/sec: 52.5 +/- 0.0 (jitter = 0.1) 7.910
30 images/sec: 52.5 +/- 0.0 (jitter = 0.1) 7.821
40 images/sec: 52.5 +/- 0.0 (jitter = 0.1) 8.004
50 images/sec: 52.5 +/- 0.0 (jitter = 0.1) 7.769
60 images/sec: 52.5 +/- 0.0 (jitter = 0.1) 8.114
70 images/sec: 52.4 +/- 0.0 (jitter = 0.1) 7.819
80 images/sec: 52.4 +/- 0.0 (jitter = 0.1) 7.979
90 images/sec: 52.4 +/- 0.0 (jitter = 0.1) 8.096
100 images/sec: 52.4 +/- 0.0 (jitter = 0.1) 8.026
----------------------------------------------------------------
total images/sec: 52.38
----------------------------------------------------------------
rocm-smi
========================ROCm System Management Interface========================
================================================================================
GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%
0 71.0c 120.002W 1233Mhz 1750Mhz 38.82% high 120.0W 97% 100%
================================================================================
==============================End of ROCm SMI Log ==============================
**** NCF and densenet require tensorflow models to be installed:
wget https://codeload.github.com/tensorflow/models/zip/v1.13.0
unzip v1.13.0
sudo mv models-1.13.0 /usr/local/share/
sudo chown -R root:root /usr/local/share/models-1.13.0
export PYTHONPATH=$PYTHONPATH:/usr/local/share/models-1.13.0/
my setup ==
export PYTHONPATH=/usr/local/share/models-1.13.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment