Skip to content

Instantly share code, notes, and snippets.

@jacobtomlinson
Last active September 27, 2021 11:06
Show Gist options
  • Save jacobtomlinson/6242a3547d13d4e7a00cc768b8b475c8 to your computer and use it in GitHub Desktop.
Save jacobtomlinson/6242a3547d13d4e7a00cc768b8b475c8 to your computer and use it in GitHub Desktop.
UCX-Py testing on Azure
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Azure UXC-Py Testing\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Build VM\n",
"\n",
"- Create new VM instance.\n",
"- Select `East US` region.\n",
"- Change `Availability options` to `Availability set` and create a set. \n",
" - If building multiple instances put additional instances in the same set.\n",
"- Use the 2nd Gen Ubuntu 18.04 image.\n",
" - Search all images for `Ubuntu Server 18.04` and choose the second one down on the list.\n",
"- Change size to `ND40rs_v2`.\n",
"- Set password login with credentials.\n",
" - User `someuser`\n",
" - Password `somepassword`\n",
"- Leave all other options as default."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Install software\n",
"\n",
"Before installing the drivers ensure the system is up to date.\n",
"\n",
"```\n",
"sudo apt update\n",
"sudo apt upgrade -y\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## NVIDIA Drivers\n",
"\n",
"```\n",
"wget http://uk.download.nvidia.com/tesla/440.33.01/nvidia-driver-local-repo-ubuntu1804-440.33.01_1.0-1_amd64.deb\n",
"sudo dpkg -i nvidia-driver-local-repo-ubuntu1804-440.33.01_1.0-1_amd64.deb\n",
"sudo apt-key add /var/nvidia-driver-local-repo-440.33.01/7fa2af80.pub\n",
"sudo apt update\n",
"sudo apt install cuda-drivers -y\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## IB Drivers\n",
"\n",
"We need to install the drivers as per the [Azure Documentation](https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/hpc/enable-infiniband).\n",
"\n",
"```\n",
"# Modify the variable to desired Mellanox OFED version\n",
"MOFED_VERSION=4.7-3.2.9.0\n",
"# Modify the variable to desired OS\n",
"MOFED_OS=ubuntu18.04\n",
"pushd /tmp\n",
"curl -fSsL https://www.mellanox.com/downloads/ofed/MLNX_OFED-${MOFED_VERSION}/MLNX_OFED_LINUX-${MOFED_VERSION}-${MOFED_OS}-x86_64.tgz | tar -zxpf -\n",
"cd MLNX_OFED_LINUX-*\n",
"sudo ./mlnxofedinstall\n",
"popd\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Driver errors\n",
"\n",
"Running without the `apt upgrade` before hand resulted in a package conflict and error. \n",
"\n",
"<details>\n",
"<summary>Expand to see error and resolution</summary>\n",
"\n",
"```\n",
"Failed command: apt-get install -y automake libnl-route-3-200 dpatch flex graphviz m4 tk bison libnl-3-200 tcl swig gfortran libglib2.0-0 autoconf quilt libltdl-dev debhelper libgfortran4 autotools-dev chrpath\n",
"```\n",
"\n",
"Running that command manually gave this error.\n",
"\n",
"```\n",
"$ sudo apt-get install -y automake libnl-route-3-200 dpatch flex graphviz m4 tk bison libnl-3-200 tcl swig gfortran libglib2.0-0 autoconf quilt libltdl-dev debhelper libgfortran4 autotools-dev chrpath\n",
"Reading package lists... Done\n",
"Building dependency tree\n",
"Reading state information... Done\n",
"libglib2.0-0 is already the newest version (2.56.4-0ubuntu0.18.04.4).\n",
"libglib2.0-0 set to manually installed.\n",
"You might want to run 'apt --fix-broken install' to correct these.\n",
"The following packages have unmet dependencies:\n",
" bison : Depends: libbison-dev (= 2:3.0.4.dfsg-1build1) but it is not going to be installed\n",
" cuda-drivers : Depends: nvidia-440 (>= 440.33.01) but it is not going to be installed\n",
" debhelper : Depends: dh-autoreconf (>= 17~) but it is not going to be installed\n",
" Depends: dh-strip-nondeterminism (>= 0.028~) but it is not going to be installed\n",
" Depends: po-debconf but it is not going to be installed\n",
" gfortran : Depends: gfortran-7 (>= 7.4.0-1~) but it is not going to be installed\n",
" graphviz : Depends: libann0 but it is not going to be installed\n",
" Depends: libcdt5 but it is not going to be installed\n",
" Depends: libcgraph6 but it is not going to be installed\n",
" Depends: libgd3 (>= 2.1.0~alpha~) but it is not going to be installed\n",
" Depends: libgts-0.7-5 (>= 0.7.6) but it is not going to be installed\n",
" Depends: libgvc6 but it is not going to be installed\n",
" Depends: libgvpr2 but it is not going to be installed\n",
" Depends: liblab-gamut1 but it is not going to be installed\n",
" Recommends: fonts-liberation but it is not going to be installed\n",
" libltdl-dev : Depends: libltdl7 (= 2.4.6-2) but it is not going to be installed\n",
" Recommends: libtool but it is not going to be installed\n",
" nvidia-440-dev : Depends: nvidia-440 (>= 440.33.01) but it is not going to be installed\n",
" quilt : Depends: diffstat but it is not going to be installed\n",
" Depends: gettext but it is not going to be installed\n",
" swig : Depends: swig3.0 (>= 3.0.12-1) but it is not going to be installed\n",
" tcl : Depends: tcl8.6 (>= 8.6.0-2) but it is not going to be installed\n",
" tk : Depends: tk8.6 (>= 8.6.0-2) but it is not going to be installed\n",
"E: Unmet dependencies. Try 'apt --fix-broken install' with no packages (or specify a solution).\n",
"```\n",
"\n",
"Then ran the fix broken command as instructed.\n",
"\n",
"```\n",
"$ sudo apt --fix-broken install\n",
"Reading package lists... Done\n",
"Building dependency tree\n",
"Reading state information... Done\n",
"Correcting dependencies... Done\n",
"The following packages were automatically installed and are no longer required:\n",
" grub-pc-bin linux-headers-4.15.0-76\n",
"Use 'sudo apt autoremove' to remove them.\n",
"The following additional packages will be installed:\n",
" nvidia-440\n",
"The following NEW packages will be installed:\n",
" nvidia-440\n",
"0 upgraded, 1 newly installed, 0 to remove and 7 not upgraded.\n",
"173 not fully installed or removed.\n",
"Need to get 0 B/131 MB of archives.\n",
"After this operation, 458 MB of additional disk space will be used.\n",
"Do you want to continue? [Y/n] y\n",
"(Reading database ... 94872 files and directories currently installed.)\n",
"Preparing to unpack .../nvidia-440_440.33.01-0ubuntu1_amd64.deb ...\n",
"Unpacking nvidia-440 (440.33.01-0ubuntu1) ...\n",
"dpkg: error processing archive /var/cache/apt/archives/nvidia-440_440.33.01-0ubuntu1_amd64.deb (--unpack):\n",
" trying to overwrite '/usr/lib/x86_64-linux-gnu/libGLX_indirect.so.0', which is also in package libglx-mesa0:amd64 19.2.8-0ubuntu0~18.04.1\n",
"Errors were encountered while processing:\n",
" /var/cache/apt/archives/nvidia-440_440.33.01-0ubuntu1_amd64.deb\n",
"E: Sub-process /usr/bin/dpkg returned an error code (1)\n",
"```\n",
"\n",
"The nvidia driver seems to complain about a package conflict with `libglx-mesa0`.\n",
"\n",
"So manually installed the package with forced overwriting.\n",
"\n",
"```\n",
"sudo dpkg -i --force-overwrite /var/cache/apt/archives/nvidia-440_440.33.01-0ubuntu1_amd64.deb\n",
"```\n",
"\n",
"Once this completed I was able to successfully run `sudo apt --fix-broken install`.\n",
"\n",
"Then ran `sudo ./mlnxofedinstall` again.\n",
" \n",
"</details>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install GPUDirect RDMA kernel module\n",
"\n",
"As per the [Mellanox Documentation](https://www.mellanox.com/related-docs/prod_software/Mellanox_GPUDirect_User_Manual_v1.5.pdf).\n",
"\n",
"```\n",
"cd /tmp\n",
"wget https://www.mellanox.com/sites/default/files/downloads/ofed/nvidia-peer-memory_1.0-8.tar.gz\n",
"tar xvzf nvidia-peer-memory_1.0-8.tar.gz\n",
"cd nvidia-peer-memory-1.0/\n",
"./build_module.sh\n",
"dpkg-buildpackage -us -uc\n",
"sudo dpkg -i ../nvidia-peer-memory*.deb\n",
"modprobe nv_peer_mem\n",
"```\n",
"\n",
"Check install\n",
"\n",
"```\n",
"$ lsmod | grep nv_peer_mem\n",
"nv_peer_mem 16384 0\n",
"nvidia 19894272 102 nvidia_uvm,nv_peer_mem,nvidia_modeset\n",
"ib_core 327680 12 rdma_cm,ib_ipoib,mlx4_ib,nv_peer_mem,iw_cm,ib_iser,ib_umad,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm,ib_ucm\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Enable IPoIB\n",
"\n",
"```\n",
"sudo sed -i -e 's/# OS.EnableRDMA=y/OS.EnableRDMA=y/g' /etc/waagent.conf\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reboot\n",
"\n",
"Rebooted the machine to ensure all changes were applied.\n",
"\n",
"```\n",
"sudo reboot\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Check IB\n",
"\n",
"```\n",
"$ ip addr show dev ib0\n",
"4: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256\n",
" link/infiniband 20:00:09:26:fe:80:00:00:00:00:00:00:00:15:5d:ff:fd:33:ff:40 brd 00:ff:ff:ff:ff:12:40:1b:80:04:00:00:00:00:00:00:ff:ff:ff:ff\n",
" inet 172.16.1.55/16 brd 172.16.255.255 scope global ib0\n",
" valid_lft forever preferred_lft forever\n",
" inet6 fe80::215:5dff:fd33:ff40/64 scope link\n",
" valid_lft forever preferred_lft forever\n",
"```\n",
"\n",
"```\n",
"$ nvidia-smi topo -m\n",
" GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 mlx5_0 CPU Affinity\n",
"GPU0 X NV1 NV2 NODE NODE NV2 NODE NV1 NODE 0-19\n",
"GPU1 NV1 X NV2 NV1 NODE NODE NODE NV2 NODE 0-19\n",
"GPU2 NV2 NV2 X NODE NV1 NODE NODE NV1 NODE 0-19\n",
"GPU3 NODE NV1 NODE X NV2 NV1 NV2 NODE NODE 0-19\n",
"GPU4 NODE NODE NV1 NV2 X NV2 NV1 NODE NODE 0-19\n",
"GPU5 NV2 NODE NODE NV1 NV2 X NV1 NODE NODE 0-19\n",
"GPU6 NODE NODE NODE NV2 NV1 NV1 X NV2 NODE 0-19\n",
"GPU7 NV1 NV2 NV1 NODE NODE NODE NV2 X NODE 0-19\n",
"mlx5_0 NODE NODE NODE NODE NODE NODE NODE NODE X\n",
"\n",
"Legend:\n",
"\n",
" X = Self\n",
" SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)\n",
" NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node\n",
" PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)\n",
" PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)\n",
" PIX = Connection traversing at most a single PCIe bridge\n",
" NV# = Connection traversing a bonded set of # NVLinks\n",
" ```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Install UCX-Py and tools"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```\n",
"$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh\n",
"$ bash Miniconda3-latest-Linux-x86_64.sh\n",
"```\n",
"\n",
"Accept the default and allow `conda init` to run. Then start a new shell.\n",
"\n",
"Create a conda environment ([see ucx-py docs](https://ucx-py.readthedocs.io/en/latest/install.html#conda))\n",
"\n",
"```\n",
"$ conda create -n ucxpy -c conda-forge -c rapidsai python=3.7 ipython ucx-proc=*=gpu ucx ucx-py dask distributed numpy cupy -y\n",
"...\n",
"$ conda activate ucxpy\n",
"```\n",
"\n",
"Clone ucx-py repo locally\n",
"\n",
"```\n",
"git clone https://github.com/rapidsai/ucx-py.git\n",
"cd ucx-py/benchmarks\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Run benchmarks\n",
"\n",
"[GitHub Issue on Benchmarks](https://github.com/rapidsai/ucx-py/issues/311)\n",
"\n",
"## Single Node\n",
"\n",
"### TCP\n",
"\n",
"```\n",
"$ UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm python local-send-recv.py -o cupy -n \"100MB\" --server-dev 1 --client-dev 2 --reuse-alloc\n",
"[1581000782.671561] [rapids-ucxpy-1:6714 :0] parser.c:1578 UCX WARN unused env variable: UCX_CUDA_IPC_CACHE (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)\n",
"[1581000785.889780] [rapids-ucxpy-1:6783 :0] parser.c:1578 UCX WARN unused env variable: UCX_CUDA_IPC_CACHE (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)\n",
"[1581000793.139156] [rapids-ucxpy-1:6783 :0] mpool.c:43 UCX WARN object 0x56296beea780 was not returned to mpool ucp_am_bufs\n",
"Roundtrip benchmark\n",
"--------------------------\n",
"n_iter | 10\n",
"n_bytes | 100.00 MB\n",
"object | cupy\n",
"reuse alloc | True\n",
"==========================\n",
"Device(s) | 1, 2\n",
"Average | 395.07 MB/s\n",
"--------------------------\n",
"Iterations\n",
"--------------------------\n",
"000 |403.10 MB/s\n",
"001 |387.50 MB/s\n",
"002 |389.13 MB/s\n",
"003 |388.82 MB/s\n",
"004 |385.98 MB/s\n",
"005 |410.38 MB/s\n",
"006 |423.28 MB/s\n",
"007 |387.83 MB/s\n",
"008 |393.22 MB/s\n",
"009 |385.06 MB/s\n",
"```\n",
"\n",
"### IB\n",
"\n",
"```\n",
"$ UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,rc python local-send-recv.py -o cupy -n \"100MB\" --server-dev 1 --client-dev 2 --reuse-alloc [36/1091]\n",
"[1581000806.378520] [rapids-ucxpy-1:6893 :0] parser.c:1578 UCX WARN unused env variable: UCX_CUDA_IPC_CACHE (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)\n",
"[1581000806.852800] [rapids-ucxpy-1:6960 :0] parser.c:1578 UCX WARN unused env variable: UCX_CUDA_IPC_CACHE (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)\n",
"Roundtrip benchmark\n",
"--------------------------\n",
"n_iter | 10\n",
"n_bytes | 100.00 MB\n",
"object | cupy\n",
"reuse alloc | True\n",
"==========================\n",
"Device(s) | 1, 2\n",
"Average | 2.12 GB/s\n",
"--------------------------\n",
"Iterations\n",
"--------------------------\n",
"000 | 1.58 GB/s\n",
"001 | 2.20 GB/s\n",
"002 | 2.21 GB/s\n",
"003 | 2.21 GB/s\n",
"004 | 2.21 GB/s\n",
"005 | 2.21 GB/s\n",
"006 | 2.21 GB/s\n",
"007 | 2.21 GB/s\n",
"008 | 2.21 GB/s\n",
"009 | 2.21 GB/s\n",
"```\n",
"\n",
"### NVLINK\n",
"\n",
"#### NV1\n",
"\n",
"```\n",
"$ UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,cuda_ipc python local-send-recv.py -o cupy -n \"100MB\" --server-dev 1 --client-dev 0 --reuse-alloc\n",
"[1581000977.744833] [rapids-ucxpy-1:7444 :0] sock.c:224 UCX ERROR recv(fd=53) failed: Connection reset by peer\n",
"\n",
"Roundtrip benchmark\n",
"--------------------------\n",
"n_iter | 10\n",
"n_bytes | 100.00 MB\n",
"object | cupy\n",
"reuse alloc | True\n",
"==========================\n",
"Device(s) | 1, 0\n",
"Average | 17.08 GB/s\n",
"--------------------------\n",
"Iterations\n",
"--------------------------\n",
"000 | 16.54 GB/s\n",
"001 | 14.28 GB/s\n",
"002 | 17.58 GB/s\n",
"003 | 17.52 GB/s\n",
"004 | 17.51 GB/s\n",
"005 | 17.66 GB/s\n",
"006 | 17.72 GB/s\n",
"007 | 17.67 GB/s\n",
"008 | 17.31 GB/s\n",
"009 | 17.66 GB/s\n",
"```\n",
"\n",
"#### NV2\n",
"\n",
"```\n",
"$ UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,cuda_ipc python local-send-recv.py -o cupy -n \"100MB\" --server-dev 1 --client-dev 2 --reuse-alloc\n",
"[1581000819.685852] [rapids-ucxpy-1:7127 :0] mpool.c:43 UCX WARN object 0x55d8125fe500 was not returned to mpool ucp_am_bufs\n",
"Roundtrip benchmark\n",
"--------------------------\n",
"n_iter | 10\n",
"n_bytes | 100.00 MB\n",
"object | cupy\n",
"reuse alloc | True\n",
"==========================\n",
"Device(s) | 1, 2\n",
"Average | 26.12 GB/s\n",
"--------------------------\n",
"Iterations\n",
"--------------------------\n",
"000 | 19.73 GB/s\n",
"001 | 20.11 GB/s\n",
"002 | 29.31 GB/s\n",
"003 | 28.27 GB/s\n",
"004 | 28.40 GB/s\n",
"005 | 28.23 GB/s\n",
"006 | 27.69 GB/s\n",
"007 | 28.45 GB/s\n",
"008 | 28.59 GB/s\n",
"009 | 27.70 GB/s\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Multi node\n",
"\n",
"Make note of the ib addresses with `ip addr show dev ib0`.\n",
"\n",
"### Server (node 1)\n",
"\n",
"```\n",
"UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,rc python recv-into-client.py -r recv_into -o cupy --n-bytes 1000Mb -p 13337 --n-iter 100\n",
"```\n",
"\n",
"### Client (node 2)\n",
"\n",
"```\n",
"UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,rc python recv-into-client.py -r recv_into -o cupy --n-bytes 1000Mb -p 13337 -s 172.16.1.55 --n-iter 100\n",
"```\n",
"\n",
"### Results\n",
"\n",
"```\n",
"[1581000858.688259] [rapids-ucxpy-1:7223 :0] parser.c:1578 UCX WARN unused env variable: UCX_CUDA_IPC_CACHE (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)\n",
"CUDA RUNTIME DEVICE: 0\n",
"Roundtrip benchmark\n",
"-------------------\n",
"n_iter | 100\n",
"n_bytes | 1000.00 MB\n",
"recv | recv_into\n",
"object | cupy\n",
"inc | False\n",
"\n",
"===================\n",
"2.74 GB / s\n",
"===================\n",
"[1581000944.094646] [rapids-ucxpy-1:7223 :0] rc_ep.c:321 UCX WARN destroying rc ep 0x561f4ab84b88 with uncompleted operation 0x561f52681140\n",
"[1581000944.099832] [rapids-ucxpy-1:7223 :0] mpool.c:43 UCX WARN object 0x561f4abe13c0 was not returned to mpool ucp_requests\n",
"[1581000944.099844] [rapids-ucxpy-1:7223 :0] callbackq.c:447 UCX WARN 0 fast-path and 1 slow-path callbacks remain in the queue\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## ucx_perftest"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```\n",
"$ UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,rc ucx_perftest localhost -t tag_bw -m cuda -n 100 -s 10000000\n",
"[1581000133.545295] [rapids-ucxpy-2:14894:0] perftest.c:1376 UCX WARN CPU affinity is not set (bound to 40 cpus). Performance may be impacted.\n",
"+--------------+-----------------------------+---------------------+-----------------------+\n",
"| | latency (usec) | bandwidth (MB/s) | message rate (msg/s) |\n",
"+--------------+---------+---------+---------+----------+----------+-----------+-----------+\n",
"| # iterations | typical | average | overall | average | overall | average | overall |\n",
"+--------------+---------+---------+---------+----------+----------+-----------+-----------+\n",
" 100 0.000 1487.861 1487.861 6409.70 6409.70 672 672\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment