Last active
September 27, 2021 11:06
-
-
Save jacobtomlinson/6242a3547d13d4e7a00cc768b8b475c8 to your computer and use it in GitHub Desktop.
UCX-Py testing on Azure
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Azure UXC-Py Testing\n", | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Build VM\n", | |
"\n", | |
"- Create new VM instance.\n", | |
"- Select `East US` region.\n", | |
"- Change `Availability options` to `Availability set` and create a set. \n", | |
" - If building multiple instances put additional instances in the same set.\n", | |
"- Use the 2nd Gen Ubuntu 18.04 image.\n", | |
" - Search all images for `Ubuntu Server 18.04` and choose the second one down on the list.\n", | |
"- Change size to `ND40rs_v2`.\n", | |
"- Set password login with credentials.\n", | |
" - User `someuser`\n", | |
" - Password `somepassword`\n", | |
"- Leave all other options as default." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Install software\n", | |
"\n", | |
"Before installing the drivers ensure the system is up to date.\n", | |
"\n", | |
"```\n", | |
"sudo apt update\n", | |
"sudo apt upgrade -y\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## NVIDIA Drivers\n", | |
"\n", | |
"```\n", | |
"wget http://uk.download.nvidia.com/tesla/440.33.01/nvidia-driver-local-repo-ubuntu1804-440.33.01_1.0-1_amd64.deb\n", | |
"sudo dpkg -i nvidia-driver-local-repo-ubuntu1804-440.33.01_1.0-1_amd64.deb\n", | |
"sudo apt-key add /var/nvidia-driver-local-repo-440.33.01/7fa2af80.pub\n", | |
"sudo apt update\n", | |
"sudo apt install cuda-drivers -y\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## IB Drivers\n", | |
"\n", | |
"We need to install the drivers as per the [Azure Documentation](https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/hpc/enable-infiniband).\n", | |
"\n", | |
"```\n", | |
"# Modify the variable to desired Mellanox OFED version\n", | |
"MOFED_VERSION=4.7-3.2.9.0\n", | |
"# Modify the variable to desired OS\n", | |
"MOFED_OS=ubuntu18.04\n", | |
"pushd /tmp\n", | |
"curl -fSsL https://www.mellanox.com/downloads/ofed/MLNX_OFED-${MOFED_VERSION}/MLNX_OFED_LINUX-${MOFED_VERSION}-${MOFED_OS}-x86_64.tgz | tar -zxpf -\n", | |
"cd MLNX_OFED_LINUX-*\n", | |
"sudo ./mlnxofedinstall\n", | |
"popd\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Driver errors\n", | |
"\n", | |
"Running without the `apt upgrade` before hand resulted in a package conflict and error. \n", | |
"\n", | |
"<details>\n", | |
"<summary>Expand to see error and resolution</summary>\n", | |
"\n", | |
"```\n", | |
"Failed command: apt-get install -y automake libnl-route-3-200 dpatch flex graphviz m4 tk bison libnl-3-200 tcl swig gfortran libglib2.0-0 autoconf quilt libltdl-dev debhelper libgfortran4 autotools-dev chrpath\n", | |
"```\n", | |
"\n", | |
"Running that command manually gave this error.\n", | |
"\n", | |
"```\n", | |
"$ sudo apt-get install -y automake libnl-route-3-200 dpatch flex graphviz m4 tk bison libnl-3-200 tcl swig gfortran libglib2.0-0 autoconf quilt libltdl-dev debhelper libgfortran4 autotools-dev chrpath\n", | |
"Reading package lists... Done\n", | |
"Building dependency tree\n", | |
"Reading state information... Done\n", | |
"libglib2.0-0 is already the newest version (2.56.4-0ubuntu0.18.04.4).\n", | |
"libglib2.0-0 set to manually installed.\n", | |
"You might want to run 'apt --fix-broken install' to correct these.\n", | |
"The following packages have unmet dependencies:\n", | |
" bison : Depends: libbison-dev (= 2:3.0.4.dfsg-1build1) but it is not going to be installed\n", | |
" cuda-drivers : Depends: nvidia-440 (>= 440.33.01) but it is not going to be installed\n", | |
" debhelper : Depends: dh-autoreconf (>= 17~) but it is not going to be installed\n", | |
" Depends: dh-strip-nondeterminism (>= 0.028~) but it is not going to be installed\n", | |
" Depends: po-debconf but it is not going to be installed\n", | |
" gfortran : Depends: gfortran-7 (>= 7.4.0-1~) but it is not going to be installed\n", | |
" graphviz : Depends: libann0 but it is not going to be installed\n", | |
" Depends: libcdt5 but it is not going to be installed\n", | |
" Depends: libcgraph6 but it is not going to be installed\n", | |
" Depends: libgd3 (>= 2.1.0~alpha~) but it is not going to be installed\n", | |
" Depends: libgts-0.7-5 (>= 0.7.6) but it is not going to be installed\n", | |
" Depends: libgvc6 but it is not going to be installed\n", | |
" Depends: libgvpr2 but it is not going to be installed\n", | |
" Depends: liblab-gamut1 but it is not going to be installed\n", | |
" Recommends: fonts-liberation but it is not going to be installed\n", | |
" libltdl-dev : Depends: libltdl7 (= 2.4.6-2) but it is not going to be installed\n", | |
" Recommends: libtool but it is not going to be installed\n", | |
" nvidia-440-dev : Depends: nvidia-440 (>= 440.33.01) but it is not going to be installed\n", | |
" quilt : Depends: diffstat but it is not going to be installed\n", | |
" Depends: gettext but it is not going to be installed\n", | |
" swig : Depends: swig3.0 (>= 3.0.12-1) but it is not going to be installed\n", | |
" tcl : Depends: tcl8.6 (>= 8.6.0-2) but it is not going to be installed\n", | |
" tk : Depends: tk8.6 (>= 8.6.0-2) but it is not going to be installed\n", | |
"E: Unmet dependencies. Try 'apt --fix-broken install' with no packages (or specify a solution).\n", | |
"```\n", | |
"\n", | |
"Then ran the fix broken command as instructed.\n", | |
"\n", | |
"```\n", | |
"$ sudo apt --fix-broken install\n", | |
"Reading package lists... Done\n", | |
"Building dependency tree\n", | |
"Reading state information... Done\n", | |
"Correcting dependencies... Done\n", | |
"The following packages were automatically installed and are no longer required:\n", | |
" grub-pc-bin linux-headers-4.15.0-76\n", | |
"Use 'sudo apt autoremove' to remove them.\n", | |
"The following additional packages will be installed:\n", | |
" nvidia-440\n", | |
"The following NEW packages will be installed:\n", | |
" nvidia-440\n", | |
"0 upgraded, 1 newly installed, 0 to remove and 7 not upgraded.\n", | |
"173 not fully installed or removed.\n", | |
"Need to get 0 B/131 MB of archives.\n", | |
"After this operation, 458 MB of additional disk space will be used.\n", | |
"Do you want to continue? [Y/n] y\n", | |
"(Reading database ... 94872 files and directories currently installed.)\n", | |
"Preparing to unpack .../nvidia-440_440.33.01-0ubuntu1_amd64.deb ...\n", | |
"Unpacking nvidia-440 (440.33.01-0ubuntu1) ...\n", | |
"dpkg: error processing archive /var/cache/apt/archives/nvidia-440_440.33.01-0ubuntu1_amd64.deb (--unpack):\n", | |
" trying to overwrite '/usr/lib/x86_64-linux-gnu/libGLX_indirect.so.0', which is also in package libglx-mesa0:amd64 19.2.8-0ubuntu0~18.04.1\n", | |
"Errors were encountered while processing:\n", | |
" /var/cache/apt/archives/nvidia-440_440.33.01-0ubuntu1_amd64.deb\n", | |
"E: Sub-process /usr/bin/dpkg returned an error code (1)\n", | |
"```\n", | |
"\n", | |
"The nvidia driver seems to complain about a package conflict with `libglx-mesa0`.\n", | |
"\n", | |
"So manually installed the package with forced overwriting.\n", | |
"\n", | |
"```\n", | |
"sudo dpkg -i --force-overwrite /var/cache/apt/archives/nvidia-440_440.33.01-0ubuntu1_amd64.deb\n", | |
"```\n", | |
"\n", | |
"Once this completed I was able to successfully run `sudo apt --fix-broken install`.\n", | |
"\n", | |
"Then ran `sudo ./mlnxofedinstall` again.\n", | |
" \n", | |
"</details>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Install GPUDirect RDMA kernel module\n", | |
"\n", | |
"As per the [Mellanox Documentation](https://www.mellanox.com/related-docs/prod_software/Mellanox_GPUDirect_User_Manual_v1.5.pdf).\n", | |
"\n", | |
"```\n", | |
"cd /tmp\n", | |
"wget https://www.mellanox.com/sites/default/files/downloads/ofed/nvidia-peer-memory_1.0-8.tar.gz\n", | |
"tar xvzf nvidia-peer-memory_1.0-8.tar.gz\n", | |
"cd nvidia-peer-memory-1.0/\n", | |
"./build_module.sh\n", | |
"dpkg-buildpackage -us -uc\n", | |
"sudo dpkg -i ../nvidia-peer-memory*.deb\n", | |
"modprobe nv_peer_mem\n", | |
"```\n", | |
"\n", | |
"Check install\n", | |
"\n", | |
"```\n", | |
"$ lsmod | grep nv_peer_mem\n", | |
"nv_peer_mem 16384 0\n", | |
"nvidia 19894272 102 nvidia_uvm,nv_peer_mem,nvidia_modeset\n", | |
"ib_core 327680 12 rdma_cm,ib_ipoib,mlx4_ib,nv_peer_mem,iw_cm,ib_iser,ib_umad,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm,ib_ucm\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Enable IPoIB\n", | |
"\n", | |
"```\n", | |
"sudo sed -i -e 's/# OS.EnableRDMA=y/OS.EnableRDMA=y/g' /etc/waagent.conf\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Reboot\n", | |
"\n", | |
"Rebooted the machine to ensure all changes were applied.\n", | |
"\n", | |
"```\n", | |
"sudo reboot\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Check IB\n", | |
"\n", | |
"```\n", | |
"$ ip addr show dev ib0\n", | |
"4: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256\n", | |
" link/infiniband 20:00:09:26:fe:80:00:00:00:00:00:00:00:15:5d:ff:fd:33:ff:40 brd 00:ff:ff:ff:ff:12:40:1b:80:04:00:00:00:00:00:00:ff:ff:ff:ff\n", | |
" inet 172.16.1.55/16 brd 172.16.255.255 scope global ib0\n", | |
" valid_lft forever preferred_lft forever\n", | |
" inet6 fe80::215:5dff:fd33:ff40/64 scope link\n", | |
" valid_lft forever preferred_lft forever\n", | |
"```\n", | |
"\n", | |
"```\n", | |
"$ nvidia-smi topo -m\n", | |
" GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 mlx5_0 CPU Affinity\n", | |
"GPU0 X NV1 NV2 NODE NODE NV2 NODE NV1 NODE 0-19\n", | |
"GPU1 NV1 X NV2 NV1 NODE NODE NODE NV2 NODE 0-19\n", | |
"GPU2 NV2 NV2 X NODE NV1 NODE NODE NV1 NODE 0-19\n", | |
"GPU3 NODE NV1 NODE X NV2 NV1 NV2 NODE NODE 0-19\n", | |
"GPU4 NODE NODE NV1 NV2 X NV2 NV1 NODE NODE 0-19\n", | |
"GPU5 NV2 NODE NODE NV1 NV2 X NV1 NODE NODE 0-19\n", | |
"GPU6 NODE NODE NODE NV2 NV1 NV1 X NV2 NODE 0-19\n", | |
"GPU7 NV1 NV2 NV1 NODE NODE NODE NV2 X NODE 0-19\n", | |
"mlx5_0 NODE NODE NODE NODE NODE NODE NODE NODE X\n", | |
"\n", | |
"Legend:\n", | |
"\n", | |
" X = Self\n", | |
" SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)\n", | |
" NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node\n", | |
" PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)\n", | |
" PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)\n", | |
" PIX = Connection traversing at most a single PCIe bridge\n", | |
" NV# = Connection traversing a bonded set of # NVLinks\n", | |
" ```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Install UCX-Py and tools" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"```\n", | |
"$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh\n", | |
"$ bash Miniconda3-latest-Linux-x86_64.sh\n", | |
"```\n", | |
"\n", | |
"Accept the default and allow `conda init` to run. Then start a new shell.\n", | |
"\n", | |
"Create a conda environment ([see ucx-py docs](https://ucx-py.readthedocs.io/en/latest/install.html#conda))\n", | |
"\n", | |
"```\n", | |
"$ conda create -n ucxpy -c conda-forge -c rapidsai python=3.7 ipython ucx-proc=*=gpu ucx ucx-py dask distributed numpy cupy -y\n", | |
"...\n", | |
"$ conda activate ucxpy\n", | |
"```\n", | |
"\n", | |
"Clone ucx-py repo locally\n", | |
"\n", | |
"```\n", | |
"git clone https://github.com/rapidsai/ucx-py.git\n", | |
"cd ucx-py/benchmarks\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Run benchmarks\n", | |
"\n", | |
"[GitHub Issue on Benchmarks](https://github.com/rapidsai/ucx-py/issues/311)\n", | |
"\n", | |
"## Single Node\n", | |
"\n", | |
"### TCP\n", | |
"\n", | |
"```\n", | |
"$ UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm python local-send-recv.py -o cupy -n \"100MB\" --server-dev 1 --client-dev 2 --reuse-alloc\n", | |
"[1581000782.671561] [rapids-ucxpy-1:6714 :0] parser.c:1578 UCX WARN unused env variable: UCX_CUDA_IPC_CACHE (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)\n", | |
"[1581000785.889780] [rapids-ucxpy-1:6783 :0] parser.c:1578 UCX WARN unused env variable: UCX_CUDA_IPC_CACHE (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)\n", | |
"[1581000793.139156] [rapids-ucxpy-1:6783 :0] mpool.c:43 UCX WARN object 0x56296beea780 was not returned to mpool ucp_am_bufs\n", | |
"Roundtrip benchmark\n", | |
"--------------------------\n", | |
"n_iter | 10\n", | |
"n_bytes | 100.00 MB\n", | |
"object | cupy\n", | |
"reuse alloc | True\n", | |
"==========================\n", | |
"Device(s) | 1, 2\n", | |
"Average | 395.07 MB/s\n", | |
"--------------------------\n", | |
"Iterations\n", | |
"--------------------------\n", | |
"000 |403.10 MB/s\n", | |
"001 |387.50 MB/s\n", | |
"002 |389.13 MB/s\n", | |
"003 |388.82 MB/s\n", | |
"004 |385.98 MB/s\n", | |
"005 |410.38 MB/s\n", | |
"006 |423.28 MB/s\n", | |
"007 |387.83 MB/s\n", | |
"008 |393.22 MB/s\n", | |
"009 |385.06 MB/s\n", | |
"```\n", | |
"\n", | |
"### IB\n", | |
"\n", | |
"```\n", | |
"$ UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,rc python local-send-recv.py -o cupy -n \"100MB\" --server-dev 1 --client-dev 2 --reuse-alloc [36/1091]\n", | |
"[1581000806.378520] [rapids-ucxpy-1:6893 :0] parser.c:1578 UCX WARN unused env variable: UCX_CUDA_IPC_CACHE (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)\n", | |
"[1581000806.852800] [rapids-ucxpy-1:6960 :0] parser.c:1578 UCX WARN unused env variable: UCX_CUDA_IPC_CACHE (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)\n", | |
"Roundtrip benchmark\n", | |
"--------------------------\n", | |
"n_iter | 10\n", | |
"n_bytes | 100.00 MB\n", | |
"object | cupy\n", | |
"reuse alloc | True\n", | |
"==========================\n", | |
"Device(s) | 1, 2\n", | |
"Average | 2.12 GB/s\n", | |
"--------------------------\n", | |
"Iterations\n", | |
"--------------------------\n", | |
"000 | 1.58 GB/s\n", | |
"001 | 2.20 GB/s\n", | |
"002 | 2.21 GB/s\n", | |
"003 | 2.21 GB/s\n", | |
"004 | 2.21 GB/s\n", | |
"005 | 2.21 GB/s\n", | |
"006 | 2.21 GB/s\n", | |
"007 | 2.21 GB/s\n", | |
"008 | 2.21 GB/s\n", | |
"009 | 2.21 GB/s\n", | |
"```\n", | |
"\n", | |
"### NVLINK\n", | |
"\n", | |
"#### NV1\n", | |
"\n", | |
"```\n", | |
"$ UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,cuda_ipc python local-send-recv.py -o cupy -n \"100MB\" --server-dev 1 --client-dev 0 --reuse-alloc\n", | |
"[1581000977.744833] [rapids-ucxpy-1:7444 :0] sock.c:224 UCX ERROR recv(fd=53) failed: Connection reset by peer\n", | |
"\n", | |
"Roundtrip benchmark\n", | |
"--------------------------\n", | |
"n_iter | 10\n", | |
"n_bytes | 100.00 MB\n", | |
"object | cupy\n", | |
"reuse alloc | True\n", | |
"==========================\n", | |
"Device(s) | 1, 0\n", | |
"Average | 17.08 GB/s\n", | |
"--------------------------\n", | |
"Iterations\n", | |
"--------------------------\n", | |
"000 | 16.54 GB/s\n", | |
"001 | 14.28 GB/s\n", | |
"002 | 17.58 GB/s\n", | |
"003 | 17.52 GB/s\n", | |
"004 | 17.51 GB/s\n", | |
"005 | 17.66 GB/s\n", | |
"006 | 17.72 GB/s\n", | |
"007 | 17.67 GB/s\n", | |
"008 | 17.31 GB/s\n", | |
"009 | 17.66 GB/s\n", | |
"```\n", | |
"\n", | |
"#### NV2\n", | |
"\n", | |
"```\n", | |
"$ UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,cuda_ipc python local-send-recv.py -o cupy -n \"100MB\" --server-dev 1 --client-dev 2 --reuse-alloc\n", | |
"[1581000819.685852] [rapids-ucxpy-1:7127 :0] mpool.c:43 UCX WARN object 0x55d8125fe500 was not returned to mpool ucp_am_bufs\n", | |
"Roundtrip benchmark\n", | |
"--------------------------\n", | |
"n_iter | 10\n", | |
"n_bytes | 100.00 MB\n", | |
"object | cupy\n", | |
"reuse alloc | True\n", | |
"==========================\n", | |
"Device(s) | 1, 2\n", | |
"Average | 26.12 GB/s\n", | |
"--------------------------\n", | |
"Iterations\n", | |
"--------------------------\n", | |
"000 | 19.73 GB/s\n", | |
"001 | 20.11 GB/s\n", | |
"002 | 29.31 GB/s\n", | |
"003 | 28.27 GB/s\n", | |
"004 | 28.40 GB/s\n", | |
"005 | 28.23 GB/s\n", | |
"006 | 27.69 GB/s\n", | |
"007 | 28.45 GB/s\n", | |
"008 | 28.59 GB/s\n", | |
"009 | 27.70 GB/s\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Multi node\n", | |
"\n", | |
"Make note of the ib addresses with `ip addr show dev ib0`.\n", | |
"\n", | |
"### Server (node 1)\n", | |
"\n", | |
"```\n", | |
"UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,rc python recv-into-client.py -r recv_into -o cupy --n-bytes 1000Mb -p 13337 --n-iter 100\n", | |
"```\n", | |
"\n", | |
"### Client (node 2)\n", | |
"\n", | |
"```\n", | |
"UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,rc python recv-into-client.py -r recv_into -o cupy --n-bytes 1000Mb -p 13337 -s 172.16.1.55 --n-iter 100\n", | |
"```\n", | |
"\n", | |
"### Results\n", | |
"\n", | |
"```\n", | |
"[1581000858.688259] [rapids-ucxpy-1:7223 :0] parser.c:1578 UCX WARN unused env variable: UCX_CUDA_IPC_CACHE (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)\n", | |
"CUDA RUNTIME DEVICE: 0\n", | |
"Roundtrip benchmark\n", | |
"-------------------\n", | |
"n_iter | 100\n", | |
"n_bytes | 1000.00 MB\n", | |
"recv | recv_into\n", | |
"object | cupy\n", | |
"inc | False\n", | |
"\n", | |
"===================\n", | |
"2.74 GB / s\n", | |
"===================\n", | |
"[1581000944.094646] [rapids-ucxpy-1:7223 :0] rc_ep.c:321 UCX WARN destroying rc ep 0x561f4ab84b88 with uncompleted operation 0x561f52681140\n", | |
"[1581000944.099832] [rapids-ucxpy-1:7223 :0] mpool.c:43 UCX WARN object 0x561f4abe13c0 was not returned to mpool ucp_requests\n", | |
"[1581000944.099844] [rapids-ucxpy-1:7223 :0] callbackq.c:447 UCX WARN 0 fast-path and 1 slow-path callbacks remain in the queue\n", | |
"```\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## ucx_perftest" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"```\n", | |
"$ UCX_MEMTYPE_CACHE=n UCX_SOCKADDR_TLS_PRIORITY=sockcm UCX_TLS=tcp,cuda_copy,sockcm,rc ucx_perftest localhost -t tag_bw -m cuda -n 100 -s 10000000\n", | |
"[1581000133.545295] [rapids-ucxpy-2:14894:0] perftest.c:1376 UCX WARN CPU affinity is not set (bound to 40 cpus). Performance may be impacted.\n", | |
"+--------------+-----------------------------+---------------------+-----------------------+\n", | |
"| | latency (usec) | bandwidth (MB/s) | message rate (msg/s) |\n", | |
"+--------------+---------+---------+---------+----------+----------+-----------+-----------+\n", | |
"| # iterations | typical | average | overall | average | overall | average | overall |\n", | |
"+--------------+---------+---------+---------+----------+----------+-----------+-----------+\n", | |
" 100 0.000 1487.861 1487.861 6409.70 6409.70 672 672\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.7.4" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment