Skip to content

Instantly share code, notes, and snippets.

@froody
Created September 24, 2020 21:01
Show Gist options
  • Save froody/a86a5b2c5d9f46aedba7e930f4b4e08d to your computer and use it in GitHub Desktop.
Save froody/a86a5b2c5d9f46aedba7e930f4b4e08d to your computer and use it in GitHub Desktop.
benchmark results ucc vs nccl
% BACKEND=ucc TORCH_UCC_COLL_BACKEND=xccl python bench_ucx.py
rate for 2 = 258.2414863086176 b/s
rate for 4 = 1952.35747642124 b/s
rate for 8 = 124301.30042724375 b/s
rate for 16 = 252311.19377294756 b/s
rate for 32 = 513740.5466085539 b/s
rate for 64 = 993526.2242752169 b/s
rate for 128 = 2058748.3761917958 b/s
rate for 256 = 4092866.0471633147 b/s
rate for 512 = 5094305.640168373 b/s
rate for 1024 = 4976505.564964323 b/s
rate for 2048 = 17873660.44916952 b/s
rate for 4096 = 33127361.691933107 b/s
rate for 8192 = 61588879.825289965 b/s
rate for 16384 = 103268.53229209674 b/s
rate for 32768 = 212261.9263405997 b/s
rate for 65536 = 428639.060356609 b/s
rate for 131072 = 820493.6823829501 b/s
rate for 262144 = 1743540.0736871355 b/s
rate for 524288 = 3306969.334451216 b/s
rate for 1048576 = 6742672.959379208 b/s
rate for 2097152 = 14012786.819533842 b/s
rate for 4194304 = 24884232.04820429 b/s
rate for 8388608 = 51405933.109305225 b/s
rate for 16777216 = 97203524.63267504 b/s
rate for 33554432 = 174005164.0374851 b/s
rate for 67108864 = 291430088.5111464 b/s
rate for 134217728 = 377169425.41109014 b/s
rate for 268435456 = 465558437.4535709 b/s
rate for 536870912 = 482488396.92367446 b/s
rate for 1073741824 = 512633659.56535065 b/s
<crash>
% BACKEND=nccl TORCH_UCC_COLL_BACKEND=xccl python bench_ucx.py
rate for 2 = 3.098399357517365 b/s
rate for 4 = 84246.43094486182 b/s
rate for 8 = 529923.9403445458 b/s
rate for 16 = 1059096.505139863 b/s
rate for 32 = 2307145.3135250374 b/s
rate for 64 = 4718043.063867767 b/s
rate for 128 = 9415400.398842247 b/s
rate for 256 = 19029277.047005884 b/s
rate for 512 = 38034198.5151772 b/s
rate for 1024 = 75960664.45195945 b/s
rate for 2048 = 152168376.8222126 b/s
rate for 4096 = 291947725.5205284 b/s
rate for 8192 = 538786458.3377793 b/s
rate for 16384 = 1067211795.7696589 b/s
rate for 32768 = 2071923686.8846686 b/s
rate for 65536 = 3079460162.6915236 b/s
rate for 131072 = 6864992176.155897 b/s
rate for 262144 = 10549542345.679306 b/s
rate for 524288 = 11646899040.986288 b/s
rate for 1048576 = 19501927534.070766 b/s
rate for 2097152 = 34171766538.211407 b/s
rate for 4194304 = 41720567107.11991 b/s
rate for 8388608 = 53617951554.66589 b/s
rate for 16777216 = 61488779619.388245 b/s
rate for 33554432 = 69105675982.65294 b/s
rate for 67108864 = 69543341456.73645 b/s
rate for 134217728 = 71583869883.00964 b/s
rate for 268435456 = 69797371917.14935 b/s
rate for 536870912 = 71902971192.31456 b/s
rate for 1073741824 = 71646979600.41766 b/s
rate for 2147483648 = 72985546531.51923 b/s
rate for 4294967296 = 72979710930.46205 b/s
<crash>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment