Created
September 24, 2020 21:01
-
-
Save froody/a86a5b2c5d9f46aedba7e930f4b4e08d to your computer and use it in GitHub Desktop.
benchmark results ucc vs nccl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
% BACKEND=ucc TORCH_UCC_COLL_BACKEND=xccl python bench_ucx.py | |
rate for 2 = 258.2414863086176 b/s | |
rate for 4 = 1952.35747642124 b/s | |
rate for 8 = 124301.30042724375 b/s | |
rate for 16 = 252311.19377294756 b/s | |
rate for 32 = 513740.5466085539 b/s | |
rate for 64 = 993526.2242752169 b/s | |
rate for 128 = 2058748.3761917958 b/s | |
rate for 256 = 4092866.0471633147 b/s | |
rate for 512 = 5094305.640168373 b/s | |
rate for 1024 = 4976505.564964323 b/s | |
rate for 2048 = 17873660.44916952 b/s | |
rate for 4096 = 33127361.691933107 b/s | |
rate for 8192 = 61588879.825289965 b/s | |
rate for 16384 = 103268.53229209674 b/s | |
rate for 32768 = 212261.9263405997 b/s | |
rate for 65536 = 428639.060356609 b/s | |
rate for 131072 = 820493.6823829501 b/s | |
rate for 262144 = 1743540.0736871355 b/s | |
rate for 524288 = 3306969.334451216 b/s | |
rate for 1048576 = 6742672.959379208 b/s | |
rate for 2097152 = 14012786.819533842 b/s | |
rate for 4194304 = 24884232.04820429 b/s | |
rate for 8388608 = 51405933.109305225 b/s | |
rate for 16777216 = 97203524.63267504 b/s | |
rate for 33554432 = 174005164.0374851 b/s | |
rate for 67108864 = 291430088.5111464 b/s | |
rate for 134217728 = 377169425.41109014 b/s | |
rate for 268435456 = 465558437.4535709 b/s | |
rate for 536870912 = 482488396.92367446 b/s | |
rate for 1073741824 = 512633659.56535065 b/s | |
<crash> | |
% BACKEND=nccl TORCH_UCC_COLL_BACKEND=xccl python bench_ucx.py | |
rate for 2 = 3.098399357517365 b/s | |
rate for 4 = 84246.43094486182 b/s | |
rate for 8 = 529923.9403445458 b/s | |
rate for 16 = 1059096.505139863 b/s | |
rate for 32 = 2307145.3135250374 b/s | |
rate for 64 = 4718043.063867767 b/s | |
rate for 128 = 9415400.398842247 b/s | |
rate for 256 = 19029277.047005884 b/s | |
rate for 512 = 38034198.5151772 b/s | |
rate for 1024 = 75960664.45195945 b/s | |
rate for 2048 = 152168376.8222126 b/s | |
rate for 4096 = 291947725.5205284 b/s | |
rate for 8192 = 538786458.3377793 b/s | |
rate for 16384 = 1067211795.7696589 b/s | |
rate for 32768 = 2071923686.8846686 b/s | |
rate for 65536 = 3079460162.6915236 b/s | |
rate for 131072 = 6864992176.155897 b/s | |
rate for 262144 = 10549542345.679306 b/s | |
rate for 524288 = 11646899040.986288 b/s | |
rate for 1048576 = 19501927534.070766 b/s | |
rate for 2097152 = 34171766538.211407 b/s | |
rate for 4194304 = 41720567107.11991 b/s | |
rate for 8388608 = 53617951554.66589 b/s | |
rate for 16777216 = 61488779619.388245 b/s | |
rate for 33554432 = 69105675982.65294 b/s | |
rate for 67108864 = 69543341456.73645 b/s | |
rate for 134217728 = 71583869883.00964 b/s | |
rate for 268435456 = 69797371917.14935 b/s | |
rate for 536870912 = 71902971192.31456 b/s | |
rate for 1073741824 = 71646979600.41766 b/s | |
rate for 2147483648 = 72985546531.51923 b/s | |
rate for 4294967296 = 72979710930.46205 b/s | |
<crash> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment