Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cswinter/3bc2cbcc7c947e4a2736188dd6e74240 to your computer and use it in GitHub Desktop.
Save cswinter/3bc2cbcc7c947e4a2736188dd6e74240 to your computer and use it in GitHub Desktop.
root@managed-worker-3jq5:/# mpirun --allow-run-as-root -H 10.73.0.209:1,10.73.0.244:1 -np 2 -mca btl_tcp_if_include ens12 -x NCCL_IB_DISABLE=1 -x LD_LIBRARY_PATH -x NCCL_SOCKET_IFNAME=ens12 -x NCCL_DEBUG=VERSION -x NCCL_NSOCKS_PERTHREAD=1 -x NCCL_SOCKET_NTHREADS=4 /nccl-tests/build/all_reduce_perf -b 1M -e 1G -f 2 -g 1 -c 0
nThread 1 nGpus 1 minBytes 1048576 maxBytes 1073741824 step: 2(factor) warmup iters: 5 iters: 20 validation: 0
NCCL version 2.4.7ms0+cuda10.0
# NCCL Tests compiled with NCCL 2.4
# Using devices
# Rank 0 on managed-worker-3jq5 device 0 [0x00] Tesla V100-SXM2-16GB
# out-of-place in-place
# bytes N type op time algbw busbw res time algbw busbw res
# Rank 1 on managed-worker-0khr device 0 [0x00] Tesla V100-SXM2-16GB
1048576 262144 float sum 2.059 0.51 0.51 N/A 3.314 0.32 0.32 N/A
2097152 524288 float sum 3.047 0.69 0.69 N/A 6.873 0.31 0.31 N/A
4194304 1048576 float sum 5.074 0.83 0.83 N/A 5.525 0.76 0.76 N/A
8388608 2097152 float sum 7.145 1.17 1.17 N/A 8.716 0.96 0.96 N/A
16777216 4194304 float sum 17.616 0.95 0.95 N/A 17.165 0.98 0.98 N/A
33554432 8388608 float sum 37.980 0.88 0.88 N/A 35.914 0.93 0.93 N/A
67108864 16777216 float sum 68.106 0.99 0.99 N/A 72.205 0.93 0.93 N/A
134217728 33554432 float sum 169.402 0.79 0.79 N/A 121.435 1.11 1.11 N/A
268435456 67108864 float sum 243.531 1.10 1.10 N/A 252.268 1.06 1.06 N/A
536870912 134217728 float sum 497.448 1.08 1.08 N/A 519.911 1.03 1.03 N/A
1073741824 268435456 float sum 962.883 1.12 1.12 N/A 970.829 1.11 1.11 N/A
Out of bounds values : 0 OK
Avg bus bandwidth : 0.89094
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment