Skip to content

Instantly share code, notes, and snippets.

@cswinter
Created April 20, 2019 00:40
Show Gist options
  • Save cswinter/220e2360c8014e031186fecf9d79ba1b to your computer and use it in GitHub Desktop.
Save cswinter/220e2360c8014e031186fecf9d79ba1b to your computer and use it in GitHub Desktop.
root@managed-worker-l83z:/nccl-tests# mpirun --allow-run-as-root -H 10.73.0.52:1,10.73.0.15:1 -np 2 -mca btl_tcp_if_include ens12 -x LD_LIBRARY_PATH -x NCCL_SOCKET_IFNAME=ens12 -x NCCL_MIN_NRINGS=1 -x NCCL_MAX_NRINGS=1 -x NCCL_DEBUG=TRACE /nccl-tests/build/all_reduce_perf -b 1G -e 1G -f 2 -g 1 -c 0
nThread 1 nGpus 1 minBytes 1073741824 maxBytes 1073741824 step: 2(factor) warmup iters: 5 iters: 20 validation: 0
managed-worker-l83z:16157:16157 [0] NCCL INFO NET : Using interface ens12:10.73.0.52<0>
managed-worker-l83z:16157:16157 [0] NCCL INFO NET/IB : Using interface ens12 for sideband communication
managed-worker-l83z:16157:16157 [0] NCCL INFO Using internal Network Socket
managed-worker-l83z:16157:16157 [0] NCCL INFO NET : Using interface ens12:10.73.0.52<0>
managed-worker-l83z:16157:16157 [0] NCCL INFO NET/Socket : 1 interfaces found
NCCL version 2.3.7+cuda10.0
managed-worker-l83z:16157:16157 [0] NCCL INFO rank 0 nranks 2
managed-worker-l83z:16157:16163 [0] NCCL INFO comm 0x7f11600566a0 rank 0 nranks 2
managed-worker-jbk7:16322:16322 [0] NCCL INFO NET : Using interface ens12:10.73.0.15<0>
managed-worker-jbk7:16322:16322 [0] NCCL INFO NET/IB : Using interface ens12 for sideband communication
managed-worker-jbk7:16322:16322 [0] NCCL INFO Using internal Network Socket
managed-worker-jbk7:16322:16322 [0] NCCL INFO rank 1 nranks 2
managed-worker-jbk7:16322:16327 [0] NCCL INFO comm 0x7efb100566a0 rank 1 nranks 2
managed-worker-jbk7:16322:16327 [0] NCCL INFO NET : Using interface ens12:10.73.0.15<0>
managed-worker-jbk7:16322:16327 [0] NCCL INFO NET/Socket : 1 interfaces found
managed-worker-l83z:16157:16163 [0] NCCL INFO CUDA Dev 0, IP Interfaces : ens12(PHB)
managed-worker-jbk7:16322:16327 [0] NCCL INFO CUDA Dev 0, IP Interfaces : ens12(PHB)
managed-worker-l83z:16157:16163 [0] NCCL INFO NCCL_MAX_NRINGS set by environment to 1.
managed-worker-l83z:16157:16163 [0] NCCL INFO NCCL_MIN_NRINGS set by environment to 1.
managed-worker-l83z:16157:16163 [0] NCCL INFO Limiting to 1 rings per user request.
managed-worker-l83z:16157:16163 [0] NCCL INFO Using 256 threads
managed-worker-l83z:16157:16163 [0] NCCL INFO Min Comp Cap 7
managed-worker-jbk7:16322:16327 [0] NCCL INFO NCCL_MAX_NRINGS set by environment to 1.
managed-worker-jbk7:16322:16327 [0] NCCL INFO NCCL_MIN_NRINGS set by environment to 1.
managed-worker-l83z:16157:16163 [0] NCCL INFO Ring 00 : 0 1
managed-worker-jbk7:16322:16327 [0] NCCL INFO Ring 00 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16157:16163 [0] NCCL INFO Ring 00 : 1 -> 0 via NET/Socket/0
managed-worker-l83z:16157:16163 [0] NCCL INFO comm 0x7f11600566a0 rank 0 nranks 2 - COMPLETE
# NCCL Tests compiled with NCCL 2.4
# Using devices
# Rank 0 on managed-worker-l83z device 0 [0x00] Tesla V100-SXM2-16GB
managed-worker-jbk7:16322:16327 [0] NCCL INFO comm 0x7efb100566a0 rank 1 nranks 2 - COMPLETE
# out-of-place in-place
# bytes N type op time algbw busbw res time algbw busbw res
# Rank 1 on managed-worker-jbk7 device 0 [0x00] Tesla V100-SXM2-16GB
managed-worker-l83z:16157:16157 [0] NCCL INFO Launch mode Parallel
1073741824 268435456 float sum 547.908 1.96 1.96 N/A 520.709 2.06 2.06 N/A
Out of bounds values : 0 OK
Avg bus bandwidth : 2.0109
root@managed-worker-l83z:/nccl-tests# mpirun --allow-run-as-root -H 10.73.0.52:1,10.73.0.15:1 -np 2 -mca btl_tcp_if_include ens12 -x LD_LIBRARY_PATH -x NCCL_SOCKET_IFNAME=ens12 -x NCCL_MIN_NRINGS=2 -x NCCL_MAX_NRINGS=2 -x NCCL_DEBUG=TRACE /nccl-tests/build/all_reduce_perf -b 1G -e 1G -f 2 -g 1 -c 0
nThread 1 nGpus 1 minBytes 1073741824 maxBytes 1073741824 step: 2(factor) warmup iters: 5 iters: 20 validation: 0
managed-worker-l83z:16172:16172 [0] NCCL INFO NET : Using interface ens12:10.73.0.52<0>
managed-worker-l83z:16172:16172 [0] NCCL INFO NET/IB : Using interface ens12 for sideband communication
managed-worker-l83z:16172:16172 [0] NCCL INFO Using internal Network Socket
managed-worker-l83z:16172:16172 [0] NCCL INFO NET : Using interface ens12:10.73.0.52<0>
managed-worker-l83z:16172:16172 [0] NCCL INFO NET/Socket : 1 interfaces found
NCCL version 2.3.7+cuda10.0
managed-worker-l83z:16172:16172 [0] NCCL INFO rank 0 nranks 2
managed-worker-l83z:16172:16178 [0] NCCL INFO comm 0x7fb5400566a0 rank 0 nranks 2
managed-worker-jbk7:16350:16350 [0] NCCL INFO NET : Using interface ens12:10.73.0.15<0>
managed-worker-jbk7:16350:16350 [0] NCCL INFO NET/IB : Using interface ens12 for sideband communication
managed-worker-jbk7:16350:16350 [0] NCCL INFO Using internal Network Socket
managed-worker-jbk7:16350:16350 [0] NCCL INFO rank 1 nranks 2
managed-worker-jbk7:16350:16355 [0] NCCL INFO comm 0x7f65d40566a0 rank 1 nranks 2
managed-worker-jbk7:16350:16355 [0] NCCL INFO NET : Using interface ens12:10.73.0.15<0>
managed-worker-jbk7:16350:16355 [0] NCCL INFO NET/Socket : 1 interfaces found
managed-worker-jbk7:16350:16355 [0] NCCL INFO CUDA Dev 0, IP Interfaces : ens12(PHB)
managed-worker-l83z:16172:16178 [0] NCCL INFO CUDA Dev 0, IP Interfaces : ens12(PHB)
managed-worker-l83z:16172:16178 [0] NCCL INFO NCCL_MAX_NRINGS set by environment to 2.
managed-worker-l83z:16172:16178 [0] NCCL INFO NCCL_MIN_NRINGS set by environment to 2.
managed-worker-l83z:16172:16178 [0] NCCL INFO Duplicating rings to 2 per user request.
managed-worker-l83z:16172:16178 [0] NCCL INFO Using 256 threads
managed-worker-l83z:16172:16178 [0] NCCL INFO Min Comp Cap 7
managed-worker-jbk7:16350:16355 [0] NCCL INFO NCCL_MAX_NRINGS set by environment to 2.
managed-worker-jbk7:16350:16355 [0] NCCL INFO NCCL_MIN_NRINGS set by environment to 2.
managed-worker-l83z:16172:16178 [0] NCCL INFO Ring 00 : 0 1
managed-worker-l83z:16172:16178 [0] NCCL INFO Ring 01 : 0 1
managed-worker-l83z:16172:16178 [0] NCCL INFO Ring 00 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16350:16355 [0] NCCL INFO Ring 00 : 0 -> 1 via NET/Socket/0
managed-worker-jbk7:16350:16355 [0] NCCL INFO Ring 01 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16172:16178 [0] NCCL INFO Ring 01 : 1 -> 0 via NET/Socket/0
managed-worker-l83z:16172:16178 [0] NCCL INFO comm 0x7fb5400566a0 rank 0 nranks 2 - COMPLETE
# NCCL Tests compiled with NCCL 2.4
# Using devices
# Rank 0 on managed-worker-l83z device 0 [0x00] Tesla V100-SXM2-16GB
managed-worker-jbk7:16350:16355 [0] NCCL INFO comm 0x7f65d40566a0 rank 1 nranks 2 - COMPLETE
# out-of-place in-place
# bytes N type op time algbw busbw res time algbw busbw res
# Rank 1 on managed-worker-jbk7 device 0 [0x00] Tesla V100-SXM2-16GB
managed-worker-l83z:16172:16172 [0] NCCL INFO Launch mode Parallel
1073741824 268435456 float sum 393.468 2.73 2.73 N/A 387.913 2.77 2.77 N/A
Out of bounds values : 0 OK
Avg bus bandwidth : 2.74846
root@managed-worker-l83z:/nccl-tests# mpirun --allow-run-as-root -H 10.73.0.52:1,10.73.0.15:1 -np 2 -mca btl_tcp_if_include ens12 -x LD_LIBRARY_PATH -x NCCL_SOCKET_IFNAME=ens12 -x NCCL_MIN_NRINGS=4 -x NCCL_MAX_NRINGS=4 -x NCCL_DEBUG=TRACE /nccl-tests/build/all_reduce_perf -b 1G -e 1G -f 2 -g 1 -c 0
nThread 1 nGpus 1 minBytes 1073741824 maxBytes 1073741824 step: 2(factor) warmup iters: 5 iters: 20 validation: 0
managed-worker-l83z:16189:16189 [0] NCCL INFO NET : Using interface ens12:10.73.0.52<0>
managed-worker-l83z:16189:16189 [0] NCCL INFO NET/IB : Using interface ens12 for sideband communication
managed-worker-l83z:16189:16189 [0] NCCL INFO Using internal Network Socket
managed-worker-l83z:16189:16189 [0] NCCL INFO NET : Using interface ens12:10.73.0.52<0>
managed-worker-l83z:16189:16189 [0] NCCL INFO NET/Socket : 1 interfaces found
NCCL version 2.3.7+cuda10.0
managed-worker-l83z:16189:16189 [0] NCCL INFO rank 0 nranks 2
managed-worker-l83z:16189:16195 [0] NCCL INFO comm 0x7f06fc0566a0 rank 0 nranks 2
managed-worker-jbk7:16380:16380 [0] NCCL INFO NET : Using interface ens12:10.73.0.15<0>
managed-worker-jbk7:16380:16380 [0] NCCL INFO NET/IB : Using interface ens12 for sideband communication
managed-worker-jbk7:16380:16380 [0] NCCL INFO Using internal Network Socket
managed-worker-jbk7:16380:16380 [0] NCCL INFO rank 1 nranks 2
managed-worker-jbk7:16380:16385 [0] NCCL INFO comm 0x7f3eb00566a0 rank 1 nranks 2
managed-worker-jbk7:16380:16385 [0] NCCL INFO NET : Using interface ens12:10.73.0.15<0>
managed-worker-jbk7:16380:16385 [0] NCCL INFO NET/Socket : 1 interfaces found
managed-worker-l83z:16189:16195 [0] NCCL INFO CUDA Dev 0, IP Interfaces : ens12(PHB)
managed-worker-l83z:16189:16195 [0] NCCL INFO NCCL_MAX_NRINGS set by environment to 4.
managed-worker-l83z:16189:16195 [0] NCCL INFO NCCL_MIN_NRINGS set by environment to 4.
managed-worker-l83z:16189:16195 [0] NCCL INFO Duplicating rings to 4 per user request.
managed-worker-jbk7:16380:16385 [0] NCCL INFO CUDA Dev 0, IP Interfaces : ens12(PHB)
managed-worker-l83z:16189:16195 [0] NCCL INFO Using 256 threads
managed-worker-jbk7:16380:16385 [0] NCCL INFO NCCL_MAX_NRINGS set by environment to 4.
managed-worker-jbk7:16380:16385 [0] NCCL INFO NCCL_MIN_NRINGS set by environment to 4.
managed-worker-l83z:16189:16195 [0] NCCL INFO Min Comp Cap 7
managed-worker-l83z:16189:16195 [0] NCCL INFO Ring 00 : 0 1
managed-worker-l83z:16189:16195 [0] NCCL INFO Ring 01 : 0 1
managed-worker-l83z:16189:16195 [0] NCCL INFO Ring 02 : 0 1
managed-worker-l83z:16189:16195 [0] NCCL INFO Ring 03 : 0 1
managed-worker-l83z:16189:16195 [0] NCCL INFO Ring 00 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16380:16385 [0] NCCL INFO Ring 00 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16189:16195 [0] NCCL INFO Ring 01 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16380:16385 [0] NCCL INFO Ring 01 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16189:16195 [0] NCCL INFO Ring 02 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16380:16385 [0] NCCL INFO Ring 02 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16189:16195 [0] NCCL INFO Ring 03 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16380:16385 [0] NCCL INFO Ring 03 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16189:16195 [0] NCCL INFO comm 0x7f06fc0566a0 rank 0 nranks 2 - COMPLETE
# NCCL Tests compiled with NCCL 2.4
# Using devices
# Rank 0 on managed-worker-l83z device 0 [0x00] Tesla V100-SXM2-16GB
managed-worker-jbk7:16380:16385 [0] NCCL INFO comm 0x7f3eb00566a0 rank 1 nranks 2 - COMPLETE
# out-of-place in-place
# bytes N type op time algbw busbw res time algbw busbw res
# Rank 1 on managed-worker-jbk7 device 0 [0x00] Tesla V100-SXM2-16GB
managed-worker-l83z:16189:16189 [0] NCCL INFO Launch mode Parallel
1073741824 268435456 float sum 331.588 3.24 3.24 N/A 327.730 3.28 3.28 N/A
Out of bounds values : 0 OK
Avg bus bandwidth : 3.25724
root@managed-worker-l83z:/nccl-tests# mpirun --allow-run-as-root -H 10.73.0.52:1,10.73.0.15:1 -np 2 -mca btl_tcp_if_include ens12 -x LD_LIBRARY_PATH -x NCCL_SOCKET_IFNAME=ens12 -x NCCL_MIN_NRINGS=8 -x NCCL_MAX_NRINGS=8 -x NCCL_DEBUG=TRACE /nccl-tests/build/all_reduce_perf -b 1G -e 1G -f 2 -g 1 -c 0
nThread 1 nGpus 1 minBytes 1073741824 maxBytes 1073741824 step: 2(factor) warmup iters: 5 iters: 20 validation: 0
managed-worker-l83z:16210:16210 [0] NCCL INFO NET : Using interface ens12:10.73.0.52<0>
managed-worker-l83z:16210:16210 [0] NCCL INFO NET/IB : Using interface ens12 for sideband communication
managed-worker-l83z:16210:16210 [0] NCCL INFO Using internal Network Socket
managed-worker-l83z:16210:16210 [0] NCCL INFO NET : Using interface ens12:10.73.0.52<0>
managed-worker-l83z:16210:16210 [0] NCCL INFO NET/Socket : 1 interfaces found
NCCL version 2.3.7+cuda10.0
managed-worker-l83z:16210:16210 [0] NCCL INFO rank 0 nranks 2
managed-worker-l83z:16210:16216 [0] NCCL INFO comm 0x7f4a100566a0 rank 0 nranks 2
managed-worker-jbk7:16414:16414 [0] NCCL INFO NET : Using interface ens12:10.73.0.15<0>
managed-worker-jbk7:16414:16414 [0] NCCL INFO NET/IB : Using interface ens12 for sideband communication
managed-worker-jbk7:16414:16414 [0] NCCL INFO Using internal Network Socket
managed-worker-jbk7:16414:16414 [0] NCCL INFO rank 1 nranks 2
managed-worker-jbk7:16414:16419 [0] NCCL INFO comm 0x7f9ab80566a0 rank 1 nranks 2
managed-worker-jbk7:16414:16419 [0] NCCL INFO NET : Using interface ens12:10.73.0.15<0>
managed-worker-jbk7:16414:16419 [0] NCCL INFO NET/Socket : 1 interfaces found
managed-worker-l83z:16210:16216 [0] NCCL INFO CUDA Dev 0, IP Interfaces : ens12(PHB)
managed-worker-jbk7:16414:16419 [0] NCCL INFO CUDA Dev 0, IP Interfaces : ens12(PHB)
managed-worker-l83z:16210:16216 [0] NCCL INFO NCCL_MAX_NRINGS set by environment to 8.
managed-worker-l83z:16210:16216 [0] NCCL INFO NCCL_MIN_NRINGS set by environment to 8.
managed-worker-l83z:16210:16216 [0] NCCL INFO Duplicating rings to 8 per user request.
managed-worker-l83z:16210:16216 [0] NCCL INFO Using 256 threads
managed-worker-l83z:16210:16216 [0] NCCL INFO Min Comp Cap 7
managed-worker-jbk7:16414:16419 [0] NCCL INFO NCCL_MAX_NRINGS set by environment to 8.
managed-worker-jbk7:16414:16419 [0] NCCL INFO NCCL_MIN_NRINGS set by environment to 8.
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 00 : 0 1
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 01 : 0 1
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 02 : 0 1
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 03 : 0 1
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 04 : 0 1
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 05 : 0 1
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 06 : 0 1
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 07 : 0 1
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 00 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16414:16419 [0] NCCL INFO Ring 00 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 01 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16414:16419 [0] NCCL INFO Ring 01 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 02 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16414:16419 [0] NCCL INFO Ring 02 : 0 -> 1 via NET/Socket/0
managed-worker-jbk7:16414:16419 [0] NCCL INFO Ring 03 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 03 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16414:16419 [0] NCCL INFO Ring 04 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 04 : 1 -> 0 via NET/Socket/0
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 05 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16414:16419 [0] NCCL INFO Ring 05 : 0 -> 1 via NET/Socket/0
managed-worker-jbk7:16414:16419 [0] NCCL INFO Ring 06 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 06 : 1 -> 0 via NET/Socket/0
managed-worker-jbk7:16414:16419 [0] NCCL INFO Ring 07 : 0 -> 1 via NET/Socket/0
managed-worker-l83z:16210:16216 [0] NCCL INFO Ring 07 : 1 -> 0 via NET/Socket/0
managed-worker-l83z:16210:16216 [0] NCCL INFO comm 0x7f4a100566a0 rank 0 nranks 2 - COMPLETE
# NCCL Tests compiled with NCCL 2.4
# Using devices
managed-worker-jbk7:16414:16419 [0] NCCL INFO comm 0x7f9ab80566a0 rank 1 nranks 2 - COMPLETE
# Rank 0 on managed-worker-l83z device 0 [0x00] Tesla V100-SXM2-16GB
# out-of-place in-place
# bytes N type op time algbw busbw res time algbw busbw res
# Rank 1 on managed-worker-jbk7 device 0 [0x00] Tesla V100-SXM2-16GB
managed-worker-l83z:16210:16210 [0] NCCL INFO Launch mode Parallel
1073741824 268435456 float sum 286.626 3.75 3.75 N/A 283.991 3.78 3.78 N/A
Out of bounds values : 0 OK
Avg bus bandwidth : 3.76352
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment