Created
June 27, 2018 10:07
-
-
Save msalvaris/671fde2b95899c05db2ef7ee61c7da4d to your computer and use it in GitHub Desktop.
Horovod Debug Output
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
10.0.0.5,10.0.0.4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
docker0 Link encap:Ethernet HWaddr 02:42:cd:55:69:8d | |
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0 | |
UP BROADCAST MULTICAST MTU:1500 Metric:1 | |
RX packets:0 errors:0 dropped:0 overruns:0 frame:0 | |
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 | |
collisions:0 txqueuelen:0 | |
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) | |
eth0 Link encap:Ethernet HWaddr 00:0d:3a:1b:b6:a2 | |
inet addr:10.0.0.5 Bcast:10.0.0.255 Mask:255.255.255.0 | |
inet6 addr: fe80::20d:3aff:fe1b:b6a2/64 Scope:Link | |
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 | |
RX packets:36650860 errors:0 dropped:0 overruns:0 frame:0 | |
TX packets:3183577 errors:0 dropped:0 overruns:0 carrier:0 | |
collisions:0 txqueuelen:1000 | |
RX bytes:51453575812 (51.4 GB) TX bytes:40502753493 (40.5 GB) | |
eth1 Link encap:Ethernet HWaddr 00:15:5d:33:ff:0e | |
inet addr:172.16.1.5 Bcast:172.16.255.255 Mask:255.255.0.0 | |
inet6 addr: fe80::215:5dff:fe33:ff0e/64 Scope:Link | |
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 | |
RX packets:8 errors:0 dropped:0 overruns:0 frame:0 | |
TX packets:6178 errors:0 dropped:0 overruns:0 carrier:0 | |
collisions:0 txqueuelen:1000 | |
RX bytes:336 (336.0 B) TX bytes:2097208 (2.0 MB) | |
lo Link encap:Local Loopback | |
inet addr:127.0.0.1 Mask:255.0.0.0 | |
inet6 addr: ::1/128 Scope:Host | |
UP LOOPBACK RUNNING MTU:65536 Metric:1 | |
RX packets:2860 errors:0 dropped:0 overruns:0 frame:0 | |
TX packets:2860 errors:0 dropped:0 overruns:0 carrier:0 | |
collisions:0 txqueuelen:1000 | |
RX bytes:274654 (274.6 KB) TX bytes:274654 (274.6 KB) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 3 Build 20170405 (id: 17193) | |
[0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation. All rights reserved. | |
[0] MPI startup(): Multi-threaded optimized library | |
[0] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-ib0 | |
[2] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-ib0 | |
[3] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-ib0 | |
[1] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-ib0 | |
[4] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-ib0 | |
[5] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-ib0 | |
[6] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-ib0 | |
[7] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-ib0 | |
[0] MPI startup(): DAPL provider ofa-v2-ib0 | |
[0] MPI startup(): dapl data transfer mode | |
[4] MPI startup(): DAPL provider ofa-v2-ib0 | |
[4] MPI startup(): dapl data transfer mode | |
[1] MPI startup(): DAPL provider ofa-v2-ib0 | |
[3] MPI startup(): DAPL provider ofa-v2-ib0 | |
[1] MPI startup(): dapl data transfer mode | |
[3] MPI startup(): dapl data transfer mode | |
[2] MPI startup(): DAPL provider ofa-v2-ib0 | |
[2] MPI startup(): dapl data transfer mode | |
[6] MPI startup(): DAPL provider ofa-v2-ib0 | |
[5] MPI startup(): DAPL provider ofa-v2-ib0 | |
[6] MPI startup(): dapl data transfer mode | |
[5] MPI startup(): dapl data transfer mode | |
[7] MPI startup(): DAPL provider ofa-v2-ib0 | |
[7] MPI startup(): dapl data transfer mode | |
[0] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000 | |
[0] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000 | |
[1] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000 | |
[1] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000 | |
[2] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000 | |
[2] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000 | |
[3] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000 | |
[3] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000 | |
[4] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000 | |
[4] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000 | |
[5] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000 | |
[5] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000 | |
[6] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000 | |
[6] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000 | |
[7] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000 | |
[7] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000 | |
[0] MPI startup(): Device_reset_idx=4 | |
[0] MPI startup(): Allgather: 3: 0-0 & 0-8 | |
[0] MPI startup(): Allgather: 1: 1-16 & 0-8 | |
[0] MPI startup(): Allgather: 5: 17-41 & 0-8 | |
[0] MPI startup(): Allgather: 1: 42-8192 & 0-8 | |
[0] MPI startup(): Allgather: 5: 8193-123115 & 0-8 | |
[0] MPI startup(): Allgather: 3: 123116-2303247 & 0-8 | |
[0] MPI startup(): Allgather: 5: 0-2147483647 & 0-8 | |
[0] MPI startup(): Allgather: 2: 0-0 & 9-16 | |
[0] MPI startup(): Allgather: 1: 1-256 & 9-16 | |
[0] MPI startup(): Allgather: 5: 257-1024 & 9-16 | |
[0] MPI startup(): Allgather: 1: 1025-4345 & 9-16 | |
[0] MPI startup(): Allgather: 5: 4346-131072 & 9-16 | |
[0] MPI startup(): Allgather: 3: 131073-1058271 & 9-16 | |
[0] MPI startup(): Allgather: 5: 0-2147483647 & 9-16 | |
[0] MPI startup(): Allgather: 1: 0-2 & 17-32 | |
[0] MPI startup(): Allgather: 5: 3-11 & 17-32 | |
[0] MPI startup(): Allgather: 1: 12-64 & 17-32 | |
[0] MPI startup(): Allgather: 5: 65-48283 & 17-32 | |
[0] MPI startup(): Allgather: 3: 48284-740967 & 17-32 | |
[0] MPI startup(): Allgather: 5: 0-2147483647 & 17-32 | |
[0] MPI startup(): Allgather: 2: 0-0 & 33-64 | |
[0] MPI startup(): Allgather: 1: 1-256 & 33-64 | |
[0] MPI startup(): Allgather: 5: 0-2147483647 & 33-64 | |
[0] MPI startup(): Allgather: 1: 0-256 & 65-128 | |
[0] MPI startup(): Allgather: 5: 257-78452 & 65-128 | |
[0] MPI startup(): Allgather: 3: 78453-167908 & 65-128 | |
[0] MPI startup(): Allgather: 5: 167909-335315 & 65-128 | |
[0] MPI startup(): Allgather: 1: 335316-524288 & 65-128 | |
[0] MPI startup(): Allgather: 5: 0-2147483647 & 65-128 | |
[0] MPI startup(): Allgather: 3: 0-0 & 129-2147483647 | |
[0] MPI startup(): Allgather: 1: 1-2 & 129-2147483647 | |
[0] MPI startup(): Allgather: 5: 3-4 & 129-2147483647 | |
[0] MPI startup(): Allgather: 1: 5-256 & 129-2147483647 | |
[0] MPI startup(): Allgather: 5: 257-27716 & 129-2147483647 | |
[0] MPI startup(): Allgather: 3: 0-2147483647 & 129-2147483647 | |
[0] MPI startup(): Allgatherv: 3: 0-0 & 0-8 | |
[0] MPI startup(): Allgatherv: 1: 1-4096 & 0-8 | |
[0] MPI startup(): Allgatherv: 3: 4097-8192 & 0-8 | |
[0] MPI startup(): Allgatherv: 1: 8193-19391 & 0-8 | |
[0] MPI startup(): Allgatherv: 3: 0-2147483647 & 0-8 | |
[0] MPI startup(): Allgatherv: 1: 0-8192 & 9-16 | |
[0] MPI startup(): Allgatherv: 2: 8193-16384 & 9-16 | |
[0] MPI startup(): Allgatherv: 3: 0-2147483647 & 9-16 | |
[0] MPI startup(): Allgatherv: 1: 0-1 & 17-32 | |
[0] MPI startup(): Allgatherv: 2: 2-43 & 17-32 | |
[0] MPI startup(): Allgatherv: 1: 44-10031 & 17-32 | |
[0] MPI startup(): Allgatherv: 2: 10032-21034 & 17-32 | |
[0] MPI startup(): Allgatherv: 3: 0-2147483647 & 17-32 | |
[0] MPI startup(): Allgatherv: 1: 0-8192 & 33-64 | |
[0] MPI startup(): Allgatherv: 2: 8193-26640 & 33-64 | |
[0] MPI startup(): Allgatherv: 3: 0-2147483647 & 33-64 | |
[0] MPI startup(): Allgatherv: 1: 0-128 & 65-128 | |
[0] MPI startup(): Allgatherv: 2: 129-3017 & 65-128 | |
[0] MPI startup(): Allgatherv: 1: 3018-6818 & 65-128 | |
[0] MPI startup(): Allgatherv: 2: 6819-8192 & 65-128 | |
[0] MPI startup(): Allgatherv: 1: 8193-30133 & 65-128 | |
[0] MPI startup(): Allgatherv: 3: 0-2147483647 & 65-128 | |
[0] MPI startup(): Allgatherv: 2: 0-1 & 129-2147483647 | |
[0] MPI startup(): Allgatherv: 1: 2-2 & 129-2147483647 | |
[0] MPI startup(): Allgatherv: 2: 3-4 & 129-2147483647 | |
[0] MPI startup(): Allgatherv: 1: 5-64 & 129-2147483647 | |
[0] MPI startup(): Allgatherv: 2: 65-128 & 129-2147483647 | |
[0] MPI startup(): Allgatherv: 1: 129-8125 & 129-2147483647 | |
[0] MPI startup(): Allgatherv: 3: 0-2147483647 & 129-2147483647 | |
[0] MPI startup(): Allreduce: 4: 0-0 & 0-8 | |
[0] MPI startup(): Allreduce: 11: 1-4 & 0-8 | |
[0] MPI startup(): Allreduce: 10: 5-8 & 0-8 | |
[0] MPI startup(): Allreduce: 11: 9-16 & 0-8 | |
[0] MPI startup(): Allreduce: 12: 17-37 & 0-8 | |
[0] MPI startup(): Allreduce: 10: 38-64 & 0-8 | |
[0] MPI startup(): Allreduce: 11: 65-128 & 0-8 | |
[0] MPI startup(): Allreduce: 10: 129-512 & 0-8 | |
[0] MPI startup(): Allreduce: 12: 513-2048 & 0-8 | |
[0] MPI startup(): Allreduce: 11: 2049-4096 & 0-8 | |
[0] MPI startup(): Allreduce: 10: 4097-8192 & 0-8 | |
[0] MPI startup(): Allreduce: 2: 0-2147483647 & 0-8 | |
[0] MPI startup(): Allreduce: 1: 0-0 & 9-16 | |
[0] MPI startup(): Allreduce: 12: 1-6 & 9-16 | |
[0] MPI startup(): Allreduce: 11: 7-11 & 9-16 | |
[0] MPI startup(): Allreduce: 10: 12-29 & 9-16 | |
[0] MPI startup(): Allreduce: 12: 30-32 & 9-16 | |
[0] MPI startup(): Allreduce: 11: 33-64 & 9-16 | |
[0] MPI startup(): Allreduce: 12: 65-128 & 9-16 | |
[0] MPI startup(): Allreduce: 11: 129-369 & 9-16 | |
[0] MPI startup(): Allreduce: 10: 370-512 & 9-16 | |
[0] MPI startup(): Allreduce: 11: 513-6523 & 9-16 | |
[0] MPI startup(): Allreduce: 10: 6524-8192 & 9-16 | |
[0] MPI startup(): Allreduce: 2: 0-2147483647 & 9-16 | |
[0] MPI startup(): Allreduce: 1: 0-0 & 17-32 | |
[0] MPI startup(): Allreduce: 10: 1-5 & 17-32 | |
[0] MPI startup(): Allreduce: 12: 6-43 & 17-32 | |
[0] MPI startup(): Allreduce: 11: 44-64 & 17-32 | |
[0] MPI startup(): Allreduce: 10: 65-512 & 17-32 | |
[0] MPI startup(): Allreduce: 11: 513-7157 & 17-32 | |
[0] MPI startup(): Allreduce: 12: 7158-8192 & 17-32 | |
[0] MPI startup(): Allreduce: 2: 0-2147483647 & 17-32 | |
[0] MPI startup(): Allreduce: 1: 0-0 & 33-64 | |
[0] MPI startup(): Allreduce: 10: 1-29 & 33-64 | |
[0] MPI startup(): Allreduce: 12: 30-36 & 33-64 | |
[0] MPI startup(): Allreduce: 10: 37-85 & 33-64 | |
[0] MPI startup(): Allreduce: 11: 86-198 & 33-64 | |
[0] MPI startup(): Allreduce: 10: 199-4096 & 33-64 | |
[0] MPI startup(): Allreduce: 11: 4097-8192 & 33-64 | |
[0] MPI startup(): Allreduce: 2: 0-2147483647 & 33-64 | |
[0] MPI startup(): Allreduce: 1: 0-0 & 65-128 | |
[0] MPI startup(): Allreduce: 12: 1-4 & 65-128 | |
[0] MPI startup(): Allreduce: 11: 5-29 & 65-128 | |
[0] MPI startup(): Allreduce: 12: 30-32 & 65-128 | |
[0] MPI startup(): Allreduce: 11: 33-118 & 65-128 | |
[0] MPI startup(): Allreduce: 10: 119-170 & 65-128 | |
[0] MPI startup(): Allreduce: 12: 171-256 & 65-128 | |
[0] MPI startup(): Allreduce: 11: 257-512 & 65-128 | |
[0] MPI startup(): Allreduce: 12: 513-2048 & 65-128 | |
[0] MPI startup(): Allreduce: 11: 2049-6280 & 65-128 | |
[0] MPI startup(): Allreduce: 10: 6281-8192 & 65-128 | |
[0] MPI startup(): Allreduce: 2: 0-2147483647 & 65-128 | |
[0] MPI startup(): Allreduce: 8: 0-0 & 129-2147483647 | |
[0] MPI startup(): Allreduce: 11: 1-5 & 129-2147483647 | |
[0] MPI startup(): Allreduce: 10: 6-14 & 129-2147483647 | |
[0] MPI startup(): Allreduce: 11: 15-32 & 129-2147483647 | |
[0] MPI startup(): Allreduce: 10: 33-128 & 129-2147483647 | |
[0] MPI startup(): Allreduce: 12: 129-512 & 129-2147483647 | |
[0] MPI startup(): Allreduce: 11: 513-2048 & 129-2147483647 | |
[0] MPI startup(): Allreduce: 12: 2049-4230 & 129-2147483647 | |
[0] MPI startup(): Allreduce: 11: 4231-13217 & 129-2147483647 | |
[0] MPI startup(): Allreduce: 2: 0-2147483647 & 129-2147483647 | |
[0] MPI startup(): Alltoall: 2: 0-2147483647 & 0-8 | |
[0] MPI startup(): Alltoall: 1: 0-0 & 9-16 | |
[0] MPI startup(): Alltoall: 2: 1-8 & 9-16 | |
[0] MPI startup(): Alltoall: 1: 9-32 & 9-16 | |
[0] MPI startup(): Alltoall: 2: 33-262144 & 9-16 | |
[0] MPI startup(): Alltoall: 3: 0-2147483647 & 9-16 | |
[0] MPI startup(): Alltoall: 1: 0-256 & 17-32 | |
[0] MPI startup(): Alltoall: 2: 257-131072 & 17-32 | |
[0] MPI startup(): Alltoall: 3: 0-2147483647 & 17-32 | |
[0] MPI startup(): Alltoall: 1: 0-256 & 33-64 | |
[0] MPI startup(): Alltoall: 2: 257-65536 & 33-64 | |
[0] MPI startup(): Alltoall: 3: 65537-131072 & 33-64 | |
[0] MPI startup(): Alltoall: 2: 131073-1048576 & 33-64 | |
[0] MPI startup(): Alltoall: 3: 0-2147483647 & 33-64 | |
[0] MPI startup(): Alltoall: 1: 0-128 & 65-128 | |
[0] MPI startup(): Alltoall: 2: 129-131072 & 65-128 | |
[0] MPI startup(): Alltoall: 4: 131073-262144 & 65-128 | |
[0] MPI startup(): Alltoall: 2: 262145-524288 & 65-128 | |
[0] MPI startup(): Alltoall: 3: 0-2147483647 & 65-128 | |
[0] MPI startup(): Alltoall: 3: 0-0 & 129-2147483647 | |
[0] MPI startup(): Alltoall: 1: 1-256 & 129-2147483647 | |
[0] MPI startup(): Alltoall: 2: 0-2147483647 & 129-2147483647 | |
[0] MPI startup(): Alltoallv: 1: 0-2147483647 & 0-2147483647 | |
[0] MPI startup(): Alltoallw: 0: 0-2147483647 & 0-2147483647 | |
[0] MPI startup(): Barrier: 7: 0-2147483647 & 0-8 | |
[0] MPI startup(): Barrier: 9: 0-2147483647 & 9-16 | |
[0] MPI startup(): Barrier: 7: 0-2147483647 & 17-32 | |
[0] MPI startup(): Barrier: 8: 0-2147483647 & 33-64 | |
[0] MPI startup(): Barrier: 7: 0-2147483647 & 65-2147483647 | |
[0] MPI startup(): Bcast: 3: 0-0 & 0-8 | |
[0] MPI startup(): Bcast: 9: 1-64 & 0-8 | |
[0] MPI startup(): Bcast: 10: 65-256 & 0-8 | |
[0] MPI startup(): Bcast: 9: 257-4096 & 0-8 | |
[0] MPI startup(): Bcast: 10: 4097-16384 & 0-8 | |
[0] MPI startup(): Bcast: 11: 16385-136906 & 0-8 | |
[0] MPI startup(): Bcast: 9: 136907-262144 & 0-8 | |
[0] MPI startup(): Bcast: 6: 262145-524288 & 0-8 | |
[0] MPI startup(): Bcast: 3: 0-2147483647 & 0-8 | |
[0] MPI startup(): Bcast: 3: 0-0 & 9-16 | |
[0] MPI startup(): Bcast: 10: 1-16384 & 9-16 | |
[0] MPI startup(): Bcast: 9: 16385-232630 & 9-16 | |
[0] MPI startup(): Bcast: 2: 232631-1048576 & 9-16 | |
[0] MPI startup(): Bcast: 3: 0-2147483647 & 9-16 | |
[0] MPI startup(): Bcast: 1: 0-0 & 17-32 | |
[0] MPI startup(): Bcast: 10: 1-32768 & 17-32 | |
[0] MPI startup(): Bcast: 2: 0-2147483647 & 17-32 | |
[0] MPI startup(): Bcast: 7: 0-0 & 33-64 | |
[0] MPI startup(): Bcast: 8: 1-1 & 33-64 | |
[0] MPI startup(): Bcast: 10: 2-2 & 33-64 | |
[0] MPI startup(): Bcast: 8: 3-6 & 33-64 | |
[0] MPI startup(): Bcast: 10: 7-128 & 33-64 | |
[0] MPI startup(): Bcast: 9: 129-256 & 33-64 | |
[0] MPI startup(): Bcast: 11: 257-512 & 33-64 | |
[0] MPI startup(): Bcast: 10: 513-1024 & 33-64 | |
[0] MPI startup(): Bcast: 1: 1025-2048 & 33-64 | |
[0] MPI startup(): Bcast: 10: 2049-4096 & 33-64 | |
[0] MPI startup(): Bcast: 11: 4097-8192 & 33-64 | |
[0] MPI startup(): Bcast: 10: 8193-16384 & 33-64 | |
[0] MPI startup(): Bcast: 2: 16385-2842493 & 33-64 | |
[0] MPI startup(): Bcast: 6: 0-2147483647 & 33-64 | |
[0] MPI startup(): Bcast: 7: 0-0 & 65-128 | |
[0] MPI startup(): Bcast: 4: 1-1 & 65-128 | |
[0] MPI startup(): Bcast: 10: 2-3 & 65-128 | |
[0] MPI startup(): Bcast: 8: 4-6 & 65-128 | |
[0] MPI startup(): Bcast: 10: 7-32768 & 65-128 | |
[0] MPI startup(): Bcast: 9: 32769-251843 & 65-128 | |
[0] MPI startup(): Bcast: 11: 251844-427405 & 65-128 | |
[0] MPI startup(): Bcast: 6: 0-2147483647 & 65-128 | |
[0] MPI startup(): Bcast: 1: 0-0 & 129-2147483647 | |
[0] MPI startup(): Bcast: 8: 1-16 & 129-2147483647 | |
[0] MPI startup(): Bcast: 10: 17-32 & 129-2147483647 | |
[0] MPI startup(): Bcast: 8: 33-64 & 129-2147483647 | |
[0] MPI startup(): Bcast: 10: 65-512 & 129-2147483647 | |
[0] MPI startup(): Bcast: 1: 513-1024 & 129-2147483647 | |
[0] MPI startup(): Bcast: 10: 1025-9906 & 129-2147483647 | |
[0] MPI startup(): Bcast: 4: 9907-16384 & 129-2147483647 | |
[0] MPI startup(): Bcast: 2: 16385-1346076 & 129-2147483647 | |
[0] MPI startup(): Bcast: 6: 0-2147483647 & 129-2147483647 | |
[0] MPI startup(): Exscan: 0: 0-2147483647 & 0-2147483647 | |
[0] MPI startup(): Gather: 3: 0-21 & 0-8 | |
[0] MPI startup(): Gather: 1: 22-101 & 0-8 | |
[0] MPI startup(): Gather: 3: 0-2147483647 & 0-8 | |
[0] MPI startup(): Gather: 3: 0-8 & 9-16 | |
[0] MPI startup(): Gather: 2: 9-512 & 9-16 | |
[0] MPI startup(): Gather: 3: 0-2147483647 & 9-16 | |
[0] MPI startup(): Gather: 3: 0-0 & 17-32 | |
[0] MPI startup(): Gather: 2: 1-1024 & 17-32 | |
[0] MPI startup(): Gather: 3: 0-2147483647 & 17-32 | |
[0] MPI startup(): Gather: 3: 0-0 & 33-64 | |
[0] MPI startup(): Gather: 1: 1-4 & 33-64 | |
[0] MPI startup(): Gather: 4: 5-12 & 33-64 | |
[0] MPI startup(): Gather: 1: 13-22 & 33-64 | |
[0] MPI startup(): Gather: 4: 23-51 & 33-64 | |
[0] MPI startup(): Gather: 1: 52-256 & 33-64 | |
[0] MPI startup(): Gather: 2: 257-1024 & 33-64 | |
[0] MPI startup(): Gather: 3: 0-2147483647 & 33-64 | |
[0] MPI startup(): Gather: 1: 0-0 & 65-128 | |
[0] MPI startup(): Gather: 4: 1-8 & 65-128 | |
[0] MPI startup(): Gather: 1: 9-32 & 65-128 | |
[0] MPI startup(): Gather: 4: 33-1118 & 65-128 | |
[0] MPI startup(): Gather: 3: 0-2147483647 & 65-128 | |
[0] MPI startup(): Gather: 3: 0-0 & 129-2147483647 | |
[0] MPI startup(): Gather: 1: 1-512 & 129-2147483647 | |
[0] MPI startup(): Gather: 4: 513-1423 & 129-2147483647 | |
[0] MPI startup(): Gather: 3: 1424-16384 & 129-2147483647 | |
[0] MPI startup(): Gather: 2: 16385-32768 & 129-2147483647 | |
[0] MPI startup(): Gather: 3: 0-2147483647 & 129-2147483647 | |
[0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-32 | |
[0] MPI startup(): Gatherv: 3: 0-2147483647 & 33-2147483647 | |
[0] MPI startup(): Reduce_scatter: 4: 0-0 & 0-8 | |
[0] MPI startup(): Reduce_scatter: 1: 1-30 & 0-8 | |
[0] MPI startup(): Reduce_scatter: 3: 31-262144 & 0-8 | |
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 0-8 | |
[0] MPI startup(): Reduce_scatter: 1: 0-0 & 9-16 | |
[0] MPI startup(): Reduce_scatter: 4: 1-4 & 9-16 | |
[0] MPI startup(): Reduce_scatter: 5: 5-16 & 9-16 | |
[0] MPI startup(): Reduce_scatter: 1: 17-50 & 9-16 | |
[0] MPI startup(): Reduce_scatter: 3: 51-105648 & 9-16 | |
[0] MPI startup(): Reduce_scatter: 3: 105649-137527 & 9-16 | |
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 9-16 | |
[0] MPI startup(): Reduce_scatter: 2: 0-0 & 17-32 | |
[0] MPI startup(): Reduce_scatter: 4: 1-4 & 17-32 | |
[0] MPI startup(): Reduce_scatter: 1: 5-1152 & 17-32 | |
[0] MPI startup(): Reduce_scatter: 3: 1153-189120 & 17-32 | |
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 17-32 | |
[0] MPI startup(): Reduce_scatter: 3: 0-0 & 33-64 | |
[0] MPI startup(): Reduce_scatter: 4: 1-4 & 33-64 | |
[0] MPI startup(): Reduce_scatter: 1: 5-1337 & 33-64 | |
[0] MPI startup(): Reduce_scatter: 3: 1338-80624 & 33-64 | |
[0] MPI startup(): Reduce_scatter: 3: 80625-148357 & 33-64 | |
[0] MPI startup(): Reduce_scatter: 3: 148358-335236 & 33-64 | |
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 33-64 | |
[0] MPI startup(): Reduce_scatter: 1: 0-0 & 65-128 | |
[0] MPI startup(): Reduce_scatter: 4: 1-4 & 65-128 | |
[0] MPI startup(): Reduce_scatter: 1: 5-4096 & 65-128 | |
[0] MPI startup(): Reduce_scatter: 3: 4097-1383065 & 65-128 | |
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 65-128 | |
[0] MPI startup(): Reduce_scatter: 2: 0-0 & 129-2147483647 | |
[0] MPI startup(): Reduce_scatter: 4: 1-147 & 129-2147483647 | |
[0] MPI startup(): Reduce_scatter: 1: 148-19938 & 129-2147483647 | |
[0] MPI startup(): Reduce_scatter: 3: 19939-1272411 & 129-2147483647 | |
[0] MPI startup(): Reduce_scatter: 2: 0-2147483647 & 129-2147483647 | |
[0] MPI startup(): Reduce: 1: 0-0 & 0-8 | |
[0] MPI startup(): Reduce: 9: 1-4 & 0-8 | |
[0] MPI startup(): Reduce: 10: 5-8 & 0-8 | |
[0] MPI startup(): Reduce: 9: 9-256 & 0-8 | |
[0] MPI startup(): Reduce: 10: 257-512 & 0-8 | |
[0] MPI startup(): Reduce: 9: 513-2048 & 0-8 | |
[0] MPI startup(): Reduce: 10: 2049-8192 & 0-8 | |
[0] MPI startup(): Reduce: 11: 8193-16384 & 0-8 | |
[0] MPI startup(): Reduce: 5: 16385-71836 & 0-8 | |
[0] MPI startup(): Reduce: 1: 0-2147483647 & 0-8 | |
[0] MPI startup(): Reduce: 1: 0-0 & 9-16 | |
[0] MPI startup(): Reduce: 8: 1-8 & 9-16 | |
[0] MPI startup(): Reduce: 9: 9-32 & 9-16 | |
[0] MPI startup(): Reduce: 10: 33-512 & 9-16 | |
[0] MPI startup(): Reduce: 8: 513-8192 & 9-16 | |
[0] MPI startup(): Reduce: 5: 8193-100753 & 9-16 | |
[0] MPI startup(): Reduce: 1: 0-2147483647 & 9-16 | |
[0] MPI startup(): Reduce: 1: 0-0 & 17-32 | |
[0] MPI startup(): Reduce: 10: 1-4 & 17-32 | |
[0] MPI startup(): Reduce: 8: 5-8 & 17-32 | |
[0] MPI startup(): Reduce: 10: 9-16 & 17-32 | |
[0] MPI startup(): Reduce: 9: 17-52 & 17-32 | |
[0] MPI startup(): Reduce: 10: 53-65 & 17-32 | |
[0] MPI startup(): Reduce: 11: 66-512 & 17-32 | |
[0] MPI startup(): Reduce: 8: 513-1024 & 17-32 | |
[0] MPI startup(): Reduce: 10: 1025-2048 & 17-32 | |
[0] MPI startup(): Reduce: 9: 2049-4096 & 17-32 | |
[0] MPI startup(): Reduce: 8: 4097-8192 & 17-32 | |
[0] MPI startup(): Reduce: 5: 8193-196408 & 17-32 | |
[0] MPI startup(): Reduce: 1: 0-2147483647 & 17-32 | |
[0] MPI startup(): Reduce: 7: 0-0 & 33-64 | |
[0] MPI startup(): Reduce: 9: 1-13 & 33-64 | |
[0] MPI startup(): Reduce: 8: 14-16 & 33-64 | |
[0] MPI startup(): Reduce: 10: 17-879 & 33-64 | |
[0] MPI startup(): Reduce: 8: 880-2048 & 33-64 | |
[0] MPI startup(): Reduce: 10: 2049-4096 & 33-64 | |
[0] MPI startup(): Reduce: 8: 4097-8192 & 33-64 | |
[0] MPI startup(): Reduce: 5: 8193-403968 & 33-64 | |
[0] MPI startup(): Reduce: 1: 0-2147483647 & 33-64 | |
[0] MPI startup(): Reduce: 3: 0-0 & 65-128 | |
[0] MPI startup(): Reduce: 8: 1-16 & 65-128 | |
[0] MPI startup(): Reduce: 10: 17-46 & 65-128 | |
[0] MPI startup(): Reduce: 11: 47-64 & 65-128 | |
[0] MPI startup(): Reduce: 10: 65-1024 & 65-128 | |
[0] MPI startup(): Reduce: 8: 1025-2218 & 65-128 | |
[0] MPI startup(): Reduce: 11: 2219-5549 & 65-128 | |
[0] MPI startup(): Reduce: 8: 5550-9629 & 65-128 | |
[0] MPI startup(): Reduce: 11: 9630-16384 & 65-128 | |
[0] MPI startup(): Reduce: 6: 16385-32768 & 65-128 | |
[0] MPI startup(): Reduce: 5: 32769-1048576 & 65-128 | |
[0] MPI startup(): Reduce: 1: 0-2147483647 & 65-128 | |
[0] MPI startup(): Reduce: 1: 0-0 & 129-2147483647 | |
[0] MPI startup(): Reduce: 10: 1-7 & 129-2147483647 | |
[0] MPI startup(): Reduce: 4: 8-8 & 129-2147483647 | |
[0] MPI startup(): Reduce: 10: 9-1987 & 129-2147483647 | |
[0] MPI startup(): Reduce: 11: 1988-5916 & 129-2147483647 | |
[0] MPI startup(): Reduce: 10: 5917-9580 & 129-2147483647 | |
[0] MPI startup(): Reduce: 11: 9581-16384 & 129-2147483647 | |
[0] MPI startup(): Reduce: 5: 16385-1048576 & 129-2147483647 | |
[0] MPI startup(): Reduce: 1: 0-2147483647 & 129-2147483647 | |
[0] MPI startup(): Scan: 0: 0-2147483647 & 0-2147483647 | |
[0] MPI startup(): Scatter: 2: 0-0 & 0-8 | |
[0] MPI startup(): Scatter: 1: 1-64 & 0-8 | |
[0] MPI startup(): Scatter: 3: 0-2147483647 & 0-8 | |
[0] MPI startup(): Scatter: 2: 0-150 & 9-16 | |
[0] MPI startup(): Scatter: 3: 0-2147483647 & 9-16 | |
[0] MPI startup(): Scatter: 1: 0-0 & 17-32 | |
[0] MPI startup(): Scatter: 2: 1-29 & 17-32 | |
[0] MPI startup(): Scatter: 1: 30-37 & 17-32 | |
[0] MPI startup(): Scatter: 2: 38-138 & 17-32 | |
[0] MPI startup(): Scatter: 1: 139-886 & 17-32 | |
[0] MPI startup(): Scatter: 3: 0-2147483647 & 17-32 | |
[0] MPI startup(): Scatter: 3: 0-0 & 33-64 | |
[0] MPI startup(): Scatter: 2: 1-32 & 33-64 | |
[0] MPI startup(): Scatter: 1: 33-1595 & 33-64 | |
[0] MPI startup(): Scatter: 3: 0-2147483647 & 33-64 | |
[0] MPI startup(): Scatter: 3: 0-0 & 65-128 | |
[0] MPI startup(): Scatter: 1: 1-17376 & 65-128 | |
[0] MPI startup(): Scatter: 3: 0-2147483647 & 65-128 | |
[0] MPI startup(): Scatter: 1: 0-26387 & 129-2147483647 | |
[0] MPI startup(): Scatter: 3: 26388-51099 & 129-2147483647 | |
[0] MPI startup(): Scatter: 1: 51100-95214 & 129-2147483647 | |
[0] MPI startup(): Scatter: 3: 0-2147483647 & 129-2147483647 | |
[0] MPI startup(): Scatterv: 1: 0-2147483647 & 0-2147483647 | |
[1] MPI startup(): Recognition=2 Platform(code=128 ippn=2 dev=3) Fabric(intra=4 inter=4 flags=0x0) | |
[2] MPI startup(): Recognition=2 Platform(code=128 ippn=2 dev=3) Fabric(intra=4 inter=4 flags=0x0) | |
[3] MPI startup(): Recognition=2 Platform(code=128 ippn=2 dev=3) Fabric(intra=4 inter=4 flags=0x0) | |
[0] MPI startup(): Rank Pid Node name Pin cpu | |
[0] MPI startup(): 0 859 4c9d2659524a436086fb2f14ff1bd1cd000001 {0,1,2,3,4,5} | |
[0] MPI startup(): 1 860 4c9d2659524a436086fb2f14ff1bd1cd000001 {6,7,8,9,10,11} | |
[0] MPI startup(): 2 861 4c9d2659524a436086fb2f14ff1bd1cd000001 {12,13,14,15,16,17} | |
[0] MPI startup(): 3 862 4c9d2659524a436086fb2f14ff1bd1cd000001 {18,19,20,21,22,23} | |
[0] MPI startup(): 4 828 4c9d2659524a436086fb2f14ff1bd1cd000002 {0,1,2,3,4,5} | |
[0] MPI startup(): 5 829 4c9d2659524a436086fb2f14ff1bd1cd000002 {6,7,8,9,10,11} | |
[0] MPI startup(): 6 830 4c9d2659524a436086fb2f14ff1bd1cd000002 {12,13,14,15,16,17} | |
[0] MPI startup(): 7 831 4c9d2659524a436086fb2f14ff1bd1cd000002 {18,19,20,21,22,23} | |
[0] MPI startup(): Recognition=2 Platform(code=128 ippn=2 dev=3) Fabric(intra=4 inter=4 flags=0x0) | |
[4] MPI startup(): Recognition=2 Platform(code=128 ippn=2 dev=3) Fabric(intra=4 inter=4 flags=0x0) | |
[5] MPI startup(): Recognition=2 Platform(code=128 ippn=2 dev=3) Fabric(intra=4 inter=4 flags=0x0) | |
[6] MPI startup(): Recognition=2 Platform(code=128 ippn=2 dev=3) Fabric(intra=4 inter=4 flags=0x0) | |
[7] MPI startup(): Recognition=2 Platform(code=128 ippn=2 dev=3) Fabric(intra=4 inter=4 flags=0x0) | |
[0] MPI startup(): I_MPI_DAPL_PROVIDER=ofa-v2-ib0 | |
[0] MPI startup(): I_MPI_DEBUG=6 | |
[0] MPI startup(): I_MPI_DYNAMIC_CONNECTION=0 | |
[0] MPI startup(): I_MPI_FABRICS=dapl | |
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_MAP=mlx4_0:-1 | |
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=2 | |
[0] MPI startup(): I_MPI_PIN_MAPPING=4:0 0,1 6,2 12,3 18 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO NET : Using interface eth0:10.0.0.5<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO NET/IB : Using interface eth0 for sideband communication | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO NET/IB: [0] mlx4_0:1/RoCE | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO Using internal Network IB | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO Using NCCL Low-latency algorithm for sizes below 16384 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO NET : Using interface eth0:10.0.0.5<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO NET/Socket : 1 interfaces found | |
NCCL version 2.2.12+cuda9.0 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:860:970 [1] INFO NET : Using interface eth0:10.0.0.5<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000001:860:970 [1] INFO NET/IB : Using interface eth0 for sideband communication | |
4c9d2659524a436086fb2f14ff1bd1cd000001:860:970 [1] INFO NET/IB: [0] mlx4_0:1/RoCE | |
4c9d2659524a436086fb2f14ff1bd1cd000001:860:970 [1] INFO Using internal Network IB | |
4c9d2659524a436086fb2f14ff1bd1cd000001:860:970 [1] INFO Using NCCL Low-latency algorithm for sizes below 16384 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:861:968 [2] INFO NET : Using interface eth0:10.0.0.5<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO NET : Using interface eth0:10.0.0.5<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO NET/IB : Using interface eth0 for sideband communication | |
4c9d2659524a436086fb2f14ff1bd1cd000001:861:968 [2] INFO NET/IB : Using interface eth0 for sideband communication | |
4c9d2659524a436086fb2f14ff1bd1cd000001:861:968 [2] INFO NET/IB: [0] mlx4_0:1/RoCE | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO NET/IB: [0] mlx4_0:1/RoCE | |
4c9d2659524a436086fb2f14ff1bd1cd000001:861:968 [2] INFO Using internal Network IB | |
4c9d2659524a436086fb2f14ff1bd1cd000001:861:968 [2] INFO 4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO Using internal Network IB | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO Using NCCL Low-latency algorithm for sizes below 16384 | |
Using NCCL Low-latency algorithm for sizes below 16384 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO NET : Using interface eth0:10.0.0.4<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO NET/IB : Using interface eth0 for sideband communication | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO NET/IB: [0] mlx4_0:1/RoCE | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO NET : Using interface eth0:10.0.0.4<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO NET/IB : Using interface eth0 for sideband communication | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO Using internal Network IB | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO Using NCCL Low-latency algorithm for sizes below 16384 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO NET/IB: [0] mlx4_0:1/RoCE | |
4c9d2659524a436086fb2f14ff1bd1cd000002:830:939 [2] INFO NET : Using interface eth0:10.0.0.4<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000002:829:936 [1] INFO NET : Using interface eth0:10.0.0.4<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000002:830:939 [2] INFO NET/IB : Using interface eth0 for sideband communication | |
4c9d2659524a436086fb2f14ff1bd1cd000002:829:936 [1] INFO NET/IB : Using interface eth0 for sideband communication | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO Using internal Network IB | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO Using NCCL Low-latency algorithm for sizes below 16384 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:830:939 [2] INFO NET/IB: [0] mlx4_0:1/RoCE | |
4c9d2659524a436086fb2f14ff1bd1cd000002:829:936 [1] INFO NET/IB: [0] mlx4_0:1/RoCE | |
4c9d2659524a436086fb2f14ff1bd1cd000002:830:939 [2] INFO Using internal Network IB | |
4c9d2659524a436086fb2f14ff1bd1cd000002:830:939 [2] INFO 4c9d2659524a436086fb2f14ff1bd1cd000002:829:936 [1] INFO Using internal Network IB | |
Using NCCL Low-latency algorithm for sizes below 16384 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:829:936 [1] INFO Using NCCL Low-latency algorithm for sizes below 16384 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO comm 0x7f0ac826ac70 rank 3 nranks 8 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO NET : Using interface eth0:10.0.0.5<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO NET/Socket : 1 interfaces found | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO CUDA Dev 3, IB Ports : mlx4_0/1(PXB) | |
4c9d2659524a436086fb2f14ff1bd1cd000001:861:968 [2] INFO comm 0x7fd9c826b150 rank 2 nranks 8 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:861:968 [2] INFO NET : Using interface eth0:10.0.0.5<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000001:861:968 [2] INFO NET/Socket : 1 interfaces found | |
4c9d2659524a436086fb2f14ff1bd1cd000001:861:968 [2] INFO CUDA Dev 2, IB Ports : mlx4_0/1(PXB) | |
4c9d2659524a436086fb2f14ff1bd1cd000001:860:970 [1] INFO comm 0x7ff24026b560 rank 1 nranks 8 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:860:970 [1] INFO NET : Using interface eth0:10.0.0.5<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000001:860:970 [1] INFO NET/Socket : 1 interfaces found | |
4c9d2659524a436086fb2f14ff1bd1cd000001:860:970 [1] INFO CUDA Dev 1, IB Ports : mlx4_0/1(PXB) | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO comm 0x7f954829e5d0 rank 0 nranks 8 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO CUDA Dev 0, IB Ports : mlx4_0/1(PXB) | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO comm 0x7f5900268610 rank 4 nranks 8 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO NET : Using interface eth0:10.0.0.4<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO NET/Socket : 1 interfaces found | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO CUDA Dev 0, IB Ports : mlx4_0/1(PXB) | |
4c9d2659524a436086fb2f14ff1bd1cd000002:829:936 [1] INFO comm 0x7fb6bc26a200 rank 5 nranks 8 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:829:936 [1] INFO NET : Using interface eth0:10.0.0.4<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000002:829:936 [1] INFO NET/Socket : 1 interfaces found | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO comm 0x7fc3e4269730 rank 7 nranks 8 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO NET : Using interface eth0:10.0.0.4<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO NET/Socket : 1 interfaces found | |
4c9d2659524a436086fb2f14ff1bd1cd000002:829:936 [1] INFO CUDA Dev 1, IB Ports : mlx4_0/1(PXB) | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO CUDA Dev 3, IB Ports : mlx4_0/1(PXB) | |
4c9d2659524a436086fb2f14ff1bd1cd000002:830:939 [2] INFO comm 0x7f4804269b10 rank 6 nranks 8 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:830:939 [2] INFO NET : Using interface eth0:10.0.0.4<0> | |
4c9d2659524a436086fb2f14ff1bd1cd000002:830:939 [2] INFO NET/Socket : 1 interfaces found | |
4c9d2659524a436086fb2f14ff1bd1cd000002:830:939 [2] INFO CUDA Dev 2, IB Ports : mlx4_0/1(PXB) | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO Using 256 threads | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO Min Comp Cap 7 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO NCCL_SINGLE_RING_THRESHOLD=262144 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO Ring 00 : 0 1 2 3 4 5 6 7 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] transport/net_ib.cu:218 WARN No module present for GPU Direct RDMA. | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] transport/net_ib.cu:218 WARN No module present for GPU Direct RDMA. | |
4c9d2659524a436086fb2f14ff1bd1cd000001:860:970 [1] INFO 1[860] -> 2[861] via direct shared memory | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO 7 -> 0 via NET/IB/0 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:861:968 [2] INFO 2[861] -> 3[862] via direct shared memory | |
4c9d2659524a436086fb2f14ff1bd1cd000001:859:967 [0] INFO 0[859] -> 1[860] via direct shared memory | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] transport/net_ib.cu:218 WARN No module present for GPU Direct RDMA. | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] transport/net_ib.cu:218 WARN No module present for GPU Direct RDMA. | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO 3 -> 4 via NET/IB/0 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:829:936 [1] INFO 5[829] -> 6[830] via direct shared memory | |
4c9d2659524a436086fb2f14ff1bd1cd000002:830:939 [2] INFO 6[830] -> 7[831] via direct shared memory | |
4c9d2659524a436086fb2f14ff1bd1cd000002:828:937 [0] INFO 4[828] -> 5[829] via direct shared memory | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] misc/ibvwrap.cu:235 WARN Call to ibv_query_gid failed with error Unknown error -1 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO transport/net_ib.cu:453 -> 2 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO include/net.h:32 -> 2 [Net] | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO transport/net.cu:266 -> 2 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO init.cu:475 -> 2 | |
4c9d2659524a436086fb2f14ff1bd1cd000001:862:969 [3] INFO init.cu:536 -> 2 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] misc/ibvwrap.cu:235 WARN Call to ibv_query_gid failed with error Unknown error -1 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO transport/net_ib.cu:453 -> 2 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO include/net.h:32 -> 2 [Net] | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO transport/net.cu:266 -> 2 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO init.cu:475 -> 2 | |
4c9d2659524a436086fb2f14ff1bd1cd000002:831:938 [3] INFO init.cu:536 -> 2 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment