Created
April 20, 2022 15:31
-
-
Save vanzod/4e4d51d76c0acc1081256174a482bd4a to your computer and use it in GitHub Desktop.
UCX 1.11.2 debug log for failing osu_scatter
This file has been truncated, but you can view the full file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[1650465527.778184] [ndv4:13452:0] debug.c:1198 UCX DEBUG using signal stack 0x2ae65c744000 size 141824 | |
[1650465527.778267] [ndv4:13452:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465527.778291] [ndv4:13452:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2ae65c598000 | |
[1650465527.778320] [ndv4:13452:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465527.778329] [ndv4:13452:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465527.778336] [ndv4:13452:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465527.781699] [ndv4:13452:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465527.781723] [ndv4:13452:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465527.781758] [ndv4:13452:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465527.781761] [ndv4:13452:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465527.781768] [ndv4:13452:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465527.781775] [ndv4:13452:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465527.781778] [ndv4:13452:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465527.781782] [ndv4:13452:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465527.781785] [ndv4:13452:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465527.781787] [ndv4:13452:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465527.781790] [ndv4:13452:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465527.781792] [ndv4:13452:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465527.781800] [ndv4:13452:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465527.783148] [ndv4:13452:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465527.784670] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465527.784697] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465527.784709] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465527.784721] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465527.784732] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465527.784743] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465527.784755] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465527.785278] [ndv4:13452:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465527.790275] [ndv4:15226:0] debug.c:1198 UCX DEBUG using signal stack 0x2b17886a0000 size 141824 | |
[1650465527.790348] [ndv4:15226:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465527.790366] [ndv4:15226:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b17884f4000 | |
[1650465527.790390] [ndv4:15226:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465527.790399] [ndv4:15226:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465527.790405] [ndv4:15226:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465527.793470] [ndv4:15226:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465527.793491] [ndv4:15226:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465527.793524] [ndv4:15226:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465527.793527] [ndv4:15226:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465527.793533] [ndv4:15226:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465527.793540] [ndv4:15226:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465527.793543] [ndv4:15226:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465527.793546] [ndv4:15226:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465527.793549] [ndv4:15226:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465527.793551] [ndv4:15226:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465527.793553] [ndv4:15226:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465527.793556] [ndv4:15226:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465527.793563] [ndv4:15226:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465527.794454] [ndv4:15226:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465527.795265] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465527.795282] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465527.795292] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465527.795303] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465527.795313] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465527.795323] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465527.795333] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465527.795875] [ndv4:13452:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465527.795972] [ndv4:15226:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465527.796347] [ndv4:13452:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465527.803344] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x16121e0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465527.803464] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465527.803472] [ndv4:13452:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465527.804006] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465527.804017] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465527.804041] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465527.804100] [ndv4:13452:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465527.804122] [ndv4:13452:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465527.805517] [ndv4:13452:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465527.805526] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465527.806360] [ndv4:15226:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465527.805837] [ndv4:13452:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465527.805895] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465527.805960] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465527.806962] [ndv4:15226:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465527.822454] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0xf571e0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465527.822559] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465527.822567] [ndv4:15226:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465527.823160] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465527.823196] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465527.823215] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465527.823276] [ndv4:15226:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465527.823295] [ndv4:15226:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465527.825920] [ndv4:15226:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465527.825929] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465527.826112] [ndv4:15226:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465527.826137] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465527.826153] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465527.830454] [ndv4:13452:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465527.830463] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465527.839965] [ndv4:15226:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465527.839974] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465527.841862] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465527.841868] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465527.843785] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465527.843794] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465527.847356] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465527.847362] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465527.849888] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465527.849894] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465527.853278] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465527.853286] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465527.856061] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465527.856068] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465527.857394] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465527.857400] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465527.857756] [ndv4:15226:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465527.862370] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465527.862377] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465527.862770] [ndv4:13452:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465527.880765] [ndv4:13452:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465527.881428] [ndv4:13452:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465527.889613] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x160af50 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465527.889670] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465527.889673] [ndv4:13452:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465527.890179] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465527.890279] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465527.890284] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465527.890327] [ndv4:13452:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465527.891910] [ndv4:15226:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465527.892244] [ndv4:13452:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465527.892251] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465527.892670] [ndv4:15226:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465527.892699] [ndv4:13452:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465527.892733] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465527.892739] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465527.894938] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0xf4ff50 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465527.894975] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465527.894979] [ndv4:15226:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465527.895171] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465527.895179] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465527.895185] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465527.895227] [ndv4:15226:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465527.897045] [ndv4:15226:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465527.897052] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465527.897247] [ndv4:15226:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465527.897265] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465527.897270] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465527.899195] [ndv4:15226:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465527.899201] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465527.903846] [ndv4:13452:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465527.903854] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465527.907228] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465527.907235] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465527.915451] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465527.915460] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465527.916069] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465527.916073] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465527.918893] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465527.918902] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465527.924006] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465527.924014] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465527.927121] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465527.927129] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465527.928507] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465527.928514] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465527.928922] [ndv4:13452:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465527.934184] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465527.934191] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465527.934431] [ndv4:15226:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465527.940794] [ndv4:13452:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465527.941792] [ndv4:13452:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465527.942970] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1612ce0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465527.943008] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465527.943011] [ndv4:13452:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465527.943378] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465527.943454] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465527.943460] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465527.943500] [ndv4:13452:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465527.947804] [ndv4:13452:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465527.947811] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465527.949381] [ndv4:15226:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465527.948135] [ndv4:13452:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465527.948376] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465527.948450] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465527.949948] [ndv4:15226:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465527.954402] [ndv4:13452:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465527.954409] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465527.963970] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0xf57ce0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465527.964008] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465527.964012] [ndv4:15226:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465527.964494] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465527.964655] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465527.964661] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465527.964703] [ndv4:15226:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465527.966274] [ndv4:15226:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465527.966281] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465527.966721] [ndv4:15226:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465527.966821] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465527.966863] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465527.971194] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465527.971202] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465527.976002] [ndv4:15226:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465527.976010] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465527.980193] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465527.980210] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465527.989042] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465527.989049] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465527.991972] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465527.991980] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465527.993753] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465527.993759] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.004241] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.004249] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.006526] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.006534] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.007634] [ndv4:13452:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.012093] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.012102] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.013110] [ndv4:15226:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.024456] [ndv4:13452:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.025074] [ndv4:13452:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465528.026862] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x160a2b0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.026898] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465528.026902] [ndv4:13452:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465528.027518] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.027571] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.027695] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.027757] [ndv4:13452:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465528.028916] [ndv4:13452:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.028923] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.029212] [ndv4:13452:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.029311] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.029425] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.031079] [ndv4:15226:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.031469] [ndv4:15226:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465528.033716] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0xf4f2b0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.033749] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465528.033752] [ndv4:15226:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465528.034401] [ndv4:13452:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465528.034408] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.034055] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.034250] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.034255] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.034297] [ndv4:15226:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465528.042554] [ndv4:15226:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.042562] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.042903] [ndv4:15226:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.043084] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.043113] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.044906] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.044913] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.048758] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.048765] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.050701] [ndv4:15226:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465528.050708] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.052220] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.052226] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.054683] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.054689] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.068885] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.068892] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.070232] [ndv4:13452:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.077990] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.077999] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.092454] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.092462] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.096412] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.096417] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.097500] [ndv4:15226:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.106003] [ndv4:13452:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.106428] [ndv4:13452:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465528.117245] [ndv4:15226:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.117178] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1cf3bf0 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.117202] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465528.117205] [ndv4:13452:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465528.117810] [ndv4:15226:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465528.117769] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.117810] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.117816] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.117861] [ndv4:13452:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465528.119305] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x1638bf0 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.119334] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465528.119337] [ndv4:15226:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465528.119767] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.119835] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.119841] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.119886] [ndv4:15226:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465528.121357] [ndv4:15226:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.121364] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.121674] [ndv4:15226:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.122061] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.122115] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.129258] [ndv4:13452:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.129266] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.129433] [ndv4:13452:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.129691] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.129835] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.133507] [ndv4:13452:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465528.133513] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.136468] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.136474] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.140939] [ndv4:15226:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465528.140948] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.142095] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.142102] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.151860] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.151868] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.152896] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.152903] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.165486] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.165494] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.166664] [ndv4:13452:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.165619] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.165625] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.177971] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.177978] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.180022] [ndv4:13452:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.180704] [ndv4:13452:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465528.194999] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x160e710 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.195026] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465528.195030] [ndv4:13452:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465528.196012] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.196020] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.195508] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.195622] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.195627] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.195676] [ndv4:13452:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465528.197183] [ndv4:13452:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.197190] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.197679] [ndv4:15226:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.197436] [ndv4:13452:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.197804] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.197848] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.208895] [ndv4:15226:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.209469] [ndv4:15226:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465528.209863] [ndv4:13452:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465528.209871] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.213555] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.213561] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.216860] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.216867] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.221055] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0xf53710 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.221087] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465528.221090] [ndv4:15226:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465528.221672] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.221813] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.221819] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.221869] [ndv4:15226:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465528.225923] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.225929] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.228895] [ndv4:15226:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.228903] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.229121] [ndv4:15226:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.229259] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.229379] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.229785] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.229791] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.231186] [ndv4:13452:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.243317] [ndv4:13452:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.243782] [ndv4:13452:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465528.250538] [ndv4:15226:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465528.250546] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.254330] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x160e560 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.254360] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465528.254364] [ndv4:13452:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465528.254558] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.254564] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.254880] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465528.255012] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465528.255017] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.255059] [ndv4:13452:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465528.259169] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.259175] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.261384] [ndv4:13452:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.261391] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.261762] [ndv4:13452:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.262350] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465528.262423] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465528.274012] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.274019] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.277020] [ndv4:13452:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465528.277028] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.278621] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.278628] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.279396] [ndv4:15226:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.282646] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.282652] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.293469] [ndv4:15226:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.293989] [ndv4:15226:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465528.296875] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0xf53560 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.296911] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465528.296915] [ndv4:15226:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465528.297369] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465528.297477] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465528.297482] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.297525] [ndv4:15226:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465528.299094] [ndv4:15226:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.299101] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.299298] [ndv4:15226:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.299830] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465528.300061] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465528.302285] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.302292] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.304780] [ndv4:14205:0] debug.c:1198 UCX DEBUG using signal stack 0x2afda2764000 size 141824 | |
[1650465528.305112] [ndv4:14205:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.305135] [ndv4:14205:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2afda25d2000 | |
[1650465528.305502] [ndv4:14205:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465528.305511] [ndv4:14205:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465528.305518] [ndv4:14205:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465528.312444] [ndv4:15226:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465528.312451] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.312220] [ndv4:14205:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.312241] [ndv4:14205:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465528.312276] [ndv4:14205:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465528.312279] [ndv4:14205:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465528.312286] [ndv4:14205:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465528.312293] [ndv4:14205:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465528.312296] [ndv4:14205:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465528.312300] [ndv4:14205:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465528.312302] [ndv4:14205:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465528.312305] [ndv4:14205:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465528.312307] [ndv4:14205:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465528.312309] [ndv4:14205:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465528.312318] [ndv4:14205:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465528.316147] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.316152] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.318759] [ndv4:14205:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465528.320312] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.320319] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.321055] [ndv4:14205:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465528.321072] [ndv4:14205:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465528.321084] [ndv4:14205:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465528.321095] [ndv4:14205:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465528.321106] [ndv4:14205:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465528.321117] [ndv4:14205:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465528.321127] [ndv4:14205:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465528.323881] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.323887] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.323841] [ndv4:14205:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465528.325155] [ndv4:13452:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.326841] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.326850] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.330030] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.330037] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.341640] [ndv4:13452:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.342075] [ndv4:13452:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465528.353919] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1cf4940 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.353953] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465528.353956] [ndv4:13452:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465528.354268] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465528.354539] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465528.354545] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.354652] [ndv4:13452:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465528.356097] [ndv4:13452:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.356104] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.356447] [ndv4:13452:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.356770] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465528.357176] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465528.361357] [ndv4:14205:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.362047] [ndv4:14205:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465528.369768] [ndv4:13452:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465528.369777] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.374665] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.374674] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.375000] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x1990360 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.375099] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465528.375106] [ndv4:14205:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465528.376036] [ndv4:15226:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.381220] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.381443] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.381467] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.381529] [ndv4:14205:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465528.381552] [ndv4:14205:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465528.389779] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.389787] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.393364] [ndv4:14205:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.393379] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.393640] [ndv4:14205:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.393872] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.393977] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.397564] [ndv4:15226:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.398504] [ndv4:15226:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465528.398898] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.398905] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.399995] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x1639940 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.400030] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465528.400035] [ndv4:15226:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465528.400648] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465528.400727] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465528.400735] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.400808] [ndv4:15226:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465528.402497] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.402503] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.404765] [ndv4:13452:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.404771] [ndv4:13452:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.405044] [ndv4:13452:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x15f7c90 0x15f7c90 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465528.405640] [ndv4:15226:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.405649] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.405919] [ndv4:15226:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.406203] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465528.406331] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465528.414198] [ndv4:14205:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465528.414207] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.426276] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.426283] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.431665] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.431671] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.433699] [ndv4:15226:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465528.433707] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.435816] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.435822] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.437721] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.437728] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.439106] [ndv4:14949:0] debug.c:1198 UCX DEBUG using signal stack 0x2ad4b6430000 size 141824 | |
[1650465528.439496] [ndv4:14949:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.439516] [ndv4:14949:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2ad4b629e000 | |
[1650465528.439571] [ndv4:14949:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465528.439678] [ndv4:14949:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465528.439685] [ndv4:14949:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465528.444391] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.444397] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.446069] [ndv4:14949:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.446092] [ndv4:14949:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465528.446127] [ndv4:14949:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465528.446130] [ndv4:14949:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465528.446136] [ndv4:14949:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465528.446143] [ndv4:14949:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465528.446146] [ndv4:14949:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465528.446150] [ndv4:14949:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465528.446153] [ndv4:14949:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465528.446155] [ndv4:14949:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465528.446157] [ndv4:14949:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465528.446159] [ndv4:14949:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465528.446167] [ndv4:14949:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465528.446097] [ndv4:14205:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.446877] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.446885] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.452836] [ndv4:14949:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465528.454979] [ndv4:14949:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465528.454996] [ndv4:14949:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465528.455007] [ndv4:14949:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465528.455018] [ndv4:14949:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465528.455028] [ndv4:14949:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465528.455038] [ndv4:14949:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465528.455048] [ndv4:14949:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465528.457279] [ndv4:14949:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465528.459481] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.459490] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.465834] [ndv4:14205:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.466176] [ndv4:14205:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465528.467136] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x1986db0 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.467168] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465528.467172] [ndv4:14205:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465528.472462] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.472725] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.472732] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.472775] [ndv4:14205:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465528.474254] [ndv4:14205:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.474262] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.474496] [ndv4:14205:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.477047] [ndv4:14949:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.477364] [ndv4:15226:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.477372] [ndv4:15226:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.477745] [ndv4:15226:0] ucp_context.c:1556 UCX DEBUG created ucp context 0xf3cc90 0xf3cc90 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465528.477561] [ndv4:14949:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465528.481298] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.481390] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.484564] [ndv4:13452:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2ae65fa2a000 length 12288 | |
[1650465528.484702] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465528.483866] [ndv4:14205:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465528.483873] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.486287] [ndv4:13443:0] debug.c:1198 UCX DEBUG using signal stack 0x2b750459c000 size 141824 | |
[1650465528.486516] [ndv4:13443:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.486539] [ndv4:13443:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b75043f0000 | |
[1650465528.486748] [ndv4:13443:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465528.486760] [ndv4:13443:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465528.486766] [ndv4:13443:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465528.486139] [ndv4:13452:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465528.486148] [ndv4:13452:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2ae65fa2d000 length 4296704 | |
[1650465528.486153] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2ae65fa2d018 of 4296680 bytes with 512 elements | |
[1650465528.486415] [ndv4:13452:0] mm_iface.c:600 UCX DEBUG created mm iface 0x1d74d10 FIFO id 0x400000002ed74ed9 va 0x2ae65fa2a000 size 12288 (128 x 64 elems) | |
[1650465528.486675] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x1d74d10 using posix/memory on worker 0x264e8d0 | |
[1650465528.486701] [ndv4:13452:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465528.486753] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465528.488010] [ndv4:13452:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465528.488025] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2ae664000018 of 4296680 bytes with 512 elements | |
[1650465528.488723] [ndv4:13452:0] mm_iface.c:600 UCX DEBUG created mm iface 0x1d752e0 FIFO id 0x64003b va 0x2ae65fe46000 size 12288 (128 x 64 elems) | |
[1650465528.488734] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x1d752e0 using sysv/memory on worker 0x264e8d0 | |
[1650465528.488747] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465528.488751] [ndv4:13452:0] self.c:220 UCX DEBUG created self iface id 0x8c9f25a5686245c7 send_size 8192 | |
[1650465528.488757] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x1d88ec0 using self/memory0 on worker 0x264e8d0 | |
[1650465528.488781] [ndv4:13452:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465528.488786] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465528.488789] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465528.490728] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.490735] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.491928] [ndv4:13443:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.491949] [ndv4:13443:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465528.491986] [ndv4:13443:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465528.491989] [ndv4:13443:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465528.491997] [ndv4:13443:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465528.492004] [ndv4:13443:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465528.492007] [ndv4:13443:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465528.492012] [ndv4:13443:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465528.492015] [ndv4:13443:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465528.492017] [ndv4:13443:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465528.492019] [ndv4:13443:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465528.492022] [ndv4:13443:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465528.492030] [ndv4:13443:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465528.493151] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1d8b000 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465528.493194] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465528.493213] [ndv4:13452:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1d89820: listening for connections (fd=78) on 10.5.0.5:60081 | |
[1650465528.493557] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x1d89820 using tcp/eth0 on worker 0x264e8d0 | |
[1650465528.493669] [ndv4:13452:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465528.493674] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465528.493677] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465528.493723] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1cd75c0 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465528.493746] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465528.493753] [ndv4:13452:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1d89ea0: listening for connections (fd=80) on 127.0.0.1:60264 | |
[1650465528.493885] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465528.493894] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465528.494152] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x1d89ea0 using tcp/lo on worker 0x264e8d0 | |
[1650465528.494170] [ndv4:13452:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465528.494173] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465528.494175] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465528.494469] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1ce5960 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465528.494490] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465528.494494] [ndv4:13452:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1d71530: listening for connections (fd=82) on 172.16.1.242:41147 | |
[1650465528.494890] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x1d71530 using tcp/ib0 on worker 0x264e8d0 | |
[1650465528.495093] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.495389] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.495771] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.495889] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.496007] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x972270 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.496165] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465528.496178] [ndv4:14949:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465528.496664] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.496836] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.496870] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.496951] [ndv4:14949:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465528.496977] [ndv4:14949:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465528.496710] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465528.498148] [ndv4:14949:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.498159] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.498633] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465528.498672] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.498675] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.498688] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.499206] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.499215] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.499704] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.499713] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465528.501039] [ndv4:13443:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465528.500640] [ndv4:14949:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.500812] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.500887] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.503089] [ndv4:13443:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465528.503105] [ndv4:13443:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465528.503117] [ndv4:13443:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465528.503127] [ndv4:13443:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465528.503137] [ndv4:13443:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465528.503147] [ndv4:13443:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465528.503158] [ndv4:13443:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465528.503934] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x1d7f670: created RC QP 0xdb0a on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465528.507210] [ndv4:13443:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465528.509842] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x1d7f670 using rc_verbs/mlx5_ib0:1 on worker 0x264e8d0 | |
[1650465528.514474] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.514483] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.518482] [ndv4:13443:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.517892] [ndv4:14949:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465528.517902] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.518999] [ndv4:13443:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465528.519299] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.519463] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.522169] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.522176] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.524976] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x1d3b080 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.525077] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465528.525085] [ndv4:13443:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465528.525762] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.525999] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.526023] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.526082] [ndv4:13443:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465528.526104] [ndv4:13443:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465528.525830] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.525841] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.526052] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.526276] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.527014] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465528.527420] [ndv4:13452:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465528.527692] [ndv4:14205:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.527695] [ndv4:13443:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.527705] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.527957] [ndv4:13443:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.528046] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.528120] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.528639] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.528657] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.528661] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.528718] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.529232] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2718010 of 8176 bytes with 127 elements | |
[1650465528.529533] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.529566] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.529668] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465528.529674] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.533146] [ndv4:13443:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465528.533154] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.537623] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1ce4c70 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.537664] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465528.537690] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x1d8d0c0 using rc_mlx5/mlx5_ib0:1 on worker 0x264e8d0 | |
[1650465528.538342] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.538484] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.538725] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.539182] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.543686] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.543705] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.544099] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465528.544799] [ndv4:14205:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.545210] [ndv4:14205:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465528.545956] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x1986e00 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.545992] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465528.545995] [ndv4:14205:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465528.546404] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.546817] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.546829] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.546914] [ndv4:14205:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465528.547812] [ndv4:14205:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.547819] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.548071] [ndv4:14205:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.548415] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.548799] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.550718] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.550729] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.550732] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.550747] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.551230] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465528.551242] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.551273] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465528.551277] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.551289] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1ce3b40 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.551313] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465528.553073] [ndv4:13452:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465528.553481] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.553489] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.555115] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.555123] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.556885] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.556892] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.559726] [ndv4:14949:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.563437] [ndv4:13452:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2982010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdbb1 | |
[1650465528.564490] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.564725] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.564751] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae65fe4b008 of 151544 bytes with 1052 elements | |
[1650465528.566397] [ndv4:15226:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b178b99c000 length 12288 | |
[1650465528.566463] [ndv4:14205:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465528.566475] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.566800] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465528.568358] [ndv4:15226:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465528.568376] [ndv4:15226:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b178b99f000 length 4296704 | |
[1650465528.568385] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b178b99f018 of 4296680 bytes with 512 elements | |
[1650465528.568860] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae664600000..0x2ae666c00000 on mlx5_ib0 lkey 0x80400 rkey 0x80400 access 0xf flags 0x3e4 | |
[1650465528.568881] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae664600018 of 39845864 bytes with 4752 elements | |
[1650465528.569015] [ndv4:13452:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2982010 | |
[1650465528.569051] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x2982010 using dc_mlx5/mlx5_ib0:1 on worker 0x264e8d0 | |
[1650465528.569523] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.569669] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.569798] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.569940] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.570427] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465528.571685] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.569774] [ndv4:15226:0] mm_iface.c:600 UCX DEBUG created mm iface 0x16b9d10 FIFO id 0x400000000f80c7c5 va 0x2b178b99c000 size 12288 (128 x 64 elems) | |
[1650465528.569895] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x16b9d10 using posix/memory on worker 0x1f938d0 | |
[1650465528.569931] [ndv4:15226:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465528.569993] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465528.570012] [ndv4:15226:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465528.570023] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b1790000018 of 4296680 bytes with 512 elements | |
[1650465528.570713] [ndv4:15226:0] mm_iface.c:600 UCX DEBUG created mm iface 0x16ba2e0 FIFO id 0x648002 va 0x2b178bdb8000 size 12288 (128 x 64 elems) | |
[1650465528.570723] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x16ba2e0 using sysv/memory on worker 0x1f938d0 | |
[1650465528.570735] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465528.570738] [ndv4:15226:0] self.c:220 UCX DEBUG created self iface id 0x45a973c9f245d83d send_size 8192 | |
[1650465528.570744] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x16cdec0 using self/memory0 on worker 0x1f938d0 | |
[1650465528.570767] [ndv4:15226:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465528.570773] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465528.570775] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465528.572471] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x1d95a30: created UD QP 0xdbd1 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.573493] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x16d0000 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465528.573522] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465528.573541] [ndv4:15226:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x16ce820: listening for connections (fd=78) on 10.5.0.5:56827 | |
[1650465528.573882] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.574922] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.574993] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.575194] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.575665] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.576188] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae65fe70000..0x2ae65fef5000 on mlx5_ib0 lkey 0x80500 rkey 0x80500 access 0xf flags 0x3e4 | |
[1650465528.576194] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae65fe70018 of 544744 bytes with 128 elements | |
[1650465528.576199] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.573916] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x16ce820 using tcp/eth0 on worker 0x1f938d0 | |
[1650465528.573938] [ndv4:15226:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465528.573942] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465528.573944] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465528.574008] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x161c5c0 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465528.574030] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465528.574034] [ndv4:15226:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x16ceea0: listening for connections (fd=80) on 127.0.0.1:36308 | |
[1650465528.574050] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465528.574056] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465528.574206] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x16ceea0 using tcp/lo on worker 0x1f938d0 | |
[1650465528.574222] [ndv4:15226:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465528.574225] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465528.574229] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465528.574268] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x162a960 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465528.574292] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465528.574296] [ndv4:15226:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x16b6530: listening for connections (fd=82) on 172.16.1.242:50416 | |
[1650465528.574750] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x16b6530 using tcp/ib0 on worker 0x1f938d0 | |
[1650465528.574918] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.574964] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.575195] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.575375] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.575652] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465528.576215] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.576225] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.577475] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.577485] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.577446] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465528.577484] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.577488] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.577500] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.577813] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.577830] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465528.579366] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x16c4670: created RC QP 0xdbe3 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465528.581440] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x1d95a30: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465528.582505] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x1d95a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465528.583101] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x1d95a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465528.587527] [ndv4:14949:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.587837] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.587847] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.588170] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x1d95a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465528.588074] [ndv4:14949:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465528.589073] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x968cc0 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.589104] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465528.589108] [ndv4:14949:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465528.590076] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x16c4670 using rc_verbs/mlx5_ib0:1 on worker 0x1f938d0 | |
[1650465528.590241] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.590565] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.589687] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.589774] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.589780] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.589845] [ndv4:14949:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465528.591035] [ndv4:14949:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.591042] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.591241] [ndv4:14949:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.591388] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.591470] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.590823] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.591122] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.591638] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465528.592398] [ndv4:15226:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465528.591962] [ndv4:13552:0] debug.c:1198 UCX DEBUG using signal stack 0x2b264ffd1000 size 141824 | |
[1650465528.592104] [ndv4:13552:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.592126] [ndv4:13552:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b26599af000 | |
[1650465528.592291] [ndv4:13552:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465528.592300] [ndv4:13552:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465528.592307] [ndv4:13552:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465528.593626] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x1d95a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465528.593807] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x1d95a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465528.593833] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x1d95a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465528.593953] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x1d95a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465528.593487] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.593496] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.593499] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.593553] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.594106] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x205d010 of 8176 bytes with 127 elements | |
[1650465528.594357] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.594370] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.594398] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.594449] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465528.594455] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.597444] [ndv4:13552:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.597465] [ndv4:13552:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465528.597501] [ndv4:13552:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465528.597504] [ndv4:13552:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465528.597511] [ndv4:13552:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465528.597518] [ndv4:13552:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465528.597522] [ndv4:13552:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465528.597527] [ndv4:13552:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465528.597529] [ndv4:13552:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465528.597532] [ndv4:13552:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465528.597534] [ndv4:13552:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465528.597536] [ndv4:13552:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465528.597544] [ndv4:13552:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465528.601759] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1cd52f0 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.601795] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465528.601825] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x1d95a30 using ud_verbs/mlx5_ib0:1 on worker 0x264e8d0 | |
[1650465528.602008] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.602091] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.602256] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.602363] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.603045] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x1629c70 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.603078] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465528.603105] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x16d20c0 using rc_mlx5/mlx5_ib0:1 on worker 0x1f938d0 | |
[1650465528.603332] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.603350] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.603080] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465528.603916] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.604088] [ndv4:14949:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465528.604096] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.605117] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x266e750: created UD QP 0xdc27 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.605130] [ndv4:13452:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465528.605487] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.605496] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.606525] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.606532] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.606123] [ndv4:13552:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465528.605895] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.606307] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.606659] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.606766] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.606910] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.606671] [ndv4:13443:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.607365] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae65fef5000..0x2ae65ff7a000 on mlx5_ib0 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465528.607371] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae65fef5018 of 544744 bytes with 128 elements | |
[1650465528.607376] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.607794] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x266e750: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465528.608180] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x266e750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465528.607931] [ndv4:13552:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465528.607948] [ndv4:13552:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465528.607959] [ndv4:13552:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465528.607970] [ndv4:13552:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465528.607980] [ndv4:13552:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465528.607990] [ndv4:13552:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465528.608001] [ndv4:13552:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465528.608688] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x266e750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465528.608920] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x266e750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465528.609045] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x266e750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465528.608863] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.608870] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.610462] [ndv4:13552:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465528.613721] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.613876] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.614246] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.614528] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.615133] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465528.617061] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x266e750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465528.617445] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x266e750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465528.617717] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x266e750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465528.617725] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.617756] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.617761] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1ce36c0 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.617780] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465528.617797] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x266e750 using ud_mlx5/mlx5_ib0:1 on worker 0x264e8d0 | |
[1650465528.618145] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.618406] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.618431] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.618440] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.618444] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.618459] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.618881] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.618923] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.619086] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465528.619094] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.619127] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465528.619131] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.619142] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x1628b40 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.619168] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465528.619468] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465528.619725] [ndv4:15226:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465528.620830] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465528.620840] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.620843] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.620862] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.621899] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.621907] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465528.622491] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x2f1f050: created RC QP 0xbfb5 on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465528.621774] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.621782] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.624099] [ndv4:14205:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.623762] [ndv4:13552:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.624216] [ndv4:13552:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465528.624190] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.624198] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.628177] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x2f1f050 using rc_verbs/mlx5_ib1:1 on worker 0x264e8d0 | |
[1650465528.628248] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.628368] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.628819] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.628832] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.629049] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465528.628491] [ndv4:15226:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x22c7010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdc69 | |
[1650465528.629737] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.629864] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.629887] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b178bdbd008 of 151544 bytes with 1052 elements | |
[1650465528.630309] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.630320] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.630323] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.630340] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.630921] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2f32010 of 8176 bytes with 127 elements | |
[1650465528.631156] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.631165] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.631200] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465528.631205] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.631217] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1d76f70 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.631246] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465528.631257] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x2da0010 using rc_mlx5/mlx5_ib1:1 on worker 0x264e8d0 | |
[1650465528.630163] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x13d41e0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.630272] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465528.630280] [ndv4:13552:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465528.630678] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.631087] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.631111] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.631166] [ndv4:13552:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465528.631188] [ndv4:13552:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465528.632604] [ndv4:13443:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.633052] [ndv4:13552:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.633062] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.633271] [ndv4:13552:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.633903] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.634019] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.634382] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.634391] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.634193] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b1790600000..0x2b1792c00000 on mlx5_ib0 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465528.634211] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b1790600018 of 39845864 bytes with 4752 elements | |
[1650465528.634356] [ndv4:15226:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x22c7010 | |
[1650465528.634395] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x22c7010 using dc_mlx5/mlx5_ib0:1 on worker 0x1f938d0 | |
[1650465528.634809] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.635036] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.635370] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.635396] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.634418] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.634641] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.634944] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.635034] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.635768] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465528.633876] [ndv4:13443:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465528.636391] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465528.637511] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.637571] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.637631] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.637635] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.637705] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.638111] [ndv4:14205:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.638109] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465528.638118] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.638145] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465528.638148] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.638157] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1d88dd0 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.638180] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465528.638688] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x1d3bd70 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.638724] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465528.638730] [ndv4:13443:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465528.638826] [ndv4:13452:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465528.639085] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.639329] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.639339] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.639409] [ndv4:13443:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465528.639773] [ndv4:14205:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465528.639719] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x16daa30: created UD QP 0xdc87 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.641074] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x20705a0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.641095] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465528.641098] [ndv4:14205:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465528.641342] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.641536] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.641544] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.641681] [ndv4:14205:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465528.640409] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.641232] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.641326] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.641566] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.641786] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.642230] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b178bde2000..0x2b178be67000 on mlx5_ib0 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465528.642235] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b178bde2018 of 544744 bytes with 128 elements | |
[1650465528.642239] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.641519] [ndv4:13443:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.641527] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.641788] [ndv4:13443:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.641935] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.642419] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.643101] [ndv4:14205:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.643110] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.643371] [ndv4:14205:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.643744] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.644243] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.647207] [ndv4:13452:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x30ca010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xbfe5 | |
[1650465528.646782] [ndv4:13443:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465528.646789] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.648174] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.648392] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.648402] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae65ff7c008 of 151544 bytes with 1052 elements | |
[1650465528.647865] [ndv4:14205:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465528.647873] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.652171] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae666e00000..0x2ae669400000 on mlx5_ib1 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465528.652196] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae666e00018 of 39845864 bytes with 4752 elements | |
[1650465528.652334] [ndv4:13452:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x30ca010 | |
[1650465528.652370] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x30ca010 using dc_mlx5/mlx5_ib1:1 on worker 0x264e8d0 | |
[1650465528.652733] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.652800] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.653245] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.653558] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.652657] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.652664] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.653946] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465528.654810] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.655163] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x2fc93d0: created UD QP 0xbfe9 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.655102] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x16daa30: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465528.655486] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x16daa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465528.655711] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x16daa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465528.655739] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x16daa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465528.655886] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x16daa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465528.655847] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.656099] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.656388] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.656899] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.656960] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.657455] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae66941b000..0x2ae6694a0000 on mlx5_ib1 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465528.657461] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae66941b018 of 544744 bytes with 128 elements | |
[1650465528.657466] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.657643] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2fc93d0: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465528.657684] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2fc93d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465528.657773] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2fc93d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465528.657913] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2fc93d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465528.657973] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2fc93d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465528.658234] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2fc93d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465528.658564] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2fc93d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465528.658837] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2fc93d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465528.659182] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.659191] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1d7d2b0 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.659227] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465528.659238] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x2fc93d0 using ud_verbs/mlx5_ib1:1 on worker 0x264e8d0 | |
[1650465528.661181] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.661190] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.661709] [ndv4:12741:0] debug.c:1198 UCX DEBUG using signal stack 0x2b7350987000 size 141824 | |
[1650465528.662254] [ndv4:12741:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.662276] [ndv4:12741:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b73507db000 | |
[1650465528.662304] [ndv4:12741:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465528.662313] [ndv4:12741:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465528.662319] [ndv4:12741:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465528.664733] [ndv4:13552:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465528.664742] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.665519] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x16daa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465528.665869] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x16daa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465528.666144] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x16daa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465528.666712] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.664732] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.664738] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.667392] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.667673] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.668073] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.668154] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.667455] [ndv4:12741:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.667479] [ndv4:12741:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465528.667512] [ndv4:12741:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465528.667515] [ndv4:12741:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465528.667521] [ndv4:12741:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465528.667528] [ndv4:12741:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465528.667531] [ndv4:12741:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465528.667535] [ndv4:12741:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465528.667537] [ndv4:12741:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465528.667540] [ndv4:12741:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465528.667542] [ndv4:12741:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465528.667544] [ndv4:12741:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465528.667552] [ndv4:12741:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465528.668520] [ndv4:14421:0] debug.c:1198 UCX DEBUG using signal stack 0x2ab786fdd000 size 141824 | |
[1650465528.668843] [ndv4:14421:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.668864] [ndv4:14421:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2ab786e31000 | |
[1650465528.668920] [ndv4:14421:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465528.668929] [ndv4:14421:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465528.668936] [ndv4:14421:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465528.668801] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465528.669714] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.670173] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x2e90280: created UD QP 0xc000 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.670185] [ndv4:13452:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465528.670812] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.671182] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.671407] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.671863] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.671959] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.672145] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.672155] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.672366] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae6694a0000..0x2ae669525000 on mlx5_ib1 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465528.672373] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae6694a0018 of 544744 bytes with 128 elements | |
[1650465528.672377] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.672683] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2e90280: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465528.672926] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2e90280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465528.673190] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2e90280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465528.673949] [ndv4:14949:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.674051] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x161a2f0 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.674090] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465528.674126] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x16daa30 using ud_verbs/mlx5_ib0:1 on worker 0x1f938d0 | |
[1650465528.674670] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.674999] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.675366] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.675412] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.674155] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2e90280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465528.675382] [ndv4:14421:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465528.675405] [ndv4:14421:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465528.675435] [ndv4:14421:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465528.675438] [ndv4:14421:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465528.675446] [ndv4:14421:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465528.675453] [ndv4:14421:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465528.675456] [ndv4:14421:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465528.675460] [ndv4:14421:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465528.675462] [ndv4:14421:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465528.675465] [ndv4:14421:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465528.675467] [ndv4:14421:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465528.675469] [ndv4:14421:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465528.675476] [ndv4:14421:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465528.676733] [ndv4:12741:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465528.676095] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465528.677496] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.679159] [ndv4:12741:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465528.679176] [ndv4:12741:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465528.679370] [ndv4:12741:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465528.679405] [ndv4:12741:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465528.679569] [ndv4:12741:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465528.679669] [ndv4:12741:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465528.679680] [ndv4:12741:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465528.679848] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x1fb3750: created UD QP 0xdcee on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.679863] [ndv4:15226:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465528.678708] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.678716] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.680122] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.680130] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.679782] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2e90280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465528.680202] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2e90280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465528.680922] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.681247] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.681730] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.681939] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.682072] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.682425] [ndv4:12741:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465528.682533] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b178be67000..0x2b178beec000 on mlx5_ib0 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465528.682540] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b178be67018 of 544744 bytes with 128 elements | |
[1650465528.682545] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.682825] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x1fb3750: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465528.682970] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x1fb3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465528.683143] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x1fb3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465528.682987] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.682994] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.683637] [ndv4:14421:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465528.687231] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.687237] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.689528] [ndv4:13443:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.692066] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2e90280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465528.692681] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x2e90280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465528.692690] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.692722] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.692728] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x2da8c30 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.692755] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465528.692771] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x2e90280 using ud_mlx5/mlx5_ib1:1 on worker 0x264e8d0 | |
[1650465528.692984] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.693102] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.693431] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.693467] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.694564] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.694661] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.695669] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465528.695802] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.695810] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.696993] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465528.697012] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.697016] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.697067] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.697100] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.697106] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.696956] [ndv4:14205:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.698189] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.698196] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465528.699617] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.699624] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.698713] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x34b90a0: created RC QP 0xbff3 on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465528.700293] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x34b90a0 using rc_verbs/mlx5_ib2:1 on worker 0x264e8d0 | |
[1650465528.701158] [ndv4:14421:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465528.701175] [ndv4:14421:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465528.701187] [ndv4:14421:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465528.701199] [ndv4:14421:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465528.701209] [ndv4:14421:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465528.701219] [ndv4:14421:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465528.701230] [ndv4:14421:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465528.701659] [ndv4:13443:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.700822] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x1fb3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465528.700894] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x1fb3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465528.701106] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x1fb3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465528.701836] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x1fb3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465528.702185] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x1fb3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465528.702192] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.702223] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.702228] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x16286c0 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.702259] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465528.702273] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x1fb3750 using ud_mlx5/mlx5_ib0:1 on worker 0x1f938d0 | |
[1650465528.702410] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.702074] [ndv4:13443:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465528.703315] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x1d43c00 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.703349] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465528.703352] [ndv4:13443:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465528.702201] [ndv4:14949:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.702722] [ndv4:14949:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465528.702748] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.703269] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.703404] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.703832] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.704204] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.704210] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.704252] [ndv4:13443:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465528.704276] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465528.704354] [ndv4:14421:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465528.703879] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x970ad0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.703906] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465528.703911] [ndv4:14949:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465528.704164] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.704284] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.704293] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.704370] [ndv4:14949:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465528.705119] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465528.705131] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.705133] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.705148] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.705400] [ndv4:14949:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.705408] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.705787] [ndv4:12741:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.706035] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.706043] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465528.706256] [ndv4:12741:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465528.706548] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.706808] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.707094] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.707663] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.706764] [ndv4:14949:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.706930] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.707168] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.706704] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x2864050: created RC QP 0xc041 on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465528.708279] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465528.709932] [ndv4:14205:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.710375] [ndv4:14205:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465528.709660] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x2864050 using rc_verbs/mlx5_ib1:1 on worker 0x1f938d0 | |
[1650465528.710745] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.711076] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.710813] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.710825] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.711889] [ndv4:13552:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.711139] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x206fd30 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.711163] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465528.711167] [ndv4:14205:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465528.711673] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.711826] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.711832] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.711873] [ndv4:14205:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465528.711074] [ndv4:13443:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.711086] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.711298] [ndv4:13443:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.711748] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.711829] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.711766] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.711825] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.712274] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465528.712630] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x26f61e0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.712747] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465528.712754] [ndv4:12741:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465528.713131] [ndv4:14205:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.713139] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.713334] [ndv4:14205:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.713625] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.713663] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.713019] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.713082] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.713104] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.713165] [ndv4:12741:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465528.713187] [ndv4:12741:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465528.713490] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.713499] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.713502] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.713521] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.714023] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2877010 of 8176 bytes with 127 elements | |
[1650465528.714325] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.714331] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.714367] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465528.714371] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.714381] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x16bbf70 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.714421] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465528.714430] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x26e5010 using rc_mlx5/mlx5_ib1:1 on worker 0x1f938d0 | |
[1650465528.714750] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.714794] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.715521] [ndv4:12741:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.715531] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.716047] [ndv4:12741:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.716103] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.716146] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.715947] [ndv4:14949:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465528.715957] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.717102] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.717120] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.717124] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.717175] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.717719] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x37da010 of 8176 bytes with 127 elements | |
[1650465528.718505] [ndv4:14421:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.717966] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.717974] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.718010] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465528.718014] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.718024] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1d975f0 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.718048] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465528.718073] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x3606030 using rc_mlx5/mlx5_ib2:1 on worker 0x264e8d0 | |
[1650465528.718190] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.718393] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.718688] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.718847] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.718936] [ndv4:14421:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465528.719308] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465528.720970] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.721028] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.721102] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.721120] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.721123] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.721177] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.721421] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465528.722839] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.722854] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.722858] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.722912] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.722062] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465528.722070] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.722102] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465528.722105] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.722113] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1d774c0 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.722142] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465528.722787] [ndv4:13452:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465528.721958] [ndv4:13443:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465528.721967] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.723368] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465528.723375] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.723409] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465528.723412] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.723419] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x16cddd0 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.723444] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465528.723781] [ndv4:14205:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465528.723789] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.723957] [ndv4:15226:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465528.726757] [ndv4:12741:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465528.726765] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.729154] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.729166] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.730973] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0xf35280 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.731086] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465528.731094] [ndv4:14421:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465528.731554] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.731560] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.731731] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.731766] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.731788] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.731851] [ndv4:14421:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465528.731873] [ndv4:14421:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465528.731805] [ndv4:13552:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.732315] [ndv4:13552:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465528.733361] [ndv4:14421:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.733369] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.733621] [ndv4:14421:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.734103] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465528.734500] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465528.732367] [ndv4:13452:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x37dc050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc018 | |
[1650465528.732888] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.732895] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.734644] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.734672] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.734717] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.734724] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.732840] [ndv4:15226:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2a0f010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc07a | |
[1650465528.734112] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.734172] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.734182] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b178beee008 of 151544 bytes with 1052 elements | |
[1650465528.735754] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.735761] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.737717] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.737725] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.738137] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b1792e00000..0x2b1795400000 on mlx5_ib1 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465528.738159] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b1792e00018 of 39845864 bytes with 4752 elements | |
[1650465528.738297] [ndv4:15226:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2a0f010 | |
[1650465528.738334] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x2a0f010 using dc_mlx5/mlx5_ib1:1 on worker 0x1f938d0 | |
[1650465528.738692] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.738775] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.739068] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.739101] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.738791] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.739140] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.739152] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae65ffa3008 of 151544 bytes with 1052 elements | |
[1650465528.739955] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465528.740085] [ndv4:14421:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465528.740092] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.740774] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.740780] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.741265] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.742518] [ndv4:14949:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.742977] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x290e3d0: created UD QP 0xc08a on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.743545] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.743911] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.744047] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.743161] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.743170] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.743265] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae669600000..0x2ae66bc00000 on mlx5_ib2 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465528.743294] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae669600018 of 39845864 bytes with 4752 elements | |
[1650465528.743439] [ndv4:13452:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x37dc050 | |
[1650465528.743473] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x37dc050 using dc_mlx5/mlx5_ib2:1 on worker 0x264e8d0 | |
[1650465528.744191] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.744200] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.744754] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.744763] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.744698] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.744709] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.745123] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b178bf13000..0x2b178bf98000 on mlx5_ib1 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465528.745129] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b178bf13018 of 544744 bytes with 128 elements | |
[1650465528.745134] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.745472] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x290e3d0: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465528.745566] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x13ccf50 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.745662] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465528.745666] [ndv4:13552:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465528.745894] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x290e3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465528.745891] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.746143] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.746150] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.746197] [ndv4:13552:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465528.747017] [ndv4:13552:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.747025] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.747660] [ndv4:13552:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.748019] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.748335] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.748223] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.748230] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.750171] [ndv4:12741:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.751796] [ndv4:13552:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465528.751804] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.754860] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.754866] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.758293] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.758299] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.759734] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.759741] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.759684] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.759932] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.760256] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.760481] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.760885] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465528.760729] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x290e3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465528.761849] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.761857] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.761959] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.762287] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x39b8060: created UD QP 0xc028 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.763066] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.763398] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.763921] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.764214] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.764569] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.762876] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.762884] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.765170] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae66bd26000..0x2ae66bdab000 on mlx5_ib2 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465528.765177] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae66bd26018 of 544744 bytes with 128 elements | |
[1650465528.765181] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.765243] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x39b8060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465528.765466] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.765476] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.767647] [ndv4:14205:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.771885] [ndv4:14949:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.772352] [ndv4:14949:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465528.772730] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x9709c0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.772767] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465528.772771] [ndv4:14949:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465528.774257] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x290e3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465528.774389] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x290e3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465528.774625] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.774632] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.774562] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.774795] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.774804] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.774858] [ndv4:14949:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465528.776189] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.776197] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.777074] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x39b8060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465528.777297] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x39b8060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465528.777535] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x39b8060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465528.777551] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.777558] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.778414] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.778422] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.778140] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x39b8060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465528.778442] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x39b8060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465528.778573] [ndv4:14949:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.778635] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.778791] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x39b8060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465528.778903] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x39b8060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465528.779210] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.779219] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x2f35e70 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.779240] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465528.779255] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x39b8060 using ud_verbs/mlx5_ib2:1 on worker 0x264e8d0 | |
[1650465528.779174] [ndv4:13552:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.779724] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.779809] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.780466] [ndv4:13443:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.780649] [ndv4:14949:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.780771] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.780820] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.782078] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x290e3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465528.782638] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x290e3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465528.782979] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x290e3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465528.783320] [ndv4:12741:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.783929] [ndv4:12741:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465528.784368] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x26eef50 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.784393] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465528.784397] [ndv4:12741:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465528.784716] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.784930] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.784937] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.784979] [ndv4:12741:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465528.783325] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.783338] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x16c22b0 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.783372] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465528.783385] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x290e3d0 using ud_verbs/mlx5_ib1:1 on worker 0x1f938d0 | |
[1650465528.783802] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.783909] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.784167] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.784276] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.785280] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465528.784026] [ndv4:14949:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465528.784034] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.786167] [ndv4:12741:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.786174] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.786404] [ndv4:12741:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.786687] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.787128] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.788006] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.788136] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.788557] [ndv4:14205:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.788762] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465528.789105] [ndv4:14205:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465528.789796] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.789972] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x198af90 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.790002] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465528.790005] [ndv4:14205:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465528.790107] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x3ad6050: created UD QP 0xc03c on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.790115] [ndv4:13452:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465528.790851] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.790446] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.790972] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.790980] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.791028] [ndv4:14205:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465528.791124] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.791500] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.792850] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.793011] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.794524] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.794537] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.793524] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae66bdab000..0x2ae66be30000 on mlx5_ib2 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465528.793530] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae66bdab018 of 544744 bytes with 128 elements | |
[1650465528.793535] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.794035] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x3ad6050: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465528.794313] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x3ad6050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465528.794523] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x3ad6050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465528.794779] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x3ad6050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465528.795133] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x3ad6050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465528.795068] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.795947] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x3ad6050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465528.796162] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x3ad6050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465528.796543] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x3ad6050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465528.796551] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.796668] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.796674] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x34c1c60 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.796697] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465528.796707] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x3ad6050 using ud_mlx5/mlx5_ib2:1 on worker 0x264e8d0 | |
[1650465528.796957] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.797226] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.797890] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.797941] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.798658] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465528.797070] [ndv4:13443:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.797494] [ndv4:13443:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465528.798465] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x1d3b3a0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.798494] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465528.798498] [ndv4:13443:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465528.798994] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.799164] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.799170] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.799223] [ndv4:13443:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465528.796796] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x27d5280: created UD QP 0xc101 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.796805] [ndv4:15226:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465528.797676] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.798420] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.798757] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.798806] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.798828] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.799545] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b179541b000..0x2b17954a0000 on mlx5_ib1 lkey 0x80d00 rkey 0x80d00 access 0xf flags 0x3e4 | |
[1650465528.799552] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b179541b018 of 544744 bytes with 128 elements | |
[1650465528.799557] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.799970] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x27d5280: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465528.798489] [ndv4:12741:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465528.798496] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.801621] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.801627] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.799949] [ndv4:14205:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.799957] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.800177] [ndv4:14205:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.800677] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.800836] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.800161] [ndv4:13443:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.800168] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.800321] [ndv4:13443:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.800692] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.800879] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.800119] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465528.800135] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.800138] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.800188] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.801240] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.801246] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465528.801744] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x3bd60a0: created RC QP 0xbf5b on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465528.797775] [ndv4:13552:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.798193] [ndv4:13552:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465528.798892] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x13d4ce0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.798915] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465528.798919] [ndv4:13552:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465528.799316] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.799381] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.799387] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.799434] [ndv4:13552:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465528.804068] [ndv4:14205:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465528.804076] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.804014] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x3bd60a0 using rc_verbs/mlx5_ib3:1 on worker 0x264e8d0 | |
[1650465528.804103] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.804144] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.804507] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.805606] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.805614] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.805102] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.806120] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465528.806483] [ndv4:13443:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465528.806490] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.810473] [ndv4:13552:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.810482] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.809724] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.809731] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.809842] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x27d5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465528.810304] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x27d5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465528.810777] [ndv4:13552:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.810969] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.811248] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.810857] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x27d5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465528.811627] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x27d5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465528.811729] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x27d5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465528.812090] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x27d5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465528.812165] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x27d5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465528.812175] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.812279] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.812286] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x26edc30 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.812301] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465528.812323] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x27d5280 using ud_mlx5/mlx5_ib1:1 on worker 0x1f938d0 | |
[1650465528.812711] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.813530] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.814199] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.814322] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.814658] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465528.815242] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.815261] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.815265] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.815320] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.815725] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465528.815742] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.815745] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.815798] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.815932] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3d24010 of 8176 bytes with 127 elements | |
[1650465528.815531] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.815540] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.816269] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.816276] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465528.816909] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x2dfe0a0: created RC QP 0xc052 on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465528.816529] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465528.816543] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465528.816169] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.816178] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.816214] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465528.816218] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.816232] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x2b463b0 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.816251] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465528.816269] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x3d2b100 using rc_mlx5/mlx5_ib3:1 on worker 0x264e8d0 | |
[1650465528.816334] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.816970] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.817300] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.817461] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.819144] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x2dfe0a0 using rc_verbs/mlx5_ib2:1 on worker 0x1f938d0 | |
[1650465528.819285] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.819364] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.820066] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.820226] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.820318] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465528.818826] [ndv4:14421:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.819245] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.819252] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.821231] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.821238] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.820788] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465528.822484] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.822498] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.822501] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.822556] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.823056] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x311f010 of 8176 bytes with 127 elements | |
[1650465528.822948] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.822967] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.822971] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.823028] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.824787] [ndv4:13552:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465528.824796] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.823549] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465528.823557] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.823631] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465528.823635] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.823645] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1d72f00 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.823676] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465528.824340] [ndv4:13452:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465528.823364] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.823370] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.823406] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465528.823410] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.823422] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x16dc5f0 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.823450] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465528.823477] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x2f4b030 using rc_mlx5/mlx5_ib2:1 on worker 0x1f938d0 | |
[1650465528.823750] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.824390] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.824826] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.824863] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.826017] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.826026] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.827743] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.827751] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.831526] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.831535] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.831136] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.831143] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.832341] [ndv4:14421:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.832895] [ndv4:14421:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465528.833553] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.833559] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.834138] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0xf2bcf0 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.834164] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465528.834168] [ndv4:14421:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465528.834852] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.835777] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.835786] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.835846] [ndv4:14421:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465528.835281] [ndv4:12741:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.834632] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465528.836035] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.836050] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.836054] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.836112] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.836777] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.836785] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.836660] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465528.836666] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.836700] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465528.836704] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.836712] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x16bc4c0 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.836732] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465528.837249] [ndv4:15226:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465528.836761] [ndv4:13452:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3efe010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xbf8c | |
[1650465528.837314] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.837488] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.837501] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae65ffca008 of 151544 bytes with 1052 elements | |
[1650465528.837885] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.837891] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.839694] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.839705] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.842096] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae66c000000..0x2ae66e600000 on mlx5_ib3 lkey 0x80400 rkey 0x80400 access 0xf flags 0x3e4 | |
[1650465528.842117] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae66c000018 of 39845864 bytes with 4752 elements | |
[1650465528.842261] [ndv4:13452:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3efe010 | |
[1650465528.842293] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x3efe010 using dc_mlx5/mlx5_ib3:1 on worker 0x264e8d0 | |
[1650465528.842509] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.842753] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.843548] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.843712] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.842337] [ndv4:14421:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.842346] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.842702] [ndv4:14421:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.843556] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465528.843739] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465528.841748] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.841754] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.843751] [ndv4:14205:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.844080] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465528.845212] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.845633] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x40da060: created UD QP 0xbf90 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.846130] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.846144] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.846376] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.846798] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.846825] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.846914] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.847098] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.846486] [ndv4:12741:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.847021] [ndv4:12741:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465528.847557] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae66e631000..0x2ae66e6b6000 on mlx5_ib3 lkey 0x80500 rkey 0x80500 access 0xf flags 0x3e4 | |
[1650465528.847563] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae66e631018 of 544744 bytes with 128 elements | |
[1650465528.847567] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.847924] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x40da060: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465528.848083] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x40da060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465528.848405] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x40da060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465528.847396] [ndv4:14949:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.848636] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.848643] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.847938] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x26f6ce0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.847965] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465528.847969] [ndv4:12741:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465528.848217] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.848255] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.848261] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.848302] [ndv4:12741:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465528.849631] [ndv4:12741:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.849637] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.848024] [ndv4:15226:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3121050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc07b | |
[1650465528.848737] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.849032] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.849044] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b178bf9a008 of 151544 bytes with 1052 elements | |
[1650465528.849803] [ndv4:12741:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.850025] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.850345] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.854433] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x40da060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465528.855118] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x40da060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465528.855089] [ndv4:14421:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465528.855097] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.854480] [ndv4:12741:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465528.854488] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.855888] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x40da060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465528.856352] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x40da060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465528.856010] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b1795600000..0x2b1797c00000 on mlx5_ib2 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465528.856025] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b1795600018 of 39845864 bytes with 4752 elements | |
[1650465528.856194] [ndv4:15226:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3121050 | |
[1650465528.856219] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x3121050 using dc_mlx5/mlx5_ib2:1 on worker 0x1f938d0 | |
[1650465528.856792] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.856905] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.857009] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.857251] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.858098] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.858107] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.857855] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.857865] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.859119] [ndv4:13443:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.862676] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465528.864125] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.864400] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x32fd060: created UD QP 0xc08c on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.863306] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.863314] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.863619] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.863626] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.862377] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.862384] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.864171] [ndv4:13552:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.861183] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x40da060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465528.861539] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.861551] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x40dae60 [id=109 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.861656] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 109 events 0x5 mode thread_spinlock | |
[1650465528.861685] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[24]=0x40da060 using ud_verbs/mlx5_ib3:1 on worker 0x264e8d0 | |
[1650465528.862379] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.862825] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.863207] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.863407] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.863982] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465528.863711] [ndv4:14205:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.864143] [ndv4:14205:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465528.865095] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.865026] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x198a720 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.865058] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465528.865063] [ndv4:14205:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465528.865778] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.865982] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.866182] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.866217] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.866676] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b1797ca1000..0x2b1797d26000 on mlx5_ib2 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465528.866682] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b1797ca1018 of 544744 bytes with 128 elements | |
[1650465528.866686] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.865790] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465528.865982] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465528.865987] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.866034] [ndv4:14205:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465528.867641] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.867343] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.867349] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.867809] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x32fd060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465528.868447] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x32fd060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465528.867888] [ndv4:14205:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.867895] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.868114] [ndv4:14205:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.868453] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465528.868715] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465528.869254] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x32fd060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465528.869024] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x33e6440: created UD QP 0xbf9e on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.869033] [ndv4:13452:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465528.870007] [ndv4:14949:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.870141] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.870560] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.870445] [ndv4:14949:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465528.870820] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.870978] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.871116] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.871516] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x970150 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.871552] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465528.871555] [ndv4:14949:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465528.871698] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae66e6b6000..0x2ae66e73b000 on mlx5_ib3 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465528.871706] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae66e6b6018 of 544744 bytes with 128 elements | |
[1650465528.871711] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.871877] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33e6440: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465528.871964] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33e6440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465528.872396] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33e6440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465528.872534] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33e6440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465528.872680] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33e6440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465528.872892] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33e6440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465528.871803] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.871942] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.871948] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.871999] [ndv4:14949:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465528.873091] [ndv4:14949:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.873099] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.871844] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.871852] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.874219] [ndv4:14949:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.873799] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33e6440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465528.874413] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33e6440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465528.874420] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.874450] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.874455] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x2b461e0 [id=110 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.874481] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 110 events 0x5 mode thread_spinlock | |
[1650465528.874495] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[25]=0x33e6440 using ud_mlx5/mlx5_ib3:1 on worker 0x264e8d0 | |
[1650465528.875547] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.875554] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.875039] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.875110] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.874808] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.875112] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.875502] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.875689] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.876321] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465528.878659] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465528.878679] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.878683] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.878736] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.877449] [ndv4:13552:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.877864] [ndv4:13552:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465528.877477] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x32fd060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465528.877866] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x32fd060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465528.878029] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x32fd060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465528.878530] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x32fd060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465528.879875] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.879881] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465528.880689] [ndv4:14205:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465528.880696] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.881921] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x42f3020: created RC QP 0xbf0b on mlx5_ib4:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465528.882002] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.882009] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.883768] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[26]=0x42f3020 using rc_verbs/mlx5_ib4:1 on worker 0x264e8d0 | |
[1650465528.883867] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.883988] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.884660] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.884745] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.885230] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465528.885967] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x32fd060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465528.886170] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x13cc2b0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.886202] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465528.886206] [ndv4:13552:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465528.886401] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.886409] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x287ae70 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.886436] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465528.886449] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x32fd060 using ud_verbs/mlx5_ib2:1 on worker 0x1f938d0 | |
[1650465528.886782] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.886876] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.887069] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.887472] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.886919] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.886927] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.886663] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.886780] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.886786] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.886830] [ndv4:13552:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465528.887908] [ndv4:13552:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.887915] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.887745] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.887774] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.887778] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.887834] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.888128] [ndv4:12741:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.888175] [ndv4:13552:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.888285] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.888362] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.888803] [ndv4:13443:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.889381] [ndv4:13443:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465528.890650] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4614010 of 8176 bytes with 127 elements | |
[1650465528.890964] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.890972] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.891009] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465528.891013] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.891025] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x3b03690 [id=113 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.891053] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 113 events 0x1 mode thread_spinlock | |
[1650465528.891064] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[27]=0x4440030 using rc_mlx5/mlx5_ib4:1 on worker 0x264e8d0 | |
[1650465528.891504] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.891741] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.892074] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.892354] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.890921] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.890929] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.890898] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x1d43280 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.890937] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465528.890942] [ndv4:13443:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465528.892125] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.892355] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.892366] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.892435] [ndv4:13443:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465528.892553] [ndv4:14949:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465528.892563] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.892556] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465528.893870] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.894165] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x341b050: created UD QP 0xc0ad on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.894173] [ndv4:15226:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465528.894416] [ndv4:13443:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.894425] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.894684] [ndv4:13443:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.894825] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.894880] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.895614] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465528.895622] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465528.895807] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.895813] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.894858] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.895455] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.895668] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.895826] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.896445] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.896884] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b1797d26000..0x2b1797dab000 on mlx5_ib2 lkey 0x80d00 rkey 0x80d00 access 0xf flags 0x3e4 | |
[1650465528.896890] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b1797d26018 of 544744 bytes with 128 elements | |
[1650465528.896895] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.897232] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x341b050: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465528.897536] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x341b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465528.897679] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x341b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465528.897746] [ndv4:14421:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.898117] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.898124] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.899919] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465528.899914] [ndv4:13443:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465528.899921] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.905294] [ndv4:13552:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465528.905305] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.904477] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x341b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465528.904983] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x341b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465528.905262] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x341b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465528.905404] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.905411] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.908706] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.908715] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.909434] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.909441] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.910452] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x341b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465528.910513] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465528.910520] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465528.912112] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.912131] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.912135] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.912192] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.912316] [ndv4:12741:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.912468] [ndv4:14205:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.912643] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465528.912650] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.912684] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465528.912688] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.912697] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x4448f60 [id=115 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.912720] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 115 events 0x1 mode thread_spinlock | |
[1650465528.912936] [ndv4:12741:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465528.913365] [ndv4:13452:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465528.913456] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.913462] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.913617] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x26ee2b0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.913655] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465528.913658] [ndv4:12741:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465528.913859] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.913865] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.913933] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.914541] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.914546] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.914642] [ndv4:12741:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465528.914793] [ndv4:14421:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.915253] [ndv4:14421:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465528.915812] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0xf33cb0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.915836] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465528.915839] [ndv4:14421:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465528.916420] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.916555] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.916560] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.916675] [ndv4:14421:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465528.917864] [ndv4:14421:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.917871] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.918110] [ndv4:14421:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.918224] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465528.918296] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465528.918892] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x341b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465528.918902] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.918936] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.918941] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2e06c60 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.918965] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465528.918976] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x341b050 using ud_mlx5/mlx5_ib2:1 on worker 0x1f938d0 | |
[1650465528.919195] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.919383] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.919808] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.919888] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.921262] [ndv4:12741:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.921270] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.921473] [ndv4:12741:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.920495] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465528.922127] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465528.922143] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.922146] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.922198] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.922672] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.922678] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465528.923414] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.923422] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.923504] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.923511] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.923175] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x351b0a0: created RC QP 0xbfc6 on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465528.924279] [ndv4:13452:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4616050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xbf64 | |
[1650465528.925148] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.925220] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.925229] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae670f3c008 of 151544 bytes with 1052 elements | |
[1650465528.924438] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x351b0a0 using rc_verbs/mlx5_ib3:1 on worker 0x1f938d0 | |
[1650465528.924787] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.924952] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.925182] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.925205] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.925512] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465528.924815] [ndv4:14205:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.925190] [ndv4:14205:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465528.926318] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x2070cd0 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.926349] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465528.926352] [ndv4:14205:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465528.924315] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.924326] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.926238] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.926314] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.927151] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.927158] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.928876] [ndv4:13552:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.928774] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.928784] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.928787] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.928841] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.929206] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3669010 of 8176 bytes with 127 elements | |
[1650465528.929049] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae66e800000..0x2ae670e00000 on mlx5_ib4 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465528.929070] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae66e800018 of 39845864 bytes with 4752 elements | |
[1650465528.929209] [ndv4:13452:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4616050 | |
[1650465528.929246] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[28]=0x4616050 using dc_mlx5/mlx5_ib4:1 on worker 0x264e8d0 | |
[1650465528.929947] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.930074] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.930240] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.930435] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.929421] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.929427] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.929460] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465528.929464] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.929473] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x248b3b0 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.929499] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465528.929509] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x3670100 using rc_mlx5/mlx5_ib3:1 on worker 0x1f938d0 | |
[1650465528.929679] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.929757] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.929934] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.930061] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.930979] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465528.932045] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465528.932108] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465528.932114] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.932161] [ndv4:14205:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465528.932177] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.932538] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x47f21f0: created UD QP 0xbf91 on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.933240] [ndv4:14205:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.933247] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.933509] [ndv4:14205:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.933379] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.933030] [ndv4:14421:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465528.933040] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.934680] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465528.934736] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465528.933865] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.934017] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.934132] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.934716] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.935324] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae670f61000..0x2ae670fe6000 on mlx5_ib4 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465528.935329] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae670f61018 of 544744 bytes with 128 elements | |
[1650465528.935334] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.935704] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x47f21f0: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465528.935697] [ndv4:12741:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465528.935705] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.934871] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.934879] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.936674] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.936680] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.940563] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.940572] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.939486] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x47f21f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465528.938789] [ndv4:14205:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465528.938795] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.940858] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465528.941985] [ndv4:13443:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.942183] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.942191] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.942775] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.942782] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.943124] [ndv4:13552:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.943664] [ndv4:13552:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465528.944322] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1ab5bf0 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.944349] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465528.944352] [ndv4:13552:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465528.944151] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.944159] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.944781] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.944853] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.944859] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.944904] [ndv4:13552:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465528.945669] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x47f21f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465528.946062] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x47f21f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465528.946171] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x47f21f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465528.946305] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x47f21f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465528.946480] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x47f21f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465528.946704] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x47f21f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465528.945866] [ndv4:14949:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.947028] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.947036] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x42cea70 [id=116 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.947070] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 116 events 0x5 mode thread_spinlock | |
[1650465528.947083] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[29]=0x47f21f0 using ud_verbs/mlx5_ib4:1 on worker 0x264e8d0 | |
[1650465528.947326] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.947412] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.947770] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.948052] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.948297] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465528.949287] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.949296] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.949642] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.951100] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x4910050: created UD QP 0xbfba on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.951108] [ndv4:13452:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465528.951880] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.951899] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.951904] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.951960] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.952170] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.953278] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.953479] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.953777] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.953875] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.954242] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae670fe6000..0x2ae67106b000 on mlx5_ib4 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465528.954247] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae670fe6018 of 544744 bytes with 128 elements | |
[1650465528.954252] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.955145] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4910050: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465528.956383] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.956393] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.955795] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4910050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465528.954754] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465528.954764] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.954793] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465528.954797] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.954806] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x16b7f00 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.954834] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465528.955626] [ndv4:15226:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465528.955491] [ndv4:13552:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.955501] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.955733] [ndv4:13552:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.956092] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465528.956656] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465528.960928] [ndv4:14949:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.961318] [ndv4:14949:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465528.963983] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.963992] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.964416] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.964425] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.964170] [ndv4:15226:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3843010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xbff4 | |
[1650465528.964691] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4910050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465528.968876] [ndv4:13443:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465528.969405] [ndv4:13443:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465528.969880] [ndv4:13552:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465528.969888] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.970416] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.970424] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.970363] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x1d41f90 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.970395] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465528.970399] [ndv4:13443:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465528.971446] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x970710 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465528.971480] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465528.971484] [ndv4:14949:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465528.971523] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.971769] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.971779] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.971845] [ndv4:13443:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465528.972072] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.972265] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.972271] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465528.972336] [ndv4:14949:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465528.972012] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.972555] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.972566] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b178bfc1008 of 151544 bytes with 1052 elements | |
[1650465528.972330] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4910050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465528.972741] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4910050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465528.972911] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4910050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465528.973199] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4910050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465528.973448] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4910050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465528.973455] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.973488] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465528.973493] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x42ce410 [id=117 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465528.973516] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 117 events 0x5 mode thread_spinlock | |
[1650465528.973528] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[30]=0x4910050 using ud_mlx5/mlx5_ib4:1 on worker 0x264e8d0 | |
[1650465528.973681] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.973754] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.974223] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.974251] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.973984] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465528.973991] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465528.975040] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465528.974016] [ndv4:14949:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.974023] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.974271] [ndv4:14949:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.974783] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.974999] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.975740] [ndv4:12741:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.976208] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465528.976224] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.976227] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.976287] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.976547] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465528.976557] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465528.977452] [ndv4:14421:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465528.976567] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b1797e00000..0x2b179a400000 on mlx5_ib3 lkey 0x80700 rkey 0x80700 access 0xf flags 0x3e4 | |
[1650465528.976633] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b1797e00018 of 39845864 bytes with 4752 elements | |
[1650465528.976775] [ndv4:15226:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3843010 | |
[1650465528.976812] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x3843010 using dc_mlx5/mlx5_ib3:1 on worker 0x1f938d0 | |
[1650465528.977257] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.977392] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.977437] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.977443] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465528.978187] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x4a100a0: created RC QP 0xbde6 on mlx5_ib5:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465528.980935] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[31]=0x4a100a0 using rc_verbs/mlx5_ib5:1 on worker 0x264e8d0 | |
[1650465528.981267] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.981442] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.981886] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.982186] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.984641] [ndv4:14949:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465528.984650] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.985548] [ndv4:13443:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465528.985560] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465528.984344] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.984377] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.984837] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465528.985750] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.985759] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.986256] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465528.986572] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x3a1f060: created UD QP 0xc009 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465528.987444] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465528.988027] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.988434] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.988696] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465528.988826] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465528.986667] [ndv4:13443:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465528.986778] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.987032] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.988744] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.988750] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.989426] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465528.989438] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465528.990758] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b179a5ac000..0x2b179a631000 on mlx5_ib3 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465528.990772] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b179a5ac018 of 544744 bytes with 128 elements | |
[1650465528.990781] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465528.991005] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x3a1f060: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465528.991443] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x3a1f060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465528.990902] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465528.991377] [ndv4:13443:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465528.991384] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465528.997102] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465528.997111] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465528.997982] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x3a1f060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465528.998236] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x3a1f060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465528.998285] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465528.998303] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465528.998307] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465528.998357] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465528.998914] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x3a1f060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465528.999291] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x3a1f060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465528.998864] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4d31010 of 8176 bytes with 127 elements | |
[1650465528.999104] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465528.999112] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465528.999152] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465528.999156] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465528.999169] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x3d23e10 [id=120 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465528.999201] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 120 events 0x1 mode thread_spinlock | |
[1650465528.999213] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[32]=0x4b5d030 using rc_mlx5/mlx5_ib5:1 on worker 0x264e8d0 | |
[1650465528.999289] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465528.999545] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465528.999286] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465528.999294] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.000090] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x3a1f060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465528.999852] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.000264] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.000524] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.002038] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.002052] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.002055] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.002110] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.002555] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.002561] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.002641] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465529.002645] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.002653] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x3d23f80 [id=122 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.002681] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 122 events 0x1 mode thread_spinlock | |
[1650465529.003307] [ndv4:13452:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.003641] [ndv4:14205:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.003648] [ndv4:14205:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.004181] [ndv4:14205:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x1973dc0 0x1973dc0 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465529.007651] [ndv4:12741:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.007109] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.007118] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.008236] [ndv4:12741:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465529.008953] [ndv4:14421:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.009644] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2dd7bf0 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.009674] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465529.009678] [ndv4:12741:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465529.009658] [ndv4:14421:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465529.010645] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0xf2b280 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.010679] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465529.010683] [ndv4:14421:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465529.011724] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.011797] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.011806] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.011873] [ndv4:14421:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465529.011889] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x3a1f060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465529.012304] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.012316] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3a1fe60 [id=109 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.012355] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 109 events 0x5 mode thread_spinlock | |
[1650465529.012378] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[24]=0x3a1f060 using ud_verbs/mlx5_ib3:1 on worker 0x1f938d0 | |
[1650465529.012742] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.012978] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.012238] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465529.012250] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.013690] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.014409] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.013843] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.013851] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.014020] [ndv4:14421:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.014028] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.014246] [ndv4:14421:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.013801] [ndv4:13452:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4d33050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xbe32 | |
[1650465529.014871] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.014878] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.015108] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.015121] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae67386c008 of 151544 bytes with 1052 elements | |
[1650465529.015910] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465529.015917] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.017244] [ndv4:13552:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.018138] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.018546] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.018552] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.018679] [ndv4:12741:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465529.018858] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae671200000..0x2ae673800000 on mlx5_ib5 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465529.018882] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae671200018 of 39845864 bytes with 4752 elements | |
[1650465529.019027] [ndv4:13452:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4d33050 | |
[1650465529.019061] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[33]=0x4d33050 using dc_mlx5/mlx5_ib5:1 on worker 0x264e8d0 | |
[1650465529.019767] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.019822] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.020568] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.020703] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.021184] [ndv4:12741:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.021192] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.022116] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.022355] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.022034] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.022043] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.021927] [ndv4:12741:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.022264] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.022360] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.022899] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.025079] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.025766] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.025887] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x4f0f060: created UD QP 0xbe5c on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.026163] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x2d2b440: created UD QP 0xc032 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.026170] [ndv4:15226:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.026690] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.027211] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.028107] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.028489] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.027121] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.027128] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.028930] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.029312] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.029633] [ndv4:12741:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465529.029639] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.029793] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b179a631000..0x2b179a6b6000 on mlx5_ib3 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465529.029800] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b179a631018 of 544744 bytes with 128 elements | |
[1650465529.029804] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.030261] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2d2b440: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465529.034684] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.034691] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.035959] [ndv4:13552:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.036443] [ndv4:13552:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465529.036707] [ndv4:13443:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.035816] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.035827] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.036895] [ndv4:14949:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.037371] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x13d0710 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.037402] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465529.037405] [ndv4:13552:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465529.037942] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.037982] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.037987] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.038043] [ndv4:13552:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465529.038958] [ndv4:13552:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.038966] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.039193] [ndv4:13552:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.039682] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.039802] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.041259] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.041400] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.041489] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.041880] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.042371] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae673891000..0x2ae673916000 on mlx5_ib5 lkey 0x80700 rkey 0x80700 access 0xf flags 0x3e4 | |
[1650465529.042377] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae673891018 of 544744 bytes with 128 elements | |
[1650465529.042381] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.043030] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4f0f060: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465529.043508] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4f0f060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465529.043110] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2d2b440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465529.043335] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2d2b440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465529.044207] [ndv4:13552:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465529.044214] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.048756] [ndv4:14421:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465529.048765] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465529.051177] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4f0f060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465529.051420] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4f0f060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465529.051163] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2d2b440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465529.051754] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2d2b440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465529.052067] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465529.052074] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.051863] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4f0f060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465529.052987] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2d2b440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465529.053251] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2d2b440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465529.052915] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4f0f060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465529.053248] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4f0f060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465529.053562] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x4f0f060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465529.053146] [ndv4:13443:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.053546] [ndv4:13443:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465529.053979] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.053989] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x49eb970 [id=123 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.054017] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 123 events 0x5 mode thread_spinlock | |
[1650465529.054030] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[34]=0x4f0f060 using ud_verbs/mlx5_ib5:1 on worker 0x264e8d0 | |
[1650465529.054297] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.054818] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.054992] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.055550] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.055005] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x1d41f40 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.055033] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465529.055037] [ndv4:13443:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465529.055643] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.056038] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.056044] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.056092] [ndv4:13443:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465529.055831] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.055841] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.056298] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.057167] [ndv4:13443:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.057175] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.057435] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.057420] [ndv4:13443:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.057622] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.058069] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.060293] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x502d460: created UD QP 0xbe9f on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.060301] [ndv4:13452:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.061436] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465529.061445] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465529.061070] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2d2b440: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465529.061082] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.061123] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.061128] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x248b1e0 [id=110 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.061153] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 110 events 0x5 mode thread_spinlock | |
[1650465529.061167] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[25]=0x2d2b440 using ud_mlx5/mlx5_ib3:1 on worker 0x1f938d0 | |
[1650465529.061373] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.061461] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.061907] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.062700] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.062787] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.062836] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.062869] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.063316] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae673916000..0x2ae67399b000 on mlx5_ib5 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465529.063322] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae673916018 of 544744 bytes with 128 elements | |
[1650465529.063326] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.062026] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.062511] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.062938] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.065001] [ndv4:13443:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465529.065008] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.065491] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.065506] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.065510] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.065560] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.064749] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x502d460: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465529.065291] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x502d460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465529.065493] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x502d460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465529.065813] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x502d460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465529.065955] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x502d460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465529.066531] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x502d460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465529.066635] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.066642] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.066927] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x502d460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465529.067169] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x502d460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465529.067177] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.067210] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.067215] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x502df60 [id=124 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.067248] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 124 events 0x5 mode thread_spinlock | |
[1650465529.067259] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[35]=0x502d460 using ud_mlx5/mlx5_ib5:1 on worker 0x264e8d0 | |
[1650465529.067818] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.067912] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.067214] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x3c38020: created RC QP 0xc025 on mlx5_ib4:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.068007] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465529.068014] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465529.068682] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[26]=0x3c38020 using rc_verbs/mlx5_ib4:1 on worker 0x1f938d0 | |
[1650465529.068876] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.069361] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.070181] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.070188] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.074960] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.075055] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.075867] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.076549] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465529.076556] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.076391] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.076421] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.077097] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.077651] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.077669] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.077672] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.077727] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.077284] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.077292] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.078624] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465529.078631] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465529.079083] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.079097] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.079101] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.079155] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.079569] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3f59010 of 8176 bytes with 127 elements | |
[1650465529.078797] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.078803] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.079313] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x512d0a0: created RC QP 0xbdaa on mlx5_ib6:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.080171] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[36]=0x512d0a0 using rc_verbs/mlx5_ib6:1 on worker 0x264e8d0 | |
[1650465529.080319] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.080636] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.081106] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.082342] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.079949] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.079956] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.079990] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465529.079994] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.080005] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3448690 [id=113 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.080039] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 113 events 0x1 mode thread_spinlock | |
[1650465529.080049] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[27]=0x3d85030 using rc_mlx5/mlx5_ib4:1 on worker 0x1f938d0 | |
[1650465529.080272] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.080491] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.080908] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.081327] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.082560] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.082630] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.082327] [ndv4:14949:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.083802] [ndv4:14949:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465529.083595] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.083601] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.085073] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465529.085080] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465529.087689] [ndv4:14421:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.087398] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.088413] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.088420] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.088335] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.089833] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.089850] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.089853] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.089926] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.090029] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.090058] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.090063] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.090122] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.090482] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.090490] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.090524] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465529.090527] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.090536] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3d8df60 [id=115 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.090566] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 115 events 0x1 mode thread_spinlock | |
[1650465529.090871] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x544e010 of 8176 bytes with 127 elements | |
[1650465529.090839] [ndv4:13552:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.091738] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x10532d0 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.091774] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465529.091778] [ndv4:14949:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465529.092256] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.092410] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.092416] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.092469] [ndv4:14949:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465529.091182] [ndv4:15226:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.091198] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.091209] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.091251] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465529.091255] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.091270] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5282c00 [id=127 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.091300] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 127 events 0x1 mode thread_spinlock | |
[1650465529.091318] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[37]=0x527a030 using rc_mlx5/mlx5_ib6:1 on worker 0x264e8d0 | |
[1650465529.091652] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.091808] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.092274] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.092775] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.093834] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.094181] [ndv4:14205:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2afda3df3000 length 12288 | |
[1650465529.094273] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.095016] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.095033] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.095036] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.095093] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.095528] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.095535] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.095571] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465529.095625] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.095633] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x49eb330 [id=129 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.095660] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 129 events 0x1 mode thread_spinlock | |
[1650465529.096292] [ndv4:13452:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.096113] [ndv4:14205:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465529.096122] [ndv4:14205:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2afda9c64000 length 4296704 | |
[1650465529.096128] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2afda9c64018 of 4296680 bytes with 512 elements | |
[1650465529.096382] [ndv4:14205:0] mm_iface.c:600 UCX DEBUG created mm iface 0x20f3b20 FIFO id 0x4000000046425bb9 va 0x2afda3df3000 size 12288 (128 x 64 elems) | |
[1650465529.096457] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x20f3b20 using posix/memory on worker 0x29ca770 | |
[1650465529.096482] [ndv4:14205:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465529.096525] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.096542] [ndv4:14205:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465529.096551] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2afdaa07d018 of 4296680 bytes with 512 elements | |
[1650465529.097204] [ndv4:14205:0] mm_iface.c:600 UCX DEBUG created mm iface 0x2105870 FIFO id 0x64802a va 0x2afda3df6000 size 12288 (128 x 64 elems) | |
[1650465529.097215] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x2105870 using sysv/memory on worker 0x29ca770 | |
[1650465529.097227] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465529.097230] [ndv4:14205:0] self.c:220 UCX DEBUG created self iface id 0xb55d03b1f3198649 send_size 8192 | |
[1650465529.097236] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x2104f10 using self/memory0 on worker 0x29ca770 | |
[1650465529.097259] [ndv4:14205:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.097264] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.097267] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.100141] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x2112bf0 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.100184] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465529.100203] [ndv4:14205:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2105e40: listening for connections (fd=78) on 10.5.0.5:35403 | |
[1650465529.100509] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x2105e40 using tcp/eth0 on worker 0x29ca770 | |
[1650465529.100529] [ndv4:14205:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.100532] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.100535] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.100664] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x2061700 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.100687] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465529.100691] [ndv4:14205:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x20ed5c0: listening for connections (fd=80) on 127.0.0.1:52878 | |
[1650465529.100709] [ndv4:14205:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.100715] [ndv4:14205:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.100823] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x20ed5c0 using tcp/lo on worker 0x29ca770 | |
[1650465529.100842] [ndv4:14205:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.100845] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.100848] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.100886] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x2060fe0 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.100921] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465529.100925] [ndv4:14205:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x20edc40: listening for connections (fd=82) on 172.16.1.242:55307 | |
[1650465529.101334] [ndv4:14949:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.101343] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.101546] [ndv4:14949:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.101783] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.102051] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.101288] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x20edc40 using tcp/ib0 on worker 0x29ca770 | |
[1650465529.101435] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.101520] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.101778] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.101873] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.101967] [ndv4:15226:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3f5b050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc061 | |
[1650465529.102810] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.102831] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.102842] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b179ceb7008 of 151544 bytes with 1052 elements | |
[1650465529.102815] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.102809] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.102819] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.104132] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.104169] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.104172] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.104185] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.104843] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.104852] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.106058] [ndv4:13452:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x5450050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xbdd7 | |
[1650465529.106571] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.106882] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.106896] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae67619c008 of 151544 bytes with 1052 elements | |
[1650465529.106892] [ndv4:14949:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465529.106899] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.106070] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x20fb6c0: created RC QP 0xde53 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.106751] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b179a800000..0x2b179ce00000 on mlx5_ib4 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465529.106765] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b179a800018 of 39845864 bytes with 4752 elements | |
[1650465529.106902] [ndv4:15226:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3f5b050 | |
[1650465529.106937] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[28]=0x3f5b050 using dc_mlx5/mlx5_ib4:1 on worker 0x1f938d0 | |
[1650465529.107298] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.107393] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.107660] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.107790] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.108801] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.110738] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.111111] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae673a00000..0x2ae676000000 on mlx5_ib6 lkey 0x80400 rkey 0x80400 access 0xf flags 0x3e4 | |
[1650465529.111127] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae673a00018 of 39845864 bytes with 4752 elements | |
[1650465529.111272] [ndv4:13452:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x5450050 | |
[1650465529.111305] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[38]=0x5450050 using dc_mlx5/mlx5_ib6:1 on worker 0x264e8d0 | |
[1650465529.112161] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x41371f0: created UD QP 0xc07d on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.112170] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.112283] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.112283] [ndv4:13552:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.112922] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.112969] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.112980] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.113069] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.113078] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.112894] [ndv4:13552:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465529.113756] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b179cedc000..0x2b179cf61000 on mlx5_ib4 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465529.113763] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b179cedc018 of 544744 bytes with 128 elements | |
[1650465529.113767] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.114220] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x41371f0: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465529.114528] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.114536] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.114748] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x13d0560 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.114782] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465529.114786] [ndv4:13552:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465529.115653] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465529.115667] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.115802] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.116041] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.116047] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.116105] [ndv4:13552:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465529.114804] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x41371f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465529.117038] [ndv4:13443:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.116990] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x20fb6c0 using rc_verbs/mlx5_ib0:1 on worker 0x29ca770 | |
[1650465529.117127] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.117684] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.117916] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.118346] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.118671] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.121109] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.121216] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.121735] [ndv4:14421:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.122301] [ndv4:14421:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465529.121811] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.122967] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x41371f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465529.123307] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x41371f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465529.122983] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.123354] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x562c2e0: created UD QP 0xbde1 on mlx5_ib6:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.123310] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0x1617280 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.123337] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465529.123341] [ndv4:14421:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465529.124051] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.124489] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.124770] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.125036] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.125181] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.124411] [ndv4:13552:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.124420] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.124695] [ndv4:13552:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.125016] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.125180] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.123890] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x41371f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465529.124827] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x41371f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465529.125150] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x41371f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465529.123835] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.124410] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.124415] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.124461] [ndv4:14421:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465529.125777] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae6761c1000..0x2ae676246000 on mlx5_ib6 lkey 0x80500 rkey 0x80500 access 0xf flags 0x3e4 | |
[1650465529.125782] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae6761c1018 of 544744 bytes with 128 elements | |
[1650465529.125787] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.125847] [ndv4:14421:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.125854] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.126131] [ndv4:14421:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.126382] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.126791] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.127901] [ndv4:14205:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465529.129054] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.129063] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.129066] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.129119] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.129482] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465529.129490] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.130173] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2a94010 of 8176 bytes with 127 elements | |
[1650465529.130801] [ndv4:14421:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465529.130807] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.130526] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.130553] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.130658] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.130665] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.131557] [ndv4:13443:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.131922] [ndv4:13443:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465529.132520] [ndv4:12741:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.132540] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x1d43820 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.132671] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465529.132677] [ndv4:13443:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465529.132988] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.133131] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.133137] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.133197] [ndv4:13443:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465529.138435] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x205f380 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.138468] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465529.138488] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x2109110 using rc_mlx5/mlx5_ib0:1 on worker 0x29ca770 | |
[1650465529.139346] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.139528] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.139834] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.139977] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.140360] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.139973] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.139980] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.141684] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.141693] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.141696] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.141709] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.142138] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.142145] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.142177] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.142181] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.142190] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x205fe00 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.142217] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465529.142673] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x41371f0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465529.143651] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.143664] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3c13a70 [id=116 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.143695] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 116 events 0x5 mode thread_spinlock | |
[1650465529.143714] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[29]=0x41371f0 using ud_verbs/mlx5_ib4:1 on worker 0x1f938d0 | |
[1650465529.143912] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.144111] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.144446] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.144538] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.142747] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x562c2e0: adding gid fe80::15:5dff:fd34:1 to hash on device mlx5_ib6 port 1 index 0) | |
[1650465529.143124] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x562c2e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 1) | |
[1650465529.143376] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x562c2e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 2) | |
[1650465529.143520] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x562c2e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 3) | |
[1650465529.143785] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x562c2e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 4) | |
[1650465529.143918] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x562c2e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 5) | |
[1650465529.144124] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x562c2e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 6) | |
[1650465529.144453] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x562c2e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 7) | |
[1650465529.142987] [ndv4:14205:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.144789] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.144797] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x51088d0 [id=130 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.144824] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 130 events 0x5 mode thread_spinlock | |
[1650465529.144837] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[39]=0x562c2e0 using ud_verbs/mlx5_ib6:1 on worker 0x264e8d0 | |
[1650465529.145065] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.145075] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.145203] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.145661] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.145444] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.146516] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.146556] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.146955] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x4255050: created UD QP 0xc0c7 on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.146963] [ndv4:15226:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.148656] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.149041] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x574a460: created UD QP 0xbe1d on mlx5_ib6:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.149048] [ndv4:13452:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.149953] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.147840] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.148688] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.149079] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.149523] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.150246] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.148943] [ndv4:13443:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.148953] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.149206] [ndv4:13443:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.149456] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.150247] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.150937] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.151379] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.151453] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.151670] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.150730] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b179cf61000..0x2b179cfe6000 on mlx5_ib4 lkey 0x80d00 rkey 0x80d00 access 0xf flags 0x3e4 | |
[1650465529.150736] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b179cf61018 of 544744 bytes with 128 elements | |
[1650465529.150740] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.152271] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae676246000..0x2ae6762cb000 on mlx5_ib6 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465529.152279] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae676246018 of 544744 bytes with 128 elements | |
[1650465529.152285] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.152999] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x574a460: adding gid fe80::15:5dff:fd34:1 to hash on device mlx5_ib6 port 1 index 0) | |
[1650465529.153035] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.153042] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.152235] [ndv4:14205:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2cfe010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xde76 | |
[1650465529.153941] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.153999] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.154023] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2afda3dfb008 of 151544 bytes with 1052 elements | |
[1650465529.155021] [ndv4:12741:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.155674] [ndv4:12741:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465529.157067] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x26f2710 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.157099] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465529.157103] [ndv4:12741:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465529.157404] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.157970] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.157976] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.158047] [ndv4:12741:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465529.158054] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdaa600000..0x2afdacc00000 on mlx5_ib0 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465529.158074] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2afdaa600018 of 39845864 bytes with 4752 elements | |
[1650465529.158213] [ndv4:14205:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2cfe010 | |
[1650465529.158249] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x2cfe010 using dc_mlx5/mlx5_ib0:1 on worker 0x29ca770 | |
[1650465529.158381] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.158716] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.158957] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.159051] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.159723] [ndv4:12741:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.159729] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.158652] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x574a460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 1) | |
[1650465529.159253] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x574a460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 2) | |
[1650465529.158221] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.158228] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.160012] [ndv4:12741:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.160273] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.161021] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.162960] [ndv4:13443:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465529.162968] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.164570] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x574a460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 3) | |
[1650465529.165327] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x574a460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 4) | |
[1650465529.164885] [ndv4:12741:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465529.164892] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.163273] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465529.163283] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.164751] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.166302] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.166309] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.168556] [ndv4:14949:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.168887] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4255050: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465529.169421] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4255050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465529.170363] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4255050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465529.171160] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4255050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465529.171283] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4255050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465529.172172] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.172397] [ndv4:13552:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465529.172407] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.172704] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x29ea5f0: created UD QP 0xde5c on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.172468] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4255050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465529.173399] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.173923] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.174567] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.173303] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4255050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465529.175106] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.175190] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.176096] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.176106] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.176458] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x574a460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 5) | |
[1650465529.176737] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x574a460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 6) | |
[1650465529.176320] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465529.176329] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.175755] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afda3e20000..0x2afda3ea5000 on mlx5_ib0 lkey 0x80d00 rkey 0x80d00 access 0xf flags 0x3e4 | |
[1650465529.175763] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afda3e20018 of 544744 bytes with 128 elements | |
[1650465529.175767] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.176207] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x29ea5f0: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.176726] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x29ea5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.177175] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x29ea5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.177501] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x29ea5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.178089] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x29ea5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.178190] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x29ea5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.178460] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x29ea5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.179347] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4255050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465529.179359] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.179407] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.179412] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3c13410 [id=117 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.179438] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 117 events 0x5 mode thread_spinlock | |
[1650465529.179453] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[30]=0x4255050 using ud_mlx5/mlx5_ib4:1 on worker 0x1f938d0 | |
[1650465529.179748] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.180141] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.180773] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.180838] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.181200] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465529.181207] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.181293] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.181303] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.181366] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.182060] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x574a460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 7) | |
[1650465529.182070] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.182166] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.182172] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5108a40 [id=131 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.182205] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 131 events 0x5 mode thread_spinlock | |
[1650465529.182218] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[40]=0x574a460 using ud_mlx5/mlx5_ib6:1 on worker 0x264e8d0 | |
[1650465529.182518] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.182536] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.182540] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.182638] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.183031] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.183364] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.183790] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.183868] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.184301] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.183194] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.183203] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.183715] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x43550a0: created RC QP 0xbfb1 on mlx5_ib5:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.185497] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.185512] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.185515] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.185566] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.185792] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[31]=0x43550a0 using rc_verbs/mlx5_ib5:1 on worker 0x1f938d0 | |
[1650465529.185984] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.186216] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.185096] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465529.185161] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465529.187171] [ndv4:14421:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.187837] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x29ea5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.188442] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.188451] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.188255] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.189046] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x584a0a0: created RC QP 0xbd08 on mlx5_ib7:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.190127] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[41]=0x584a0a0 using rc_verbs/mlx5_ib7:1 on worker 0x264e8d0 | |
[1650465529.190415] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.190663] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.190904] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.191249] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.194301] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.194311] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.194408] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x28fc370 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.194447] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465529.194472] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x29ea5f0 using ud_verbs/mlx5_ib0:1 on worker 0x29ca770 | |
[1650465529.194823] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.195159] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.195543] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.195648] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.195250] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.195261] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.195699] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.196816] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.197709] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.199390] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.199963] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.200527] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x2111a80: created UD QP 0xde5d on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.200542] [ndv4:14205:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.203292] [ndv4:14421:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.203187] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.203485] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.203493] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.203865] [ndv4:14421:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465529.204261] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.204293] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.204296] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.204314] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.204318] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.204372] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.204738] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.204967] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.204987] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x5b6b010 of 8176 bytes with 127 elements | |
[1650465529.205203] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.205004] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0xf2f710 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.205035] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465529.205039] [ndv4:14421:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465529.205696] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.205882] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.205887] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.205936] [ndv4:14421:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465529.207055] [ndv4:14421:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.207062] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.207318] [ndv4:14421:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.207638] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.208263] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.205648] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afda3ea5000..0x2afda3f2a000 on mlx5_ib0 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465529.205656] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afda3ea5018 of 544744 bytes with 128 elements | |
[1650465529.205660] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.206636] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.206646] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.206687] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib7 length=2048) failed: Invalid argument | |
[1650465529.206692] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.206706] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x57778f0 [id=134 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.206738] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 134 events 0x1 mode thread_spinlock | |
[1650465529.206753] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[42]=0x5997030 using rc_mlx5/mlx5_ib7:1 on worker 0x264e8d0 | |
[1650465529.207042] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.207468] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.207964] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.208286] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.206146] [ndv4:14949:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.206552] [ndv4:14949:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465529.207815] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x95cfa0 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.207840] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465529.207843] [ndv4:14949:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465529.208355] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.208390] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.208395] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.208442] [ndv4:14949:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465529.205297] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.205307] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.208999] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.209388] [ndv4:14949:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.209395] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.209609] [ndv4:14949:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.210274] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.210291] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.210295] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.210351] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.210832] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.210841] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.210877] [ndv4:13452:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib7 length=2048) failed: Invalid argument | |
[1650465529.210880] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.210889] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1d97df0 [id=136 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.210916] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 136 events 0x1 mode thread_spinlock | |
[1650465529.211549] [ndv4:13452:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.210721] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.210907] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.213560] [ndv4:14421:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465529.213567] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.215739] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2111a80: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.216172] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2111a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.216723] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.216744] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.216748] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.216801] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.217252] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4676010 of 8176 bytes with 127 elements | |
[1650465529.218170] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.218236] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.217513] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.217521] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.217563] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465529.217567] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.217632] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3668e10 [id=120 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.217644] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 120 events 0x1 mode thread_spinlock | |
[1650465529.217661] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[32]=0x44a2030 using rc_mlx5/mlx5_ib5:1 on worker 0x1f938d0 | |
[1650465529.217926] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.218149] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.218934] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.219253] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.220698] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.219607] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.219618] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.218822] [ndv4:13443:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.218829] [ndv4:13443:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.219346] [ndv4:13443:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x1d28d80 0x1d28d80 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465529.221076] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2111a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.222102] [ndv4:14949:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465529.222111] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.224006] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.224023] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.224026] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.224082] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.224228] [ndv4:13452:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x5b6d050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xbd3d | |
[1650465529.225021] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.225028] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.225063] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465529.225066] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.225074] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3668f80 [id=122 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.225103] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 122 events 0x1 mode thread_spinlock | |
[1650465529.225696] [ndv4:15226:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.226032] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.226264] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.226284] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae678acc008 of 151544 bytes with 1052 elements | |
[1650465529.228164] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2111a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.228709] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2111a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.229418] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2111a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.229174] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.229184] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.229241] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.229249] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.230207] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae676400000..0x2ae678a00000 on mlx5_ib7 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465529.230231] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae676400018 of 39845864 bytes with 4752 elements | |
[1650465529.230371] [ndv4:13452:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x5b6d050 | |
[1650465529.230407] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[43]=0x5b6d050 using dc_mlx5/mlx5_ib7:1 on worker 0x264e8d0 | |
[1650465529.230467] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.230478] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.230954] [ndv4:12741:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.230862] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.230932] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.231515] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.231893] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.232498] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.231994] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.232003] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.233874] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.234203] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x5d49060: created UD QP 0xbd48 on mlx5_ib7:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.234897] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.235432] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.235507] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.236079] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.235272] [ndv4:15226:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4678050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc01e | |
[1650465529.235472] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.235779] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.235924] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b179f7e7008 of 151544 bytes with 1052 elements | |
[1650465529.237135] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2111a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.237048] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.237550] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae678af1000..0x2ae678b76000 on mlx5_ib7 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465529.237557] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae678af1018 of 544744 bytes with 128 elements | |
[1650465529.237561] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.238044] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x5d49060: adding gid fe80::15:5dff:fd34:2 to hash on device mlx5_ib7 port 1 index 0) | |
[1650465529.238075] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x5d49060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 1) | |
[1650465529.238241] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x5d49060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 2) | |
[1650465529.237890] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2111a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.237899] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.238015] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.238021] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x2051df0 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.238048] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465529.238072] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x2111a80 using ud_mlx5/mlx5_ib0:1 on worker 0x29ca770 | |
[1650465529.238477] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.239289] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.239719] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.239825] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.237526] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.237532] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.238662] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x5d49060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 3) | |
[1650465529.239259] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x5d49060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 4) | |
[1650465529.240093] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x5d49060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 5) | |
[1650465529.240568] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x5d49060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 6) | |
[1650465529.240351] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.241817] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.241830] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.241833] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.241856] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.242664] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.242671] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.239893] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b179d000000..0x2b179f600000 on mlx5_ib5 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465529.239915] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b179d000018 of 39845864 bytes with 4752 elements | |
[1650465529.240055] [ndv4:15226:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4678050 | |
[1650465529.240089] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[33]=0x4678050 using dc_mlx5/mlx5_ib5:1 on worker 0x1f938d0 | |
[1650465529.240413] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.240753] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.241147] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.241175] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.242136] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.241124] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x5d49060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 7) | |
[1650465529.241491] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.241498] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5d49f40 [id=137 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.241534] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 137 events 0x5 mode thread_spinlock | |
[1650465529.241546] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[44]=0x5d49060 using ud_verbs/mlx5_ib7:1 on worker 0x264e8d0 | |
[1650465529.241874] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.242645] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.242881] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.243009] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.243307] [ndv4:13452:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.244714] [ndv4:13452:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.245031] [ndv4:13452:0] ib_iface.c:994 UCX DEBUG iface=0x33b97b0: created UD QP 0xbd58 on mlx5_ib7:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.245038] [ndv4:13452:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.243738] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x329b050: created RC QP 0xc450 on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.246223] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.246234] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.245700] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.246401] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.246474] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.245964] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x329b050 using rc_verbs/mlx5_ib1:1 on worker 0x29ca770 | |
[1650465529.246389] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.246737] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.247027] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.247268] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.246823] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.246908] [ndv4:13452:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.247316] [ndv4:13452:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae678b76000..0x2ae678bfb000 on mlx5_ib7 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465529.247321] [ndv4:13452:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae678b76018 of 544744 bytes with 128 elements | |
[1650465529.247326] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.247767] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33b97b0: adding gid fe80::15:5dff:fd34:2 to hash on device mlx5_ib7 port 1 index 0) | |
[1650465529.248155] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33b97b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 1) | |
[1650465529.248343] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33b97b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 2) | |
[1650465529.248822] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33b97b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 3) | |
[1650465529.248986] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33b97b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 4) | |
[1650465529.249538] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33b97b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 5) | |
[1650465529.250027] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33b97b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 6) | |
[1650465529.251564] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.251982] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x4854060: created UD QP 0xc055 on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.252680] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.253029] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.253260] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.253711] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.254217] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.255163] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.254779] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b179f80c000..0x2b179f891000 on mlx5_ib5 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465529.254785] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b179f80c018 of 544744 bytes with 128 elements | |
[1650465529.254790] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.255426] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4854060: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465529.254698] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.254705] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.255198] [ndv4:13452:0] ud_iface.c:393 UCX DEBUG iface 0x33b97b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 7) | |
[1650465529.255205] [ndv4:13452:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.255234] [ndv4:13452:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.255238] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x1d759b0 [id=138 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.255269] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 138 events 0x5 mode thread_spinlock | |
[1650465529.255281] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[45]=0x33b97b0 using ud_mlx5/mlx5_ib7:1 on worker 0x264e8d0 | |
[1650465529.255346] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool uct_scopy_iface_tx_mp: align 64, maxelems 4294967295, elemsize 736 | |
[1650465529.256567] [ndv4:13452:0] ucp_worker.c:1159 UCX DEBUG created interface[46]=0x5e67880 using cma/memory on worker 0x264e8d0 | |
[1650465529.256715] [ndv4:13452:0] ucp_worker.c:982 UCX DEBUG selected scalable tl bitmap: 0x7fffffffffff 0x0 (47 tls) | |
[1650465529.256685] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.256696] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.256699] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.256722] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.257312] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x32ae010 of 8176 bytes with 127 elements | |
[1650465529.257679] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.257686] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.257720] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465529.257724] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.257733] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x20f8460 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.257765] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465529.257775] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x311c010 using rc_mlx5/mlx5_ib1:1 on worker 0x29ca770 | |
[1650465529.258078] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.258215] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.260354] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.260367] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.259953] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465529.259960] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465529.261642] [ndv4:14421:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.262510] [ndv4:13552:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.264323] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.264452] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.266402] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x58257a0 [id=75 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.266434] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 75 events 0x0 mode thread_spinlock | |
[1650465529.266486] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x58255d0 [id=76 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.266532] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 76 events 0x0 mode thread_spinlock | |
[1650465529.266551] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5825610 [id=77 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.266558] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 77 events 0x0 mode thread_spinlock | |
[1650465529.265835] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.265845] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.265473] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.267259] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.267277] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.267281] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.267344] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.267850] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.267858] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.267890] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465529.267895] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.267903] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x20f7fa0 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.267928] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465529.268667] [ndv4:14205:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.268650] [ndv4:12741:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.269213] [ndv4:12741:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465529.270265] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x26f2560 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.270303] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465529.270308] [ndv4:12741:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465529.269751] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4854060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465529.270759] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4854060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465529.271064] [ndv4:14949:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.271070] [ndv4:14949:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.271335] [ndv4:14949:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x955cd0 0x955cd0 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465529.271700] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x33b9600 [id=79 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.271739] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 79 events 0x0 mode thread_spinlock | |
[1650465529.271920] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.271927] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.272372] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.272381] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.272788] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.272795] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.271889] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4854060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465529.272816] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4854060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465529.273458] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4854060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465529.270873] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.271011] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.271017] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.271090] [ndv4:12741:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465529.272244] [ndv4:12741:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.272252] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.272511] [ndv4:12741:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.272895] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.273535] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.273749] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.273756] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.274202] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.274207] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.274816] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.274824] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.275140] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.275145] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.275310] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.275315] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.275386] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.275391] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.275833] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.275840] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.275950] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.275955] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.276233] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.276240] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.276925] [ndv4:13452:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.276933] [ndv4:13452:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.277207] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x33b9640 [id=81 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.277242] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 81 events 0x0 mode thread_spinlock | |
[1650465529.280262] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4854060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465529.280567] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4854060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465529.281021] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.281030] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x4330970 [id=123 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.281068] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 123 events 0x5 mode thread_spinlock | |
[1650465529.281079] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[34]=0x4854060 using ud_verbs/mlx5_ib5:1 on worker 0x1f938d0 | |
[1650465529.280827] [ndv4:14205:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3446010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc473 | |
[1650465529.281804] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.282303] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.282401] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.282436] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.283246] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.282741] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.283256] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.283268] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2afda3f2c008 of 151544 bytes with 1052 elements | |
[1650465529.284376] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.284713] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x4972460: created UD QP 0xc084 on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.284721] [ndv4:15226:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.285064] [ndv4:12741:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465529.285073] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.285367] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.285974] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.286243] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.286540] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.286736] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.288516] [ndv4:13552:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.289161] [ndv4:13552:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465529.287858] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x33b9680 [id=83 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.287895] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 83 events 0x0 mode thread_spinlock | |
[1650465529.287982] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x33b96c0 [id=84 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288004] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 84 events 0x0 mode thread_spinlock | |
[1650465529.288024] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x33b9730 [id=86 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288045] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 86 events 0x0 mode thread_spinlock | |
[1650465529.288094] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x33b9770 [id=90 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288113] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 90 events 0x0 mode thread_spinlock | |
[1650465529.288129] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x40daec0 [id=91 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288177] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 91 events 0x0 mode thread_spinlock | |
[1650465529.288192] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x40daf00 [id=93 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288238] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 93 events 0x0 mode thread_spinlock | |
[1650465529.288276] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x40daf40 [id=97 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288296] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 97 events 0x0 mode thread_spinlock | |
[1650465529.288312] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x40daf80 [id=98 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288352] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 98 events 0x0 mode thread_spinlock | |
[1650465529.288369] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x3494b00 [id=100 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288413] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 100 events 0x0 mode thread_spinlock | |
[1650465529.288453] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x3494b40 [id=104 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288479] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 104 events 0x0 mode thread_spinlock | |
[1650465529.288495] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x3494b80 [id=105 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288515] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 105 events 0x0 mode thread_spinlock | |
[1650465529.288530] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x3494bc0 [id=107 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288552] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 107 events 0x0 mode thread_spinlock | |
[1650465529.288653] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x3494c00 [id=111 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288682] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 111 events 0x0 mode thread_spinlock | |
[1650465529.288702] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x3494c40 [id=112 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288721] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 112 events 0x0 mode thread_spinlock | |
[1650465529.288737] [ndv4:13452:0] async.c:228 UCX DE[1650465529.287165] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b179f891000..0x2b179f916000 on mlx5_ib5 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465529.287172] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b179f891018 of 544744 bytes with 128 elements | |
[1650465529.287177] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.287436] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4972460: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465529.288144] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4972460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465529.289168] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4972460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465529.289392] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4972460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465529.289738] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4972460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465529.290121] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4972460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465529.287368] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdace00000..0x2afdaf400000 on mlx5_ib1 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465529.287392] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2afdace00018 of 39845864 bytes with 4752 elements | |
[1650465529.287536] [ndv4:14205:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3446010 | |
[1650465529.287570] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x3446010 using dc_mlx5/mlx5_ib1:1 on worker 0x29ca770 | |
[1650465529.287907] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.288307] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.289148] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.289687] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.290323] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
BUG added async handler 0x3494c80 [id=114 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288756] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 114 events 0x0 mode thread_spinlock | |
[1650465529.288796] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x3494cc0 [id=118 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288803] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 118 events 0x0 mode thread_spinlock | |
[1650465529.288819] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5e67dd0 [id=119 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288823] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 119 events 0x0 mode thread_spinlock | |
[1650465529.288838] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5e67e10 [id=121 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288841] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 121 events 0x0 mode thread_spinlock | |
[1650465529.288879] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5e67e50 [id=125 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288904] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 125 events 0x0 mode thread_spinlock | |
[1650465529.288920] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5e67e90 [id=126 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288938] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 126 events 0x0 mode thread_spinlock | |
[1650465529.288952] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5e67ed0 [id=128 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.288970] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 128 events 0x0 mode thread_spinlock | |
[1650465529.289006] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5e67f10 [id=132 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.289030] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 132 events 0x0 mode thread_spinlock | |
[1650465529.289045] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5e67f50 [id=133 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.289060] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 133 events 0x0 mode thread_spinlock | |
[1650465529.289074] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5e67f90 [id=135 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.289094] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 135 events 0x0 mode thread_spinlock | |
[1650465529.290636] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.290643] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.291650] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.292041] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x3345450: created UD QP 0xc459 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.292679] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.291983] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4972460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465529.292370] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4972460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465529.292377] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.292416] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.292421] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x4972f60 [id=124 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.292430] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 124 events 0x5 mode thread_spinlock | |
[1650465529.292443] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[35]=0x4972460 using ud_mlx5/mlx5_ib5:1 on worker 0x1f938d0 | |
[1650465529.292794] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.293169] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.293303] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.294129] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.293792] [ndv4:13452:0] async.c:228 UCX DEBUG added async handler 0x5f42af0 [id=139 ref 1] uct_rdmacm_cm_event_handler() to hash | |
[1650465529.293807] [ndv4:13452:0] async.c:506 UCX DEBUG listening to async event fd 139 events 0x1 mode thread_spinlock | |
[1650465529.293820] [ndv4:13452:0] rdmacm_cm.c:922 UCX DEBUG created rdmacm_cm 0x5f424f0 with event_channel 0x5e67fd0 (fd=139) | |
[1650465529.293850] [ndv4:13452:0] tcp_sockcm.c:186 UCX DEBUG created tcp_sockcm 0x3edd7e0 | |
[1650465529.293862] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ucp_requests: align 64, maxelems 4294967295, elemsize 440 | |
[1650465529.293865] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ucp_rkeys: align 64, maxelems 4294967295, elemsize 168 | |
[1650465529.293870] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ucp_am_bufs: align 64, maxelems 4294967295, elemsize 65624 | |
[1650465529.293873] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ucp_reg_bufs: align 64, maxelems 4294967295, elemsize 8208 | |
[1650465529.293876] [ndv4:13452:0] mpool.c:88 UCX DEBUG mpool ucp_rndv_frags: align 512, maxelems 4294967295, elemsize 524304 | |
[1650465529.293959] [ndv4:13452:0] parser.c:1893 UCX INFO UCX_* env variables: UCX_POSIX_USE_PROC_LINK=n UCX_LOG_LEVEL=debug | |
[1650465529.295205] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.294542] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.294893] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.295319] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.295872] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.296293] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afda3f51000..0x2afda3fd6000 on mlx5_ib1 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465529.296300] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afda3f51018 of 544744 bytes with 128 elements | |
[1650465529.296305] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.296719] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3345450: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465529.296861] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3345450: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465529.297007] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3345450: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465529.296107] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.296114] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.297869] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3345450: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465529.298112] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3345450: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465529.298880] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3345450: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465529.299286] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3345450: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465529.299825] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3345450: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465529.300120] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.300127] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x2113600 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.300143] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465529.300153] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x3345450 using ud_verbs/mlx5_ib1:1 on worker 0x29ca770 | |
[1650465529.300681] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.300968] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.301119] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.301236] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.299720] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.299727] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.301484] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.304874] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.304880] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.308037] [ndv4:12741:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.310632] [ndv4:14421:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.311246] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.311264] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.311268] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.311321] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.311718] [ndv4:14421:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465529.311956] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.311964] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.312903] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x4a720a0: created RC QP 0xbfc1 on mlx5_ib6:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.312757] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0x1619890 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.312795] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465529.312799] [ndv4:14421:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465529.313542] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.313707] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.313713] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.313769] [ndv4:14421:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465529.314173] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.314570] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x320c280: created UD QP 0xc45a on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.314623] [ndv4:14205:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.315269] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.316240] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.316548] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.316875] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.317059] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.316192] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1ab6940 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.316225] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465529.316229] [ndv4:13552:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465529.316999] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.317061] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.317066] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.317117] [ndv4:13552:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465529.316470] [ndv4:13443:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b7507896000 length 12288 | |
[1650465529.316557] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.315799] [ndv4:14421:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.315806] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.316012] [ndv4:14421:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.314180] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[36]=0x4a720a0 using rc_verbs/mlx5_ib6:1 on worker 0x1f938d0 | |
[1650465529.314395] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.314438] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.314522] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.314529] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.316476] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.318617] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.318631] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.318634] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.318688] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.317462] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdaf498000..0x2afdaf51d000 on mlx5_ib1 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465529.317468] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdaf498018 of 544744 bytes with 128 elements | |
[1650465529.317473] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.318185] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x320c280: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465529.318403] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x320c280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465529.318605] [ndv4:13552:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.318613] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.318885] [ndv4:13552:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.319246] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.319665] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.317920] [ndv4:13443:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465529.317930] [ndv4:13443:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b7507899000 length 4296704 | |
[1650465529.317935] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b7507899018 of 4296680 bytes with 512 elements | |
[1650465529.318199] [ndv4:13443:0] mm_iface.c:600 UCX DEBUG created mm iface 0x24a5d50 FIFO id 0x400000005e8b39bc va 0x2b7507896000 size 12288 (128 x 64 elems) | |
[1650465529.318548] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x24a5d50 using posix/memory on worker 0x2d7f8d0 | |
[1650465529.318572] [ndv4:13443:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465529.318701] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.318717] [ndv4:13443:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465529.318727] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b750c000018 of 4296680 bytes with 512 elements | |
[1650465529.319309] [ndv4:13443:0] mm_iface.c:600 UCX DEBUG created mm iface 0x24a6320 FIFO id 0x648035 va 0x2b7507cb2000 size 12288 (128 x 64 elems) | |
[1650465529.319316] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x24a6320 using sysv/memory on worker 0x2d7f8d0 | |
[1650465529.319328] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465529.319331] [ndv4:13443:0] self.c:220 UCX DEBUG created self iface id 0x5dd80def44468893 send_size 8192 | |
[1650465529.319338] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x24b9f00 using self/memory0 on worker 0x2d7f8d0 | |
[1650465529.319361] [ndv4:13443:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.319366] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.319369] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.319130] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4d93010 of 8176 bytes with 127 elements | |
[1650465529.319402] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.319408] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.319445] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465529.319450] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.319460] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x4bc7c00 [id=127 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.319471] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 127 events 0x1 mode thread_spinlock | |
[1650465529.319479] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[37]=0x4bbf030 using rc_mlx5/mlx5_ib6:1 on worker 0x1f938d0 | |
[1650465529.319356] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x320c280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465529.319995] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x320c280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465529.320313] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x320c280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465529.320627] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x320c280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465529.321292] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x320c280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465529.322147] [ndv4:12741:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.322690] [ndv4:12741:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465529.322612] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.323030] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.324294] [ndv4:13552:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465529.324301] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.323926] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x24bc040 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.323966] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465529.323985] [ndv4:13443:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x24ba860: listening for connections (fd=78) on 10.5.0.5:35601 | |
[1650465529.324737] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x24ba860 using tcp/eth0 on worker 0x2d7f8d0 | |
[1650465529.324759] [ndv4:13443:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.324763] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.324766] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.324925] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x1d1f7b0 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.324949] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465529.324953] [ndv4:13443:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x24baee0: listening for connections (fd=80) on 127.0.0.1:60761 | |
[1650465529.325015] [ndv4:13443:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.325021] [ndv4:13443:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.325471] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x24baee0 using tcp/lo on worker 0x2d7f8d0 | |
[1650465529.325487] [ndv4:13443:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.325490] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.325493] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.325523] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.325756] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.326240] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.326497] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.326410] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x24177d0 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.326431] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465529.326435] [ndv4:13443:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x24a2570: listening for connections (fd=82) on 172.16.1.242:42120 | |
[1650465529.326979] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x24a2570 using tcp/ib0 on worker 0x2d7f8d0 | |
[1650465529.327523] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.328082] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.328520] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.328642] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2dd8940 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.328682] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465529.328686] [ndv4:12741:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465529.329338] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.329551] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.329557] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.329695] [ndv4:12741:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465529.328997] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.330255] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.330541] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.331712] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.331753] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.331757] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.331769] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.331830] [ndv4:12741:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.331837] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.332053] [ndv4:12741:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.332201] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.332210] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.333050] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x320c280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465529.333061] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.333095] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.333099] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x33452c0 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.333134] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465529.333147] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x320c280 using ud_mlx5/mlx5_ib1:1 on worker 0x29ca770 | |
[1650465529.333802] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.334164] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.333020] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.333039] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.333042] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.333104] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.334224] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.334231] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.334266] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465529.334269] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.334277] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x4330330 [id=129 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.334306] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 129 events 0x1 mode thread_spinlock | |
[1650465529.333212] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x24b06b0: created RC QP 0xde5e on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.332830] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.333522] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.334958] [ndv4:15226:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.334953] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.335113] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.335956] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.335964] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.341779] [ndv4:12741:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465529.341786] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.342860] [ndv4:14421:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465529.342870] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.344257] [ndv4:15226:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4d95050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc02b | |
[1650465529.345156] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.345198] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.345208] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b17a2117008 of 151544 bytes with 1052 elements | |
[1650465529.349245] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b179fa00000..0x2b17a2000000 on mlx5_ib6 lkey 0x80700 rkey 0x80700 access 0xf flags 0x3e4 | |
[1650465529.349268] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b179fa00018 of 39845864 bytes with 4752 elements | |
[1650465529.349410] [ndv4:15226:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4d95050 | |
[1650465529.349449] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[38]=0x4d95050 using dc_mlx5/mlx5_ib6:1 on worker 0x1f938d0 | |
[1650465529.349815] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.349989] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.350360] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.350407] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.349122] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.351425] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.352777] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.353995] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x4f712e0: created UD QP 0xc04b on mlx5_ib6:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.354789] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.356247] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.356328] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.356488] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.357299] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.357783] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b17a213c000..0x2b17a21c1000 on mlx5_ib6 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465529.357790] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b17a213c018 of 544744 bytes with 128 elements | |
[1650465529.357794] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.358397] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4f712e0: adding gid fe80::15:5dff:fd34:1 to hash on device mlx5_ib6 port 1 index 0) | |
[1650465529.358690] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.358711] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.358715] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.358776] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.359541] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.359550] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.359456] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4f712e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 1) | |
[1650465529.360088] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x24b06b0 using rc_verbs/mlx5_ib0:1 on worker 0x2d7f8d0 | |
[1650465529.360297] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.361076] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.361491] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.361922] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.362951] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.360516] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x38350a0: created RC QP 0xc4aa on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.359764] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4f712e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 2) | |
[1650465529.360219] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4f712e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 3) | |
[1650465529.360937] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4f712e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 4) | |
[1650465529.362067] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4f712e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 5) | |
[1650465529.362371] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4f712e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 6) | |
[1650465529.362424] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x4f712e0: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 7) | |
[1650465529.362710] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.362719] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x4a4d8d0 [id=130 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.362755] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 130 events 0x5 mode thread_spinlock | |
[1650465529.362771] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[39]=0x4f712e0 using ud_verbs/mlx5_ib6:1 on worker 0x1f938d0 | |
[1650465529.362999] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.363552] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.361096] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.361106] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.363954] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.364182] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.363923] [ndv4:13443:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465529.365136] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.365151] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.365163] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.365167] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.365222] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.366566] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.365871] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2e49010 of 8176 bytes with 127 elements | |
[1650465529.366167] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.366190] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.366239] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.366244] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.365554] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x38350a0 using rc_verbs/mlx5_ib2:1 on worker 0x29ca770 | |
[1650465529.366792] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.367687] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.367992] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.367519] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x508f460: created UD QP 0xc067 on mlx5_ib6:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.367527] [ndv4:15226:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.368692] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.369326] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.369889] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.370735] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.370800] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.370915] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.371166] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.371062] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.371077] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.371080] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.371131] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.371685] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b17a21c1000..0x2b17a2246000 on mlx5_ib6 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465529.371691] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b17a21c1018 of 544744 bytes with 128 elements | |
[1650465529.371696] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.371928] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x508f460: adding gid fe80::15:5dff:fd34:1 to hash on device mlx5_ib6 port 1 index 0) | |
[1650465529.372335] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x508f460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 1) | |
[1650465529.372401] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x508f460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 2) | |
[1650465529.372453] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x508f460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 3) | |
[1650465529.371681] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3b56010 of 8176 bytes with 127 elements | |
[1650465529.371937] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.371944] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.371979] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465529.371982] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.371995] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x32a3bb0 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.372004] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465529.372027] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x3982030 using rc_mlx5/mlx5_ib2:1 on worker 0x29ca770 | |
[1650465529.372254] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.372665] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.372881] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.373196] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.374057] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.373696] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x2416090 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.373728] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465529.373748] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x24be100 using rc_mlx5/mlx5_ib0:1 on worker 0x2d7f8d0 | |
[1650465529.374175] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.374854] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.375130] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.375698] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.375714] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.375717] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.375773] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.375934] [ndv4:14949:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2ad4b7eeb000 length 12288 | |
[1650465529.376013] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.376172] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.376181] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.376215] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465529.376218] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.376226] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x20f33e0 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.376246] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465529.375883] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.376519] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.376925] [ndv4:14205:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.377210] [ndv4:14949:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465529.377219] [ndv4:14949:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2ad4bd841000 length 4296704 | |
[1650465529.377224] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2ad4bd841018 of 4296680 bytes with 512 elements | |
[1650465529.377480] [ndv4:14949:0] mm_iface.c:600 UCX DEBUG created mm iface 0x10d5a10 FIFO id 0x400000000c4c9420 va 0x2ad4b7eeb000 size 12288 (128 x 64 elems) | |
[1650465529.377673] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.377683] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.377686] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.377699] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.378113] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.378120] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.378153] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.378156] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.378165] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x2417480 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.378187] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465529.377823] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x10d5a10 using posix/memory on worker 0x19ac660 | |
[1650465529.377852] [ndv4:14949:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465529.377895] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.377914] [ndv4:14949:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465529.377924] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2ad4bdc5a018 of 4296680 bytes with 512 elements | |
[1650465529.378500] [ndv4:14949:0] mm_iface.c:600 UCX DEBUG created mm iface 0x10e7760 FIFO id 0x648038 va 0x2ad4b7eee000 size 12288 (128 x 64 elems) | |
[1650465529.378509] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x10e7760 using sysv/memory on worker 0x19ac660 | |
[1650465529.378521] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465529.378524] [ndv4:14949:0] self.c:220 UCX DEBUG created self iface id 0xa7861ef0e5659aa9 send_size 8192 | |
[1650465529.378530] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x10e6e00 using self/memory0 on worker 0x19ac660 | |
[1650465529.378553] [ndv4:14949:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.378558] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.378561] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.378876] [ndv4:13443:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.380233] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.380241] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.379806] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.379817] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.382017] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x10f4ae0 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.382061] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465529.382081] [ndv4:14949:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x10e7d30: listening for connections (fd=78) on 10.5.0.5:34331 | |
[1650465529.382717] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x10e7d30 using tcp/eth0 on worker 0x19ac660 | |
[1650465529.382740] [ndv4:14949:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.382744] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.382747] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.382872] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x10435f0 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.382904] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465529.382908] [ndv4:14949:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x10cf4b0: listening for connections (fd=80) on 127.0.0.1:37455 | |
[1650465529.382939] [ndv4:14949:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.382945] [ndv4:14949:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.383144] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x10cf4b0 using tcp/lo on worker 0x19ac660 | |
[1650465529.383160] [ndv4:14949:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.383163] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.383165] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.383442] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x508f460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 4) | |
[1650465529.384013] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x508f460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 5) | |
[1650465529.384072] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x508f460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 6) | |
[1650465529.384286] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x508f460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 7) | |
[1650465529.384294] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.384398] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.384404] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x4a4da40 [id=131 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.384432] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 131 events 0x5 mode thread_spinlock | |
[1650465529.384451] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[40]=0x508f460 using ud_mlx5/mlx5_ib6:1 on worker 0x1f938d0 | |
[1650465529.384712] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.385490] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.383374] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x1042ed0 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.383403] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465529.383407] [ndv4:14949:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x10cfb30: listening for connections (fd=82) on 172.16.1.242:45429 | |
[1650465529.384048] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x10cfb30 using tcp/ib0 on worker 0x19ac660 | |
[1650465529.384265] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.384927] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.385043] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.385052] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.385638] [ndv4:14205:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3b58050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc4bd | |
[1650465529.386063] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.386508] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.386516] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2afda3fd8008 of 151544 bytes with 1052 elements | |
[1650465529.385859] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.386194] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.386546] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.385760] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.386376] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.388063] [ndv4:13443:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x30b3010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xde84 | |
[1650465529.389327] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.389395] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.389399] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.389431] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.389768] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.389778] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.389346] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.389800] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.389823] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b7507cb7008 of 151544 bytes with 1052 elements | |
[1650465529.390411] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdaf600000..0x2afdb1c00000 on mlx5_ib2 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465529.390430] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2afdaf600018 of 39845864 bytes with 4752 elements | |
[1650465529.390566] [ndv4:14205:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3b58050 | |
[1650465529.390658] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x3b58050 using dc_mlx5/mlx5_ib2:1 on worker 0x29ca770 | |
[1650465529.391375] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.390340] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x10dd5b0: created RC QP 0xde67 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.392170] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.392680] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.392931] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.393380] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.393800] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b750c600000..0x2b750ec00000 on mlx5_ib0 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465529.393822] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b750c600018 of 39845864 bytes with 4752 elements | |
[1650465529.393959] [ndv4:13443:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x30b3010 | |
[1650465529.393996] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x30b3010 using dc_mlx5/mlx5_ib0:1 on worker 0x2d7f8d0 | |
[1650465529.394430] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.394469] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x10dd5b0 using rc_verbs/mlx5_ib0:1 on worker 0x19ac660 | |
[1650465529.395035] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.395385] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x3d34060: created UD QP 0xc4b3 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.394841] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.395383] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.395765] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.394875] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.395105] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.395568] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.395806] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.395992] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.396656] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.396697] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.397146] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.397298] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.397314] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.397967] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.398357] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x24c6a70: created UD QP 0xde68 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.397739] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb1d1e000..0x2afdb1da3000 on mlx5_ib2 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465529.397746] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdb1d1e018 of 544744 bytes with 128 elements | |
[1650465529.397752] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.398373] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3d34060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465529.399089] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.399718] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.400033] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.400427] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.400915] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.401422] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7507cdc000..0x2b7507d61000 on mlx5_ib0 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465529.401436] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7507cdc018 of 544744 bytes with 128 elements | |
[1650465529.401442] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.402289] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24c6a70: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.403165] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24c6a70: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.404506] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.404514] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.408911] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.408879] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.409969] [ndv4:14949:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465529.409654] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.409663] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.410174] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3d34060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465529.410357] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3d34060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465529.410342] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.410360] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.410364] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.410418] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.410936] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.410945] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.411110] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.411117] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.411076] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.411090] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.411094] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.411153] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.411676] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x1a76010 of 8176 bytes with 127 elements | |
[1650465529.411952] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.411984] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.412044] [ndv4:14949:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.412050] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.412372] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x518f0a0: created RC QP 0xbe92 on mlx5_ib7:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.413822] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[41]=0x518f0a0 using rc_verbs/mlx5_ib7:1 on worker 0x1f938d0 | |
[1650465529.414299] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.414308] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.414521] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.414658] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.414855] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.414863] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.414706] [ndv4:12741:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.414712] [ndv4:12741:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.414881] [ndv4:12741:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x26dbc90 0x26dbc90 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465529.415086] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.415844] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24c6a70: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.416455] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24c6a70: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.417000] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24c6a70: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.417375] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.417392] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.417395] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.417445] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.417904] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24c6a70: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.419014] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24c6a70: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.419174] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24c6a70: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.419499] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.418029] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x54b0010 of 8176 bytes with 127 elements | |
[1650465529.418506] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.418512] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.418554] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib7 length=2048) failed: Invalid argument | |
[1650465529.418558] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.418569] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x50bc8f0 [id=134 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.418683] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 134 events 0x1 mode thread_spinlock | |
[1650465529.418696] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[42]=0x52dc030 using rc_mlx5/mlx5_ib7:1 on worker 0x1f938d0 | |
[1650465529.418929] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.419047] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.419382] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.419451] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.419562] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x1035490 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.419675] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465529.419790] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x10eb000 using rc_mlx5/mlx5_ib0:1 on worker 0x19ac660 | |
[1650465529.419939] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.420012] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.420216] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.420224] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.419784] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3d34060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465529.419960] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3d34060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465529.420794] [ndv4:13552:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.420802] [ndv4:13552:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.421086] [ndv4:13552:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x13b9c90 0x13b9c90 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465529.420915] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.422066] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.422077] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.422080] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.422094] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.422531] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.422539] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.422619] [ndv4:14949:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.422623] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.422632] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x10447d0 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.422660] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465529.423275] [ndv4:14949:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.425807] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x24ae4a0 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.425846] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465529.425888] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x24c6a70 using ud_verbs/mlx5_ib0:1 on worker 0x2d7f8d0 | |
[1650465529.426135] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.426393] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.428194] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465529.428203] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465529.427023] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.427236] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.427827] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.427778] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3d34060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465529.428154] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3d34060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465529.428902] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.429654] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.429673] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.429677] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.429734] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.429949] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.430258] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x2d9f750: created UD QP 0xde6d on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.430271] [ndv4:13443:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.430356] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.430364] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.430398] [ndv4:15226:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib7 length=2048) failed: Invalid argument | |
[1650465529.430402] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.430410] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x16dcdf0 [id=136 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.430437] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 136 events 0x1 mode thread_spinlock | |
[1650465529.430394] [ndv4:14421:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.430960] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.431137] [ndv4:15226:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.432149] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.432373] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.432734] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.432908] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.433621] [ndv4:14949:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x1ce0010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xde91 | |
[1650465529.434712] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.435085] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.435111] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ad4b7ef3008 of 151544 bytes with 1052 elements | |
[1650465529.433395] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7507d61000..0x2b7507de6000 on mlx5_ib0 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465529.433402] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7507d61018 of 544744 bytes with 128 elements | |
[1650465529.433406] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.437928] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x3d34060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465529.438922] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4be200000..0x2ad4c0800000 on mlx5_ib0 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465529.438947] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ad4be200018 of 39845864 bytes with 4752 elements | |
[1650465529.439086] [ndv4:14949:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x1ce0010 | |
[1650465529.439124] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x1ce0010 using dc_mlx5/mlx5_ib0:1 on worker 0x19ac660 | |
[1650465529.438290] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.438301] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x2113f20 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.438332] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465529.438352] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x3d34060 using ud_verbs/mlx5_ib2:1 on worker 0x29ca770 | |
[1650465529.438671] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.439281] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.439532] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.440288] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.440439] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.440435] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.441375] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.441742] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.442281] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.442513] [ndv4:15226:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x54b2050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xbef3 | |
[1650465529.442940] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.443190] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.443201] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b17a4a47008 of 151544 bytes with 1052 elements | |
[1650465529.445084] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x2d9f750: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.446634] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.446143] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x2d9f750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.446909] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x2d9f750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.447511] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x2d9f750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.447388] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x19cc4e0: created UD QP 0xde72 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.447073] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b17a2400000..0x2b17a4a00000 on mlx5_ib7 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465529.447096] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b17a2400018 of 39845864 bytes with 4752 elements | |
[1650465529.447235] [ndv4:15226:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x54b2050 | |
[1650465529.447273] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[43]=0x54b2050 using dc_mlx5/mlx5_ib7:1 on worker 0x1f938d0 | |
[1650465529.447879] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.447981] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.448458] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.448651] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.448101] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x2d9f750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.449070] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x2d9f750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.448021] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.448848] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.448878] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.449262] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.449396] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.449891] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.450069] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.450199] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4b7f18000..0x2ad4b7f9d000 on mlx5_ib0 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465529.450207] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ad4b7f18018 of 544744 bytes with 128 elements | |
[1650465529.450212] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.451066] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x19cc4e0: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.451394] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x19cc4e0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.450105] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x2d9f750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.451755] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.452420] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x2071930: created UD QP 0xc4b4 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.452429] [ndv4:14205:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.451871] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x19cc4e0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.451964] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x19cc4e0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.452395] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x19cc4e0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.452278] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.452717] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x568e060: created UD QP 0xbf3c on mlx5_ib7:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.452886] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x19cc4e0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.453232] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x19cc4e0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.454199] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x19cc4e0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.454475] [ndv4:14949:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.453424] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.454963] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.455137] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.453344] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.455982] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.456205] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.456645] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b17a4a6c000..0x2b17a4af1000 on mlx5_ib7 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465529.456652] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b17a4a6c018 of 544744 bytes with 128 elements | |
[1650465529.456656] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.457668] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x568e060: adding gid fe80::15:5dff:fd34:2 to hash on device mlx5_ib7 port 1 index 0) | |
[1650465529.458067] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x568e060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 1) | |
[1650465529.457951] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x2d9f750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.457961] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.458022] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.458027] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x2408220 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.458056] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465529.458071] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x2d9f750 using ud_mlx5/mlx5_ib0:1 on worker 0x2d7f8d0 | |
[1650465529.459263] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x568e060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 2) | |
[1650465529.459662] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x568e060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 3) | |
[1650465529.460665] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x568e060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 4) | |
[1650465529.459020] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.459307] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.460280] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.461141] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.460980] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.461127] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.461569] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.460849] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x18e3970 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.460893] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465529.460916] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x19cc4e0 using ud_verbs/mlx5_ib0:1 on worker 0x19ac660 | |
[1650465529.461337] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.461725] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.462271] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.462457] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.462171] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.462772] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb1da3000..0x2afdb1e28000 on mlx5_ib2 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465529.462778] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdb1da3018 of 544744 bytes with 128 elements | |
[1650465529.462784] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.463508] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2071930: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465529.463828] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2071930: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465529.463958] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.464994] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.465284] [ndv4:14421:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.465519] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x10f3970: created UD QP 0xde74 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.465530] [ndv4:14949:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.466631] [ndv4:14421:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465529.466142] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.466976] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.467094] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.468381] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x568e060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 5) | |
[1650465529.468885] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x568e060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 6) | |
[1650465529.468966] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x568e060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 7) | |
[1650465529.469411] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.469419] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x568ef40 [id=137 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.469451] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 137 events 0x5 mode thread_spinlock | |
[1650465529.469463] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[44]=0x568e060 using ud_verbs/mlx5_ib7:1 on worker 0x1f938d0 | |
[1650465529.469880] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.470458] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.469008] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.470747] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.470760] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.471365] [ndv4:15226:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.472316] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2071930: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465529.473269] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2071930: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465529.473020] [ndv4:15226:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.473333] [ndv4:15226:0] ib_iface.c:994 UCX DEBUG iface=0x2cfe7b0: created UD QP 0xbf7c on mlx5_ib7:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.473340] [ndv4:15226:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.474122] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.474383] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.474948] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c0874000..0x2ad4c08f9000 on mlx5_ib0 lkey 0x81400 rkey 0x81400 access 0xf flags 0x3e4 | |
[1650465529.474957] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ad4c0874018 of 544744 bytes with 128 elements | |
[1650465529.474963] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.475833] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x10f3970: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.476349] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x10f3970: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.474130] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.475022] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.476069] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.474060] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2071930: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465529.474590] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2071930: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465529.475213] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2071930: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465529.475654] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x2071930: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465529.475663] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.475706] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.475711] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x3d34ef0 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.475740] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465529.475758] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x2071930 using ud_mlx5/mlx5_ib2:1 on worker 0x29ca770 | |
[1650465529.476709] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x10f3970: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.477711] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0xf330a0 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.477729] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465529.477733] [ndv4:14421:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465529.478644] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.478967] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.478974] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.479027] [ndv4:14421:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465529.478949] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.478960] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.478964] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.478981] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.480129] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.480135] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.480892] [ndv4:14421:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.480899] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.481256] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x3650050: created RC QP 0xc45b on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.481206] [ndv4:14421:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.481878] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.481906] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.486418] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.486620] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.486988] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.487303] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.485556] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x3650050 using rc_verbs/mlx5_ib1:1 on worker 0x2d7f8d0 | |
[1650465529.485896] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.486367] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.487323] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.486847] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x10f3970: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.486460] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.486621] [ndv4:15226:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.487390] [ndv4:15226:0] ib_md.c:812 UCX DEBUG registered memory 0x2b17a4af1000..0x2b17a4b76000 on mlx5_ib7 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465529.487396] [ndv4:15226:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b17a4af1018 of 544744 bytes with 128 elements | |
[1650465529.487401] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.488229] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2cfe7b0: adding gid fe80::15:5dff:fd34:2 to hash on device mlx5_ib7 port 1 index 0) | |
[1650465529.487710] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.488777] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2cfe7b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 1) | |
[1650465529.488437] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.489680] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.489948] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.489966] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.489969] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.490020] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.491113] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.491119] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.491687] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.491705] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.491709] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.491734] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.492091] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x3f520a0: created RC QP 0xc43b on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.492247] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3663010 of 8176 bytes with 127 elements | |
[1650465529.492679] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.492687] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.492724] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465529.492729] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.492742] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x24ade90 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.492772] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465529.492797] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x34d1010 using rc_mlx5/mlx5_ib1:1 on worker 0x2d7f8d0 | |
[1650465529.493333] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.493679] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.494682] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.495042] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.495400] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x3f520a0 using rc_verbs/mlx5_ib3:1 on worker 0x29ca770 | |
[1650465529.495741] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.495834] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.496465] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.496714] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.496342] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x10f3970: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.497140] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x10f3970: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.497427] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x10f3970: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.496309] [ndv4:14421:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465529.496318] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.496282] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.497006] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2cfe7b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 2) | |
[1650465529.497569] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2cfe7b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 3) | |
[1650465529.498136] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.498154] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.498157] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.498215] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.498387] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x10f3970: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.498396] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.498430] [ndv4:14949:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.498435] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x1041670 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.498466] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465529.498481] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x10f3970 using ud_mlx5/mlx5_ib0:1 on worker 0x19ac660 | |
[1650465529.498675] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.499419] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.498683] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.498690] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.498722] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465529.498725] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.498733] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x24a9170 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.498760] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465529.499418] [ndv4:13443:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.498368] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2cfe7b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 4) | |
[1650465529.498569] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2cfe7b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 5) | |
[1650465529.498941] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2cfe7b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 6) | |
[1650465529.499787] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.499943] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.500483] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.501699] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.501709] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.501712] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.501726] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.501983] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.501988] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.502406] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x227d050: created RC QP 0xc45e on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.503491] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.504497] [ndv4:15226:0] ud_iface.c:393 UCX DEBUG iface 0x2cfe7b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 7) | |
[1650465529.504518] [ndv4:15226:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.503154] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x227d050 using rc_verbs/mlx5_ib1:1 on worker 0x19ac660 | |
[1650465529.503371] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.503633] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.504106] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.504207] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.504696] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.504714] [ndv4:15226:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.504724] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x16ba9b0 [id=138 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.504763] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 138 events 0x5 mode thread_spinlock | |
[1650465529.504796] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[45]=0x2cfe7b0 using ud_mlx5/mlx5_ib7:1 on worker 0x1f938d0 | |
[1650465529.504868] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool uct_scopy_iface_tx_mp: align 64, maxelems 4294967295, elemsize 736 | |
[1650465529.505167] [ndv4:15226:0] ucp_worker.c:1159 UCX DEBUG created interface[46]=0x57ac880 using cma/memory on worker 0x1f938d0 | |
[1650465529.505175] [ndv4:15226:0] ucp_worker.c:982 UCX DEBUG selected scalable tl bitmap: 0x7fffffffffff 0x0 (47 tls) | |
[1650465529.505664] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.505685] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.505689] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.505741] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.506027] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.506037] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.506040] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.506055] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.506351] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4273010 of 8176 bytes with 127 elements | |
[1650465529.506399] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2290010 of 8176 bytes with 127 elements | |
[1650465529.506686] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.506694] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.506732] [ndv4:14949:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465529.506736] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.506747] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x2285fc0 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.506775] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465529.506784] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x20fe010 using rc_mlx5/mlx5_ib1:1 on worker 0x19ac660 | |
[1650465529.507121] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.507144] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.507394] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.507418] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.506698] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.506709] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.506773] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465529.506777] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.506793] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x3e52ae0 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.506822] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465529.506844] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x409f030 using rc_mlx5/mlx5_ib3:1 on worker 0x29ca770 | |
[1650465529.507222] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.507235] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.507389] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.507418] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.508271] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.508440] [ndv4:12741:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b7353c83000 length 12288 | |
[1650465529.508507] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.508962] [ndv4:13443:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x37fb010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc481 | |
[1650465529.510127] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.510266] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.510279] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b7507de8008 of 151544 bytes with 1052 elements | |
[1650465529.509900] [ndv4:12741:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465529.509909] [ndv4:12741:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b7358000000 length 4296704 | |
[1650465529.509915] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b7358000018 of 4296680 bytes with 512 elements | |
[1650465529.510170] [ndv4:12741:0] mm_iface.c:600 UCX DEBUG created mm iface 0x2e58d10 FIFO id 0x400000006f2aa999 va 0x2b7353c83000 size 12288 (128 x 64 elems) | |
[1650465529.510322] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x2e58d10 using posix/memory on worker 0x37328d0 | |
[1650465529.510345] [ndv4:12741:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465529.510384] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.510402] [ndv4:12741:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465529.510411] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b7358419018 of 4296680 bytes with 512 elements | |
[1650465529.511050] [ndv4:12741:0] mm_iface.c:600 UCX DEBUG created mm iface 0x2e592e0 FIFO id 0x64803d va 0x2b7353c86000 size 12288 (128 x 64 elems) | |
[1650465529.511060] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x2e592e0 using sysv/memory on worker 0x37328d0 | |
[1650465529.511072] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465529.511075] [ndv4:12741:0] self.c:220 UCX DEBUG created self iface id 0x19986a38e0b4fb7d send_size 8192 | |
[1650465529.511081] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x2e6cec0 using self/memory0 on worker 0x37328d0 | |
[1650465529.511103] [ndv4:12741:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.511108] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.511111] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.515348] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x516a7a0 [id=75 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.515389] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 75 events 0x0 mode thread_spinlock | |
[1650465529.515437] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x516a5d0 [id=76 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.515443] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 76 events 0x0 mode thread_spinlock | |
[1650465529.515458] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x516a610 [id=77 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.515465] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 77 events 0x0 mode thread_spinlock | |
[1650465529.514360] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2e6f000 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.514398] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465529.514415] [ndv4:12741:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2e6d820: listening for connections (fd=78) on 10.5.0.5:39592 | |
[1650465529.514886] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x2e6d820 using tcp/eth0 on worker 0x37328d0 | |
[1650465529.514906] [ndv4:12741:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.514910] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.514913] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.514952] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2dbb5c0 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.514979] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465529.514983] [ndv4:12741:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2e6dea0: listening for connections (fd=80) on 127.0.0.1:39725 | |
[1650465529.514999] [ndv4:12741:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.515005] [ndv4:12741:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.515669] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x2e6dea0 using tcp/lo on worker 0x37328d0 | |
[1650465529.515687] [ndv4:12741:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.515690] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.515692] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.515733] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2dc9960 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.515766] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465529.515770] [ndv4:12741:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2e55530: listening for connections (fd=82) on 172.16.1.242:41065 | |
[1650465529.514229] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b750ee00000..0x2b7511400000 on mlx5_ib1 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465529.514248] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b750ee00018 of 39845864 bytes with 4752 elements | |
[1650465529.514386] [ndv4:13443:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x37fb010 | |
[1650465529.514424] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x37fb010 using dc_mlx5/mlx5_ib1:1 on worker 0x2d7f8d0 | |
[1650465529.514805] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.515525] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.515893] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.516325] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.516361] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x2e55530 using tcp/ib0 on worker 0x37328d0 | |
[1650465529.517559] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.519112] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2cfe600 [id=79 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.519151] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 79 events 0x0 mode thread_spinlock | |
[1650465529.519169] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.519175] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.519274] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.519279] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.519346] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.519352] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.519715] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.519722] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.520196] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.520202] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.519059] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.519075] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.519078] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.519137] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.519661] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.519668] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.519701] [ndv4:14949:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465529.519706] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.519713] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x10d6590 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.519746] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465529.520165] [ndv4:14949:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.519534] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.519556] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.519559] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.519670] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.520151] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.520159] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.520200] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465529.520203] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.520216] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x20f1c90 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.520248] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465529.520834] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.520841] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.520926] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.520931] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.521161] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.521166] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.521002] [ndv4:14205:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.521807] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.521814] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.521883] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.521887] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.522279] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.522284] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.522723] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.522730] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.523001] [ndv4:15226:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.523005] [ndv4:15226:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.523341] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2cfe640 [id=81 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.523381] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 81 events 0x0 mode thread_spinlock | |
[1650465529.525389] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.525761] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.526339] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.526402] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.527225] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.528967] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.529005] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.529009] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.529027] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.528351] [ndv4:14949:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2428010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc48d | |
[1650465529.529442] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.529451] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.530078] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2cfe680 [id=83 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530107] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 83 events 0x0 mode thread_spinlock | |
[1650465529.530189] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2cfe6c0 [id=84 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530213] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 84 events 0x0 mode thread_spinlock | |
[1650465529.530233] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2cfe730 [id=86 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530257] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 86 events 0x0 mode thread_spinlock | |
[1650465529.530306] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2cfe770 [id=90 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530327] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 90 events 0x0 mode thread_spinlock | |
[1650465529.530343] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3a1fec0 [id=91 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530365] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 91 events 0x0 mode thread_spinlock | |
[1650465529.530381] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3a1ff00 [id=93 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530407] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 93 events 0x0 mode thread_spinlock | |
[1650465529.530444] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3a1ff40 [id=97 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530470] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 97 events 0x0 mode thread_spinlock | |
[1650465529.530486] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x3a1ff80 [id=98 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530508] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 98 events 0x0 mode thread_spinlock | |
[1650465529.530523] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2dd9b00 [id=100 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530543] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 100 events 0x0 mode thread_spinlock | |
[1650465529.530672] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2dd9b40 [id=104 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530707] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 104 events 0x0 mode thread_spinlock | |
[1650465529.530724] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2dd9b80 [id=105 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530751] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 105 events 0x0 mode thread_spinlock | |
[1650465529.530767] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2dd9bc0 [id=107 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530792] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 107 events 0x0 mode thread_spinlock | |
[1650465529.530829] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2dd9c00 [id=111 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530850] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 111 events 0x0 mode thread_spinlock | |
[1650465529.530867] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2dd9c40 [id=112 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530891] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 112 events 0x0 mode thread_spinlock | |
[1650465529.530905] [ndv4:15226:0] async.c:228 UCX DE[1650465529.530963] [ndv4:14205:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4275050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc44e | |
[1650465529.530331] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x2e63670: created RC QP 0xde78 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
BUG added async handler 0x2dd9c80 [id=114 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530939] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 114 events 0x0 mode thread_spinlock | |
[1650465529.530975] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x2dd9cc0 [id=118 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.530999] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 118 events 0x0 mode thread_spinlock | |
[1650465529.531015] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x57acdd0 [id=119 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.531034] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 119 events 0x0 mode thread_spinlock | |
[1650465529.531048] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x57ace10 [id=121 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.531074] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 121 events 0x0 mode thread_spinlock | |
[1650465529.531110] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x57ace50 [id=125 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.531133] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 125 events 0x0 mode thread_spinlock | |
[1650465529.531149] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x57ace90 [id=126 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.531171] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 126 events 0x0 mode thread_spinlock | |
[1650465529.531187] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x57aced0 [id=128 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.531211] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 128 events 0x0 mode thread_spinlock | |
[1650465529.531248] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x57acf10 [id=132 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.531266] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 132 events 0x0 mode thread_spinlock | |
[1650465529.531281] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x57acf50 [id=133 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.531299] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 133 events 0x0 mode thread_spinlock | |
[1650465529.531314] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x57acf90 [id=135 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465529.531332] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 135 events 0x0 mode thread_spinlock | |
[1650465529.531750] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.531844] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.531856] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2afdb4629008 of 151544 bytes with 1052 elements | |
[1650465529.532635] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.532645] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.533180] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.534719] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.535101] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x36fa420: created UD QP 0xc46d on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.534869] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.535121] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.535152] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ad4b7f9f008 of 151544 bytes with 1052 elements | |
[1650465529.535705] [ndv4:15226:0] async.c:228 UCX DEBUG added async handler 0x5887af0 [id=139 ref 1] uct_rdmacm_cm_event_handler() to hash | |
[1650465529.535769] [ndv4:15226:0] async.c:506 UCX DEBUG listening to async event fd 139 events 0x1 mode thread_spinlock | |
[1650465529.535797] [ndv4:15226:0] rdmacm_cm.c:922 UCX DEBUG created rdmacm_cm 0x58874f0 with event_channel 0x57acfd0 (fd=139) | |
[1650465529.535861] [ndv4:15226:0] tcp_sockcm.c:186 UCX DEBUG created tcp_sockcm 0x38227e0 | |
[1650465529.535882] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ucp_requests: align 64, maxelems 4294967295, elemsize 440 | |
[1650465529.535886] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ucp_rkeys: align 64, maxelems 4294967295, elemsize 168 | |
[1650465529.535891] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ucp_am_bufs: align 64, maxelems 4294967295, elemsize 65624 | |
[1650465529.535894] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ucp_reg_bufs: align 64, maxelems 4294967295, elemsize 8208 | |
[1650465529.535896] [ndv4:15226:0] mpool.c:88 UCX DEBUG mpool ucp_rndv_frags: align 512, maxelems 4294967295, elemsize 524304 | |
[1650465529.535989] [ndv4:15226:0] parser.c:1893 UCX INFO UCX_* env variables: UCX_POSIX_USE_PROC_LINK=n UCX_LOG_LEVEL=debug | |
[1650465529.535733] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.536134] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.536627] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.537167] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.537683] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.535890] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb2000000..0x2afdb4600000 on mlx5_ib3 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465529.535905] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2afdb2000018 of 39845864 bytes with 4752 elements | |
[1650465529.536049] [ndv4:14205:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4275050 | |
[1650465529.536081] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x4275050 using dc_mlx5/mlx5_ib3:1 on worker 0x29ca770 | |
[1650465529.536534] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.536728] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.537153] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.537372] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.538434] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.538210] [ndv4:13552:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b265bf47000 length 12288 | |
[1650465529.538297] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.539998] [ndv4:13552:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465529.540011] [ndv4:13552:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b2660ec7000 length 4296704 | |
[1650465529.540018] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b2660ec7018 of 4296680 bytes with 512 elements | |
[1650465529.540305] [ndv4:13552:0] mm_iface.c:600 UCX DEBUG created mm iface 0x1b36d10 FIFO id 0x400000002adac201 va 0x2b265bf47000 size 12288 (128 x 64 elems) | |
[1650465529.538161] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7507e0d000..0x2b7507e92000 on mlx5_ib1 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465529.538168] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7507e0d018 of 544744 bytes with 128 elements | |
[1650465529.538173] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.538455] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x36fa420: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465529.539302] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x36fa420: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465529.539735] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x36fa420: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465529.540288] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x36fa420: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465529.539134] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c0a00000..0x2ad4c3000000 on mlx5_ib1 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465529.539148] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ad4c0a00018 of 39845864 bytes with 4752 elements | |
[1650465529.539300] [ndv4:14949:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2428010 | |
[1650465529.539337] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x2428010 using dc_mlx5/mlx5_ib1:1 on worker 0x19ac660 | |
[1650465529.539702] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.539914] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.540084] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.540118] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.541109] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.540737] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x1b36d10 using posix/memory on worker 0x24108d0 | |
[1650465529.540767] [ndv4:13552:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465529.540823] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.540845] [ndv4:13552:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465529.540855] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b26612e0018 of 4296680 bytes with 512 elements | |
[1650465529.541452] [ndv4:13552:0] mm_iface.c:600 UCX DEBUG created mm iface 0x1b372e0 FIFO id 0x64803f va 0x2b265bf4a000 size 12288 (128 x 64 elems) | |
[1650465529.541461] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x1b372e0 using sysv/memory on worker 0x24108d0 | |
[1650465529.541475] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465529.541478] [ndv4:13552:0] self.c:220 UCX DEBUG created self iface id 0xdd7e8e124eead7c send_size 8192 | |
[1650465529.541486] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x1b4aec0 using self/memory0 on worker 0x24108d0 | |
[1650465529.541513] [ndv4:13552:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.541519] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.541522] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.540844] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x36fa420: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465529.541408] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x36fa420: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465529.542212] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x36fa420: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465529.542346] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.542796] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x2327300: created UD QP 0xc46e on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.543226] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x36fa420: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465529.544301] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.544817] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.545178] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.545911] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.546180] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.544969] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1b4d000 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.545015] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465529.545038] [ndv4:13552:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1b4b820: listening for connections (fd=78) on 10.5.0.5:45569 | |
[1650465529.545957] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x1b4b820 using tcp/eth0 on worker 0x24108d0 | |
[1650465529.545981] [ndv4:13552:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.545984] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.545987] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.546274] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1a995c0 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.546303] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465529.546307] [ndv4:13552:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1b4bea0: listening for connections (fd=80) on 127.0.0.1:53720 | |
[1650465529.546323] [ndv4:13552:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.546329] [ndv4:13552:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.546661] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x1b4bea0 using tcp/lo on worker 0x24108d0 | |
[1650465529.546679] [ndv4:13552:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.546683] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.546685] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.546727] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1aa7960 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.546755] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465529.546759] [ndv4:13552:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1b33530: listening for connections (fd=82) on 172.16.1.242:37867 | |
[1650465529.544808] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x2e63670 using rc_verbs/mlx5_ib0:1 on worker 0x37328d0 | |
[1650465529.545164] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.545565] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.546149] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.546364] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.543493] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.543502] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x24a8500 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.543533] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465529.543556] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x36fa420 using ud_verbs/mlx5_ib1:1 on worker 0x2d7f8d0 | |
[1650465529.543643] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.543886] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.544712] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.545089] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.545662] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.546671] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c30fa000..0x2ad4c317f000 on mlx5_ib1 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465529.546678] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ad4c30fa018 of 544744 bytes with 128 elements | |
[1650465529.546682] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.546995] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.547259] [ndv4:12741:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465529.548703] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.548716] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.548720] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.548777] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.547065] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.547410] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x35c1280: created UD QP 0xc46f on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.547417] [ndv4:13443:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.548042] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.547202] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x1b33530 using tcp/ib0 on worker 0x24108d0 | |
[1650465529.547700] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.548016] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.548475] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.548863] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.549086] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.549143] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.549315] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.549537] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.550729] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7507e92000..0x2b7507f17000 on mlx5_ib1 lkey 0x80700 rkey 0x80700 access 0xf flags 0x3e4 | |
[1650465529.550741] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7507e92018 of 544744 bytes with 128 elements | |
[1650465529.550747] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.550856] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x35c1280: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465529.551164] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x35c1280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465529.549862] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.550637] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.550678] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.550682] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.550701] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.550966] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.550974] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.551637] [ndv4:13552:0] ib_iface.c:994 UCX DEBUG iface=0x1b41670: created RC QP 0xde79 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.549257] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x37fc010 of 8176 bytes with 127 elements | |
[1650465529.549533] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.549559] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.549668] [ndv4:12741:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.549675] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.552026] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x35c1280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465529.549991] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.550466] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x4451060: created UD QP 0xc444 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.551146] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.552209] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.552692] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.552860] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.552957] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.553389] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb464e000..0x2afdb46d3000 on mlx5_ib3 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465529.553396] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdb464e018 of 544744 bytes with 128 elements | |
[1650465529.553401] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.557175] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2dc8c70 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.557212] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465529.557236] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x2e710c0 using rc_mlx5/mlx5_ib0:1 on worker 0x37328d0 | |
[1650465529.557885] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.558081] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.558390] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.558782] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.558721] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2327300: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465529.558852] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2327300: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465529.559084] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.560087] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.560096] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.560099] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.560113] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.560568] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.560627] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.560659] [ndv4:12741:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.560663] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.560673] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2dc7b40 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.560699] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465529.561234] [ndv4:12741:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.567210] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x35c1280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465529.568080] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x35c1280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465529.568668] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x35c1280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465529.568827] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4451060: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465529.569541] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4451060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465529.571134] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2327300: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465529.570189] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4451060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465529.570888] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4451060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465529.571118] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4451060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465529.571418] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4451060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465529.570310] [ndv4:12741:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3a66010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdea3 | |
[1650465529.571049] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.571286] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.571308] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b7353c8b008 of 151544 bytes with 1052 elements | |
[1650465529.572697] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4451060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465529.572722] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4451060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465529.573094] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.573112] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x3810fc0 [id=109 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.573149] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 109 events 0x5 mode thread_spinlock | |
[1650465529.573180] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[24]=0x4451060 using ud_verbs/mlx5_ib3:1 on worker 0x29ca770 | |
[1650465529.573400] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.573479] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.573985] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.574007] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.574622] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.575319] [ndv4:12741:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7358a00000..0x2b735b000000 on mlx5_ib0 lkey 0x81900 rkey 0x81900 access 0xf flags 0x3e4 | |
[1650465529.575332] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b7358a00018 of 39845864 bytes with 4752 elements | |
[1650465529.575476] [ndv4:12741:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3a66010 | |
[1650465529.575505] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x3a66010 using dc_mlx5/mlx5_ib0:1 on worker 0x37328d0 | |
[1650465529.575745] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.575811] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.576138] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.576332] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.577065] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.576542] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.576551] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.577642] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x456f460: created UD QP 0xc445 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.577691] [ndv4:14205:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.578071] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.578359] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.578940] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2327300: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465529.579078] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2327300: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465529.579558] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2327300: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465529.578897] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x1b41670 using rc_verbs/mlx5_ib0:1 on worker 0x24108d0 | |
[1650465529.579033] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.579086] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.579257] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.579649] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x2e79a30: created UD QP 0xde86 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.581016] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2327300: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465529.580314] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.581290] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.581691] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.582274] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.582351] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.581764] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2327300: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465529.582054] [ndv4:14949:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.582060] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x23270c0 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.582092] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465529.582106] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x2327300 using ud_verbs/mlx5_ib1:1 on worker 0x19ac660 | |
[1650465529.582774] [ndv4:12741:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7353cb0000..0x2b7353d35000 on mlx5_ib0 lkey 0x81a00 rkey 0x81a00 access 0xf flags 0x3e4 | |
[1650465529.582782] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7353cb0018 of 544744 bytes with 128 elements | |
[1650465529.582787] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.584405] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x2e79a30: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.585211] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x2e79a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.585405] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x2e79a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.585558] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x2e79a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.589830] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x35c1280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465529.590457] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x35c1280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465529.590468] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.590673] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.590683] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x34d9be0 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.590716] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465529.590734] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x35c1280 using ud_mlx5/mlx5_ib1:1 on worker 0x2d7f8d0 | |
[1650465529.591169] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.591236] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.591440] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.591446] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.590814] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.590968] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.591018] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.591108] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.591556] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb46d3000..0x2afdb4758000 on mlx5_ib3 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465529.591562] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdb46d3018 of 544744 bytes with 128 elements | |
[1650465529.591566] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.591177] [ndv4:14419:0] debug.c:1198 UCX DEBUG using signal stack 0x2b79b7fdb000 size 141824 | |
[1650465529.591264] [ndv4:14419:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465529.591284] [ndv4:14419:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b79c23f9000 | |
[1650465529.591314] [ndv4:14419:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465529.591323] [ndv4:14419:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465529.591330] [ndv4:14419:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465529.591896] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.592305] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.592329] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.592785] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.593060] [ndv4:13552:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465529.594465] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.594477] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.594555] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.594565] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.594206] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.594216] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.594220] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.594278] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.594943] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x24da010 of 8176 bytes with 127 elements | |
[1650465529.595303] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.595328] [ndv4:13552:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.595387] [ndv4:13552:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.595393] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.594811] [ndv4:14419:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465529.594834] [ndv4:14419:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465529.594869] [ndv4:14419:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465529.594872] [ndv4:14419:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465529.594879] [ndv4:14419:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465529.594887] [ndv4:14419:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465529.594889] [ndv4:14419:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465529.594894] [ndv4:14419:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465529.594896] [ndv4:14419:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465529.594899] [ndv4:14419:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465529.594902] [ndv4:14419:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465529.594904] [ndv4:14419:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465529.594912] [ndv4:14419:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465529.594822] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.595864] [ndv4:14419:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465529.596861] [ndv4:14419:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465529.597044] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x2e79a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.597071] [ndv4:14419:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465529.597720] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x2e79a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.598726] [ndv4:14419:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465529.598748] [ndv4:14419:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465529.598972] [ndv4:14419:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465529.599228] [ndv4:14419:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465529.599243] [ndv4:14419:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465529.599891] [ndv4:14419:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465529.604958] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.605021] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.604163] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x456f460: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465529.604326] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x456f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465529.604367] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x456f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465529.604383] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x456f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465529.604410] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x456f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465529.604429] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x456f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465529.604468] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x456f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465529.604900] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x456f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465529.604907] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.604951] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.604957] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x456ff60 [id=110 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.604993] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 110 events 0x5 mode thread_spinlock | |
[1650465529.605009] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[25]=0x456f460 using ud_mlx5/mlx5_ib3:1 on worker 0x29ca770 | |
[1650465529.605039] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.605046] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.605181] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.605196] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.605257] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1aa6c70 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.605299] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465529.605324] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x1b4f0c0 using rc_mlx5/mlx5_ib0:1 on worker 0x24108d0 | |
[1650465529.605390] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.605404] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.605496] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.605501] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.605607] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.605812] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.606848] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.606859] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.606862] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.606880] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.607350] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.607358] [ndv4:13552:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.607399] [ndv4:13552:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.607403] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.607413] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1aa5b40 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.607439] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465529.607810] [ndv4:14419:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.608037] [ndv4:13552:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.608270] [ndv4:14419:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465529.608921] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.609419] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x2e79a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.609902] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x2e79a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.609410] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x2293070: created UD QP 0xc471 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.609418] [ndv4:14949:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.610217] [ndv4:12741:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.610198] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.610669] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.610676] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.610693] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.610707] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.611248] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c317f000..0x2ad4c3204000 on mlx5_ib1 lkey 0x81400 rkey 0x81400 access 0xf flags 0x3e4 | |
[1650465529.611255] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ad4c317f018 of 544744 bytes with 128 elements | |
[1650465529.611260] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.612442] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2293070: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465529.612887] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2293070: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465529.616305] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2db92f0 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.616343] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465529.616365] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x2e79a30 using ud_verbs/mlx5_ib0:1 on worker 0x37328d0 | |
[1650465529.616555] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.616567] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.616728] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.616735] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.617019] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.618013] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.618377] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x3752750: created UD QP 0xde8f on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.618390] [ndv4:12741:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.619162] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.619181] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.619185] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.619240] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.619144] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.619175] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.619181] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.619215] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.619220] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.619693] [ndv4:12741:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7353d35000..0x2b7353dba000 on mlx5_ib0 lkey 0x81b00 rkey 0x81b00 access 0xf flags 0x3e4 | |
[1650465529.619699] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7353d35018 of 544744 bytes with 128 elements | |
[1650465529.619703] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.620050] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3752750: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.620082] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3752750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.620545] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3752750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.618804] [ndv4:13552:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2744010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdeaf | |
[1650465529.619008] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.619023] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.619046] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b265bf4f008 of 151544 bytes with 1052 elements | |
[1650465529.619872] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.619878] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.619943] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.619965] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.619969] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.620020] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.620727] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.620732] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.621033] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x3bea0a0: created RC QP 0xc4b5 on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.620846] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3752750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.621670] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x466f0a0: created RC QP 0xc4a6 on mlx5_ib4:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.623079] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2293070: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465529.622399] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[26]=0x466f0a0 using rc_verbs/mlx5_ib4:1 on worker 0x29ca770 | |
[1650465529.622553] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.622625] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.622026] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x3bea0a0 using rc_verbs/mlx5_ib2:1 on worker 0x2d7f8d0 | |
[1650465529.622134] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.622144] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.622518] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.622549] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.623306] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.624034] [ndv4:13552:0] ib_md.c:812 UCX DEBUG registered memory 0x2b2661800000..0x2b2663e00000 on mlx5_ib0 lkey 0x81c00 rkey 0x81c00 access 0xf flags 0x3e4 | |
[1650465529.624062] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b2661800018 of 39845864 bytes with 4752 elements | |
[1650465529.624204] [ndv4:13552:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2744010 | |
[1650465529.624241] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x2744010 using dc_mlx5/mlx5_ib0:1 on worker 0x24108d0 | |
[1650465529.624268] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.624278] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.624535] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.624540] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.624714] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.624731] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.624735] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.624785] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.625224] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3f0b010 of 8176 bytes with 127 elements | |
[1650465529.625646] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.625652] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.625687] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465529.625692] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.625702] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x2e53440 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.625730] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465529.625752] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x3d37030 using rc_mlx5/mlx5_ib2:1 on worker 0x2d7f8d0 | |
[1650465529.624836] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2293070: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465529.624861] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2293070: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465529.625243] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2293070: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465529.625457] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2293070: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465529.625486] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2293070: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465529.625492] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.625535] [ndv4:14949:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.625541] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x2106c50 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.625571] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465529.625665] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x2293070 using ud_mlx5/mlx5_ib1:1 on worker 0x19ac660 | |
[1650465529.625680] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.625703] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.626346] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.626351] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.626662] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.626670] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.626936] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.626971] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.627969] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.628651] [ndv4:14421:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465529.628660] [ndv4:14421:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465529.628766] [ndv4:14421:0] ucp_context.c:1556 UCX DEBUG created ucp context 0xf18cd0 0xf18cd0 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465529.629849] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.629867] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.629870] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.629938] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.630386] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3752750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.631506] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3752750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.631535] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3752750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.630345] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.630352] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.630387] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465529.630391] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.630398] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x3d3ff60 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.630427] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465529.631061] [ndv4:13443:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.633960] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.633984] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.634416] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.635688] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.635704] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.635707] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.635761] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.635698] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.636362] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4990010 of 8176 bytes with 127 elements | |
[1650465529.636274] [ndv4:14419:0] async.c:228 UCX DEBUG added async handler 0x8d61e0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.636390] [ndv4:14419:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465529.636399] [ndv4:14419:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465529.636469] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.636610] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.636620] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.636642] [ndv4:14419:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.636702] [ndv4:14419:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465529.636723] [ndv4:14419:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465529.636523] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.637009] [ndv4:13552:0] ib_iface.c:994 UCX DEBUG iface=0x1b57a30: created UD QP 0xde93 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.636678] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.636685] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.636724] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465529.636728] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.636740] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x4677f60 [id=113 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.636769] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 113 events 0x1 mode thread_spinlock | |
[1650465529.636781] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[27]=0x47bc030 using rc_mlx5/mlx5_ib4:1 on worker 0x29ca770 | |
[1650465529.637035] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.637053] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.637221] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.637232] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.637744] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.637796] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.637803] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.638355] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.638393] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.638274] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.638292] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.638309] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.638361] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.637879] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.638710] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.638717] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.639343] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x28170a0: created RC QP 0xc4c1 on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.638785] [ndv4:13552:0] ib_md.c:812 UCX DEBUG registered memory 0x2b265bf74000..0x2b265bff9000 on mlx5_ib0 lkey 0x81d00 rkey 0x81d00 access 0xf flags 0x3e4 | |
[1650465529.638792] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b265bf74018 of 544744 bytes with 128 elements | |
[1650465529.638797] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.639410] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x1b57a30: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.639478] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.639495] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.639498] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.639556] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.639880] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x28170a0 using rc_verbs/mlx5_ib2:1 on worker 0x19ac660 | |
[1650465529.639927] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.639963] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.640064] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.640077] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.640029] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.640036] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.640069] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465529.640072] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.640080] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x3762580 [id=115 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.640107] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 115 events 0x1 mode thread_spinlock | |
[1650465529.639570] [ndv4:13443:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3f0d050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc4cb | |
[1650465529.639884] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.639955] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.639965] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b7507f19008 of 151544 bytes with 1052 elements | |
[1650465529.640888] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.640828] [ndv4:14205:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.641936] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.641950] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.641953] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.642005] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.642387] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2b38010 of 8176 bytes with 127 elements | |
[1650465529.642690] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.642697] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.642736] [ndv4:14949:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465529.642739] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.642749] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x10d4e50 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.642786] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465529.642812] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x2964030 using rc_mlx5/mlx5_ib2:1 on worker 0x19ac660 | |
[1650465529.642938] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.642945] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.643112] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.643129] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.642501] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3752750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.642512] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.642621] [ndv4:12741:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.642626] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2dc76c0 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.642659] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465529.642676] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x3752750 using ud_mlx5/mlx5_ib0:1 on worker 0x37328d0 | |
[1650465529.642905] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.642926] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.643103] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.643129] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.643483] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.643897] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.643886] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7511600000..0x2b7513c00000 on mlx5_ib2 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465529.643911] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b7511600018 of 39845864 bytes with 4752 elements | |
[1650465529.644051] [ndv4:13443:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3f0d050 | |
[1650465529.644087] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x3f0d050 using dc_mlx5/mlx5_ib2:1 on worker 0x2d7f8d0 | |
[1650465529.644269] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.644292] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.644523] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.644552] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.644989] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.644998] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.645001] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.645019] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.645483] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.645523] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.646339] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x4003050: created RC QP 0xc474 on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.647845] [ndv4:14419:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.647857] [ndv4:14419:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.647116] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x4003050 using rc_verbs/mlx5_ib1:1 on worker 0x37328d0 | |
[1650465529.647536] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.647570] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.647865] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.647871] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.648063] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.649392] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.649401] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.649404] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.649422] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.649863] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4016010 of 8176 bytes with 127 elements | |
[1650465529.650089] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.650095] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.650133] [ndv4:12741:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465529.650136] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.650145] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2e5af70 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.650167] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465529.650176] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x3e84010 using rc_mlx5/mlx5_ib1:1 on worker 0x37328d0 | |
[1650465529.650297] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.650311] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.650443] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.650509] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.649357] [ndv4:14419:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.649428] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.649445] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.649322] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x1b57a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.649895] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x1b57a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.650229] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x1b57a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.650489] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x1b57a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.651075] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.651249] [ndv4:14419:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465529.651257] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465529.650501] [ndv4:14205:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4992050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc4b9 | |
[1650465529.650745] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.650833] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.650842] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2afdb6f5a008 of 151544 bytes with 1052 elements | |
[1650465529.652681] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.652713] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.652717] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.652778] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.653215] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.653230] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.653305] [ndv4:12741:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465529.653310] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.653329] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2e6cdd0 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.653369] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465529.653935] [ndv4:12741:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.655249] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.654912] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb4800000..0x2afdb6e00000 on mlx5_ib4 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465529.654925] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2afdb4800018 of 39845864 bytes with 4752 elements | |
[1650465529.655064] [ndv4:14205:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4992050 | |
[1650465529.655097] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[28]=0x4992050 using dc_mlx5/mlx5_ib4:1 on worker 0x29ca770 | |
[1650465529.655263] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.655273] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.655698] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.655718] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.655721] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.655778] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.656186] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.656193] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.656231] [ndv4:14949:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465529.656236] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.656245] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x296ce80 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.656272] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465529.656378] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.656728] [ndv4:14949:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.656826] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x40e9060: created UD QP 0xc4c2 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.657486] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.657593] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.657600] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.657638] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.657643] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.658054] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7507f3e000..0x2b7507fc3000 on mlx5_ib2 lkey 0x81400 rkey 0x81400 access 0xf flags 0x3e4 | |
[1650465529.658060] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7507f3e018 of 544744 bytes with 128 elements | |
[1650465529.658065] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.658307] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x40e9060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465529.658715] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x40e9060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465529.658907] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x40e9060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465529.658989] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x40e9060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465529.659120] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x40e9060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465529.659227] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x40e9060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465529.659252] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x40e9060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465529.659276] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x40e9060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465529.659658] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.659669] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x24b9630 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.659706] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465529.659718] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x40e9060 using ud_verbs/mlx5_ib2:1 on worker 0x2d7f8d0 | |
[1650465529.659729] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.659734] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.659832] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.659837] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.660686] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x1b57a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.660962] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x1b57a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.661278] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x1b57a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.661536] [ndv4:13552:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.661629] [ndv4:14419:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465529.661638] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465529.664303] [ndv4:12741:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x41ae010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc49d | |
[1650465529.664743] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.664751] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.664762] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b7353dbc008 of 151544 bytes with 1052 elements | |
[1650465529.664191] [ndv4:14949:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2b3a050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc4d7 | |
[1650465529.664647] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.664656] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.664667] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ad4b7fc6008 of 151544 bytes with 1052 elements | |
[1650465529.664891] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.664903] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.665412] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.666503] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.666935] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x4b6e250: created UD QP 0xc4af on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.667541] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.667859] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.667875] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.667904] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.667920] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.667467] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1a972f0 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.667506] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465529.667532] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x1b57a30 using ud_verbs/mlx5_ib0:1 on worker 0x24108d0 | |
[1650465529.668358] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c3400000..0x2ad4c5a00000 on mlx5_ib2 lkey 0x81500 rkey 0x81500 access 0xf flags 0x3e4 | |
[1650465529.668378] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ad4c3400018 of 39845864 bytes with 4752 elements | |
[1650465529.668521] [ndv4:14949:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2b3a050 | |
[1650465529.668558] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x2b3a050 using dc_mlx5/mlx5_ib2:1 on worker 0x19ac660 | |
[1650465529.668766] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.668777] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.668907] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.668914] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.669493] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.668389] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb6f7f000..0x2afdb7004000 on mlx5_ib4 lkey 0x80700 rkey 0x80700 access 0xf flags 0x3e4 | |
[1650465529.668397] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdb6f7f018 of 544744 bytes with 128 elements | |
[1650465529.668402] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.668944] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4b6e250: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465529.668971] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4b6e250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465529.669191] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4b6e250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465529.669246] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4b6e250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465529.669319] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4b6e250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465529.669731] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4b6e250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465529.670266] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4b6e250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465529.669328] [ndv4:12741:0] ib_md.c:812 UCX DEBUG registered memory 0x2b735b200000..0x2b735d800000 on mlx5_ib1 lkey 0x81500 rkey 0x81500 access 0xf flags 0x3e4 | |
[1650465529.669343] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b735b200018 of 39845864 bytes with 4752 elements | |
[1650465529.669483] [ndv4:12741:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x41ae010 | |
[1650465529.669514] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x41ae010 using dc_mlx5/mlx5_ib1:1 on worker 0x37328d0 | |
[1650465529.669540] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.669547] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.670343] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.670870] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4b6e250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465529.671502] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.671832] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x24269c0: created UD QP 0xc4ce on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.671839] [ndv4:13443:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.671200] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.671208] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x3e7f3f0 [id=116 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.671236] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 116 events 0x5 mode thread_spinlock | |
[1650465529.671256] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[29]=0x4b6e250 using ud_verbs/mlx5_ib4:1 on worker 0x29ca770 | |
[1650465529.671340] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.671347] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.671429] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.671434] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.671966] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.672548] [ndv4:14419:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465529.672666] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465529.672366] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.672622] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.672629] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.672754] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.672814] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.673180] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7513c1c000..0x2b7513ca1000 on mlx5_ib2 lkey 0x81600 rkey 0x81600 access 0xf flags 0x3e4 | |
[1650465529.673192] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7513c1c018 of 544744 bytes with 128 elements | |
[1650465529.673198] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.673434] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24269c0: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465529.673616] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24269c0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465529.673761] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24269c0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465529.677598] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.677621] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.677683] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.677694] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.679026] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.679036] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.679988] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.679895] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.680236] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x2d16060: created UD QP 0xc4cf on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.680817] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.681442] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.681450] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.681471] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.681475] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.681678] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.682500] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.682997] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x4c8c050: created UD QP 0xc4b0 on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.683004] [ndv4:14205:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.682020] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x40ad3d0: created UD QP 0xc47d on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.681931] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c5a05000..0x2ad4c5a8a000 on mlx5_ib2 lkey 0x81700 rkey 0x81700 access 0xf flags 0x3e4 | |
[1650465529.681938] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ad4c5a05018 of 544744 bytes with 128 elements | |
[1650465529.681942] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.682782] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2d16060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465529.683820] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.683956] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24269c0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465529.684737] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24269c0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465529.683534] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.683808] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.683827] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.684088] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.684093] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.684510] [ndv4:12741:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7353de1000..0x2b7353e66000 on mlx5_ib1 lkey 0x81600 rkey 0x81600 access 0xf flags 0x3e4 | |
[1650465529.684516] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7353de1018 of 544744 bytes with 128 elements | |
[1650465529.684520] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.684834] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x40ad3d0: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465529.685888] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x40ad3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465529.685745] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24269c0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465529.685776] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24269c0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465529.687226] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.688613] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.689076] [ndv4:13552:0] ib_iface.c:994 UCX DEBUG iface=0x2430750: created UD QP 0xde94 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.689091] [ndv4:13552:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.689826] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.689965] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.689973] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.690059] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.690068] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.690550] [ndv4:13552:0] ib_md.c:812 UCX DEBUG registered memory 0x2b2663efa000..0x2b2663f7f000 on mlx5_ib0 lkey 0x81e00 rkey 0x81e00 access 0xf flags 0x3e4 | |
[1650465529.690557] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b2663efa018 of 544744 bytes with 128 elements | |
[1650465529.690561] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.690828] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2430750: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.691012] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2430750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.691131] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2430750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.691417] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2430750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.692231] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.692242] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.692344] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.692350] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.692111] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2d16060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465529.692743] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb7004000..0x2afdb7089000 on mlx5_ib4 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465529.692752] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdb7004018 of 544744 bytes with 128 elements | |
[1650465529.692757] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.693168] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4c8c050: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465529.693376] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4c8c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465529.693433] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4c8c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465529.693518] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4c8c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465529.693757] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4c8c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465529.693825] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4c8c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465529.693984] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4c8c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465529.694164] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x4c8c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465529.694171] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.694205] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.694210] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x464aa10 [id=117 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.694235] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 117 events 0x5 mode thread_spinlock | |
[1650465529.694248] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[30]=0x4c8c050 using ud_mlx5/mlx5_ib4:1 on worker 0x29ca770 | |
[1650465529.694312] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.694318] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.694396] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.694415] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.694807] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x24269c0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465529.694819] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.694852] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.694858] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x36666b0 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.694879] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465529.694890] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x24269c0 using ud_mlx5/mlx5_ib2:1 on worker 0x2d7f8d0 | |
[1650465529.694907] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.694914] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.695058] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.695068] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.694864] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.694737] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x40ad3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465529.695412] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.696087] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.696104] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.696108] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.696159] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.696551] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.696568] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.696571] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.696643] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.696864] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.696870] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.697175] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.697183] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.697794] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x4d8c0a0: created RC QP 0xc443 on mlx5_ib5:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.698112] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x43070a0: created RC QP 0xc446 on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.699010] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x43070a0 using rc_verbs/mlx5_ib3:1 on worker 0x2d7f8d0 | |
[1650465529.699207] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.699215] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.699284] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.699288] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.698610] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[31]=0x4d8c0a0 using rc_verbs/mlx5_ib5:1 on worker 0x29ca770 | |
[1650465529.698858] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.698866] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.699042] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.699047] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.700142] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.700188] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.700792] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2430750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.701175] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2430750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.702115] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2d16060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465529.702071] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.702085] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.702088] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.702137] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.702697] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x50ad010 of 8176 bytes with 127 elements | |
[1650465529.701906] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2430750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.702356] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2430750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.702364] [ndv4:13552:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.702399] [ndv4:13552:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.702404] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1aa56c0 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.702430] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465529.702445] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x2430750 using ud_mlx5/mlx5_ib0:1 on worker 0x24108d0 | |
[1650465529.702656] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.702663] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.702981] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.702988] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.703023] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465529.703026] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.703039] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x3e7f740 [id=120 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.703061] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 120 events 0x1 mode thread_spinlock | |
[1650465529.703072] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[32]=0x4ed9030 using rc_mlx5/mlx5_ib5:1 on worker 0x29ca770 | |
[1650465529.703192] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.703199] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.703363] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.703368] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.704194] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x40ad3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465529.704496] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x40ad3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465529.704997] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x40ad3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465529.705356] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x40ad3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465529.705393] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x40ad3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465529.705710] [ndv4:12741:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.705718] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2e612b0 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.705748] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465529.705760] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x40ad3d0 using ud_verbs/mlx5_ib1:1 on worker 0x37328d0 | |
[1650465529.705876] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.705883] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.706018] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.706025] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.708504] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.708519] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.708522] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.708629] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.709075] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4628010 of 8176 bytes with 127 elements | |
[1650465529.710302] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.710315] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.711642] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.709988] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2d16060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465529.710604] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2d16060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465529.711313] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2d16060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465529.712063] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2d16060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465529.712729] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2d16060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465529.709360] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.709367] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.709403] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465529.709407] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.709418] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x4207ae0 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.709446] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465529.709456] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x4454030 using rc_mlx5/mlx5_ib3:1 on worker 0x2d7f8d0 | |
[1650465529.709530] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.709537] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.709696] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.709702] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.710496] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.711468] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.712938] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.712954] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.712957] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.713015] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.712838] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.712848] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.712851] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.712868] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.712988] [ndv4:14949:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.712997] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x1ea4f00 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.713032] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465529.713043] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x2d16060 using ud_verbs/mlx5_ib2:1 on worker 0x19ac660 | |
[1650465529.713143] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.713149] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.713254] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.713259] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.713209] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.713215] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.713458] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.713467] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.713501] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465529.713505] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.713514] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x4ee1fc0 [id=122 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.713542] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 122 events 0x1 mode thread_spinlock | |
[1650465529.713863] [ndv4:13552:0] ib_iface.c:994 UCX DEBUG iface=0x2ce1050: created RC QP 0xc47f on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.714298] [ndv4:14205:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.714653] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x2ce1050 using rc_verbs/mlx5_ib1:1 on worker 0x24108d0 | |
[1650465529.714696] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.714702] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.714839] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.714844] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.715153] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.716539] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.716549] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.716552] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.716570] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.717064] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2cf4010 of 8176 bytes with 127 elements | |
[1650465529.717277] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.717282] [ndv4:13552:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.717321] [ndv4:13552:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465529.717325] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.717336] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1b38f70 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.717364] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465529.717374] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x2b62010 using rc_mlx5/mlx5_ib1:1 on worker 0x24108d0 | |
[1650465529.717415] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.717422] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.717527] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.717533] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.719325] [ndv4:14419:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465529.719337] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465529.720835] [ndv4:14419:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465529.720842] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465529.721162] [ndv4:14419:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.722361] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.722381] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.722384] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.722440] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.722760] [ndv4:14205:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x50af050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc456 | |
[1650465529.723010] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.723017] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.723026] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2afdb988c008 of 151544 bytes with 1052 elements | |
[1650465529.722900] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.722908] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.722940] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465529.722944] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.722951] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x24a3f40 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.722974] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465529.723586] [ndv4:13443:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.724718] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.725714] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.726005] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x2e34050: created UD QP 0xc4d2 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.726013] [ndv4:14949:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.726322] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.726918] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb7200000..0x2afdb9800000 on mlx5_ib5 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465529.726935] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2afdb7200018 of 39845864 bytes with 4752 elements | |
[1650465529.727121] [ndv4:14205:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x50af050 | |
[1650465529.727155] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[33]=0x50af050 using dc_mlx5/mlx5_ib5:1 on worker 0x29ca770 | |
[1650465529.726691] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.726782] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.726788] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.726815] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.726820] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.727269] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c5a8a000..0x2ad4c5b0f000 on mlx5_ib2 lkey 0x81800 rkey 0x81800 access 0xf flags 0x3e4 | |
[1650465529.727276] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ad4c5a8a018 of 544744 bytes with 128 elements | |
[1650465529.727280] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.727552] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2e34050: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465529.727304] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.727807] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x3f74280: created UD QP 0xc483 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.727816] [ndv4:12741:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.728684] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.728834] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.728844] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.728932] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.728937] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.729436] [ndv4:12741:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7353e66000..0x2b7353eeb000 on mlx5_ib1 lkey 0x81700 rkey 0x81700 access 0xf flags 0x3e4 | |
[1650465529.729443] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7353e66018 of 544744 bytes with 128 elements | |
[1650465529.729447] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.729803] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3f74280: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465529.729892] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3f74280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465529.730106] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.731425] [ndv4:13443:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x462a050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc45c | |
[1650465529.731490] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.731497] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.731506] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b7507fc5008 of 151544 bytes with 1052 elements | |
[1650465529.731392] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.731412] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.731415] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.731472] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.731998] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.732005] [ndv4:13552:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.732038] [ndv4:13552:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465529.732042] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.732050] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1b4add0 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.732074] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465529.732518] [ndv4:13552:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.735331] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7513e00000..0x2b7516400000 on mlx5_ib3 lkey 0x80d00 rkey 0x80d00 access 0xf flags 0x3e4 | |
[1650465529.735347] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b7513e00018 of 39845864 bytes with 4752 elements | |
[1650465529.735487] [ndv4:13443:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x462a050 | |
[1650465529.735522] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x462a050 using dc_mlx5/mlx5_ib3:1 on worker 0x2d7f8d0 | |
[1650465529.735817] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.735828] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.735895] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.735900] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.736289] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.737734] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.738173] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x4806060: created UD QP 0xc452 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.738815] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.738958] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.738966] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.738998] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.739002] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.739934] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.739966] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.740076] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.740081] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.740487] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.739494] [ndv4:13552:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2e8c010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc4ab | |
[1650465529.739612] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.739620] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.739629] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b2666780008 of 151544 bytes with 1052 elements | |
[1650465529.739377] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b75164a2000..0x2b7516527000 on mlx5_ib3 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465529.739384] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b75164a2018 of 544744 bytes with 128 elements | |
[1650465529.739388] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.739548] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4806060: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465529.739927] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4806060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465529.740146] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4806060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465529.740324] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4806060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465529.740470] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4806060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465529.740650] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4806060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465529.740710] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4806060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465529.740747] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4806060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465529.740958] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.740968] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x3bc5e10 [id=109 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.740993] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 109 events 0x5 mode thread_spinlock | |
[1650465529.741005] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[24]=0x4806060 using ud_verbs/mlx5_ib3:1 on worker 0x2d7f8d0 | |
[1650465529.741024] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.741031] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.741081] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.741085] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.741243] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.742316] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.742711] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x4924460: created UD QP 0xc453 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.742719] [ndv4:13443:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.742814] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3f74280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465529.743060] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3f74280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465529.743251] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3f74280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465529.743487] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3f74280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465529.743548] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3f74280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465529.743684] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x3f74280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465529.743693] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.743734] [ndv4:12741:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.743742] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x3e8cc30 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.743772] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465529.743793] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x3f74280 using ud_mlx5/mlx5_ib1:1 on worker 0x37328d0 | |
[1650465529.743873] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.743880] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.743940] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.743944] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.744302] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.745319] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.745335] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.745338] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.745390] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.743195] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.743325] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.743331] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.743342] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.743346] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.743709] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7516527000..0x2b75165ac000 on mlx5_ib3 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465529.743714] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7516527018 of 544744 bytes with 128 elements | |
[1650465529.743718] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.744200] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4924460: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465529.744557] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4924460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465529.744597] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4924460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465529.744611] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4924460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465529.744624] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4924460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465529.744637] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4924460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465529.744651] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4924460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465529.744868] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4924460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465529.744874] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.744904] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.744909] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x24b8f70 [id=110 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.744936] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 110 events 0x5 mode thread_spinlock | |
[1650465529.744946] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[25]=0x4924460 using ud_mlx5/mlx5_ib3:1 on worker 0x2d7f8d0 | |
[1650465529.745003] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.745010] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.745085] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.745091] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.745794] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.745801] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.743283] [ndv4:13552:0] ib_md.c:812 UCX DEBUG registered memory 0x2b2664000000..0x2b2666600000 on mlx5_ib1 lkey 0x81800 rkey 0x81800 access 0xf flags 0x3e4 | |
[1650465529.743301] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b2664000018 of 39845864 bytes with 4752 elements | |
[1650465529.743443] [ndv4:13552:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2e8c010 | |
[1650465529.743478] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x2e8c010 using dc_mlx5/mlx5_ib1:1 on worker 0x24108d0 | |
[1650465529.743551] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.743562] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.743658] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.743663] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.743963] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.745180] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.745504] [ndv4:13552:0] ib_iface.c:994 UCX DEBUG iface=0x2d8b3d0: created UD QP 0xc48e on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.745962] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.745997] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.746475] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.746483] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.746520] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.746528] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.746478] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x459d0a0: created RC QP 0xc4d3 on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.746926] [ndv4:13552:0] ib_md.c:812 UCX DEBUG registered memory 0x2b26667a5000..0x2b266682a000 on mlx5_ib1 lkey 0x81900 rkey 0x81900 access 0xf flags 0x3e4 | |
[1650465529.746932] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b26667a5018 of 544744 bytes with 128 elements | |
[1650465529.746936] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.747409] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.747427] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.747430] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.747482] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.747302] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x459d0a0 using rc_verbs/mlx5_ib2:1 on worker 0x37328d0 | |
[1650465529.747483] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.747490] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.747959] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.747966] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.748819] [ndv4:14421:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2ab787e7e000 length 12288 | |
[1650465529.748907] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.748761] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x4a240a0: created RC QP 0xc4b1 on mlx5_ib4:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.749689] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[26]=0x4a240a0 using rc_verbs/mlx5_ib4:1 on worker 0x2d7f8d0 | |
[1650465529.749795] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.749802] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.749961] [ndv4:14421:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465529.749971] [ndv4:14421:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2ab78e454000 length 4296704 | |
[1650465529.749976] [ndv4:14421:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2ab78e454018 of 4296680 bytes with 512 elements | |
[1650465529.750229] [ndv4:14421:0] mm_iface.c:600 UCX DEBUG created mm iface 0x16ac390 FIFO id 0x400000002dc1aa42 va 0x2ab787e7e000 size 12288 (128 x 64 elems) | |
[1650465529.750282] [ndv4:14421:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x16ac390 using posix/memory on worker 0x1f71f60 | |
[1650465529.750308] [ndv4:14421:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465529.750347] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465529.750365] [ndv4:14421:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465529.750374] [ndv4:14421:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2ab78e86d018 of 4296680 bytes with 512 elements | |
[1650465529.750974] [ndv4:14421:0] mm_iface.c:600 UCX DEBUG created mm iface 0x1698370 FIFO id 0x650006 va 0x2ab787e81000 size 12288 (128 x 64 elems) | |
[1650465529.750982] [ndv4:14421:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x1698370 using sysv/memory on worker 0x1f71f60 | |
[1650465529.750994] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465529.750998] [ndv4:14421:0] self.c:220 UCX DEBUG created self iface id 0x28eaabbf9b96ab84 send_size 8192 | |
[1650465529.751004] [ndv4:14421:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x1698940 using self/memory0 on worker 0x1f71f60 | |
[1650465529.751027] [ndv4:14421:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.751032] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.751035] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.751974] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.752415] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x528b060: created UD QP 0xc44c on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.753037] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.753160] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.753167] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.753224] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.753229] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.753593] [ndv4:14419:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.753507] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0x169d3a0 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.753543] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465529.753562] [ndv4:14421:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x16ace80: listening for connections (fd=78) on 10.5.0.5:52989 | |
[1650465529.753711] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb98b1000..0x2afdb9936000 on mlx5_ib5 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465529.753719] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdb98b1018 of 544744 bytes with 128 elements | |
[1650465529.753725] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.754323] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x528b060: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465529.754418] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x528b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465529.754439] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x528b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465529.754602] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x528b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465529.754695] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x528b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465529.754925] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x528b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465529.755039] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x528b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465529.755209] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x528b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465529.755564] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.755571] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x4d679b0 [id=123 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.755647] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 123 events 0x5 mode thread_spinlock | |
[1650465529.755660] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[34]=0x528b060 using ud_verbs/mlx5_ib5:1 on worker 0x29ca770 | |
[1650465529.755860] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.755866] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.756027] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.756032] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.754225] [ndv4:14419:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465529.755208] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2d8b3d0: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465529.755236] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2d8b3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465529.755252] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2d8b3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465529.755267] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2d8b3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465529.755511] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2d8b3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465529.755761] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2d8b3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465529.755783] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2d8b3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465529.755799] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2d8b3d0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465529.756060] [ndv4:13552:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.756069] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1b3f2b0 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.756098] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465529.756109] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x2d8b3d0 using ud_verbs/mlx5_ib1:1 on worker 0x24108d0 | |
[1650465529.756242] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.756248] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.756405] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.756410] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.753758] [ndv4:14421:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x16ace80 using tcp/eth0 on worker 0x1f71f60 | |
[1650465529.753780] [ndv4:14421:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.753784] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.753786] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.753826] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0x1608900 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.753847] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465529.753852] [ndv4:14421:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x16ad500: listening for connections (fd=80) on 127.0.0.1:60246 | |
[1650465529.753868] [ndv4:14421:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465529.753873] [ndv4:14421:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465529.753930] [ndv4:14421:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x16ad500 using tcp/lo on worker 0x1f71f60 | |
[1650465529.753947] [ndv4:14421:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465529.753950] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465529.753953] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465529.753991] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0x1608630 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465529.754009] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465529.754013] [ndv4:14421:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1694b90: listening for connections (fd=82) on 172.16.1.242:56708 | |
[1650465529.754283] [ndv4:14421:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x1694b90 using tcp/ib0 on worker 0x1f71f60 | |
[1650465529.754312] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.754320] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.754466] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.754471] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.756170] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.757161] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.755856] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.755866] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.756462] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.757471] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x53a9460: created UD QP 0xc44d on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.757477] [ndv4:14205:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.758052] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.758068] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.758071] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.758122] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.758073] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.758237] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.758244] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.758379] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.758384] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.758647] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x48be010 of 8176 bytes with 127 elements | |
[1650465529.760023] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.760033] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.760501] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.758910] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.758916] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.758952] [ndv4:12741:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465529.758956] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.758968] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2e7b5f0 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.759002] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465529.759029] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x46ea030 using rc_mlx5/mlx5_ib2:1 on worker 0x37328d0 | |
[1650465529.759216] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.759222] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.759398] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.759403] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.759734] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.758750] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb9936000..0x2afdb99bb000 on mlx5_ib5 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465529.758755] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdb9936018 of 544744 bytes with 128 elements | |
[1650465529.758760] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.759331] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x53a9460: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465529.759496] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x53a9460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465529.759512] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x53a9460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465529.759944] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x53a9460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465529.759970] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x53a9460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465529.759984] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x53a9460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465529.760543] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x53a9460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465529.760689] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x53a9460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465529.760695] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.760734] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.760738] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x528bf20 [id=124 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.760761] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 124 events 0x5 mode thread_spinlock | |
[1650465529.760771] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[35]=0x53a9460 using ud_mlx5/mlx5_ib5:1 on worker 0x29ca770 | |
[1650465529.760915] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.760921] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.761104] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.761109] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.761114] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.761132] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.761135] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.761191] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.761540] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.761546] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.761607] [ndv4:12741:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465529.761611] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.761618] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2e5b4c0 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.761649] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465529.761667] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.761634] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.761649] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.761653] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.761704] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.762145] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4d45010 of 8176 bytes with 127 elements | |
[1650465529.762211] [ndv4:12741:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.762856] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.762875] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.762878] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.762927] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.762397] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2e34050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465529.762402] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.762408] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.762443] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465529.762447] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.762458] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x4b79da0 [id=113 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.762478] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 113 events 0x1 mode thread_spinlock | |
[1650465529.762489] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[27]=0x4b71030 using rc_mlx5/mlx5_ib4:1 on worker 0x2d7f8d0 | |
[1650465529.762654] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.762660] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.762756] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.762761] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.763148] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.763644] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.763650] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.764505] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.764523] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.764527] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.764635] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.764506] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x54a90a0: created RC QP 0xc494 on mlx5_ib6:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.764989] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.764996] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.765030] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465529.765034] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.765041] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x4b79fc0 [id=115 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.765068] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 115 events 0x1 mode thread_spinlock | |
[1650465529.765137] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[36]=0x54a90a0 using rc_verbs/mlx5_ib6:1 on worker 0x29ca770 | |
[1650465529.765273] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.765280] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.765440] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.765445] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.765643] [ndv4:13443:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.766340] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.767700] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.767716] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.767719] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.767771] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.768376] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x57ca010 of 8176 bytes with 127 elements | |
[1650465529.768662] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.768668] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.768706] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465529.768710] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.768720] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x55fec40 [id=127 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.768744] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 127 events 0x1 mode thread_spinlock | |
[1650465529.768754] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[37]=0x55f6030 using rc_mlx5/mlx5_ib6:1 on worker 0x29ca770 | |
[1650465529.768769] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.768774] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.768891] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.768897] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.768863] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465529.769163] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.770432] [ndv4:12741:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x48c0050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc4e8 | |
[1650465529.770727] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.770735] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.770744] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b7353eed008 of 151544 bytes with 1052 elements | |
[1650465529.773084] [ndv4:13443:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4d47050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc4c7 | |
[1650465529.773183] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.773190] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.773199] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b7518dad008 of 151544 bytes with 1052 elements | |
[1650465529.774468] [ndv4:12741:0] ib_md.c:812 UCX DEBUG registered memory 0x2b735da00000..0x2b7360000000 on mlx5_ib2 lkey 0x81b00 rkey 0x81b00 access 0xf flags 0x3e4 | |
[1650465529.774495] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b735da00018 of 39845864 bytes with 4752 elements | |
[1650465529.774663] [ndv4:12741:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x48c0050 | |
[1650465529.774697] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x48c0050 using dc_mlx5/mlx5_ib2:1 on worker 0x37328d0 | |
[1650465529.774827] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.774840] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.774932] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.774937] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.775320] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.776352] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.776898] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7516600000..0x2b7518c00000 on mlx5_ib4 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465529.776915] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b7516600018 of 39845864 bytes with 4752 elements | |
[1650465529.777051] [ndv4:13443:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4d47050 | |
[1650465529.777087] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[28]=0x4d47050 using dc_mlx5/mlx5_ib4:1 on worker 0x2d7f8d0 | |
[1650465529.777228] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.777239] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.777327] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.777332] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.776764] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x4a9c060: created UD QP 0xc4de on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.777331] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.777502] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.777507] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.777640] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.777645] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.776676] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2e34050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465529.778001] [ndv4:12741:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7353f12000..0x2b7353f97000 on mlx5_ib2 lkey 0x81c00 rkey 0x81c00 access 0xf flags 0x3e4 | |
[1650465529.778007] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7353f12018 of 544744 bytes with 128 elements | |
[1650465529.778012] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.779744] [ndv4:14421:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.780384] [ndv4:14419:0] async.c:228 UCX DEBUG added async handler 0x8cef50 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.780414] [ndv4:14419:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465529.780420] [ndv4:14419:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465529.782937] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.783306] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.783333] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.783337] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.783395] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.783400] [ndv4:13552:0] ib_iface.c:994 UCX DEBUG iface=0x2c52280: created UD QP 0xc48f on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.783408] [ndv4:13552:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.783867] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.783879] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.783921] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465529.783924] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.783937] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x5484360 [id=129 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.783964] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 129 events 0x1 mode thread_spinlock | |
[1650465529.784046] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.784640] [ndv4:14205:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.786763] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4a9c060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465529.786796] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.786340] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2e34050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465529.786912] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2e34050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465529.787129] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2e34050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465529.787457] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2e34050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465529.787823] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x2e34050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465529.787833] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.787867] [ndv4:14949:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.787872] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x2d16e10 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.787899] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465529.787911] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x2e34050 using ud_mlx5/mlx5_ib2:1 on worker 0x19ac660 | |
[1650465529.788245] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.788251] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.788344] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.788348] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.788335] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.788780] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x4f23220: created UD QP 0xc4bd on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.789152] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.789169] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.789177] [ndv4:14419:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.789256] [ndv4:14419:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465529.790680] [ndv4:14419:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.790686] [ndv4:14419:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.790921] [ndv4:14419:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.791038] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.791044] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.789495] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.789542] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.789549] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.789631] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.789637] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.790035] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7518dd2000..0x2b7518e57000 on mlx5_ib4 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465529.790041] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7518dd2018 of 544744 bytes with 128 elements | |
[1650465529.790046] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.790847] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4f23220: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465529.791212] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4f23220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465529.791523] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4f23220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465529.789069] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.792653] [ndv4:14205:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x57cc050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc4b7 | |
[1650465529.792829] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.792836] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.792844] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2afdbc1be008 of 151544 bytes with 1052 elements | |
[1650465529.793748] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.793761] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.793921] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465529.793926] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465529.794370] [ndv4:13552:0] ib_md.c:812 UCX DEBUG registered memory 0x2b266682a000..0x2b26668af000 on mlx5_ib1 lkey 0x81a00 rkey 0x81a00 access 0xf flags 0x3e4 | |
[1650465529.794376] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b266682a018 of 544744 bytes with 128 elements | |
[1650465529.794380] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.794917] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2c52280: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465529.795408] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2c52280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465529.795708] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2c52280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465529.796746] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdb9a00000..0x2afdbc000000 on mlx5_ib6 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465529.796767] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2afdb9a00018 of 39845864 bytes with 4752 elements | |
[1650465529.796905] [ndv4:14205:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x57cc050 | |
[1650465529.796943] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[38]=0x57cc050 using dc_mlx5/mlx5_ib6:1 on worker 0x29ca770 | |
[1650465529.797246] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.797256] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.797341] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.797346] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.797903] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.798217] [ndv4:14421:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.798260] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.798263] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.798278] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.798723] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.798733] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.799698] [ndv4:14421:0] ib_iface.c:994 UCX DEBUG iface=0x16a2cd0: created RC QP 0xde95 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.800702] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.800721] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.800724] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.800776] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.801240] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.801247] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.802007] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x2f340a0: created RC QP 0xc454 on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.802816] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x2f340a0 using rc_verbs/mlx5_ib3:1 on worker 0x19ac660 | |
[1650465529.802894] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.802900] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.802964] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.802969] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.801628] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4f23220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465529.802320] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4f23220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465529.802778] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4f23220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465529.803450] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4f23220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465529.803471] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x4f23220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465529.801686] [ndv4:14421:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x16a2cd0 using rc_verbs/mlx5_ib0:1 on worker 0x1f71f60 | |
[1650465529.801744] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.801751] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.801913] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.801919] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.802329] [ndv4:14421:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.803760] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.803769] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x42343a0 [id=116 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.803799] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 116 events 0x5 mode thread_spinlock | |
[1650465529.803811] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[29]=0x4f23220 using ud_verbs/mlx5_ib4:1 on worker 0x2d7f8d0 | |
[1650465529.803910] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.803916] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.804010] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.804015] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.803666] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.804862] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2c52280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465529.805281] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2c52280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465529.804323] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.805913] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.806277] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x5041050: created UD QP 0xc4be on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.806285] [ndv4:13443:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.806785] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.806902] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.806908] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.806922] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.806925] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.807213] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.807655] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x59a8340: created UD QP 0xc49d on mlx5_ib6:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.808275] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.808779] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.808788] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.808983] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.808988] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.809417] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdbc1e3000..0x2afdbc268000 on mlx5_ib6 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465529.809423] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdbc1e3018 of 544744 bytes with 128 elements | |
[1650465529.809427] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.810173] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x59a8340: adding gid fe80::15:5dff:fd34:1 to hash on device mlx5_ib6 port 1 index 0) | |
[1650465529.807290] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7518e57000..0x2b7518edc000 on mlx5_ib4 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465529.807297] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7518e57018 of 544744 bytes with 128 elements | |
[1650465529.807302] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.807835] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5041050: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465529.808384] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5041050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465529.809889] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5041050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465529.810328] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5041050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465529.811041] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5041050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465529.810566] [ndv4:14419:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465529.810591] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465529.810985] [ndv4:14421:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465529.811858] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5041050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465529.812081] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5041050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465529.812102] [ndv4:14421:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.812111] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.812114] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.812162] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.812663] [ndv4:14421:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x203b010 of 8176 bytes with 127 elements | |
[1650465529.814261] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.814275] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.814278] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.814331] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.814793] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3255010 of 8176 bytes with 127 elements | |
[1650465529.815014] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.815019] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.815057] [ndv4:14949:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465529.815061] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.815071] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x10d1d40 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.815091] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465529.815101] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x3081030 using rc_mlx5/mlx5_ib3:1 on worker 0x19ac660 | |
[1650465529.814922] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4a9c060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465529.812925] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.812950] [ndv4:14421:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.813003] [ndv4:14421:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.813009] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.813967] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2c52280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465529.814250] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2c52280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465529.814516] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x2c52280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465529.814523] [ndv4:13552:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.814559] [ndv4:13552:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.814564] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x2b6ac30 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.814645] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465529.814659] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x2c52280 using ud_mlx5/mlx5_ib1:1 on worker 0x24108d0 | |
[1650465529.815001] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.815008] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.815280] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.815286] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.818805] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x59a8340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 1) | |
[1650465529.819106] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x59a8340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 2) | |
[1650465529.819549] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0x15f8390 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.819660] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465529.819679] [ndv4:14421:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x16ae710 using rc_mlx5/mlx5_ib0:1 on worker 0x1f71f60 | |
[1650465529.819860] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.819867] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.820060] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.820065] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.821828] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5041050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465529.821837] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.821867] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.821872] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x4a2cc00 [id=117 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.821894] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 117 events 0x5 mode thread_spinlock | |
[1650465529.821904] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[30]=0x5041050 using ud_mlx5/mlx5_ib4:1 on worker 0x2d7f8d0 | |
[1650465529.821972] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.821978] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.822035] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.822040] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.822403] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.823971] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.823981] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.824052] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.824056] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.823661] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.823678] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.823681] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.823732] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.824171] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.824177] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.824912] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x51410a0: created RC QP 0xc44e on mlx5_ib5:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.823616] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4a9c060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465529.824072] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4a9c060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465529.824459] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4a9c060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465529.824922] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4a9c060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465529.825352] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4a9c060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465529.825709] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[31]=0x51410a0 using rc_verbs/mlx5_ib5:1 on worker 0x2d7f8d0 | |
[1650465529.825984] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.825992] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.826124] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.826129] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.826736] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.830763] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x59a8340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 3) | |
[1650465529.830807] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x59a8340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 4) | |
[1650465529.830823] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x59a8340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 5) | |
[1650465529.831331] [ndv4:14419:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465529.831339] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465529.834391] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.834704] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.835366] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4a9c060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465529.835669] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.835687] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.835690] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.835745] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.836655] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.836663] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.835721] [ndv4:12741:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.835732] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x4019e70 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.835756] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465529.835777] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x4a9c060 using ud_verbs/mlx5_ib2:1 on worker 0x37328d0 | |
[1650465529.835843] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.835849] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.835929] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.835935] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.836592] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.836145] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.836160] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.836163] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.836219] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.836659] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.836666] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.836699] [ndv4:14949:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465529.836703] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.836709] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x1053e70 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.836729] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465529.837311] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.837365] [ndv4:13552:0] ib_iface.c:994 UCX DEBUG iface=0x327b0a0: created RC QP 0xc4df on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.837267] [ndv4:14949:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.837678] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x4bba050: created UD QP 0xc4e0 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.837687] [ndv4:12741:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.838109] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x327b0a0 using rc_verbs/mlx5_ib2:1 on worker 0x24108d0 | |
[1650465529.838279] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.838286] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.838423] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.838429] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.838396] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.838574] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.838622] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.838641] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.838646] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.838701] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.838716] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.838719] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.838773] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.839256] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x5462010 of 8176 bytes with 127 elements | |
[1650465529.839524] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.839532] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.839569] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465529.839635] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.839647] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x4234e50 [id=120 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.839667] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 120 events 0x1 mode thread_spinlock | |
[1650465529.839678] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[32]=0x528e030 using rc_mlx5/mlx5_ib5:1 on worker 0x2d7f8d0 | |
[1650465529.839755] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.839762] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.839843] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.839848] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.839058] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.839084] [ndv4:12741:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7360035000..0x2b73600ba000 on mlx5_ib2 lkey 0x81d00 rkey 0x81d00 access 0xf flags 0x3e4 | |
[1650465529.839091] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b7360035018 of 544744 bytes with 128 elements | |
[1650465529.839095] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.839130] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4bba050: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465529.839146] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4bba050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465529.839222] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4bba050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465529.839633] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4bba050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465529.840504] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.840661] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.840679] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.840682] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.840733] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.841206] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x359c010 of 8176 bytes with 127 elements | |
[1650465529.840747] [ndv4:14421:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.842075] [ndv4:14421:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.842086] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.842089] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.842105] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.841797] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.841814] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.841818] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.841876] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.842247] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.842254] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.842288] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465529.842292] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.842300] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x5296f80 [id=122 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.842325] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 122 events 0x1 mode thread_spinlock | |
[1650465529.841687] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.841696] [ndv4:13552:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.841731] [ndv4:13552:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465529.841735] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.841748] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1b595f0 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.841772] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465529.841795] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x33c8030 using rc_mlx5/mlx5_ib2:1 on worker 0x24108d0 | |
[1650465529.841815] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.841822] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.841978] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.841983] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.842347] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.842494] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.842502] [ndv4:14421:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.842534] [ndv4:14421:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465529.842538] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.842548] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0x1606a40 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.842569] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465529.842934] [ndv4:13443:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.843307] [ndv4:14421:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.844650] [ndv4:14949:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3257050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc46a | |
[1650465529.844736] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.844743] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.844751] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ad4c8310008 of 151544 bytes with 1052 elements | |
[1650465529.848421] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c5c00000..0x2ad4c8200000 on mlx5_ib3 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465529.848440] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ad4c5c00018 of 39845864 bytes with 4752 elements | |
[1650465529.848602] [ndv4:14949:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3257050 | |
[1650465529.848636] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x3257050 using dc_mlx5/mlx5_ib3:1 on worker 0x19ac660 | |
[1650465529.848756] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.848768] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.848882] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.848886] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.849484] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.850215] [ndv4:13443:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x5464050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc464 | |
[1650465529.850336] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.850343] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.850351] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b751b6dd008 of 151544 bytes with 1052 elements | |
[1650465529.852527] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4bba050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465529.852557] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4bba050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465529.852631] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4bba050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465529.852646] [ndv4:12741:0] ud_iface.c:393 UCX DEBUG iface 0x4bba050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465529.852653] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.852683] [ndv4:12741:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.852687] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x45a5c60 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.852707] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465529.852718] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x4bba050 using ud_mlx5/mlx5_ib2:1 on worker 0x37328d0 | |
[1650465529.852775] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.852781] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.852922] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.852926] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.853372] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.850941] [ndv4:14421:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x22a6010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdebe | |
[1650465529.851064] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.851070] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.851091] [ndv4:14421:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ab787e86008 of 151544 bytes with 1052 elements | |
[1650465529.850537] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.850907] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x27f2700: created UD QP 0xc460 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.851379] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.851731] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.851738] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.851894] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.851900] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.852264] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c8335000..0x2ad4c83ba000 on mlx5_ib3 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465529.852270] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ad4c8335018 of 544744 bytes with 128 elements | |
[1650465529.852275] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.852409] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x27f2700: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465529.852660] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x27f2700: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465529.852777] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x27f2700: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465529.853070] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x27f2700: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465529.853409] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x27f2700: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465529.853426] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x27f2700: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465529.853517] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x27f2700: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465529.853894] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x27f2700: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465529.853135] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x59a8340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 6) | |
[1650465529.853342] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x59a8340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 7) | |
[1650465529.853674] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.853681] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x59a8e90 [id=130 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.853703] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 130 events 0x5 mode thread_spinlock | |
[1650465529.853717] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[39]=0x59a8340 using ud_verbs/mlx5_ib6:1 on worker 0x29ca770 | |
[1650465529.853877] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.853883] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.854052] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.854058] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.854861] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.854525] [ndv4:14419:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465529.854533] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465529.854880] [ndv4:14419:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465529.854884] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465529.854131] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7519000000..0x2b751b600000 on mlx5_ib5 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465529.854146] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b7519000018 of 39845864 bytes with 4752 elements | |
[1650465529.854282] [ndv4:13443:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x5464050 | |
[1650465529.854317] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[33]=0x5464050 using dc_mlx5/mlx5_ib5:1 on worker 0x2d7f8d0 | |
[1650465529.854766] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.854777] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.854639] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.854658] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.854662] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.854714] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.855052] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.855060] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.855522] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x4cba0a0: created RC QP 0xc461 on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.854117] [ndv4:14949:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.854128] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x34338e0 [id=109 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.854151] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 109 events 0x5 mode thread_spinlock | |
[1650465529.854161] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[24]=0x27f2700 using ud_verbs/mlx5_ib3:1 on worker 0x19ac660 | |
[1650465529.854465] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.854472] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.854789] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.854796] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.855376] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.854881] [ndv4:14421:0] ib_md.c:812 UCX DEBUG registered memory 0x2ab78ee00000..0x2ab791400000 on mlx5_ib0 lkey 0x81f00 rkey 0x81f00 access 0xf flags 0x3e4 | |
[1650465529.854904] [ndv4:14421:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ab78ee00018 of 39845864 bytes with 4752 elements | |
[1650465529.855037] [ndv4:14421:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x22a6010 | |
[1650465529.855073] [ndv4:14421:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x22a6010 using dc_mlx5/mlx5_ib0:1 on worker 0x1f71f60 | |
[1650465529.855240] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.855251] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.855411] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.855417] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.856054] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x4cba0a0 using rc_verbs/mlx5_ib3:1 on worker 0x37328d0 | |
[1650465529.856117] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.856125] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.856128] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.856412] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.856542] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x5ac6460: created UD QP 0xc49e on mlx5_ib6:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.856548] [ndv4:14205:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.856697] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x3551460: created UD QP 0xc462 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.856705] [ndv4:14949:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.857152] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.857341] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.857348] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.857431] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.857435] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.857156] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.857342] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.857348] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.857376] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.857383] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.857820] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c83ba000..0x2ad4c843f000 on mlx5_ib3 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465529.857826] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ad4c83ba018 of 544744 bytes with 128 elements | |
[1650465529.857831] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.857983] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x3551460: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465529.858383] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x3551460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465529.858744] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x3551460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465529.859237] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x3551460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465529.857761] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdbc268000..0x2afdbc2ed000 on mlx5_ib6 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465529.857768] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdbc268018 of 544744 bytes with 128 elements | |
[1650465529.857772] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.858201] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x5ac6460: adding gid fe80::15:5dff:fd34:1 to hash on device mlx5_ib6 port 1 index 0) | |
[1650465529.858340] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x5ac6460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 1) | |
[1650465529.858958] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x5ac6460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 2) | |
[1650465529.859226] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x5ac6460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 3) | |
[1650465529.859386] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x5ac6460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 4) | |
[1650465529.857726] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.857745] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.857748] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.857803] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.858233] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.858241] [ndv4:13552:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.858274] [ndv4:13552:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465529.858277] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.858284] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x1b394c0 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.858311] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465529.858834] [ndv4:13552:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.861622] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.861631] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.862507] [ndv4:14421:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.862821] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.862833] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.863551] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.862616] [ndv4:14419:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465529.862624] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465529.862897] [ndv4:14419:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465529.862561] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.863965] [ndv4:14421:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.864283] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.864332] [ndv4:14421:0] ib_iface.c:994 UCX DEBUG iface=0x2049050: created UD QP 0xde9f on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.864691] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x5640060: created UD QP 0xc45a on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.864602] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.864612] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.864615] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.864665] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.864923] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.865144] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.865151] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.865176] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.865180] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.865538] [ndv4:14421:0] ib_md.c:812 UCX DEBUG registered memory 0x2ab787eab000..0x2ab787f30000 on mlx5_ib0 lkey 0x82000 rkey 0x82000 access 0xf flags 0x3e4 | |
[1650465529.865544] [ndv4:14421:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ab787eab018 of 544744 bytes with 128 elements | |
[1650465529.865548] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.865627] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x2049050: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.865877] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x2049050: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.865195] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.865378] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.865386] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.865521] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.865526] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.865076] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4e08010 of 8176 bytes with 127 elements | |
[1650465529.865282] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.865288] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.865324] [ndv4:12741:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465529.865328] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.865338] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x3c2a3b0 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.865357] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465529.865367] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x4e0f100 using rc_mlx5/mlx5_ib3:1 on worker 0x37328d0 | |
[1650465529.865477] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.865483] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.865555] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.865561] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.866027] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b751b702000..0x2b751b787000 on mlx5_ib5 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465529.866034] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b751b702018 of 544744 bytes with 128 elements | |
[1650465529.866038] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.866233] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5640060: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465529.866255] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5640060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465529.867050] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5640060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465529.867500] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5640060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465529.868031] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5640060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465529.868354] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5640060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465529.868860] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5640060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465529.869571] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x5640060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465529.866473] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x5ac6460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 5) | |
[1650465529.866973] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x5ac6460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 6) | |
[1650465529.867480] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x5ac6460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 7) | |
[1650465529.867485] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.867518] [ndv4:14205:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.867522] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x5484970 [id=131 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.867542] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 131 events 0x5 mode thread_spinlock | |
[1650465529.867553] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[40]=0x5ac6460 using ud_mlx5/mlx5_ib6:1 on worker 0x29ca770 | |
[1650465529.867717] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.867724] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.867869] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.867874] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.868325] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.866557] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x3551460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465529.866969] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x3551460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465529.867467] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x3551460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465529.868351] [ndv4:14949:0] ud_iface.c:393 UCX DEBUG iface 0x3551460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465529.868361] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.868391] [ndv4:14949:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.868396] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x3551f60 [id=110 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.868418] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 110 events 0x5 mode thread_spinlock | |
[1650465529.868430] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[25]=0x3551460 using ud_mlx5/mlx5_ib3:1 on worker 0x19ac660 | |
[1650465529.868812] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.868819] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.869010] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.869015] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.869506] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.868567] [ndv4:13552:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x359e050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc4f6 | |
[1650465529.868933] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.868940] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.868949] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b26690b0008 of 151544 bytes with 1052 elements | |
[1650465529.870636] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.870652] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.870655] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.870709] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.869896] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.869907] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x511c9b0 [id=123 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.869937] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 123 events 0x5 mode thread_spinlock | |
[1650465529.869947] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[34]=0x5640060 using ud_verbs/mlx5_ib5:1 on worker 0x2d7f8d0 | |
[1650465529.870108] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.870115] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.870336] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.870342] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.870025] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.870045] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.870048] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.870097] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.871195] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.871201] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.871032] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465529.871173] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.871180] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.871926] [ndv4:14949:0] ib_iface.c:994 UCX DEBUG iface=0x36510a0: created RC QP 0xc4bf on mlx5_ib4:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.872101] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x5bc60a0: created RC QP 0xc43a on mlx5_ib7:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.872366] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.872819] [ndv4:14419:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465529.873199] [ndv4:14419:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465529.872752] [ndv4:13552:0] ib_md.c:812 UCX DEBUG registered memory 0x2b2666a00000..0x2b2669000000 on mlx5_ib2 lkey 0x81e00 rkey 0x81e00 access 0xf flags 0x3e4 | |
[1650465529.872774] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b2666a00018 of 39845864 bytes with 4752 elements | |
[1650465529.872917] [ndv4:13552:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x359e050 | |
[1650465529.872951] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x359e050 using dc_mlx5/mlx5_ib2:1 on worker 0x24108d0 | |
[1650465529.872972] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.872981] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.873086] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.873091] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.873472] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.872675] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x575e460: created UD QP 0xc45b on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.872682] [ndv4:13443:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.873227] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.874068] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.874075] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.874090] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465529.874093] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465529.872672] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[26]=0x36510a0 using rc_verbs/mlx5_ib4:1 on worker 0x19ac660 | |
[1650465529.872842] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.872849] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.872920] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.872925] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.873088] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.874451] [ndv4:13443:0] ib_md.c:812 UCX DEBUG registered memory 0x2b751b787000..0x2b751b80c000 on mlx5_ib5 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465529.874457] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b751b787018 of 544744 bytes with 128 elements | |
[1650465529.874462] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.874761] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x575e460: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465529.873770] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.874363] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x2049050: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465529.874740] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x2049050: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465529.875101] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x2049050: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465529.875232] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x2049050: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465529.875021] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.875034] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.875037] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.875090] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.875553] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3972010 of 8176 bytes with 127 elements | |
[1650465529.872741] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[41]=0x5bc60a0 using rc_verbs/mlx5_ib7:1 on worker 0x29ca770 | |
[1650465529.872926] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.872933] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.873049] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.873054] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.873679] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.875896] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.875911] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.875915] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.875967] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.876443] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x5ee7010 of 8176 bytes with 127 elements | |
[1650465529.875815] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.875823] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.875859] [ndv4:14949:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465529.875863] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.875875] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x2744690 [id=113 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.875895] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 113 events 0x1 mode thread_spinlock | |
[1650465529.875904] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[27]=0x379e030 using rc_mlx5/mlx5_ib4:1 on worker 0x19ac660 | |
[1650465529.876142] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.876149] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.876363] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.876367] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.875769] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x2049050: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465529.876536] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x2049050: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465529.876696] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.876938] [ndv4:14421:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.876689] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.876696] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.876732] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib7 length=2048) failed: Invalid argument | |
[1650465529.876736] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.876746] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x5af3960 [id=134 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.876766] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 134 events 0x1 mode thread_spinlock | |
[1650465529.876775] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[42]=0x5d13030 using rc_mlx5/mlx5_ib7:1 on worker 0x29ca770 | |
[1650465529.876857] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.876864] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.877072] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.877078] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.878095] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.879642] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.879660] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.879663] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.879719] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.880093] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.880102] [ndv4:14205:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.880136] [ndv4:14205:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib7 length=2048) failed: Invalid argument | |
[1650465529.880139] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.880146] [ndv4:14205:0] async.c:228 UCX DEBUG added async handler 0x20ef2f0 [id=136 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.880164] [ndv4:14205:0] async.c:506 UCX DEBUG listening to async event fd 136 events 0x1 mode thread_spinlock | |
[1650465529.880834] [ndv4:14205:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.881476] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.881814] [ndv4:14419:0] async.c:228 UCX DEBUG added async handler 0x8d6ce0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465529.881840] [ndv4:14419:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465529.881844] [ndv4:14419:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465529.881980] [ndv4:13552:0] ib_iface.c:994 UCX DEBUG iface=0x377a060: created UD QP 0xc4ec on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.882079] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.882090] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.882096] [ndv4:14419:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465529.882139] [ndv4:14419:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465529.883475] [ndv4:14419:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465529.883480] [ndv4:14419:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465529.883703] [ndv4:14419:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465529.883977] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.883983] [ndv4:14419:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.882569] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.882872] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.882879] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.883005] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.883010] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.883389] [ndv4:13552:0] ib_md.c:812 UCX DEBUG registered memory 0x2b26690d5000..0x2b266915a000 on mlx5_ib2 lkey 0x81f00 rkey 0x81f00 access 0xf flags 0x3e4 | |
[1650465529.883395] [ndv4:13552:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b26690d5018 of 544744 bytes with 128 elements | |
[1650465529.883399] [ndv4:13552:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.883982] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x377a060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465529.884569] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x377a060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465529.884753] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x377a060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465529.883223] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x575e460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465529.883622] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x575e460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465529.884134] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x575e460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465529.884974] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x575e460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465529.885484] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x575e460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465529.885890] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x575e460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465529.886229] [ndv4:13443:0] ud_iface.c:393 UCX DEBUG iface 0x575e460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465529.886236] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.886272] [ndv4:13443:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.886278] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x5640f20 [id=124 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.886308] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 124 events 0x5 mode thread_spinlock | |
[1650465529.886319] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[35]=0x575e460 using ud_mlx5/mlx5_ib5:1 on worker 0x2d7f8d0 | |
[1650465529.886384] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.886390] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.886509] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.886514] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.882192] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.882211] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.882214] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.882272] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.882670] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.882677] [ndv4:12741:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.882711] [ndv4:12741:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465529.882715] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.882723] [ndv4:12741:0] async.c:228 UCX DEBUG added async handler 0x2e56f00 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.882742] [ndv4:12741:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465529.883193] [ndv4:12741:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.882730] [ndv4:14421:0] async.c:228 UCX DEBUG added async handler 0x160a520 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.882767] [ndv4:14421:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465529.882791] [ndv4:14421:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x2049050 using ud_verbs/mlx5_ib0:1 on worker 0x1f71f60 | |
[1650465529.882896] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.882902] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.883065] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.883070] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.883711] [ndv4:14421:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465529.885002] [ndv4:14421:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.885397] [ndv4:14421:0] ib_iface.c:994 UCX DEBUG iface=0x16b9090: created UD QP 0xdea1 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.885409] [ndv4:14421:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.886028] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.886202] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.886209] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.886235] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465529.886240] [ndv4:14421:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465529.886697] [ndv4:14421:0] ib_md.c:812 UCX DEBUG registered memory 0x2ab787f30000..0x2ab787fb5000 on mlx5_ib0 lkey 0x82100 rkey 0x82100 access 0xf flags 0x3e4 | |
[1650465529.886703] [ndv4:14421:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ab787f30018 of 544744 bytes with 128 elements | |
[1650465529.886707] [ndv4:14421:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.886944] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x16b9090: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465529.885944] [ndv4:14949:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.885963] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.885966] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.886022] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.886449] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.886455] [ndv4:14949:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.886488] [ndv4:14949:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465529.886492] [ndv4:14949:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.886500] [ndv4:14949:0] async.c:228 UCX DEBUG added async handler 0x37a6f10 [id=115 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.886519] [ndv4:14949:0] async.c:506 UCX DEBUG listening to async event fd 115 events 0x1 mode thread_spinlock | |
[1650465529.887117] [ndv4:14949:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.890809] [ndv4:12741:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4fe2010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc478 | |
[1650465529.890885] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.890892] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.890902] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b7353f99008 of 151544 bytes with 1052 elements | |
[1650465529.889806] [ndv4:14205:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x5ee9050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc45d | |
[1650465529.889939] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.889946] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.889955] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2afdbeaf0008 of 151544 bytes with 1052 elements | |
[1650465529.895331] [ndv4:12741:0] ib_md.c:812 UCX DEBUG registered memory 0x2b7360200000..0x2b7362800000 on mlx5_ib3 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465529.895344] [ndv4:12741:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b7360200018 of 39845864 bytes with 4752 elements | |
[1650465529.895485] [ndv4:12741:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4fe2010 | |
[1650465529.895506] [ndv4:12741:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x4fe2010 using dc_mlx5/mlx5_ib3:1 on worker 0x37328d0 | |
[1650465529.895535] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.895545] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.895660] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.895666] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.896054] [ndv4:12741:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465529.896075] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x377a060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465529.896406] [ndv4:14419:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465529.896414] [ndv4:14419:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465529.894569] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdbc400000..0x2afdbea00000 on mlx5_ib7 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465529.894618] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2afdbc400018 of 39845864 bytes with 4752 elements | |
[1650465529.894809] [ndv4:14205:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x5ee9050 | |
[1650465529.894850] [ndv4:14205:0] ucp_worker.c:1159 UCX DEBUG created interface[43]=0x5ee9050 using dc_mlx5/mlx5_ib7:1 on worker 0x29ca770 | |
[1650465529.894527] [ndv4:14949:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3974050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xc4d5 | |
[1650465529.894720] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.894728] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.894737] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ad4cac40008 of 151544 bytes with 1052 elements | |
[1650465529.897997] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.899180] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465529.899198] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.899201] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.899256] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.898433] [ndv4:14949:0] ib_md.c:812 UCX DEBUG registered memory 0x2ad4c8600000..0x2ad4cac00000 on mlx5_ib4 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465529.898455] [ndv4:14949:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ad4c8600018 of 39845864 bytes with 4752 elements | |
[1650465529.898645] [ndv4:14949:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3974050 | |
[1650465529.898676] [ndv4:14949:0] ucp_worker.c:1159 UCX DEBUG created interface[28]=0x3974050 using dc_mlx5/mlx5_ib4:1 on worker 0x19ac660 | |
[1650465529.898819] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.898831] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.898970] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465529.898975] [ndv4:14949:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465529.899697] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.899703] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465529.900412] [ndv4:13443:0] ib_iface.c:994 UCX DEBUG iface=0x585e0a0: created RC QP 0xc49f on mlx5_ib6:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465529.901183] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[36]=0x585e0a0 using rc_verbs/mlx5_ib6:1 on worker 0x2d7f8d0 | |
[1650465529.901271] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.901277] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.901441] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.901446] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.901775] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.903093] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.903111] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.903115] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.903165] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.903592] [ndv4:13443:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x5b7f010 of 8176 bytes with 127 elements | |
[1650465529.903841] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465529.903847] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.903889] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465529.903894] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.903906] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x59b3c40 [id=127 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.903926] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 127 events 0x1 mode thread_spinlock | |
[1650465529.903937] [ndv4:13443:0] ucp_worker.c:1159 UCX DEBUG created interface[37]=0x59ab030 using rc_mlx5/mlx5_ib6:1 on worker 0x2d7f8d0 | |
[1650465529.904005] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.904011] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.904162] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465529.904167] [ndv4:13443:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465529.904483] [ndv4:13443:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465529.905762] [ndv4:13443:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465529.905779] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465529.905782] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465529.905844] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465529.905341] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.905352] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.905433] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.905438] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.905868] [ndv4:14205:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465529.906202] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465529.906209] [ndv4:13443:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465529.906247] [ndv4:13443:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465529.906252] [ndv4:13443:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465529.906260] [ndv4:13443:0] async.c:228 UCX DEBUG added async handler 0x5839360 [id=129 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465529.906284] [ndv4:13443:0] async.c:506 UCX DEBUG listening to async event fd 129 events 0x1 mode thread_spinlock | |
[1650465529.906373] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x377a060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465529.906520] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x377a060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465529.906567] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x377a060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465529.906726] [ndv4:13552:0] ud_iface.c:393 UCX DEBUG iface 0x377a060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465529.906838] [ndv4:13443:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465529.906994] [ndv4:13552:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465529.907004] [ndv4:13552:0] async.c:228 UCX DEBUG added async handler 0x2cf7e70 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465529.907033] [ndv4:13552:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465529.907044] [ndv4:13552:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x377a060 using ud_verbs/mlx5_ib2:1 on worker 0x24108d0 | |
[1650465529.907060] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.907067] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.907153] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465529.907157] [ndv4:13552:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465529.907013] [ndv4:14205:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.907411] [ndv4:14205:0] ib_iface.c:994 UCX DEBUG iface=0x60c5060: created UD QP 0xc443 on mlx5_ib7:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.907148] [ndv4:12741:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.907531] [ndv4:12741:0] ib_iface.c:994 UCX DEBUG iface=0x51be060: created UD QP 0xc46e on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.908007] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.908051] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.908058] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.908104] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465529.908109] [ndv4:14205:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465529.908477] [ndv4:14205:0] ib_md.c:812 UCX DEBUG registered memory 0x2afdbeb15000..0x2afdbeb9a000 on mlx5_ib7 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465529.908483] [ndv4:14205:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2afdbeb15018 of 544744 bytes with 128 elements | |
[1650465529.908487] [ndv4:14205:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465529.908792] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x60c5060: adding gid fe80::15:5dff:fd34:2 to hash on device mlx5_ib7 port 1 index 0) | |
[1650465529.909092] [ndv4:14205:0] ud_iface.c:393 UCX DEBUG iface 0x60c5060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 1) | |
[1650465529.908970] [ndv4:14949:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465529.908123] [ndv4:12741:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465529.908227] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465529.908233] [ndv4:12741:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465529.907549] [ndv4:13552:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465529.908500] [ndv4:13552:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465529.908905] [ndv4:13552:0] ib_iface.c:994 UCX DEBUG iface=0x3898050: created UD QP 0xc4ed on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465529.908912] [ndv4:13552:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465529.907788] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x16b9090: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465529.907944] [ndv4:14421:0] ud_iface.c:393 UCX DEBUG iface 0x16b9090: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment