Created
April 20, 2022 15:06
-
-
Save vanzod/3a8d04f14614d8a0914b0bbfb1ecafca to your computer and use it in GitHub Desktop.
UCX 1.11.2 debug log for successful osu_scatter
This file has been truncated, but you can view the full file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[1650465628.842779] [ndv4:54965:0] debug.c:1198 UCX DEBUG using signal stack 0x2b4b1d33f000 size 141824 | |
[1650465628.842855] [ndv4:54965:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465628.842874] [ndv4:54965:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b4b1d1ad000 | |
[1650465628.842890] [ndv4:54965:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465628.842899] [ndv4:54965:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465628.842905] [ndv4:54965:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465628.844926] [ndv4:54965:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465628.844945] [ndv4:54965:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465628.844977] [ndv4:54965:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465628.844980] [ndv4:54965:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465628.844987] [ndv4:54965:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465628.844994] [ndv4:54965:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465628.844996] [ndv4:54965:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465628.845001] [ndv4:54965:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465628.845003] [ndv4:54965:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465628.845006] [ndv4:54965:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465628.845008] [ndv4:54965:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465628.845010] [ndv4:54965:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465628.845018] [ndv4:54965:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465628.845871] [ndv4:54965:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465628.846460] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465628.846473] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465628.846485] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465628.846495] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465628.846506] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465628.846516] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465628.846526] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465628.846950] [ndv4:54965:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465628.871052] [ndv4:54965:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465628.871410] [ndv4:54965:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465628.876869] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x13a1770 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465628.876968] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465628.876976] [ndv4:54965:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465628.877132] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465628.877144] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465628.877170] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465628.877226] [ndv4:54965:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465628.877248] [ndv4:54965:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465628.877792] [ndv4:54965:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465628.877798] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465628.878043] [ndv4:54965:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465628.886764] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465628.886776] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465628.906117] [ndv4:54965:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465628.906125] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465628.915312] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465628.915319] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465628.916888] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465628.916895] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465628.917160] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465628.917164] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465628.926198] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465628.926205] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465628.926420] [ndv4:54965:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465628.933862] [ndv4:54965:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465628.934181] [ndv4:54965:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465628.934712] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1a84230 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465628.934736] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465628.934740] [ndv4:54965:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465628.935038] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465628.935046] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465628.935051] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465628.935092] [ndv4:54965:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465628.945862] [ndv4:54965:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465628.945870] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465628.946088] [ndv4:54965:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465628.946180] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465628.946186] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465628.946684] [ndv4:54965:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465628.946689] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465628.947385] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465628.947389] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465628.947864] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465628.947870] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465628.957359] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465628.957366] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465628.978279] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465628.978287] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465628.978495] [ndv4:54965:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465628.984708] [ndv4:54965:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465628.985074] [ndv4:54965:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465628.986231] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1a841c0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465628.986256] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465628.986259] [ndv4:54965:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465628.986504] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465628.986513] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465628.986518] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465628.986562] [ndv4:54965:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465628.987418] [ndv4:54965:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465628.987423] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465628.987660] [ndv4:54965:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465628.987747] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465628.987754] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.000031] [ndv4:54965:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465629.000039] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.000114] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.000117] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.001343] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.001347] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.003045] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.003050] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.017740] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.017749] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.017960] [ndv4:54965:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.034393] [ndv4:54965:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.034778] [ndv4:54965:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465629.035460] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x139aeb0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.035484] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465629.035487] [ndv4:54965:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465629.035678] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.035686] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.035691] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.035731] [ndv4:54965:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465629.037434] [ndv4:54965:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.037439] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.037676] [ndv4:54965:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.038147] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.038153] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.050057] [ndv4:54965:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465629.050072] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.050784] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.050789] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.061545] [ndv4:54756:0] debug.c:1198 UCX DEBUG using signal stack 0x2b3b50a32000 size 141824 | |
[1650465629.061711] [ndv4:54756:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.061736] [ndv4:54756:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b3b50886000 | |
[1650465629.061754] [ndv4:54756:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.061764] [ndv4:54756:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.061770] [ndv4:54756:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.061746] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.061754] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.065140] [ndv4:54756:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.065160] [ndv4:54756:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.065199] [ndv4:54756:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.065202] [ndv4:54756:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.065209] [ndv4:54756:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.065217] [ndv4:54756:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.065220] [ndv4:54756:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.065225] [ndv4:54756:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.065228] [ndv4:54756:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.065230] [ndv4:54756:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.065232] [ndv4:54756:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.065234] [ndv4:54756:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.065243] [ndv4:54756:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.066028] [ndv4:54756:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.068450] [ndv4:55152:0] debug.c:1198 UCX DEBUG using signal stack 0x2b3b93d85000 size 141824 | |
[1650465629.068526] [ndv4:55152:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.068545] [ndv4:55152:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b3b93bf3000 | |
[1650465629.068563] [ndv4:55152:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.068683] [ndv4:55152:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.068693] [ndv4:55152:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.071186] [ndv4:55152:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.071204] [ndv4:55152:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.071236] [ndv4:55152:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.071240] [ndv4:55152:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.071247] [ndv4:55152:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.071253] [ndv4:55152:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.071256] [ndv4:55152:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.071261] [ndv4:55152:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.071264] [ndv4:55152:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.071266] [ndv4:55152:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.071268] [ndv4:55152:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.071270] [ndv4:55152:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.071278] [ndv4:55152:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.071966] [ndv4:55152:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.073920] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.073928] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.075239] [ndv4:55152:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.075255] [ndv4:55152:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.075266] [ndv4:55152:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.075277] [ndv4:55152:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.075286] [ndv4:55152:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.075297] [ndv4:55152:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.075307] [ndv4:55152:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.075753] [ndv4:55152:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.074913] [ndv4:54756:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.074929] [ndv4:54756:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.074940] [ndv4:54756:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.074950] [ndv4:54756:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.074960] [ndv4:54756:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.074970] [ndv4:54756:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.074981] [ndv4:54756:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.075645] [ndv4:54756:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.077694] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.077699] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.077984] [ndv4:54965:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.083669] [ndv4:54756:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.083993] [ndv4:54756:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.084750] [ndv4:54965:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.085175] [ndv4:54965:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465629.090063] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x1e5f7f0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.090170] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.090178] [ndv4:54756:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.090747] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.090758] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.090784] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.090844] [ndv4:54756:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.090867] [ndv4:54756:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.093041] [ndv4:54756:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.093049] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.093203] [ndv4:54756:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.093258] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.093265] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.094532] [ndv4:55152:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.094917] [ndv4:55152:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.099563] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x25102a0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.099777] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.099786] [ndv4:55152:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.099964] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.099978] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.100004] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.100063] [ndv4:55152:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.100085] [ndv4:55152:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.113420] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x13a2090 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.113451] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465629.113455] [ndv4:54965:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465629.112466] [ndv4:54756:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.112475] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.113360] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.113364] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.113752] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.113766] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.113775] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.113834] [ndv4:54965:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465629.117339] [ndv4:54965:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.117347] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.117590] [ndv4:54965:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.117686] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.117693] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.118790] [ndv4:54965:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465629.118794] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.127865] [ndv4:55152:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.127876] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.128053] [ndv4:55152:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.128369] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.128376] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.131875] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.131883] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.143769] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.143776] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.144283] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.144286] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.145399] [ndv4:55152:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.145407] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.147500] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.147508] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.152107] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.152114] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.152379] [ndv4:54756:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.153670] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.153677] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.154301] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.154309] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.164640] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.164647] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.168360] [ndv4:54756:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.168696] [ndv4:54756:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465629.169737] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x2540b40 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.169764] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465629.169768] [ndv4:54756:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465629.170094] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.170104] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.170114] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.170175] [ndv4:54756:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465629.171549] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.171557] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.171797] [ndv4:54965:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.172645] [ndv4:54756:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.172652] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.172790] [ndv4:54756:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.172944] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.172950] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.174590] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.174597] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.175991] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.175997] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.176222] [ndv4:55152:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.184025] [ndv4:55152:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.184404] [ndv4:55152:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465629.183925] [ndv4:54756:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465629.183934] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.186506] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2536130 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.186531] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465629.186534] [ndv4:55152:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465629.186862] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.186871] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.186876] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.186917] [ndv4:55152:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465629.187284] [ndv4:54965:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.187733] [ndv4:54965:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465629.189194] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x139ab00 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.189219] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465629.189223] [ndv4:54965:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465629.189492] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.189507] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.189518] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.189654] [ndv4:54965:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465629.189514] [ndv4:55152:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.189522] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.189676] [ndv4:55152:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.189981] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.189989] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.191762] [ndv4:54965:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.191770] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.192260] [ndv4:54965:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.192318] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.192326] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.197904] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.197911] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.202291] [ndv4:55152:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465629.202300] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.206769] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.206777] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.213673] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.213680] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.214761] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.214766] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.217321] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.217327] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.216182] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.216189] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.218392] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.218398] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.218716] [ndv4:54756:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.219680] [ndv4:54965:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465629.219690] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.224211] [ndv4:54756:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.224568] [ndv4:54756:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465629.225194] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x1e5e970 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.225220] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465629.225224] [ndv4:54756:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465629.225503] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.225517] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.225527] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.225608] [ndv4:54756:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465629.230023] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.230031] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.230243] [ndv4:55152:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.235484] [ndv4:55152:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.235791] [ndv4:55152:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465629.236088] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2536c60 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.236112] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465629.236116] [ndv4:55152:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465629.236267] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.236276] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.236281] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.236328] [ndv4:55152:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465629.238518] [ndv4:55152:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.238533] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.238715] [ndv4:55152:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.238878] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.238884] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.242099] [ndv4:54756:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.242107] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.242245] [ndv4:54756:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.242414] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.242422] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.243888] [ndv4:54756:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465629.243895] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.247740] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.247749] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.249244] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.249250] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.254333] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.254342] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.255218] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.255223] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.255684] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.255690] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.255988] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.255992] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.256272] [ndv4:54756:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.259163] [ndv4:55152:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465629.259171] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.260460] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.260465] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.263204] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.263212] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.273165] [ndv4:54756:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.273450] [ndv4:54756:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465629.274790] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x1e5f2f0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.274814] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465629.274817] [ndv4:54756:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465629.275089] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.275099] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.275104] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.275147] [ndv4:54756:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465629.280135] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.280143] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.283382] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.283389] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.283031] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.283039] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.283356] [ndv4:54965:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.285804] [ndv4:54756:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.285826] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.287425] [ndv4:54756:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.287536] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.287548] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.289755] [ndv4:54965:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.290097] [ndv4:54965:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465629.298882] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x139ce10 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.298907] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465629.298910] [ndv4:54965:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465629.299154] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465629.299165] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465629.299170] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.299215] [ndv4:54965:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465629.300013] [ndv4:54965:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.300018] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.300643] [ndv4:54965:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.300759] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465629.300766] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465629.302412] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.302419] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.302653] [ndv4:55152:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.309811] [ndv4:55516:0] debug.c:1198 UCX DEBUG using signal stack 0x2ae248e6a000 size 141824 | |
[1650465629.309894] [ndv4:55516:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.309917] [ndv4:55516:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2ae248cbe000 | |
[1650465629.309939] [ndv4:55516:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.309948] [ndv4:55516:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.309954] [ndv4:55516:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.312273] [ndv4:55512:0] debug.c:1198 UCX DEBUG using signal stack 0x2aea6cc40000 size 141824 | |
[1650465629.312349] [ndv4:55512:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.312371] [ndv4:55512:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2aea6ca94000 | |
[1650465629.312389] [ndv4:55512:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.312398] [ndv4:55512:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.312404] [ndv4:55512:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.312644] [ndv4:55516:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.312664] [ndv4:55516:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.312704] [ndv4:55516:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.312708] [ndv4:55516:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.312716] [ndv4:55516:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.312722] [ndv4:55516:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.312725] [ndv4:55516:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.312730] [ndv4:55516:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.312732] [ndv4:55516:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.312734] [ndv4:55516:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.312737] [ndv4:55516:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.312739] [ndv4:55516:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.312747] [ndv4:55516:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.310936] [ndv4:54965:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465629.310944] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.311368] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.311372] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.311718] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.311722] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.312855] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.312860] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.313123] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.313126] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.313340] [ndv4:54965:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.314760] [ndv4:55512:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.314778] [ndv4:55512:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.314811] [ndv4:55512:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.314815] [ndv4:55512:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.314823] [ndv4:55512:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.314830] [ndv4:55512:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.314833] [ndv4:55512:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.314839] [ndv4:55512:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.314842] [ndv4:55512:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.314844] [ndv4:55512:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.314847] [ndv4:55512:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.314849] [ndv4:55512:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.314857] [ndv4:55512:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.315530] [ndv4:55512:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.313572] [ndv4:55516:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.314161] [ndv4:55516:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.314175] [ndv4:55516:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.314187] [ndv4:55516:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.314198] [ndv4:55516:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.314208] [ndv4:55516:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.314218] [ndv4:55516:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.314228] [ndv4:55516:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.314698] [ndv4:55516:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.314383] [ndv4:54756:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465629.314393] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.314701] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.314706] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.315527] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.315531] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.317874] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.317880] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.318854] [ndv4:55152:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.319168] [ndv4:55152:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465629.320782] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2536360 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.320805] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465629.320808] [ndv4:55152:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465629.321279] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.321288] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.321293] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.321336] [ndv4:55152:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465629.323124] [ndv4:55516:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.323422] [ndv4:55516:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.323528] [ndv4:55512:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.323546] [ndv4:55512:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.323557] [ndv4:55512:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.323567] [ndv4:55512:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.323617] [ndv4:55512:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.323628] [ndv4:55512:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.323639] [ndv4:55512:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.324007] [ndv4:55512:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.324434] [ndv4:54965:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.324796] [ndv4:54965:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465629.326239] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x139af10 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.326262] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465629.326265] [ndv4:54965:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465629.326556] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465629.326566] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465629.326571] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.326642] [ndv4:54965:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465629.327840] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.327850] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.328215] [ndv4:54756:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.328712] [ndv4:54965:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.328720] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.328985] [ndv4:54965:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.329023] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465629.329028] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465629.329405] [ndv4:54965:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465629.329409] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.329555] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.329559] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.329687] [ndv4:55512:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.329993] [ndv4:55512:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.330214] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x12771f0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.330311] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.330319] [ndv4:55516:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.330496] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.330506] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.330532] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.330615] [ndv4:55516:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.330638] [ndv4:55516:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.331815] [ndv4:55516:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.331823] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.331975] [ndv4:55516:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.332035] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.332041] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.332789] [ndv4:55516:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.332794] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.333475] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.333480] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.333258] [ndv4:55152:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.333265] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.333417] [ndv4:55152:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.333473] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.333480] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.333973] [ndv4:55152:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465629.333978] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.334285] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.334289] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.334436] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.334440] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.334554] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.334557] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.335022] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.335027] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.335237] [ndv4:55152:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.334364] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x128d1f0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.334463] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.334470] [ndv4:55512:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.334620] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.334628] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.334651] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.334707] [ndv4:55512:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.334729] [ndv4:55512:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.335310] [ndv4:55512:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.335317] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.335477] [ndv4:55512:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.335501] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.335508] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.338843] [ndv4:54756:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.339146] [ndv4:54756:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465629.341725] [ndv4:55152:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.342047] [ndv4:55152:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465629.342976] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.342983] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.342972] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x253af90 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.342994] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465629.342997] [ndv4:55152:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465629.343339] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.343347] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.343353] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.343392] [ndv4:55152:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465629.344771] [ndv4:55152:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.344777] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.344936] [ndv4:55152:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.345201] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.345207] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.344729] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.344737] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.347206] [ndv4:55512:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.347215] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.348357] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x1e58d20 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.348381] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465629.348385] [ndv4:54756:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465629.348767] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.348777] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.348783] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.348829] [ndv4:54756:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465629.350324] [ndv4:54756:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.350329] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.350462] [ndv4:54756:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.350682] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.350688] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.350547] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.350558] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.354037] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.354044] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.356157] [ndv4:55152:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465629.356165] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.355662] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.355670] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.356282] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.356286] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.356505] [ndv4:55516:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.356025] [ndv4:54965:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.356032] [ndv4:54965:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.356126] [ndv4:54965:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x1385b10 0x1385b10 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465629.360922] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.360931] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.360742] [ndv4:54756:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465629.360751] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.363512] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.363518] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.362643] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.362649] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.364730] [ndv4:55516:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.365054] [ndv4:55516:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465629.364878] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.364884] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.365115] [ndv4:55512:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.366114] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x126ff60 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.366137] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465629.366140] [ndv4:55516:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465629.366390] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.366400] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.366405] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.366450] [ndv4:55516:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465629.367662] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.367670] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.368660] [ndv4:55516:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.368668] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.368801] [ndv4:55516:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.368954] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.368960] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.371045] [ndv4:55516:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465629.371053] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.372795] [ndv4:55512:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.372914] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.372921] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.374278] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.374283] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.373106] [ndv4:55512:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465629.373966] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x1285f60 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.373987] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465629.373991] [ndv4:55512:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465629.374179] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.374188] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.374193] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.374233] [ndv4:55512:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465629.375792] [ndv4:55512:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.375798] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.375955] [ndv4:55512:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.376142] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.376149] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.379836] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.379844] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.381304] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.381310] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.384243] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.384251] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.385599] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.385607] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.392869] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.392877] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.393091] [ndv4:55152:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.399992] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.400001] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.400224] [ndv4:55516:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.398544] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.398551] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.400524] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.400532] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.400854] [ndv4:54756:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.400147] [ndv4:55512:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465629.400155] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.400958] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.400963] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.401163] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.401166] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.406677] [ndv4:54756:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.406972] [ndv4:54756:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465629.407443] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x1e5adf0 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.407466] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465629.407469] [ndv4:54756:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465629.412262] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.412268] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.415353] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.415359] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.415568] [ndv4:55512:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.416911] [ndv4:55516:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.417198] [ndv4:55516:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465629.417454] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.417466] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.417471] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.417519] [ndv4:54756:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465629.418651] [ndv4:54756:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.418657] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.418835] [ndv4:54756:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.419020] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.419027] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.417880] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x1277cf0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.417928] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465629.417932] [ndv4:55516:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465629.418068] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.418076] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.418081] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.418152] [ndv4:55516:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465629.419677] [ndv4:55516:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.419683] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.419816] [ndv4:55516:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.420235] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.420242] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.419998] [ndv4:55152:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.420335] [ndv4:55152:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465629.421262] [ndv4:54756:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465629.421268] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.422223] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x253cf40 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.422247] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465629.422250] [ndv4:55152:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465629.422569] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.422614] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.422619] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.422663] [ndv4:55152:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465629.423827] [ndv4:55152:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.423834] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.423986] [ndv4:55152:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.424098] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.424104] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.426091] [ndv4:54965:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b4b1ff61000 length 12288 | |
[1650465629.426745] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465629.427806] [ndv4:54965:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465629.427815] [ndv4:54965:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b4b246c5000 length 4296704 | |
[1650465629.427819] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b4b246c5018 of 4296680 bytes with 512 elements | |
[1650465629.428085] [ndv4:54965:0] mm_iface.c:600 UCX DEBUG created mm iface 0x1b02d20 FIFO id 0x4000000036dc426d va 0x2b4b1ff61000 size 12288 (128 x 64 elems) | |
[1650465629.428132] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x1b02d20 using posix/memory on worker 0x23dc840 | |
[1650465629.428157] [ndv4:54965:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465629.428193] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465629.429135] [ndv4:54965:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465629.429148] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b4b24ade018 of 4296680 bytes with 512 elements | |
[1650465629.429798] [ndv4:54965:0] mm_iface.c:600 UCX DEBUG created mm iface 0x1b032f0 FIFO id 0x718004 va 0x2b4b1ff64000 size 12288 (128 x 64 elems) | |
[1650465629.429806] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x1b032f0 using sysv/memory on worker 0x23dc840 | |
[1650465629.429818] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465629.429821] [ndv4:54965:0] self.c:220 UCX DEBUG created self iface id 0xc4ee86fb0bd28fe8 send_size 8192 | |
[1650465629.429827] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x1b16ed0 using self/memory0 on worker 0x23dc840 | |
[1650465629.429851] [ndv4:54965:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.429856] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.429858] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.432614] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1b24ef0 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.432643] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465629.432662] [ndv4:54965:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1b17830: listening for connections (fd=78) on 10.5.0.5:52109 | |
[1650465629.432785] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x1b17830 using tcp/eth0 on worker 0x23dc840 | |
[1650465629.432805] [ndv4:54965:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.432808] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.432811] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.432847] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1a732b0 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.432867] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465629.432870] [ndv4:54965:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1b17eb0: listening for connections (fd=80) on 127.0.0.1:46545 | |
[1650465629.432887] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465629.432893] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465629.432950] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x1b17eb0 using tcp/lo on worker 0x23dc840 | |
[1650465629.432966] [ndv4:54965:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.432968] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.432971] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.433010] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1a72fe0 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.433029] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465629.433034] [ndv4:54965:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1aff540: listening for connections (fd=82) on 172.16.1.242:34126 | |
[1650465629.430967] [ndv4:55516:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465629.430976] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.433239] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x1aff540 using tcp/ib0 on worker 0x23dc840 | |
[1650465629.433389] [ndv4:55512:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.433742] [ndv4:55512:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465629.434285] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x128dcf0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.434308] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465629.434311] [ndv4:55512:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465629.434447] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.434456] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.434461] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.434501] [ndv4:55512:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465629.435063] [ndv4:55512:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.435068] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.435217] [ndv4:55512:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.435244] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.435249] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.435980] [ndv4:55152:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465629.435988] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.440058] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.440066] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.440965] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.440970] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.440501] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.440510] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.441310] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.441321] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.444628] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.444635] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.443854] [ndv4:55512:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465629.443861] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.450697] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.450705] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.458273] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.458285] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.459304] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.459931] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.459942] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.459128] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.459137] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.460569] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.460612] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.461183] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.461190] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.462152] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.462160] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.462387] [ndv4:55516:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.467675] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.467714] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.467718] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.467731] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.468673] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.468682] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.469217] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x1b0d680: created RC QP 0xf475 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.470360] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.470369] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.470659] [ndv4:54756:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.471295] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.471302] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.472384] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.472388] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.472672] [ndv4:55152:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.478271] [ndv4:55152:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.478630] [ndv4:55152:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465629.479073] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2536cb0 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.479095] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465629.479098] [ndv4:55152:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465629.479252] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465629.479262] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465629.479267] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.479307] [ndv4:55152:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465629.482039] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x1b0d680 using rc_verbs/mlx5_ib0:1 on worker 0x23dc840 | |
[1650465629.482132] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.482138] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.482308] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.482313] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.482503] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.483013] [ndv4:54965:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465629.481708] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.481716] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.483275] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.483279] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.484034] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.484043] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.484046] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.484098] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.484554] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x24a6010 of 8176 bytes with 127 elements | |
[1650465629.484834] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.484858] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.484909] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465629.484914] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.489527] [ndv4:55516:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.489826] [ndv4:55516:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465629.490225] [ndv4:55152:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.490233] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.490398] [ndv4:55152:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.490828] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465629.490835] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465629.490705] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x126f2c0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.490728] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465629.490731] [ndv4:55516:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465629.491006] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.491015] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.491020] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.491060] [ndv4:55516:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465629.491923] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1a64060 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.491950] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465629.491968] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x1b1b0d0 using rc_mlx5/mlx5_ib0:1 on worker 0x23dc840 | |
[1650465629.492371] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.492379] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.492645] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.492651] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.493440] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.495318] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.495327] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.495330] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.495344] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.495661] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.495669] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.495700] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465629.495704] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.495712] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1395710 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.495730] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465629.496339] [ndv4:54965:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.496728] [ndv4:54756:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.497079] [ndv4:54756:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465629.497652] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x1e5e9c0 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.497686] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465629.497690] [ndv4:54756:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465629.497824] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465629.497837] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465629.497845] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.497902] [ndv4:54756:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465629.498924] [ndv4:54756:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.498930] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.499072] [ndv4:54756:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.499109] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465629.499117] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465629.501569] [ndv4:55516:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.501645] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.501794] [ndv4:55516:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.502084] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.502091] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.502917] [ndv4:54965:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2710010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xf498 | |
[1650465629.503438] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.503445] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.503465] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b4b1ff69008 of 151544 bytes with 1052 elements | |
[1650465629.502643] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.502651] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.502862] [ndv4:55512:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.505819] [ndv4:55516:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465629.505826] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.507161] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b25000000..0x2b4b27600000 on mlx5_ib0 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465629.507180] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b4b25000018 of 39845864 bytes with 4752 elements | |
[1650465629.507318] [ndv4:54965:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2710010 | |
[1650465629.507352] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x2710010 using dc_mlx5/mlx5_ib0:1 on worker 0x23dc840 | |
[1650465629.507412] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.507422] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.507490] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.507496] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.507822] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.508592] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.508939] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x23fc6c0: created UD QP 0xf47e on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.509449] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.509763] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.509771] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.510049] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.510054] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.509654] [ndv4:55512:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.509958] [ndv4:55512:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465629.510195] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x12852c0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.510219] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465629.510222] [ndv4:55512:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465629.510392] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.510401] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.510406] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.510445] [ndv4:55512:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465629.511786] [ndv4:55512:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.511793] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.510481] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b276f8000..0x2b4b2777d000 on mlx5_ib0 lkey 0x80700 rkey 0x80700 access 0xf flags 0x3e4 | |
[1650465629.510488] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b276f8018 of 544744 bytes with 128 elements | |
[1650465629.510493] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.511193] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x23fc6c0: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465629.511223] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x23fc6c0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465629.511566] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x23fc6c0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465629.511653] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x23fc6c0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465629.511672] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x23fc6c0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465629.511692] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x23fc6c0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465629.511737] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x23fc6c0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465629.511963] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x23fc6c0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465629.512241] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.511940] [ndv4:55512:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.512021] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.512026] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.512329] [ndv4:55512:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465629.512333] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.512870] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.512874] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.513205] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.513208] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.511719] [ndv4:55152:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465629.511728] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.511828] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.511832] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.513972] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.513979] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.514838] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.514842] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.515080] [ndv4:55512:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.517860] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1a73a90 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.517894] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465629.517915] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x23fc6c0 using ud_verbs/mlx5_ib0:1 on worker 0x23dc840 | |
[1650465629.518048] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.518056] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.518153] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.518157] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.518971] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.519655] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.519911] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x1b252d0: created UD QP 0xf47f on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.519922] [ndv4:54965:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.519152] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.519162] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.520412] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.523329] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.523339] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.529562] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.529570] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.529735] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.529740] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.530328] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.530337] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.530086] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b2777d000..0x2b4b27802000 on mlx5_ib0 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465629.530092] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b2777d018 of 544744 bytes with 128 elements | |
[1650465629.530097] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.530811] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x1b252d0: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465629.531262] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x1b252d0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465629.531240] [ndv4:55512:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.531560] [ndv4:55512:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465629.533205] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x12877e0 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.533230] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465629.533233] [ndv4:55512:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465629.532066] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x1b252d0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465629.533167] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x1b252d0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465629.533742] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.533752] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.533757] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.533794] [ndv4:55512:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465629.536093] [ndv4:55512:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.536101] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.536252] [ndv4:55512:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.536447] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.536453] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.535708] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.535717] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.539078] [ndv4:54756:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465629.539088] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.541511] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.541520] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.541640] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.541643] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.541854] [ndv4:55516:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.543483] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x1b252d0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465629.547005] [ndv4:55512:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465629.547012] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.548716] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.548723] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.548628] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.548638] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.551958] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x1b252d0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465629.554962] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.554971] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.555184] [ndv4:55152:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.557985] [ndv4:55516:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.558266] [ndv4:55516:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465629.559386] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.559393] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.559547] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.559550] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.559319] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x12717e0 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.559344] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465629.559347] [ndv4:55516:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465629.559457] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.559464] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.559469] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.559510] [ndv4:55516:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465629.560990] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.560998] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.560241] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x1b252d0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465629.561009] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x1b252d0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465629.561015] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.561046] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.561050] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1b02c00 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.561070] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465629.561083] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x1b252d0 using ud_mlx5/mlx5_ib0:1 on worker 0x23dc840 | |
[1650465629.561233] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.561239] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.569474] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.569491] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.570422] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.570418] [ndv4:55516:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.570426] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.570558] [ndv4:55516:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.570838] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.570845] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.571860] [ndv4:55152:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.572207] [ndv4:55152:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465629.571854] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.571863] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.571865] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.571879] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.571777] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.571785] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.572938] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.572944] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.573342] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x2cad050: created RC QP 0xda68 on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.573673] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x2cad050 using rc_verbs/mlx5_ib1:1 on worker 0x23dc840 | |
[1650465629.579071] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.579079] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.579291] [ndv4:55512:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.580977] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x25387e0 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.581001] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465629.581005] [ndv4:55152:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465629.581260] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465629.581270] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465629.581275] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.581317] [ndv4:55152:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465629.582090] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.582097] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.582382] [ndv4:54756:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.583677] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.583688] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.583827] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.583832] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.588997] [ndv4:55512:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.589322] [ndv4:55512:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465629.590196] [ndv4:54756:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.590509] [ndv4:54756:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465629.589850] [ndv4:55516:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465629.589859] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.592048] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.592123] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x196ebf0 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.592154] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465629.592158] [ndv4:55512:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465629.592658] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.592666] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.592671] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.592714] [ndv4:55512:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465629.591428] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.591435] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.592071] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x1e5ad90 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.592095] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465629.592098] [ndv4:54756:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465629.592401] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465629.592410] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465629.592418] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.592469] [ndv4:54756:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465629.593826] [ndv4:54756:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.593834] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.593849] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.593858] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.593861] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.593879] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.594001] [ndv4:54756:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.594038] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465629.594046] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465629.594324] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2cc0010 of 8176 bytes with 127 elements | |
[1650465629.594551] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.594563] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.594618] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465629.594623] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.594635] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1b0a9a0 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.594656] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465629.594669] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x2b2e010 using rc_mlx5/mlx5_ib1:1 on worker 0x23dc840 | |
[1650465629.594732] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.594739] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.594879] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.594884] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.595308] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.596673] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.596696] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.596700] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.596757] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.597196] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.597203] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.597238] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465629.597242] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.597249] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1b16de0 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.597271] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465629.597864] [ndv4:54965:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.599779] [ndv4:55152:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.599793] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.599955] [ndv4:55152:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.600083] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465629.600090] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465629.601665] [ndv4:55152:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465629.601673] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.602046] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.602050] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.602477] [ndv4:55512:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.602486] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.602675] [ndv4:55512:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.602692] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.602698] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.603230] [ndv4:55512:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465629.603235] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.603898] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.603902] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.605126] [ndv4:55432:0] debug.c:1198 UCX DEBUG using signal stack 0x2aee39a40000 size 141824 | |
[1650465629.605202] [ndv4:55432:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.605224] [ndv4:55432:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2aee398ae000 | |
[1650465629.605241] [ndv4:55432:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.605250] [ndv4:55432:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.605256] [ndv4:55432:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.605219] [ndv4:54965:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2e58010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda8b | |
[1650465629.605300] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.605306] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.605314] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b4b1ff90008 of 151544 bytes with 1052 elements | |
[1650465629.609236] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b27a00000..0x2b4b2a000000 on mlx5_ib1 lkey 0x80400 rkey 0x80400 access 0xf flags 0x3e4 | |
[1650465629.609258] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b4b27a00018 of 39845864 bytes with 4752 elements | |
[1650465629.609398] [ndv4:54965:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2e58010 | |
[1650465629.609432] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x2e58010 using dc_mlx5/mlx5_ib1:1 on worker 0x23dc840 | |
[1650465629.609533] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.609545] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.609667] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.609672] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.607726] [ndv4:55432:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.607745] [ndv4:55432:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.607777] [ndv4:55432:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.607780] [ndv4:55432:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.607787] [ndv4:55432:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.607794] [ndv4:55432:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.607796] [ndv4:55432:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.607802] [ndv4:55432:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.607804] [ndv4:55432:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.607806] [ndv4:55432:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.607809] [ndv4:55432:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.607811] [ndv4:55432:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.607819] [ndv4:55432:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.608495] [ndv4:55432:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.609063] [ndv4:55432:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.609076] [ndv4:55432:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.609088] [ndv4:55432:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.609098] [ndv4:55432:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.609109] [ndv4:55432:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.609119] [ndv4:55432:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.609129] [ndv4:55432:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.609487] [ndv4:55432:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.610070] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.610937] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.611250] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x2d57460: created UD QP 0xda71 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.611788] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.611868] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.611874] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.611919] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.611924] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.613363] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.613371] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.612365] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b2a003000..0x2b4b2a088000 on mlx5_ib1 lkey 0x80500 rkey 0x80500 access 0xf flags 0x3e4 | |
[1650465629.612371] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b2a003018 of 544744 bytes with 128 elements | |
[1650465629.612375] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.612758] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2d57460: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465629.613073] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2d57460: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465629.613339] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2d57460: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465629.615129] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.615137] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.615425] [ndv4:54756:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465629.615436] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.619285] [ndv4:55432:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.619663] [ndv4:55432:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.623945] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2d57460: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465629.624195] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2d57460: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465629.624358] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2d57460: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465629.624675] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2d57460: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465629.626343] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.626351] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.625454] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.625462] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.626448] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.626452] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.626754] [ndv4:55516:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.630032] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.630039] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.633515] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2d57460: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465629.633872] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.633882] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1b09820 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.633909] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465629.633921] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x2d57460 using ud_verbs/mlx5_ib1:1 on worker 0x23dc840 | |
[1650465629.634016] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.634022] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.634170] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.634174] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.635378] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.634295] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.634304] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.635444] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x210e320 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.635540] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.635548] [ndv4:55432:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.634390] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.634398] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.635993] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.636003] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.636027] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.636085] [ndv4:55432:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.636107] [ndv4:55432:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.637468] [ndv4:55432:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.637475] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.635502] [ndv4:55516:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.635893] [ndv4:55516:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465629.637206] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x1958bf0 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.637241] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465629.637245] [ndv4:55516:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465629.637482] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.637494] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.637505] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.637609] [ndv4:55516:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465629.637736] [ndv4:55432:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.637893] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.637899] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.636829] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.636835] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.639314] [ndv4:55152:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.639321] [ndv4:55152:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.639407] [ndv4:55152:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x2523d70 0x2523d70 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465629.639961] [ndv4:55516:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.639971] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.640125] [ndv4:55516:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.640159] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.640167] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.641513] [ndv4:55516:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465629.641518] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.640696] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.640729] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.644787] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.644477] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.644484] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.644758] [ndv4:55512:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.645244] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x2c1e280: created UD QP 0xda72 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.645251] [ndv4:54965:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.645932] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.646214] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.646222] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.646358] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.646363] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.647179] [ndv4:55432:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.647187] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.646753] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b2a088000..0x2b4b2a10d000 on mlx5_ib1 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465629.646759] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b2a088018 of 544744 bytes with 128 elements | |
[1650465629.646764] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.647113] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2c1e280: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465629.647366] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2c1e280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465629.649230] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.649237] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.649659] [ndv4:55691:0] debug.c:1198 UCX DEBUG using signal stack 0x2b871c5e3000 size 141824 | |
[1650465629.649731] [ndv4:55691:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.649750] [ndv4:55691:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b871c437000 | |
[1650465629.649771] [ndv4:55691:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.649779] [ndv4:55691:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.649785] [ndv4:55691:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.650472] [ndv4:54756:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.650483] [ndv4:54756:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.650795] [ndv4:54756:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x1e43b50 0x1e43b50 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465629.652420] [ndv4:55691:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.652436] [ndv4:55691:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.652472] [ndv4:55691:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.652475] [ndv4:55691:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.652481] [ndv4:55691:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.652488] [ndv4:55691:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.652491] [ndv4:55691:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.652495] [ndv4:55691:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.652498] [ndv4:55691:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.652501] [ndv4:55691:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.652504] [ndv4:55691:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.652506] [ndv4:55691:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.652514] [ndv4:55691:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.653260] [ndv4:55691:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.654170] [ndv4:55691:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.654188] [ndv4:55691:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.654200] [ndv4:55691:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.654212] [ndv4:55691:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.654357] [ndv4:55691:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.654511] [ndv4:55691:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.654528] [ndv4:55691:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.654940] [ndv4:55691:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.657259] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2c1e280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465629.657690] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2c1e280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465629.658091] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2c1e280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465629.658369] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2c1e280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465629.661807] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.661817] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.661786] [ndv4:55512:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.662107] [ndv4:55512:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465629.663182] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x128da10 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.663204] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465629.663207] [ndv4:55512:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465629.664428] [ndv4:55691:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.663491] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465629.663500] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465629.663505] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.663545] [ndv4:55512:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465629.664728] [ndv4:55512:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.664734] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.664790] [ndv4:55691:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.664899] [ndv4:55512:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.665070] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465629.665077] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465629.666607] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2c1e280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465629.666889] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x2c1e280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465629.666896] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.666928] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.666932] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1b18b40 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.666953] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465629.666965] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x2c1e280 using ud_mlx5/mlx5_ib1:1 on worker 0x23dc840 | |
[1650465629.667120] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.667126] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.667260] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.667265] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.667710] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.669124] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.669143] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.669146] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.669206] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.669194] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.669205] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.670232] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.670238] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.670752] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x32470a0: created RC QP 0xdacc on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.671077] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x32470a0 using rc_verbs/mlx5_ib2:1 on worker 0x23dc840 | |
[1650465629.671265] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.671271] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.671989] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x23c5280 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.672086] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.672093] [ndv4:55691:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.672451] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.672462] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.672486] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.672571] [ndv4:55691:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.672628] [ndv4:55691:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.671666] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.671674] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.676694] [ndv4:55512:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465629.676706] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.680029] [ndv4:55152:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b3b9706c000 length 12288 | |
[1650465629.680097] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465629.681185] [ndv4:55152:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465629.681195] [ndv4:55152:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b3b9706f000 length 4296704 | |
[1650465629.681200] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b3b9706f018 of 4296680 bytes with 512 elements | |
[1650465629.681456] [ndv4:55152:0] mm_iface.c:600 UCX DEBUG created mm iface 0x2ca0d60 FIFO id 0x400000002cee43e8 va 0x2b3b9706c000 size 12288 (128 x 64 elems) | |
[1650465629.681507] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x2ca0d60 using posix/memory on worker 0x357a8c0 | |
[1650465629.681533] [ndv4:55152:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465629.681571] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465629.681606] [ndv4:55152:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465629.681616] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b3b9748b018 of 4296680 bytes with 512 elements | |
[1650465629.682172] [ndv4:55152:0] mm_iface.c:600 UCX DEBUG created mm iface 0x2ca1330 FIFO id 0x718018 va 0x2b3b97488000 size 12288 (128 x 64 elems) | |
[1650465629.682181] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x2ca1330 using sysv/memory on worker 0x357a8c0 | |
[1650465629.682193] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465629.682196] [ndv4:55152:0] self.c:220 UCX DEBUG created self iface id 0xcf9893745fe6be74 send_size 8192 | |
[1650465629.682202] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x2cb4f10 using self/memory0 on worker 0x357a8c0 | |
[1650465629.682224] [ndv4:55152:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.682230] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.682233] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.679925] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.679935] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.680376] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.681723] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.681739] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.681743] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.681794] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.682267] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3568010 of 8176 bytes with 127 elements | |
[1650465629.682484] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.682495] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.682531] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465629.682537] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.682550] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1b00f60 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.682570] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465629.682616] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x3394030 using rc_mlx5/mlx5_ib2:1 on worker 0x23dc840 | |
[1650465629.682726] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.682733] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.682811] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.682816] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.680993] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.681001] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.682909] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.682916] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.683216] [ndv4:55516:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.682942] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.684068] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.684084] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.684087] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.684142] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.684507] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.684514] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.684546] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465629.684550] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.684557] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x28e9ef0 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.684608] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465629.685205] [ndv4:54965:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.685452] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2cb7050 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.685487] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465629.685505] [ndv4:55152:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2cb5870: listening for connections (fd=78) on 10.5.0.5:46789 | |
[1650465629.685721] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x2cb5870 using tcp/eth0 on worker 0x357a8c0 | |
[1650465629.685743] [ndv4:55152:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.685747] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.685750] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.685790] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2c119a0 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.685810] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465629.685814] [ndv4:55152:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2cb5ef0: listening for connections (fd=80) on 127.0.0.1:59444 | |
[1650465629.685829] [ndv4:55152:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465629.685834] [ndv4:55152:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465629.685891] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x2cb5ef0 using tcp/lo on worker 0x357a8c0 | |
[1650465629.685908] [ndv4:55152:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.685911] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.685914] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.685950] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2c0fb10 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.685974] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465629.685978] [ndv4:55152:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2c9d580: listening for connections (fd=82) on 172.16.1.242:39739 | |
[1650465629.686197] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x2c9d580 using tcp/ib0 on worker 0x357a8c0 | |
[1650465629.686387] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.686395] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.686560] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.686565] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.687243] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.688034] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.688072] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.688076] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.688088] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.688479] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.688489] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.688526] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.688534] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.689697] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x2cab6c0: created RC QP 0xf480 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.691626] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x2cab6c0 using rc_verbs/mlx5_ib0:1 on worker 0x357a8c0 | |
[1650465629.691668] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.691674] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.691772] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.691776] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.690406] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.690420] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.692232] [ndv4:54965:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x356a050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdadf | |
[1650465629.692290] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.692296] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.692305] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b4b1ffb7008 of 151544 bytes with 1052 elements | |
[1650465629.692641] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.692870] [ndv4:55152:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465629.694006] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.694016] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.694019] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.694072] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.694498] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3644010 of 8176 bytes with 127 elements | |
[1650465629.695644] [ndv4:55691:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.695659] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.695817] [ndv4:55691:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.695974] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.695981] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.694774] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.694797] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.694848] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465629.694854] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.695972] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b2a200000..0x2b4b2c800000 on mlx5_ib2 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465629.695990] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b4b2a200018 of 39845864 bytes with 4752 elements | |
[1650465629.696129] [ndv4:54965:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x356a050 | |
[1650465629.696160] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x356a050 using dc_mlx5/mlx5_ib2:1 on worker 0x23dc840 | |
[1650465629.696407] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.696418] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.696691] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.696696] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.697604] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.698909] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.699211] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x3746060: created UD QP 0xdad5 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.699781] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.699879] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.699886] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.699921] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.699926] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.699778] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.699786] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.700007] [ndv4:55432:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.700379] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b2c90e000..0x2b4b2c993000 on mlx5_ib2 lkey 0x80700 rkey 0x80700 access 0xf flags 0x3e4 | |
[1650465629.700385] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b2c90e018 of 544744 bytes with 128 elements | |
[1650465629.700389] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.700434] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3746060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465629.700460] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3746060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465629.700476] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3746060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465629.700497] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3746060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465629.700525] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3746060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465629.700553] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3746060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465629.700643] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3746060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465629.700672] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3746060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465629.700943] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.700952] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x2cc39c0 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.700981] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465629.700992] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x3746060 using ud_verbs/mlx5_ib2:1 on worker 0x23dc840 | |
[1650465629.701003] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.701008] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.701050] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.701054] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.701197] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.701899] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.702210] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x139f1a0: created UD QP 0xdad6 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.702217] [ndv4:54965:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.701931] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2c01350 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.701961] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465629.701980] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x2cb9110 using rc_mlx5/mlx5_ib0:1 on worker 0x357a8c0 | |
[1650465629.702003] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.702010] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.702087] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.702093] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.702278] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.702508] [ndv4:55516:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.702874] [ndv4:55516:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465629.702774] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.702848] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.702854] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.702865] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.702869] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.702995] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.703006] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.703010] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.703024] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.703198] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b2c993000..0x2b4b2ca18000 on mlx5_ib2 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465629.703205] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b2c993018 of 544744 bytes with 128 elements | |
[1650465629.703210] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.703391] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.703399] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.703431] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465629.703435] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.703444] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2c0fb80 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.703464] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465629.704067] [ndv4:55152:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.704179] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.704190] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.708180] [ndv4:55432:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.708537] [ndv4:55432:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465629.709052] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x2134fc0 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.709077] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465629.709081] [ndv4:55432:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465629.709269] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.709278] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.709283] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.709325] [ndv4:55432:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465629.709938] [ndv4:55432:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.709943] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.711535] [ndv4:55152:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x38ae010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xf4a6 | |
[1650465629.711884] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.711891] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.711911] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b978a6008 of 151544 bytes with 1052 elements | |
[1650465629.712100] [ndv4:55691:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.712109] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.712502] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.712506] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.710162] [ndv4:55432:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.710244] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.710250] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.711188] [ndv4:55432:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465629.711192] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.714875] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x139f1a0: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465629.715002] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x139f1a0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465629.715092] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x139f1a0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465629.715183] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x139f1a0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465629.715250] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x139f1a0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465629.715338] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x139f1a0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465629.715427] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x139f1a0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465629.715515] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x139f1a0: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465629.715521] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.715552] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.715556] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3746fc0 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.715594] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465629.715607] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x139f1a0 using ud_mlx5/mlx5_ib2:1 on worker 0x23dc840 | |
[1650465629.715623] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.715629] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.715712] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.715718] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.715864] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465629.715533] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x1277a10 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.715587] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465629.715593] [ndv4:55516:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465629.715828] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465629.715838] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465629.715844] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.715909] [ndv4:55516:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465629.715594] [ndv4:54756:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b3b53d21000 length 12288 | |
[1650465629.715665] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465629.716781] [ndv4:54756:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465629.716790] [ndv4:54756:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b3b58000000 length 4296704 | |
[1650465629.716796] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b3b58000018 of 4296680 bytes with 512 elements | |
[1650465629.717053] [ndv4:54756:0] mm_iface.c:600 UCX DEBUG created mm iface 0x25c0d70 FIFO id 0x400000004ce22b49 va 0x2b3b53d21000 size 12288 (128 x 64 elems) | |
[1650465629.717103] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x25c0d70 using posix/memory on worker 0x2e9a8d0 | |
[1650465629.717130] [ndv4:54756:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465629.717162] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465629.717178] [ndv4:54756:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465629.717186] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b3b58419018 of 4296680 bytes with 512 elements | |
[1650465629.717847] [ndv4:54756:0] mm_iface.c:600 UCX DEBUG created mm iface 0x25c1340 FIFO id 0x71801c va 0x2b3b53d24000 size 12288 (128 x 64 elems) | |
[1650465629.717856] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x25c1340 using sysv/memory on worker 0x2e9a8d0 | |
[1650465629.717868] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465629.717871] [ndv4:54756:0] self.c:220 UCX DEBUG created self iface id 0x98bcd82d5f8a642c send_size 8192 | |
[1650465629.717877] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x25d4f20 using self/memory0 on worker 0x2e9a8d0 | |
[1650465629.717901] [ndv4:54756:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.717906] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.717909] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.716986] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.717001] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.717004] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.717053] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.718161] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.718166] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.718669] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x39640a0: created RC QP 0xda53 on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.719012] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x39640a0 using rc_verbs/mlx5_ib3:1 on worker 0x23dc840 | |
[1650465629.719135] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.719141] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.719261] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.719266] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.719649] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465629.717809] [ndv4:55516:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.717817] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.717986] [ndv4:55516:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.718171] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465629.718178] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465629.719792] [ndv4:55516:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465629.719798] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.717435] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b9c200000..0x2b3b9e800000 on mlx5_ib0 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465629.717456] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3b9c200018 of 39845864 bytes with 4752 elements | |
[1650465629.717674] [ndv4:55152:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x38ae010 | |
[1650465629.717706] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x38ae010 using dc_mlx5/mlx5_ib0:1 on worker 0x357a8c0 | |
[1650465629.717912] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.717924] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.718167] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.718173] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.719007] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.720005] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.720350] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x2cc1a80: created UD QP 0xf489 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.716732] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.716751] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.720887] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.720900] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.720903] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.720954] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.721275] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3c85010 of 8176 bytes with 127 elements | |
[1650465629.720922] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.721155] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.721161] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.721225] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.721230] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.720439] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x25e2f40 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.720469] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465629.720488] [ndv4:54756:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x25d5880: listening for connections (fd=78) on 10.5.0.5:49260 | |
[1650465629.720635] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x25d5880 using tcp/eth0 on worker 0x2e9a8d0 | |
[1650465629.720655] [ndv4:54756:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.720659] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.720662] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.720725] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x25319c0 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.720745] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465629.720749] [ndv4:54756:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x25d5f00: listening for connections (fd=80) on 127.0.0.1:33573 | |
[1650465629.720767] [ndv4:54756:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465629.720772] [ndv4:54756:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465629.720837] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x25d5f00 using tcp/lo on worker 0x2e9a8d0 | |
[1650465629.720853] [ndv4:54756:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.720855] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.720858] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.720895] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x252fb30 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.720917] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465629.720920] [ndv4:54756:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x25bd590: listening for connections (fd=82) on 172.16.1.242:52612 | |
[1650465629.721282] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x25bd590 using tcp/ib0 on worker 0x2e9a8d0 | |
[1650465629.721410] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.721418] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.721594] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.721601] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.721773] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.721665] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b978cb000..0x2b3b97950000 on mlx5_ib0 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465629.721672] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b978cb018 of 544744 bytes with 128 elements | |
[1650465629.721676] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.721484] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.721489] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.721526] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465629.721530] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.721540] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3864ae0 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.721565] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465629.721573] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x3ab1030 using rc_mlx5/mlx5_ib3:1 on worker 0x23dc840 | |
[1650465629.722241] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.722249] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.722674] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.722722] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.722727] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.722766] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.723070] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.723079] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.723623] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x25cb6d0: created RC QP 0xf48a on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.727771] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.727777] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.728073] [ndv4:55512:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.729286] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.729295] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.729518] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.729523] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.730286] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465629.729451] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2cc1a80: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465629.729814] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2cc1a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465629.730558] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2cc1a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465629.731122] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.731129] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.731859] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.731877] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.731881] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.731935] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.732386] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.732393] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.732425] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465629.732428] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.732435] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x1b05480 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.732454] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465629.732808] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.732817] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.733025] [ndv4:54965:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.734355] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.734361] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.737356] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.737362] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.737679] [ndv4:55691:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.740192] [ndv4:54965:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3c87050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda66 | |
[1650465629.740561] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.740569] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.740633] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b4b2f219008 of 151544 bytes with 1052 elements | |
[1650465629.739538] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2cc1a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465629.740143] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2cc1a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465629.741014] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2cc1a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465629.741525] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2cc1a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465629.742380] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2cc1a80: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465629.742533] [ndv4:55512:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.741706] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x25cb6d0 using rc_verbs/mlx5_ib0:1 on worker 0x2e9a8d0 | |
[1650465629.741882] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.741889] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.741961] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.741965] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.742744] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.742915] [ndv4:55512:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465629.742722] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.744295] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b2cc00000..0x2b4b2f200000 on mlx5_ib3 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465629.744314] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b4b2cc00018 of 39845864 bytes with 4752 elements | |
[1650465629.744452] [ndv4:54965:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3c87050 | |
[1650465629.744486] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x3c87050 using dc_mlx5/mlx5_ib3:1 on worker 0x23dc840 | |
[1650465629.744787] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.744798] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.744997] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.745002] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.745738] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465629.749423] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.749431] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.748526] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2ca0c40 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.748558] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465629.748657] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x2cc1a80 using ud_verbs/mlx5_ib0:1 on worker 0x357a8c0 | |
[1650465629.748817] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.748824] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.749078] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.749083] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.749949] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.750518] [ndv4:54756:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465629.750967] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.751381] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x359a740: created UD QP 0xf48b on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.751397] [ndv4:55152:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.752041] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.751510] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.751524] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.751527] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.751621] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.752036] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2f64010 of 8176 bytes with 127 elements | |
[1650465629.752249] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.752272] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.752323] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465629.752329] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.752346] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x196f4b0 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.752372] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465629.752375] [ndv4:55512:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465629.752633] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465629.752643] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465629.752649] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.752704] [ndv4:55512:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465629.754838] [ndv4:55512:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.754845] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.755007] [ndv4:55512:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.755213] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465629.755220] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465629.752929] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.752940] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.754944] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.754950] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.753717] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.754029] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x3e63060: created UD QP 0xda5c on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.754516] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.754728] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.754736] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.754892] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.754896] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.755340] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b2f23e000..0x2b4b2f2c3000 on mlx5_ib3 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465629.755350] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b2f23e018 of 544744 bytes with 128 elements | |
[1650465629.755355] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.755497] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3e63060: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465629.756485] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3e63060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465629.757709] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3e63060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465629.758191] [ndv4:55512:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465629.758198] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.758527] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.758534] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.758947] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.758955] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.759173] [ndv4:55432:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.758979] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3e63060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465629.759839] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3e63060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465629.759302] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x25213a0 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.759337] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465629.759360] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x25d9120 using rc_mlx5/mlx5_ib0:1 on worker 0x2e9a8d0 | |
[1650465629.759672] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.759680] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.759770] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.759775] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.759994] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.760004] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.760115] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.760121] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.760535] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.760903] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.760909] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.760637] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b97950000..0x2b3b979d5000 on mlx5_ib0 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465629.760644] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b97950018 of 544744 bytes with 128 elements | |
[1650465629.760648] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.761540] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.761552] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.761555] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.761571] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.761794] [ndv4:55691:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.761988] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.761995] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.762041] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465629.762045] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.762056] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x1e53fb0 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.762076] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465629.762171] [ndv4:55691:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465629.762570] [ndv4:54756:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.763154] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x23bbc50 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.763183] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465629.763187] [ndv4:55691:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465629.763434] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.763444] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.763450] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.763496] [ndv4:55691:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465629.765852] [ndv4:55691:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.765860] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.766047] [ndv4:55691:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.766137] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.766143] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.767435] [ndv4:55691:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465629.767439] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.768569] [ndv4:54756:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x31ce010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xf4b4 | |
[1650465629.768831] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.768837] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.769455] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3e63060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465629.770082] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3e63060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465629.770533] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3e63060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465629.769838] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465629.769846] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465629.770130] [ndv4:55516:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.770779] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x359a740: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465629.770866] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.770879] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3222fc0 [id=109 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.770907] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 109 events 0x5 mode thread_spinlock | |
[1650465629.770921] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[24]=0x3e63060 using ud_verbs/mlx5_ib3:1 on worker 0x23dc840 | |
[1650465629.770940] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.770946] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.771059] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.771064] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.771457] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465629.776945] [ndv4:55516:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.777233] [ndv4:55516:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465629.777699] [ndv4:55432:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.777967] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x19594b0 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.777993] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465629.777998] [ndv4:55516:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465629.778075] [ndv4:55432:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465629.778177] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465629.778187] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465629.778193] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.778245] [ndv4:55516:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465629.780125] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.780132] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.780432] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.780436] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.780694] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.780702] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.780929] [ndv4:55691:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.783957] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x359a740: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465629.784076] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x359a740: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465629.784130] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x359a740: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465629.784161] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x359a740: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465629.784204] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x359a740: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465629.784363] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x359a740: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465629.784552] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x359a740: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465629.784558] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.784604] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.784609] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2533420 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.784633] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465629.784647] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x359a740 using ud_mlx5/mlx5_ib0:1 on worker 0x357a8c0 | |
[1650465629.784683] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.784689] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.784807] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.784811] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.785215] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.785984] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.785993] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.785996] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.786009] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.784821] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.784828] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.785469] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.785474] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.785894] [ndv4:55512:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.785898] [ndv4:55512:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.785996] [ndv4:55512:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x1272ca0 0x1272ca0 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465629.785516] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.785900] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x3f81460: created UD QP 0xda5d on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.785910] [ndv4:54965:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.786622] [ndv4:55691:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.786658] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.786665] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.786475] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.786609] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.786616] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.786679] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.786684] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.786996] [ndv4:55691:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465629.787758] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x3e4b050: created RC QP 0xda73 on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.787684] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b2f2c3000..0x2b4b2f348000 on mlx5_ib3 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465629.787693] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b2f2c3018 of 544744 bytes with 128 elements | |
[1650465629.787698] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.788170] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3f81460: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465629.788603] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x3e4b050 using rc_verbs/mlx5_ib1:1 on worker 0x357a8c0 | |
[1650465629.788663] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.788669] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.788733] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.788737] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.789276] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.789982] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.789992] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.790021] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b53d29008 of 151544 bytes with 1052 elements | |
[1650465629.791829] [ndv4:55516:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.791853] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.792039] [ndv4:55516:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.793740] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b58a00000..0x2b3b5b000000 on mlx5_ib0 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465629.793769] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3b58a00018 of 39845864 bytes with 4752 elements | |
[1650465629.794186] [ndv4:54756:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x31ce010 | |
[1650465629.794265] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x31ce010 using dc_mlx5/mlx5_ib0:1 on worker 0x2e9a8d0 | |
[1650465629.794348] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.794357] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.794638] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.794644] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.795669] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.796393] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.796876] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x25e1a90: created UD QP 0xf494 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.797337] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.797882] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.797895] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.798025] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.798031] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.796540] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3f81460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465629.796684] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3f81460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465629.796860] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3f81460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465629.797187] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3f81460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465629.797466] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3f81460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465629.797531] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3f81460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465629.797563] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x3f81460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465629.797569] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.797672] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.797679] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3f81f60 [id=110 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.797697] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 110 events 0x5 mode thread_spinlock | |
[1650465629.797708] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[25]=0x3f81460 using ud_mlx5/mlx5_ib3:1 on worker 0x23dc840 | |
[1650465629.797897] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.797904] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.798070] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.798075] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.799248] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465629.798398] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b53d4e000..0x2b3b53dd3000 on mlx5_ib0 lkey 0x80d00 rkey 0x80d00 access 0xf flags 0x3e4 | |
[1650465629.798412] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b53d4e018 of 544744 bytes with 128 elements | |
[1650465629.798429] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.799392] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e1a90: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465629.799936] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e1a90: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465629.800273] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e1a90: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465629.800538] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e1a90: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465629.800810] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e1a90: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465629.801055] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e1a90: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465629.801217] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e1a90: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465629.801304] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e1a90: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465629.800401] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.800412] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.800415] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.800431] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.800811] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3e5e010 of 8176 bytes with 127 elements | |
[1650465629.801046] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.801052] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.801086] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465629.801090] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.801100] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x3e53fb0 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.801123] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465629.801133] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x3ccc010 using rc_mlx5/mlx5_ib1:1 on worker 0x357a8c0 | |
[1650465629.801207] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.801213] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.801305] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.801310] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.801590] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.801556] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.803724] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x2134ce0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.803753] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465629.803758] [ndv4:55432:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465629.803965] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.803977] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.803982] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.804025] [ndv4:55432:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465629.804647] [ndv4:55432:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.804652] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.804866] [ndv4:55432:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.804910] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.804916] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.805499] [ndv4:55432:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465629.805504] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.806165] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.806169] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.803837] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465629.803856] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465629.804262] [ndv4:55516:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465629.804267] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.805062] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.805067] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.805787] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.805793] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.806933] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.806938] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.807288] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x25c94c0 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.807321] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465629.807354] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x25e1a90 using ud_verbs/mlx5_ib0:1 on worker 0x2e9a8d0 | |
[1650465629.807800] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x23c3cb0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.807829] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465629.807834] [ndv4:55691:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465629.808019] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.808034] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.808044] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.808100] [ndv4:55691:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465629.809019] [ndv4:55691:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.809025] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.809170] [ndv4:55691:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.809301] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.809308] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.809972] [ndv4:55691:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465629.809977] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.810368] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.810372] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.810487] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.810490] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.810697] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.810704] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.810821] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.810825] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.811049] [ndv4:55691:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.813645] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.813667] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.813670] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.813726] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.814765] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.814771] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.815317] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x40810a0: created RC QP 0xdac5 on mlx5_ib4:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.815668] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[26]=0x40810a0 using rc_verbs/mlx5_ib4:1 on worker 0x23dc840 | |
[1650465629.815690] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.815695] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.815554] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.815594] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.815598] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.815653] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.816073] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.816083] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.816116] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465629.816121] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.816130] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2cb4e80 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.816152] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465629.816790] [ndv4:55152:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.818126] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.818136] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.818296] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.818302] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.819015] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.818852] [ndv4:55516:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465629.818860] [ndv4:55516:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465629.818978] [ndv4:55516:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x125cca0 0x125cca0 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465629.820672] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.821138] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x25e3320: created UD QP 0xf496 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.821156] [ndv4:54756:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.821666] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.821971] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.821978] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.822088] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.822094] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.822470] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b53dd3000..0x2b3b53e58000 on mlx5_ib0 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465629.822484] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b53dd3018 of 544744 bytes with 128 elements | |
[1650465629.822500] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.823084] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e3320: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465629.823447] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e3320: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465629.823791] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e3320: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465629.824006] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e3320: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465629.824199] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e3320: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465629.824372] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e3320: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465629.824544] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e3320: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465629.824806] [ndv4:55152:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3ff6010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda99 | |
[1650465629.825072] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.825079] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.825088] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b979d7008 of 151544 bytes with 1052 elements | |
[1650465629.826658] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.826670] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.826857] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465629.828238] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.828252] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.828255] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.828308] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.828896] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x43a2010 of 8176 bytes with 127 elements | |
[1650465629.829166] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.829176] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.829210] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465629.829214] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.829227] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x4089f60 [id=113 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.829248] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 113 events 0x1 mode thread_spinlock | |
[1650465629.829258] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[27]=0x41ce030 using rc_mlx5/mlx5_ib4:1 on worker 0x23dc840 | |
[1650465629.829274] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.829280] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.829381] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.829386] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.828988] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b9ea00000..0x2b3ba1000000 on mlx5_ib1 lkey 0x80700 rkey 0x80700 access 0xf flags 0x3e4 | |
[1650465629.829004] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3b9ea00018 of 39845864 bytes with 4752 elements | |
[1650465629.829143] [ndv4:55152:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3ff6010 | |
[1650465629.829175] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x3ff6010 using dc_mlx5/mlx5_ib1:1 on worker 0x357a8c0 | |
[1650465629.829339] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.829349] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.829526] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.829531] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.830772] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.830400] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.830408] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.832092] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.832538] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x3ef5500: created UD QP 0xda7c on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.833229] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.833642] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.833648] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.833878] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.833884] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.835161] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x25e3320: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465629.835180] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.835749] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.835767] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x25d4300 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.835798] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465629.835829] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x25e3320 using ud_mlx5/mlx5_ib0:1 on worker 0x2e9a8d0 | |
[1650465629.836009] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.836017] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.836291] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.836297] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.836972] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465629.836978] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.837192] [ndv4:55432:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.838031] [ndv4:55691:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.838254] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465629.838447] [ndv4:55691:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465629.839268] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x23bb280 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.839300] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465629.839303] [ndv4:55691:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465629.839396] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.839413] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.839416] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.839476] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.839879] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.839886] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.839918] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465629.839922] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.839929] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3174580 [id=115 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.839952] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 115 events 0x1 mode thread_spinlock | |
[1650465629.840672] [ndv4:54965:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.839482] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.839492] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.839497] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.839539] [ndv4:55691:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465629.841974] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.841993] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.842477] [ndv4:55691:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.842485] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.842747] [ndv4:55691:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.843069] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.843076] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.842410] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b979fc000..0x2b3b97a81000 on mlx5_ib1 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465629.842418] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b979fc018 of 544744 bytes with 128 elements | |
[1650465629.842425] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.843259] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3ef5500: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465629.843868] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3ef5500: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465629.844307] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3ef5500: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465629.844849] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3ef5500: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465629.845316] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3ef5500: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465629.845687] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3ef5500: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465629.845863] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3ef5500: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465629.845974] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3ef5500: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465629.845239] [ndv4:55691:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465629.845245] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.846271] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.846282] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2c9f930 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.846317] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465629.846346] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x3ef5500 using ud_verbs/mlx5_ib1:1 on worker 0x357a8c0 | |
[1650465629.846371] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.846381] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.846444] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.846449] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.846678] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.847499] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.847922] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x3dbc280: created UD QP 0xda7e on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.847930] [ndv4:55152:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.848287] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.848464] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.848917] [ndv4:54965:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x43a4050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdad8 | |
[1650465629.849155] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.849168] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.849181] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b4b31b49008 of 151544 bytes with 1052 elements | |
[1650465629.849867] [ndv4:55512:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2aea6ff3a000 length 12288 | |
[1650465629.849930] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465629.851153] [ndv4:55512:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465629.851162] [ndv4:55512:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2aea74000000 length 4296704 | |
[1650465629.851168] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2aea74000018 of 4296680 bytes with 512 elements | |
[1650465629.851427] [ndv4:55512:0] mm_iface.c:600 UCX DEBUG created mm iface 0x19efd10 FIFO id 0x4000000050eb928a va 0x2aea6ff3a000 size 12288 (128 x 64 elems) | |
[1650465629.851474] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x19efd10 using posix/memory on worker 0x22c98d0 | |
[1650465629.851497] [ndv4:55512:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465629.851533] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465629.851547] [ndv4:55512:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465629.851556] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2aea74419018 of 4296680 bytes with 512 elements | |
[1650465629.849263] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.849284] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.849288] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.849312] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.849662] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.849673] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.850102] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x376b050: created RC QP 0xda7f on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.850747] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x376b050 using rc_verbs/mlx5_ib1:1 on worker 0x2e9a8d0 | |
[1650465629.852235] [ndv4:55512:0] mm_iface.c:600 UCX DEBUG created mm iface 0x19f02e0 FIFO id 0x718024 va 0x2aea6ff3d000 size 12288 (128 x 64 elems) | |
[1650465629.852246] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x19f02e0 using sysv/memory on worker 0x22c98d0 | |
[1650465629.852258] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465629.852261] [ndv4:55512:0] self.c:220 UCX DEBUG created self iface id 0x74b2d79bf49b778b send_size 8192 | |
[1650465629.852267] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x1a03ec0 using self/memory0 on worker 0x22c98d0 | |
[1650465629.852290] [ndv4:55512:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.852296] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.852298] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.853967] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b2f400000..0x2b4b31a00000 on mlx5_ib4 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465629.853994] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b4b2f400018 of 39845864 bytes with 4752 elements | |
[1650465629.854175] [ndv4:54965:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x43a4050 | |
[1650465629.854211] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[28]=0x43a4050 using dc_mlx5/mlx5_ib4:1 on worker 0x23dc840 | |
[1650465629.854335] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.854348] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.854413] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.854417] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.854007] [ndv4:54861:0] debug.c:1198 UCX DEBUG using signal stack 0x2ae884b7c000 size 141824 | |
[1650465629.854086] [ndv4:54861:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.854109] [ndv4:54861:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2ae8849d0000 | |
[1650465629.854126] [ndv4:54861:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.854135] [ndv4:54861:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.854141] [ndv4:54861:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.855426] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x1a06000 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.855461] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465629.855479] [ndv4:55512:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1a04820: listening for connections (fd=78) on 10.5.0.5:52013 | |
[1650465629.855706] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x1a04820 using tcp/eth0 on worker 0x22c98d0 | |
[1650465629.855728] [ndv4:55512:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.855732] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.855735] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.855776] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x1952250 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.855801] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465629.855805] [ndv4:55512:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x1a04ea0: listening for connections (fd=80) on 127.0.0.1:41660 | |
[1650465629.855821] [ndv4:55512:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465629.855826] [ndv4:55512:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465629.855885] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x1a04ea0 using tcp/lo on worker 0x22c98d0 | |
[1650465629.855901] [ndv4:55512:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.855904] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.855907] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.855946] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x19605b0 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.855969] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465629.855973] [ndv4:55512:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x19ec530: listening for connections (fd=82) on 172.16.1.242:43172 | |
[1650465629.856203] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x19ec530 using tcp/ib0 on worker 0x22c98d0 | |
[1650465629.856413] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.856422] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.856650] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.856655] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.857649] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.856838] [ndv4:54861:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.856861] [ndv4:54861:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.856899] [ndv4:54861:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.856903] [ndv4:54861:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.856910] [ndv4:54861:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.856917] [ndv4:54861:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.856919] [ndv4:54861:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.856925] [ndv4:54861:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.856927] [ndv4:54861:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.856930] [ndv4:54861:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.856933] [ndv4:54861:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.856935] [ndv4:54861:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.856944] [ndv4:54861:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.857706] [ndv4:54861:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.858940] [ndv4:54861:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.858958] [ndv4:54861:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.858969] [ndv4:54861:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.858979] [ndv4:54861:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.858990] [ndv4:54861:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.859001] [ndv4:54861:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.859013] [ndv4:54861:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.858944] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.858984] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.858988] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.859000] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.859354] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.859365] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.859487] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.859496] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.859508] [ndv4:54861:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.860017] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x19fa670: created RC QP 0xf49a on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.859859] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.859872] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.860055] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.860071] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.860488] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b97a81000..0x2b3b97b06000 on mlx5_ib1 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465629.860494] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b97a81018 of 544744 bytes with 128 elements | |
[1650465629.860499] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.860892] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3dbc280: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465629.861193] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.861199] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.861636] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.861649] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.861781] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.861787] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.862012] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.863050] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.863061] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.863064] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.863083] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.863436] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x377e010 of 8176 bytes with 127 elements | |
[1650465629.863671] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.863680] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.863715] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465629.863719] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.863731] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x25c4190 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.863755] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465629.863765] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x35ec010 using rc_mlx5/mlx5_ib1:1 on worker 0x2e9a8d0 | |
[1650465629.863815] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.863821] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.863895] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.863901] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.864465] [ndv4:55324:0] debug.c:1198 UCX DEBUG using signal stack 0x2b6f65e62000 size 141824 | |
[1650465629.864539] [ndv4:55324:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.864562] [ndv4:55324:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b6f65cd0000 | |
[1650465629.864655] [ndv4:55324:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.864666] [ndv4:55324:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.864672] [ndv4:55324:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.864443] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465629.865707] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.866023] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x4580250: created UD QP 0xdace on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.866616] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.866856] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.866864] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.866981] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.866986] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.867565] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b31b6e000..0x2b4b31bf3000 on mlx5_ib4 lkey 0x80700 rkey 0x80700 access 0xf flags 0x3e4 | |
[1650465629.867572] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b31b6e018 of 544744 bytes with 128 elements | |
[1650465629.867612] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.867933] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4580250: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465629.868116] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4580250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465629.868337] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4580250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465629.868937] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4580250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465629.869275] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4580250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465629.867167] [ndv4:55324:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.867186] [ndv4:55324:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.867223] [ndv4:55324:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.867227] [ndv4:55324:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.867234] [ndv4:55324:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.867240] [ndv4:55324:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.867243] [ndv4:55324:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.867248] [ndv4:55324:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.867251] [ndv4:55324:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.867254] [ndv4:55324:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.867256] [ndv4:55324:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.867258] [ndv4:55324:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.867266] [ndv4:55324:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.868014] [ndv4:55324:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.868742] [ndv4:55324:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.868758] [ndv4:55324:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.868769] [ndv4:55324:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.868780] [ndv4:55324:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.868790] [ndv4:55324:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.868799] [ndv4:55324:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.868810] [ndv4:55324:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.869183] [ndv4:55324:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.869387] [ndv4:55432:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.869092] [ndv4:55516:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2ae24bd37000 length 12288 | |
[1650465629.869153] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465629.869858] [ndv4:55432:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465629.870979] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x2134ec0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.871001] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465629.871004] [ndv4:55432:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465629.871308] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.871317] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.871322] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.871364] [ndv4:55432:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465629.870346] [ndv4:55516:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465629.870356] [ndv4:55516:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2ae250423000 length 4296704 | |
[1650465629.870361] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2ae250423018 of 4296680 bytes with 512 elements | |
[1650465629.870685] [ndv4:55516:0] mm_iface.c:600 UCX DEBUG created mm iface 0x19d9d10 FIFO id 0x400000007110a1e4 va 0x2ae24bd37000 size 12288 (128 x 64 elems) | |
[1650465629.870741] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x19d9d10 using posix/memory on worker 0x22b38d0 | |
[1650465629.870766] [ndv4:55516:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465629.870801] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465629.870815] [ndv4:55516:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465629.870825] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2ae25083c018 of 4296680 bytes with 512 elements | |
[1650465629.871445] [ndv4:55516:0] mm_iface.c:600 UCX DEBUG created mm iface 0x19da2e0 FIFO id 0x718028 va 0x2ae24bd3a000 size 12288 (128 x 64 elems) | |
[1650465629.871454] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x19da2e0 using sysv/memory on worker 0x22b38d0 | |
[1650465629.871467] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465629.871470] [ndv4:55516:0] self.c:220 UCX DEBUG created self iface id 0xafb26d37f61159f8 send_size 8192 | |
[1650465629.871476] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x19edec0 using self/memory0 on worker 0x22b38d0 | |
[1650465629.871499] [ndv4:55516:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.871537] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.871540] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.871896] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x19fa670 using rc_verbs/mlx5_ib0:1 on worker 0x22c98d0 | |
[1650465629.871954] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.871962] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.872115] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.872120] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.872766] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.873075] [ndv4:55512:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465629.872851] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.872859] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.873459] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.873463] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.874102] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.874111] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.874115] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.874171] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.874669] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2393010 of 8176 bytes with 127 elements | |
[1650465629.875002] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.875025] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.875069] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465629.875074] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.874311] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.875671] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.875690] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.875693] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.875751] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.876096] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.876103] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.876134] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465629.876137] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.876145] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x25c4fb0 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.876164] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465629.876648] [ndv4:54756:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.873750] [ndv4:55691:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.875172] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x19f0000 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.875203] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465629.875225] [ndv4:55516:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x19ee820: listening for connections (fd=78) on 10.5.0.5:47929 | |
[1650465629.875368] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x19ee820 using tcp/eth0 on worker 0x22b38d0 | |
[1650465629.875388] [ndv4:55516:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.875392] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.875395] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.875434] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x193c250 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.875458] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465629.875487] [ndv4:55516:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x19eeea0: listening for connections (fd=80) on 127.0.0.1:33777 | |
[1650465629.875504] [ndv4:55516:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465629.875509] [ndv4:55516:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465629.875608] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x19eeea0 using tcp/lo on worker 0x22b38d0 | |
[1650465629.875628] [ndv4:55516:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465629.875631] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465629.875634] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465629.875675] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x194a5b0 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465629.875699] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465629.875703] [ndv4:55516:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x19d6530: listening for connections (fd=82) on 172.16.1.242:44411 | |
[1650465629.875897] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x19d6530 using tcp/ib0 on worker 0x22b38d0 | |
[1650465629.876738] [ndv4:55324:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.877130] [ndv4:55324:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.878071] [ndv4:54861:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.878452] [ndv4:54861:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.881837] [ndv4:55324:0] async.c:228 UCX DEBUG added async handler 0x16372a0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.881938] [ndv4:55324:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.881944] [ndv4:55324:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.882089] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x195e710 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.882118] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465629.882141] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x1a080c0 using rc_mlx5/mlx5_ib0:1 on worker 0x22c98d0 | |
[1650465629.882205] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.882211] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.882285] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.882289] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.882777] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.882182] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.882193] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.882213] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.882273] [ndv4:55324:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.882295] [ndv4:55324:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.883420] [ndv4:55324:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.883434] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.883938] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.883948] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.883952] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.883965] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.884322] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.884329] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.884360] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465629.884364] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.884374] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x1282ab0 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.884399] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465629.885146] [ndv4:55512:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.883316] [ndv4:54861:0] async.c:228 UCX DEBUG added async handler 0x18c5360 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.883413] [ndv4:54861:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.883421] [ndv4:54861:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.883819] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.883829] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.883852] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.883912] [ndv4:54861:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.883934] [ndv4:54861:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.884328] [ndv4:55432:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.884338] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.884555] [ndv4:55432:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.884582] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.884589] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.885107] [ndv4:55432:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465629.885112] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.885519] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.885523] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.883022] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3dbc280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465629.883376] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3dbc280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465629.883444] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3dbc280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465629.883461] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3dbc280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465629.883475] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3dbc280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465629.883705] [ndv4:55324:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.883830] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.883836] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.882981] [ndv4:54756:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3916010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdaa8 | |
[1650465629.883362] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.883369] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.883379] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b53e5a008 of 151544 bytes with 1052 elements | |
[1650465629.886418] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.886424] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.886907] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.886912] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.887071] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b5b200000..0x2b3b5d800000 on mlx5_ib1 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465629.887086] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3b5b200018 of 39845864 bytes with 4752 elements | |
[1650465629.887250] [ndv4:54756:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3916010 | |
[1650465629.887270] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x3916010 using dc_mlx5/mlx5_ib1:1 on worker 0x2e9a8d0 | |
[1650465629.887344] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.887351] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.887413] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.887419] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.886984] [ndv4:54715:0] debug.c:1198 UCX DEBUG using signal stack 0x2ac8ab71e000 size 141824 | |
[1650465629.887058] [ndv4:54715:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.887079] [ndv4:54715:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2ac8ab58c000 | |
[1650465629.887101] [ndv4:54715:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.887110] [ndv4:54715:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.887116] [ndv4:54715:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.887956] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.888875] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.889219] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x3815510: created UD QP 0xda89 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.889669] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.889729] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.889739] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.889761] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.889765] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.890078] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b53e7f000..0x2b3b53f04000 on mlx5_ib1 lkey 0x80d00 rkey 0x80d00 access 0xf flags 0x3e4 | |
[1650465629.890086] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b53e7f018 of 544744 bytes with 128 elements | |
[1650465629.890093] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.890134] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3815510: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465629.890182] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3815510: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465629.890232] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3815510: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465629.890285] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3815510: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465629.890316] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3815510: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465629.890340] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3815510: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465629.890354] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3815510: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465629.890369] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3815510: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465629.890538] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.890548] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x25bf940 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.890590] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465629.890615] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x3815510 using ud_verbs/mlx5_ib1:1 on worker 0x2e9a8d0 | |
[1650465629.890627] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.890631] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.890686] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.890691] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.889592] [ndv4:54715:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.889610] [ndv4:54715:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.889642] [ndv4:54715:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.889645] [ndv4:54715:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.889653] [ndv4:54715:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.889659] [ndv4:54715:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.889662] [ndv4:54715:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.889666] [ndv4:54715:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.889669] [ndv4:54715:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.889671] [ndv4:54715:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.889673] [ndv4:54715:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.889675] [ndv4:54715:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.889685] [ndv4:54715:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.890386] [ndv4:54715:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.890961] [ndv4:54715:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.890978] [ndv4:54715:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.890989] [ndv4:54715:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.891000] [ndv4:54715:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.891010] [ndv4:54715:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.891020] [ndv4:54715:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.891031] [ndv4:54715:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.891402] [ndv4:54715:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.890919] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.891833] [ndv4:55512:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x25fd010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xf4c4 | |
[1650465629.892030] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.892038] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.892056] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2aea6ff42008 of 151544 bytes with 1052 elements | |
[1650465629.891680] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.891965] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x3392010: created UD QP 0xda8c on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.891971] [ndv4:54756:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.892353] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.892785] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.892792] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.892893] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.892897] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.893289] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b53f04000..0x2b3b53f89000 on mlx5_ib1 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465629.893297] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b53f04018 of 544744 bytes with 128 elements | |
[1650465629.893303] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.893906] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3392010: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465629.894390] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3392010: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465629.891523] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.891538] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.891976] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.891985] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.892387] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.893407] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.893445] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.893449] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.893462] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.893717] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.893726] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.894201] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x19e4670: created RC QP 0xf4a7 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.895761] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea74a00000..0x2aea77000000 on mlx5_ib0 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465629.895778] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2aea74a00018 of 39845864 bytes with 4752 elements | |
[1650465629.895913] [ndv4:55512:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x25fd010 | |
[1650465629.895946] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x25fd010 using dc_mlx5/mlx5_ib0:1 on worker 0x22c98d0 | |
[1650465629.896007] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.896017] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.896087] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.896092] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.896314] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.896978] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.896924] [ndv4:55691:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.897060] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4580250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465629.897173] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4580250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465629.897306] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4580250: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465629.897345] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x1a10a30: created UD QP 0xf4a8 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.897706] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.897713] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x38913f0 [id=116 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.897737] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 116 events 0x5 mode thread_spinlock | |
[1650465629.897755] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[29]=0x4580250 using ud_verbs/mlx5_ib4:1 on worker 0x23dc840 | |
[1650465629.897847] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.897854] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.897916] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.897921] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.897969] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.898187] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.898195] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.898241] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.898245] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.898642] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea6ff67000..0x2aea6ffec000 on mlx5_ib0 lkey 0x81400 rkey 0x81400 access 0xf flags 0x3e4 | |
[1650465629.898649] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aea6ff67018 of 544744 bytes with 128 elements | |
[1650465629.898653] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.898858] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465629.898835] [ndv4:55691:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465629.899604] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x23c30b0 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.899634] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465629.899638] [ndv4:55691:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465629.899886] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.899895] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.899900] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.899947] [ndv4:55691:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465629.901212] [ndv4:55691:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.901218] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.900004] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.900506] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x469e050: created UD QP 0xdacf on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.900514] [ndv4:54965:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.900644] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3dbc280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465629.901158] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x3dbc280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465629.901168] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.901254] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.901263] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2cb6b10 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.901295] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465629.901317] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x3dbc280 using ud_mlx5/mlx5_ib1:1 on worker 0x357a8c0 | |
[1650465629.901336] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.901342] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.901404] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.901409] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.901594] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.901626] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.901631] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.901644] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.901649] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.902219] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b31bf3000..0x2b4b31c78000 on mlx5_ib4 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465629.902225] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b31bf3018 of 544744 bytes with 128 elements | |
[1650465629.902230] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.902607] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x469e050: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465629.901508] [ndv4:55324:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.901516] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.901793] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.901798] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.901568] [ndv4:55691:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.901602] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.901608] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.902085] [ndv4:55691:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465629.902090] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.901873] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.902055] [ndv4:54861:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.902068] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.902261] [ndv4:54861:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.902443] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.902451] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.903038] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.903054] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.903057] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.903107] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.902791] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465629.902798] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465629.903016] [ndv4:55432:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.902731] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.902737] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.903155] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.903159] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.903703] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.903709] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.904463] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x43e50a0: created RC QP 0xdad9 on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.905475] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x43e50a0 using rc_verbs/mlx5_ib2:1 on worker 0x357a8c0 | |
[1650465629.905524] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.905529] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.905611] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.905615] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.905812] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.906420] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3392010: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465629.907012] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3392010: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465629.907391] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3392010: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465629.907075] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.907089] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.907093] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.907143] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.907629] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4706010 of 8176 bytes with 127 elements | |
[1650465629.908945] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x19e4670 using rc_verbs/mlx5_ib0:1 on worker 0x22b38d0 | |
[1650465629.909131] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.909139] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.909320] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.909326] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.907945] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.907952] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.907986] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465629.907991] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.908001] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2ca4950 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.908022] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465629.908045] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x4532030 using rc_mlx5/mlx5_ib2:1 on worker 0x357a8c0 | |
[1650465629.908127] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.908133] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.908265] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.908271] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.909264] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.909903] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x1a10a30: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465629.910559] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x1a10a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465629.910860] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x1a10a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465629.910875] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x1a10a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465629.910890] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x1a10a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465629.910459] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.911119] [ndv4:55516:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465629.909200] [ndv4:54715:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.909692] [ndv4:54715:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.912191] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.912209] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.912213] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.912267] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.912560] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.912661] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.912637] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x237d010 of 8176 bytes with 127 elements | |
[1650465629.912868] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.912901] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.913070] [ndv4:55516:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465629.913077] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.914113] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x469e050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465629.914975] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.914983] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.915199] [ndv4:55324:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.917050] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3392010: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465629.917404] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3392010: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465629.917677] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x3392010: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465629.917684] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.917786] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.917792] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x3392e80 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.917815] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465629.917830] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x3392010 using ud_mlx5/mlx5_ib1:1 on worker 0x2e9a8d0 | |
[1650465629.917902] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.917908] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.918034] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.918039] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.918244] [ndv4:54717:0] debug.c:1198 UCX DEBUG using signal stack 0x2b7cc1036000 size 141824 | |
[1650465629.918318] [ndv4:54717:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.918338] [ndv4:54717:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b7cc0e8a000 | |
[1650465629.918357] [ndv4:54717:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.918367] [ndv4:54717:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.918373] [ndv4:54717:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.918611] [ndv4:55432:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.919001] [ndv4:55432:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465629.919524] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x1a10a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465629.919787] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x1a10a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465629.919804] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x1a10a30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465629.920095] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.920112] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.920115] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.920175] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.920644] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.920653] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.920692] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465629.920696] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.920703] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x3a87ef0 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.920726] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465629.920123] [ndv4:55512:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.920610] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x281dd70 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.920643] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465629.920647] [ndv4:55432:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465629.921358] [ndv4:55152:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.922288] [ndv4:54717:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.922313] [ndv4:54717:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.922352] [ndv4:54717:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.922356] [ndv4:54717:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.922364] [ndv4:54717:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.922373] [ndv4:54717:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.922376] [ndv4:54717:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.922382] [ndv4:54717:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.922385] [ndv4:54717:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.922387] [ndv4:54717:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.922390] [ndv4:54717:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.922392] [ndv4:54717:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.922405] [ndv4:54717:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.923267] [ndv4:54717:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.921803] [ndv4:54861:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.921814] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.923832] [ndv4:54853:0] debug.c:1198 UCX DEBUG using signal stack 0x2b6769a40000 size 141824 | |
[1650465629.923906] [ndv4:54853:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.923927] [ndv4:54853:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b67698ae000 | |
[1650465629.923949] [ndv4:54853:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465629.923959] [ndv4:54853:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465629.923965] [ndv4:54853:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465629.923864] [ndv4:54717:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.923880] [ndv4:54717:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.923891] [ndv4:54717:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.923901] [ndv4:54717:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.923911] [ndv4:54717:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.923921] [ndv4:54717:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.923931] [ndv4:54717:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.924311] [ndv4:54717:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.924080] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x469e050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465629.922774] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x1948710 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.922806] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465629.922845] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x19f20c0 using rc_mlx5/mlx5_ib0:1 on worker 0x22b38d0 | |
[1650465629.923181] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.923188] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.923377] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.923383] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.924866] [ndv4:54715:0] async.c:228 UCX DEBUG added async handler 0x2006080 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.924962] [ndv4:54715:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.924970] [ndv4:54715:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.925358] [ndv4:55324:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.925520] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.925532] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.925555] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.925656] [ndv4:54715:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.925679] [ndv4:54715:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.925773] [ndv4:55324:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465629.926950] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x195fc50 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.926979] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465629.927012] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x1a10a30 using ud_verbs/mlx5_ib0:1 on worker 0x22c98d0 | |
[1650465629.927399] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.927407] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.927527] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.927531] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.926657] [ndv4:54853:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465629.926676] [ndv4:54853:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465629.926711] [ndv4:54853:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465629.926714] [ndv4:54853:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465629.926721] [ndv4:54853:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465629.926728] [ndv4:54853:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465629.926733] [ndv4:54853:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465629.926738] [ndv4:54853:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465629.926740] [ndv4:54853:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465629.926742] [ndv4:54853:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465629.926745] [ndv4:54853:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465629.926747] [ndv4:54853:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465629.926756] [ndv4:54853:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465629.927465] [ndv4:54853:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465629.927567] [ndv4:55324:0] async.c:228 UCX DEBUG added async handler 0x165d130 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.927662] [ndv4:55324:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465629.927666] [ndv4:55324:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465629.928108] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.928116] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.928121] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.928162] [ndv4:55324:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465629.928324] [ndv4:54853:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465629.928340] [ndv4:54853:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465629.928351] [ndv4:54853:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465629.928362] [ndv4:54853:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465629.928372] [ndv4:54853:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465629.928383] [ndv4:54853:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465629.928395] [ndv4:54853:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465629.928841] [ndv4:54853:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465629.927794] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.928918] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.929354] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x22e9750: created UD QP 0xf4aa on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.929364] [ndv4:55512:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.929935] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.930445] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.930452] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.930535] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.930540] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.931004] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea77033000..0x2aea770b8000 on mlx5_ib0 lkey 0x81700 rkey 0x81700 access 0xf flags 0x3e4 | |
[1650465629.931010] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aea77033018 of 544744 bytes with 128 elements | |
[1650465629.931015] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.931461] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x22e9750: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465629.929324] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.931359] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.931391] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.931395] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.931448] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.929675] [ndv4:55152:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4708050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdaef | |
[1650465629.930570] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.930629] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.930639] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b97b08008 of 151544 bytes with 1052 elements | |
[1650465629.931816] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.931823] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.932194] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.932207] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.932213] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.932254] [ndv4:55432:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465629.932267] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x3d060a0: created RC QP 0xdae5 on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.932405] [ndv4:54717:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.932701] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x3d060a0 using rc_verbs/mlx5_ib2:1 on worker 0x2e9a8d0 | |
[1650465629.932798] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.932806] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.932877] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.932881] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.932753] [ndv4:54717:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.933532] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.933540] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.934788] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.934793] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.934356] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3ba1200000..0x2b3ba3800000 on mlx5_ib2 lkey 0x80d00 rkey 0x80d00 access 0xf flags 0x3e4 | |
[1650465629.934374] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3ba1200018 of 39845864 bytes with 4752 elements | |
[1650465629.934514] [ndv4:55152:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4708050 | |
[1650465629.934548] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x4708050 using dc_mlx5/mlx5_ib2:1 on worker 0x357a8c0 | |
[1650465629.934650] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.934661] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.934943] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.934949] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.933607] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.933620] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.933155] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.933951] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.935487] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.935498] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.935501] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.935517] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.935891] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.935898] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.935931] [ndv4:55516:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465629.935935] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.935945] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x126cab0 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.935967] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465629.934323] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x469e050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465629.935021] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x469e050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465629.935453] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x469e050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465629.935359] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.936370] [ndv4:55516:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.936463] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.936783] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x48e4060: created UD QP 0xdae6 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.937258] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.937601] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.937607] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.937793] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.937797] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.937343] [ndv4:54715:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.937354] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.937534] [ndv4:54715:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.937688] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.937694] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.938175] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b97b2d000..0x2b3b97bb2000 on mlx5_ib2 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465629.938181] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b97b2d018 of 544744 bytes with 128 elements | |
[1650465629.938186] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.939183] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x48e4060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465629.937864] [ndv4:55324:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.937872] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.938124] [ndv4:55324:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.938352] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.938359] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.940044] [ndv4:55324:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465629.940051] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.939904] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x48e4060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465629.940537] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x22e9750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465629.940997] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x22e9750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465629.941544] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x22e9750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465629.941565] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x22e9750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465629.941885] [ndv4:55432:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.941894] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.942086] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x22e9750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465629.942395] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x22e9750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465629.942140] [ndv4:55432:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.942249] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465629.942256] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465629.942680] [ndv4:55432:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465629.942684] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.942804] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.942807] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.942905] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.942908] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.943014] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.943017] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.943120] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.943123] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.943333] [ndv4:55432:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.942076] [ndv4:55516:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x25e7010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xf4d3 | |
[1650465629.942336] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.942344] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.942360] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae24bd3f008 of 151544 bytes with 1052 elements | |
[1650465629.945948] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae250e00000..0x2ae253400000 on mlx5_ib0 lkey 0x81800 rkey 0x81800 access 0xf flags 0x3e4 | |
[1650465629.945974] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae250e00018 of 39845864 bytes with 4752 elements | |
[1650465629.946145] [ndv4:55516:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x25e7010 | |
[1650465629.946183] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x25e7010 using dc_mlx5/mlx5_ib0:1 on worker 0x22b38d0 | |
[1650465629.946241] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.946252] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.946332] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.946337] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.946803] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.947002] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.947027] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.947031] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.947083] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.947441] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.947447] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.948171] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.948176] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.948396] [ndv4:54861:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.948287] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x469e050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465629.948991] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x469e050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465629.948999] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.949033] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.949038] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x405ca10 [id=117 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.949058] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 117 events 0x5 mode thread_spinlock | |
[1650465629.949070] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[30]=0x469e050 using ud_mlx5/mlx5_ib4:1 on worker 0x23dc840 | |
[1650465629.949213] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.949219] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.949357] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.949362] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.949716] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465629.949525] [ndv4:54853:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.949945] [ndv4:54853:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465629.948515] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.948524] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.948332] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4027010 of 8176 bytes with 127 elements | |
[1650465629.948665] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.948675] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.948705] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465629.948709] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.948723] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x25c4900 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.948748] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465629.948783] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x3e53030 using rc_mlx5/mlx5_ib2:1 on worker 0x2e9a8d0 | |
[1650465629.948971] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.948977] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.949232] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.949238] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.949909] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.947648] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.947991] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x19faa30: created UD QP 0xf4b6 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.948489] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.950627] [ndv4:54715:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.950635] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.951394] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.951410] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.951413] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.951472] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.951223] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.951241] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.951244] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.951296] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.952327] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.952334] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.951827] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x48e4060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465629.951812] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.951820] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.951855] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465629.951859] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.951870] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x33a7ef0 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.951897] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465629.952292] [ndv4:54756:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.952774] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x479e0a0: created RC QP 0xda55 on mlx5_ib5:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.953135] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[31]=0x479e0a0 using rc_verbs/mlx5_ib5:1 on worker 0x23dc840 | |
[1650465629.953340] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.953346] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.953454] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.953459] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.953799] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465629.953633] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.953640] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.954899] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.954905] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.956796] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.956801] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.955028] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x22e9750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465629.955041] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.955137] [ndv4:55512:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.955145] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x1951060 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.955174] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465629.955197] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x22e9750 using ud_mlx5/mlx5_ib0:1 on worker 0x22c98d0 | |
[1650465629.955217] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.955224] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.955414] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.955419] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.956532] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.957889] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465629.957898] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.957901] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.957921] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.958284] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.958290] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465629.955142] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.955158] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.955161] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.955212] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.955661] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4abf010 of 8176 bytes with 127 elements | |
[1650465629.955909] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.955915] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.955947] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465629.955951] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.955961] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3891740 [id=120 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.955981] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 120 events 0x1 mode thread_spinlock | |
[1650465629.955991] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[32]=0x48eb030 using rc_mlx5/mlx5_ib5:1 on worker 0x23dc840 | |
[1650465629.956007] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.956013] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.956206] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.956210] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.956418] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465629.958507] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.958522] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.958525] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.958635] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.956590] [ndv4:54861:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.956966] [ndv4:54861:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465629.957904] [ndv4:54861:0] async.c:228 UCX DEBUG added async handler 0x18bbdb0 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.957930] [ndv4:54861:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465629.957933] [ndv4:54861:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465629.958166] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.958173] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.958178] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.958219] [ndv4:54861:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465629.959567] [ndv4:54861:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.959605] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.958957] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.958962] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.959228] [ndv4:55324:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.959132] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x2b9a050: created RC QP 0xda8d on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465629.959081] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.959089] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.959120] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465629.959123] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.959130] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x48f3fc0 [id=122 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.959149] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 122 events 0x1 mode thread_spinlock | |
[1650465629.958465] [ndv4:54756:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4029050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdafc | |
[1650465629.959270] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.959279] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.959291] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b53f8b008 of 151544 bytes with 1052 elements | |
[1650465629.957507] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465629.957516] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465629.957849] [ndv4:55691:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.956429] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.956446] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.956803] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.956809] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.957172] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae24bd64000..0x2ae24bde9000 on mlx5_ib0 lkey 0x81900 rkey 0x81900 access 0xf flags 0x3e4 | |
[1650465629.957182] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae24bd64018 of 544744 bytes with 128 elements | |
[1650465629.957192] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.957812] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x19faa30: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465629.958148] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x19faa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465629.958491] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x19faa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465629.959752] [ndv4:54965:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.959924] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x2b9a050 using rc_verbs/mlx5_ib1:1 on worker 0x22c98d0 | |
[1650465629.960110] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.960116] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.959837] [ndv4:54861:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.960122] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.960128] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.960604] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x48e4060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465629.961158] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x48e4060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465629.961415] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x48e4060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465629.961430] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x48e4060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465629.961539] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x48e4060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465629.962775] [ndv4:54861:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465629.962782] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.961991] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.961998] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2ca9390 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.962020] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465629.962030] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x48e4060 using ud_verbs/mlx5_ib2:1 on worker 0x357a8c0 | |
[1650465629.962342] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.962347] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.962552] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.962558] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.963189] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b5da00000..0x2b3b60000000 on mlx5_ib2 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465629.963213] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3b5da00018 of 39845864 bytes with 4752 elements | |
[1650465629.963387] [ndv4:54756:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4029050 | |
[1650465629.963441] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x4029050 using dc_mlx5/mlx5_ib2:1 on worker 0x2e9a8d0 | |
[1650465629.963793] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.963805] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.964022] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.964028] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.964840] [ndv4:54853:0] async.c:228 UCX DEBUG added async handler 0x138c080 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.964959] [ndv4:54853:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.964967] [ndv4:54853:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.964756] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.965181] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.965194] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.965229] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.965302] [ndv4:54853:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.965329] [ndv4:54853:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.966213] [ndv4:54853:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.966222] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.965825] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.966220] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x4205060: created UD QP 0xdaf2 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.966907] [ndv4:54717:0] async.c:228 UCX DEBUG added async handler 0x23e91e0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.966996] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465629.967006] [ndv4:54717:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465629.967376] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.967386] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.967405] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.967464] [ndv4:54717:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465629.967483] [ndv4:54717:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465629.966389] [ndv4:54853:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.966719] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.966725] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.967375] [ndv4:54853:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.967380] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.966749] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.967155] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.967162] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.967241] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.967246] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.967684] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b60035000..0x2b3b600ba000 on mlx5_ib2 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465629.967691] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b60035018 of 544744 bytes with 128 elements | |
[1650465629.967696] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.967197] [ndv4:54965:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4ac1050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda68 | |
[1650465629.967478] [ndv4:55432:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.967905] [ndv4:55432:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465629.969355] [ndv4:55324:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.969757] [ndv4:55324:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465629.969021] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x21367b0 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.969045] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465629.969048] [ndv4:55432:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465629.969252] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.969260] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.969265] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.969308] [ndv4:55432:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465629.970785] [ndv4:55432:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.970791] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.968432] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x19faa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465629.968932] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x19faa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465629.969255] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x19faa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465629.969509] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x19faa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465629.970084] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.970093] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.970764] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.970964] [ndv4:55432:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.971938] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.971951] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.971954] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.971972] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.972300] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2bad010 of 8176 bytes with 127 elements | |
[1650465629.972529] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465629.972536] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.972570] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465629.972601] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.972611] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x23a0fc0 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.972636] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465629.972646] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x2a1b010 using rc_mlx5/mlx5_ib1:1 on worker 0x22c98d0 | |
[1650465629.972776] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.972781] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.972967] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.972972] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.973376] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.973219] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465629.974078] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.974088] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.975249] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.975253] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.974470] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.974746] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x2c21900: created UD QP 0xdaf3 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.974752] [ndv4:55152:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.975390] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.975526] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.975532] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.975704] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.975709] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.976152] [ndv4:55691:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465629.976460] [ndv4:55691:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465629.976147] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b97bb2000..0x2b3b97c37000 on mlx5_ib2 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465629.976153] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b97bb2018 of 544744 bytes with 128 elements | |
[1650465629.976157] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.976191] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2c21900: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465629.976206] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2c21900: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465629.976221] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2c21900: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465629.976293] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2c21900: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465629.976381] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2c21900: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465629.977225] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2aa4c40 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.977257] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465629.977261] [ndv4:55691:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465629.979359] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x19faa30: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465629.977958] [ndv4:54717:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.977970] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.978174] [ndv4:54717:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.978406] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.978413] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.977361] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.977373] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.977385] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b4b34479008 of 151544 bytes with 1052 elements | |
[1650465629.977908] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4205060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465629.978352] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4205060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465629.978372] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4205060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465629.978938] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4205060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465629.979226] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.979234] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.980210] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.980220] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.982388] [ndv4:55432:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465629.982394] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.981324] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b31e00000..0x2b4b34400000 on mlx5_ib5 lkey 0x80400 rkey 0x80400 access 0xf flags 0x3e4 | |
[1650465629.981337] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b4b31e00018 of 39845864 bytes with 4752 elements | |
[1650465629.981479] [ndv4:54965:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4ac1050 | |
[1650465629.981509] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[33]=0x4ac1050 using dc_mlx5/mlx5_ib5:1 on worker 0x23dc840 | |
[1650465629.981827] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.981834] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.981912] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.981916] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.980361] [ndv4:55324:0] async.c:228 UCX DEBUG added async handler 0x165dc60 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465629.980401] [ndv4:55324:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465629.980405] [ndv4:55324:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465629.980728] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.980742] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.980752] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.980838] [ndv4:55324:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465629.982243] [ndv4:55324:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.982254] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.982462] [ndv4:55324:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.982713] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465629.982721] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465629.979726] [ndv4:55516:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.981969] [ndv4:54717:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465629.981977] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.984395] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465629.984412] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465629.984416] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465629.984470] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465629.984857] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465629.984865] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.984898] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465629.984902] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465629.984909] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x1a03db0 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465629.984930] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465629.984461] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.984468] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.984538] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.984541] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.984645] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465629.985648] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x1949c50 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.985677] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465629.985698] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x19faa30 using ud_verbs/mlx5_ib0:1 on worker 0x22b38d0 | |
[1650465629.985726] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.985733] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.985780] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.985785] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.985945] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465629.985664] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2c21900: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465629.986298] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2c21900: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465629.986322] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x2c21900: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465629.986327] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465629.986362] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.986367] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x48e4ef0 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.986387] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465629.986402] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x2c21900 using ud_mlx5/mlx5_ib2:1 on worker 0x357a8c0 | |
[1650465629.986892] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.986899] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.987571] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465629.987626] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465629.985663] [ndv4:55512:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465629.985771] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.985783] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.985790] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465629.985870] [ndv4:55691:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465629.987527] [ndv4:55691:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465629.987535] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465629.987743] [ndv4:55691:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465629.987923] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.987930] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.986012] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.986484] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x4c9d060: created UD QP 0xda5e on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.987162] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.987974] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.987981] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.988246] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.988251] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.988706] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b3449e000..0x2b4b34523000 on mlx5_ib5 lkey 0x80500 rkey 0x80500 access 0xf flags 0x3e4 | |
[1650465629.988712] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b3449e018 of 544744 bytes with 128 elements | |
[1650465629.988716] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.989368] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4c9d060: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465629.989717] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4c9d060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465629.989960] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4c9d060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465629.990024] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4c9d060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465629.990041] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4c9d060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465629.990645] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4c9d060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465629.990944] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4c9d060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465629.991104] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4c9d060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465629.988864] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.988873] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.989951] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.989956] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.990941] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.990947] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.990474] [ndv4:55691:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465629.990481] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.988001] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4205060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465629.989218] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4205060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465629.988092] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.988552] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x22d3750: created UD QP 0xf4b7 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.988563] [ndv4:55516:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.989189] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.989455] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.989462] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.989740] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465629.989745] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465629.990077] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae24bde9000..0x2ae24be6e000 on mlx5_ib0 lkey 0x81a00 rkey 0x81a00 access 0xf flags 0x3e4 | |
[1650465629.990086] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae24bde9018 of 544744 bytes with 128 elements | |
[1650465629.990094] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.987328] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.987334] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.986867] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465629.986873] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465629.991381] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465629.991388] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x47799b0 [id=123 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465629.991410] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 123 events 0x5 mode thread_spinlock | |
[1650465629.991421] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[34]=0x4c9d060 using ud_verbs/mlx5_ib5:1 on worker 0x23dc840 | |
[1650465629.991560] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.991566] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.991624] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.991628] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.991784] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465629.991982] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.991988] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.992246] [ndv4:54715:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465629.992774] [ndv4:55512:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2d45010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdab6 | |
[1650465629.992932] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.992938] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.992947] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2aea798b9008 of 151544 bytes with 1052 elements | |
[1650465629.992664] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465629.992944] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x4dbb460: created UD QP 0xda5f on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465629.992952] [ndv4:54965:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465629.993392] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465629.993608] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.993615] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.993646] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465629.993652] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465629.993971] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b34523000..0x2b4b345a8000 on mlx5_ib5 lkey 0x80600 rkey 0x80600 access 0xf flags 0x3e4 | |
[1650465629.993977] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b34523018 of 544744 bytes with 128 elements | |
[1650465629.993982] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465629.994013] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4dbb460: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465629.994044] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4dbb460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465629.994269] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4dbb460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465629.992711] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465629.992719] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465629.995395] [ndv4:55324:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465629.995404] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465629.997443] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea77200000..0x2aea79800000 on mlx5_ib1 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465629.997472] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2aea77200018 of 39845864 bytes with 4752 elements | |
[1650465629.998165] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465629.997767] [ndv4:55512:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2d45010 | |
[1650465629.997826] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x2d45010 using dc_mlx5/mlx5_ib1:1 on worker 0x22c98d0 | |
[1650465629.998159] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.998171] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.998241] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465629.998246] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465629.999488] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465629.999680] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465629.999688] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465629.999901] [ndv4:55432:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.000020] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4205060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465630.000701] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4205060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465630.000835] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.001253] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x2c443b0: created UD QP 0xda97 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.001893] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.001976] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.001983] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.002098] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.002104] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.000947] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.000957] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x3782400 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.000981] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465630.000995] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x4205060 using ud_verbs/mlx5_ib2:1 on worker 0x2e9a8d0 | |
[1650465630.001151] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.001158] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.001466] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.001471] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.001317] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x22d3750: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465630.002302] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x22d3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465630.002487] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x22d3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465630.003115] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x22d3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465630.003844] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x22d3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465630.002435] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea798de000..0x2aea79963000 on mlx5_ib1 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465630.002442] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aea798de018 of 544744 bytes with 128 elements | |
[1650465630.002446] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.002479] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2c443b0: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465630.002906] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2c443b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465630.003192] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2c443b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465630.003756] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2c443b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465630.003776] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2c443b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465630.003790] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2c443b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465630.003809] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2c443b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465630.002425] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.001153] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.001162] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.002523] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.002529] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.002792] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.002800] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.003988] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4dbb460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465630.003978] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.003987] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.005792] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.005798] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.006092] [ndv4:54853:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.011605] [ndv4:54853:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.011928] [ndv4:54853:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465630.011998] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.012006] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.012164] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.012167] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.012359] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.012363] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.012682] [ndv4:55691:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.007837] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.007859] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.007864] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.007915] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.008444] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.008456] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.009694] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x4b020a0: created RC QP 0xda5e on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.011049] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x4b020a0 using rc_verbs/mlx5_ib3:1 on worker 0x357a8c0 | |
[1650465630.011097] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.011104] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.011185] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.011190] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.011400] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.012444] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.012457] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.012460] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.012508] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.008473] [ndv4:54715:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.008841] [ndv4:54715:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465630.012713] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.012971] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4e23010 of 8176 bytes with 127 elements | |
[1650465630.013188] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x4323050: created UD QP 0xdaf4 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.013200] [ndv4:54756:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.013209] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2c443b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465630.013238] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.013246] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.013284] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465630.013289] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.013303] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x4a02ae0 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.013328] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465630.013339] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x4c4f030 using rc_mlx5/mlx5_ib3:1 on worker 0x357a8c0 | |
[1650465630.013370] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.013376] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.013491] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.013497] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.013666] [ndv4:55512:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.013678] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x19f8340 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.013700] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465630.013712] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x2c443b0 using ud_verbs/mlx5_ib1:1 on worker 0x22c98d0 | |
[1650465630.013760] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.013766] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.013811] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.013815] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.014024] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.013900] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.014173] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.014181] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.014195] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.014198] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.014668] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b600ba000..0x2b3b6013f000 on mlx5_ib2 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465630.014675] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b600ba018 of 544744 bytes with 128 elements | |
[1650465630.014679] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.014715] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4323050: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465630.014732] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4323050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465630.014746] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4323050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465630.014759] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4323050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465630.014948] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4323050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465630.015206] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4323050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465630.015315] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4323050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465630.015571] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4323050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465630.015597] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.015633] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.015638] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x4323f30 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.015662] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465630.015673] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x4323050 using ud_mlx5/mlx5_ib2:1 on worker 0x2e9a8d0 | |
[1650465630.015688] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.015694] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.015799] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.015804] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.013707] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x22d3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465630.014153] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x22d3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465630.014193] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x22d3750: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465630.014201] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.014305] [ndv4:55516:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.014310] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x193b060 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.014334] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465630.014354] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x22d3750 using ud_mlx5/mlx5_ib0:1 on worker 0x22b38d0 | |
[1650465630.014371] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.014377] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.014468] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.014472] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.014687] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.015498] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.015512] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.015516] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.015543] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.015927] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.015932] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.013884] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4dbb460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465630.013921] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4dbb460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465630.013936] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4dbb460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465630.013949] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x4dbb460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465630.013956] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.013987] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.013992] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x4c9df20 [id=124 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.014016] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 124 events 0x5 mode thread_spinlock | |
[1650465630.014028] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[35]=0x4dbb460 using ud_mlx5/mlx5_ib5:1 on worker 0x23dc840 | |
[1650465630.014153] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.014159] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.014233] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.014238] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.014432] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.015648] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.015666] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.015669] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.015718] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.013696] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.014674] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.014688] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.014691] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.014747] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.015142] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.015149] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.015182] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465630.015186] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.015192] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2c9ef50 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.015212] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465630.015855] [ndv4:55152:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.014795] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.014803] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.014875] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.014879] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.015180] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.015184] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.027544] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.027550] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.027857] [ndv4:55324:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.035491] [ndv4:55324:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.035867] [ndv4:55324:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465630.036784] [ndv4:55324:0] async.c:228 UCX DEBUG added async handler 0x165d360 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.036809] [ndv4:55324:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465630.036813] [ndv4:55324:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465630.036973] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.036981] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.036986] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.037034] [ndv4:55324:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465630.059798] [ndv4:55324:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.059811] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.060076] [ndv4:55324:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.060361] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.060368] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.083124] [ndv4:55324:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465630.083134] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.094948] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.094956] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.095375] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.095379] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.013807] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.013819] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.014323] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.014328] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.014695] [ndv4:54717:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.022673] [ndv4:54717:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.023043] [ndv4:54717:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465630.023512] [ndv4:54717:0] async.c:228 UCX DEBUG added async handler 0x23e1f50 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.023543] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465630.023547] [ndv4:54717:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465630.023721] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.023731] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.023738] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.023789] [ndv4:54717:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465630.024852] [ndv4:54717:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.024857] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.024995] [ndv4:54717:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.025016] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.025022] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.036542] [ndv4:54717:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465630.036562] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.058841] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.058849] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.070081] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.070089] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.071923] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.071929] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.084768] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.084776] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.085020] [ndv4:54717:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.093030] [ndv4:54717:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.093337] [ndv4:54717:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465630.093775] [ndv4:54717:0] async.c:228 UCX DEBUG added async handler 0x23e9ce0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.093805] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465630.093808] [ndv4:54717:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465630.093965] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.093973] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.093979] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.094052] [ndv4:54717:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465630.021239] [ndv4:54715:0] async.c:228 UCX DEBUG added async handler 0x2006d70 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.021267] [ndv4:54715:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465630.021271] [ndv4:54715:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465630.021418] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.021426] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.021431] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.021477] [ndv4:54715:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465630.022671] [ndv4:54715:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.022677] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.022869] [ndv4:54715:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.022946] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.022952] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.023540] [ndv4:54715:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465630.023545] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.036001] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.036009] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.059476] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.059484] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.070631] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.070638] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.083192] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.083200] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.083426] [ndv4:54715:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.090595] [ndv4:54715:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.090946] [ndv4:54715:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465630.091790] [ndv4:54715:0] async.c:228 UCX DEBUG added async handler 0x200ec00 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.091811] [ndv4:54715:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465630.091814] [ndv4:54715:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465630.091981] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.091989] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.091994] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.092035] [ndv4:54715:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465630.094019] [ndv4:54715:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.094026] [ndv4:54715:0] mpool.c:88 UCX D[1650465630.016506] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.016515] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.016747] [ndv4:54861:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.023498] [ndv4:54861:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.023816] [ndv4:54861:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465630.024121] [ndv4:54861:0] async.c:228 UCX DEBUG added async handler 0x18bbe00 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.024145] [ndv4:54861:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465630.024148] [ndv4:54861:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465630.024424] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.024433] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.024437] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.024477] [ndv4:54861:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465630.025023] [ndv4:54861:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.025028] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.025226] [ndv4:54861:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.025357] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.025363] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.026110] [ndv4:54861:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465630.026115] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.026675] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.026680] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.050561] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.050568] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.062301] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.062309] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.073604] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.073610] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.073825] [ndv4:54861:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.081010] [ndv4:54861:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.081364] [ndv4:54861:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465630.082102] [ndv4:54861:0] async.c:228 UCX DEBUG added async handler 0x1fa55a0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.082125] [ndv4:54861:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465630.082128] [ndv4:54861:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465630.082535] [ndv4:54861:0] ib_md.c:296 UCX D[1650465630.023694] [ndv4:55152:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4e25050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda74 | |
[1650465630.023947] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.023954] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.023963] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b97c39008 of 151544 bytes with 1052 elements | |
[1650465630.027773] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3ba3a00000..0x2b3ba6000000 on mlx5_ib3 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465630.027791] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3ba3a00018 of 39845864 bytes with 4752 elements | |
[1650465630.027938] [ndv4:55152:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4e25050 | |
[1650465630.027971] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x4e25050 using dc_mlx5/mlx5_ib3:1 on worker 0x357a8c0 | |
[1650465630.028173] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.028183] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.028292] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.028297] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.029110] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.040320] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.040602] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x5001060: created UD QP 0xda78 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.041069] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.041448] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.041455] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.041526] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.041531] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.041894] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b97c5e000..0x2b3b97ce3000 on mlx5_ib3 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465630.041900] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b97c5e018 of 544744 bytes with 128 elements | |
[1650465630.041904] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.051711] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5001060: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465630.052188] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5001060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465630.053143] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5001060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465630.053770] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5001060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465630.054253] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5001060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465630.061938] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5001060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465630.062490] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5001060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465630.063043] [ndv4:55152:0] [1650465630.027037] [ndv4:54853:0] async.c:228 UCX DEBUG added async handler 0x138cd70 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.027067] [ndv4:54853:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465630.027071] [ndv4:54853:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465630.027393] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.027406] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.027412] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.027460] [ndv4:54853:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465630.029757] [ndv4:54853:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.029764] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.029921] [ndv4:54853:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.030121] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.030128] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.044339] [ndv4:54853:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465630.044351] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.044872] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.044877] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.045054] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.045057] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.045196] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.045199] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.045295] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.045298] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.045708] [ndv4:54853:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.055200] [ndv4:54853:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.055511] [ndv4:54853:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465630.064874] [ndv4:54853:0] async.c:228 UCX DEBUG added async handler 0x1394c00 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.064910] [ndv4:54853:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465630.064914] [ndv4:54853:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465630.065122] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.065133] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.065139] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.065200] [ndv4:54853:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465630.066856] [ndv4:54853:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.066863] [ndv4:54853:0] mpool.c:88 UCX D[1650465630.056845] [ndv4:55039:0] debug.c:1198 UCX DEBUG using signal stack 0x2b0b39c2e000 size 141824 | |
[1650465630.056922] [ndv4:55039:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465630.056942] [ndv4:55039:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b0b39a9c000 | |
[1650465630.056961] [ndv4:55039:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465630.056969] [ndv4:55039:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465630.056975] [ndv4:55039:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465630.059510] [ndv4:55039:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465630.059527] [ndv4:55039:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465630.059565] [ndv4:55039:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465630.059568] [ndv4:55039:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465630.059593] [ndv4:55039:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465630.059601] [ndv4:55039:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465630.059604] [ndv4:55039:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465630.059608] [ndv4:55039:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465630.059611] [ndv4:55039:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465630.059613] [ndv4:55039:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465630.059615] [ndv4:55039:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465630.059618] [ndv4:55039:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465630.059626] [ndv4:55039:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465630.060357] [ndv4:55039:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465630.062039] [ndv4:55039:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465630.062055] [ndv4:55039:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465630.062068] [ndv4:55039:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465630.062079] [ndv4:55039:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465630.062089] [ndv4:55039:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465630.062100] [ndv4:55039:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465630.062110] [ndv4:55039:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465630.062481] [ndv4:55039:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465630.068823] [ndv4:55039:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.069140] [ndv4:55039:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465630.074085] [ndv4:55039:0] async.c:228 UCX DEBUG added async handler 0x19dd1e0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.074175] [ndv4:55039:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465630.074182] [ndv4:55039:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465630.074417] [[1650465630.019115] [ndv4:55432:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.019487] [ndv4:55432:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465630.020966] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x281e640 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.020991] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465630.020995] [ndv4:55432:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465630.021326] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.021335] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.021340] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.021381] [ndv4:55432:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465630.022804] [ndv4:55432:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.022810] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.022974] [ndv4:55432:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.022990] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.022996] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.023670] [ndv4:55432:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465630.023675] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.024164] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.024167] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.024478] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.024481] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.036204] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.036211] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.058795] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.058802] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.059020] [ndv4:55432:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.064844] [ndv4:55432:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.065168] [ndv4:55432:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465630.066033] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x2136760 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.066056] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465630.066059] [ndv4:55432:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465630.066182] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.066194] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.066200] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.066244] [ndv4:55432:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: usin[1650465630.020999] [ndv4:55691:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.021350] [ndv4:55691:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465630.043708] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2aa6050 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.043739] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465630.043744] [ndv4:55691:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465630.043917] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.043927] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.043934] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.044005] [ndv4:55691:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465630.046476] [ndv4:55691:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.046482] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.047671] [ndv4:55691:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.047790] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.047797] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.058394] [ndv4:55691:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465630.058401] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.059313] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.059318] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.072100] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.072109] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.083381] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.083388] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.097835] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.097844] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.098119] [ndv4:55691:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.016755] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.016761] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.017287] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x4ebb0a0: created RC QP 0xdaba on mlx5_ib6:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.017681] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[36]=0x4ebb0a0 using rc_verbs/mlx5_ib6:1 on worker 0x23dc840 | |
[1650465630.017812] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.017819] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.017927] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.017933] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.018564] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.030946] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.030960] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.030963] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.031013] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.031460] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x51dc010 of 8176 bytes with 127 elements | |
[1650465630.031676] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.031682] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.031717] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465630.031721] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.031732] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x5010c40 [id=127 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.031751] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 127 events 0x1 mode thread_spinlock | |
[1650465630.031762] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[37]=0x5008030 using rc_mlx5/mlx5_ib6:1 on worker 0x23dc840 | |
[1650465630.031833] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.031839] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.032112] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.032116] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.033100] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.034522] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.034538] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.034541] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.034639] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.035006] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.028017] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.028320] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x2b0b280: created UD QP 0xdaa1 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.028327] [ndv4:55512:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.028847] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.029387] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.029394] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.029684] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.029694] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.030080] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea79963000..0x2aea799e8000 on mlx5_ib1 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465630.030085] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aea79963018 of 544744 bytes with 128 elements | |
[1650465630.030091] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.030614] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2b0b280: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465630.031138] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2b0b280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465630.031300] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2b0b280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465630.052211] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2b0b280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465630.053036] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2b0b280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465630.053778] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2b0b280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465630.054323] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2b0b280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465630.072772] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x2b0b280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465630.072783] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.072834] [ndv4:55512:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.072840] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x19f74c0 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.072864] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465630.072885] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x2b0b280 using ud_mlx5/mlx5_ib1:1 on worker 0x22c98d0 | |
[1650465630.072974] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.072980] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.073121] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.073126] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.094058] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.095473] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t [1650465630.016046] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.017075] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.017093] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.017096] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.017152] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.017479] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.017484] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.018157] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x44230a0: created RC QP 0xda60 on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.018518] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x44230a0 using rc_verbs/mlx5_ib3:1 on worker 0x2e9a8d0 | |
[1650465630.018885] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.018891] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.019102] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.019107] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.019717] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.021497] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.021535] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.021539] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.021620] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.021967] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4744010 of 8176 bytes with 127 elements | |
[1650465630.022145] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.022151] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.022186] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465630.022190] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.022200] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x3815f80 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.022226] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465630.022236] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x4570030 using rc_mlx5/mlx5_ib3:1 on worker 0x2e9a8d0 | |
[1650465630.022352] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.022359] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.022522] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.022527] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.022939] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.024035] [ndv4:5[1650465630.016433] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x2b84050: created RC QP 0xda9a on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.017064] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x2b84050 using rc_verbs/mlx5_ib1:1 on worker 0x22b38d0 | |
[1650465630.017217] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.017225] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.017291] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.017296] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.017784] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.019268] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.019278] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.019281] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.019304] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.019643] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x2b97010 of 8176 bytes with 127 elements | |
[1650465630.019876] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.019882] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.019940] [ndv4:55516:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465630.019944] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.019954] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x238afc0 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.019972] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465630.019982] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x2a05010 using rc_mlx5/mlx5_ib1:1 on worker 0x22b38d0 | |
[1650465630.020223] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.020230] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.020410] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.020416] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.020895] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.022244] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.022261] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.022265] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.022318] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.022636] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.022642] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.022675] [ndv4:55516:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465630.022678] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.022685] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x19eddb0 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.022705] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465630.023125] [ndv4:55516:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.029890] [ndv4:55516:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x2d2f010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdac3 | |
[1650465630.030379] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.030387] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.030396] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae24be70008 of 151544 bytes with 1052 elements | |
[1650465630.034249] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae253600000..0x2ae255c00000 on mlx5_ib1 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465630.034262] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae253600018 of 39845864 bytes with 4752 elements | |
[1650465630.034430] [ndv4:55516:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x2d2f010 | |
[1650465630.034454] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x2d2f010 using dc_mlx5/mlx5_ib1:1 on worker 0x22b38d0 | |
[1650465630.034519] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.034527] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.034677] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.034683] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.048011] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.049044] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.049457] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x2c2e3b0: created UD QP 0xdaa5 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.050050] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.050513] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.050521] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.050701] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.050707] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.051087] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae24be95000..0x2ae24bf1a000 on mlx5_ib1 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465630.051103] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae24be95018 of 544744 bytes with 128 elements | |
[1650465630.051114] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.052056] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2c2e3b0: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465630.052821] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2c2e3b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465630.053658] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2c2e3b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465630.054210] [ndv4:55516:0] EBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.094686] [ndv4:54715:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.094827] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.094833] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.096569] [ndv4:54715:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465630.096636] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.098641] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.098646] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
EBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.082543] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.082548] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.082613] [ndv4:54861:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465630.084781] [ndv4:54861:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.084787] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.084978] [ndv4:54861:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.085388] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.085394] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
ud_iface.c:393 UCX DEBUG iface 0x5001060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465630.063343] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.063353] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x43c0e10 [id=109 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.063378] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 109 events 0x5 mode thread_spinlock | |
[1650465630.063392] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[24]=0x5001060 using ud_verbs/mlx5_ib3:1 on worker 0x357a8c0 | |
[1650465630.063407] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.063413] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.063501] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.063506] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.063879] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.064794] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.065144] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x511f460: created UD QP 0xda79 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.065150] [ndv4:55152:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.065792] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.065997] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.066004] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.066128] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.066134] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.066873] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b97ce3000..0x2b3b97d68000 on mlx5_ib3 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465630.066879] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b97ce3018 of 544744 bytes with 128 elements | |
[1650465630.066884] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.079289] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x511f460: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465630.080004] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x511f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465630.080023] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x511f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465630.080414] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x511f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465630.090716] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x511f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465630.090966] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x511f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465630.091319] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x511f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465630.091846] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x511f460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465630.091853] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.067097] [ndv4:54853:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.067178] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.067184] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.067852] [ndv4:54853:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465630.067857] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.080443] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.080450] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.081842] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.081849] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.095452] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.095460] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.074426] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.074446] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.074504] [ndv4:55039:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465630.074525] [ndv4:55039:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465630.075659] [ndv4:55039:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.075667] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.075870] [ndv4:55039:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.076023] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.076029] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.077436] [ndv4:55039:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465630.077441] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.087953] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.087960] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.088404] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.088408] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
g registration cache | |
[1650465630.067772] [ndv4:55432:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.067778] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.067928] [ndv4:55432:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.067962] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.067968] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.068164] [ndv4:55432:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465630.068168] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.068299] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.068303] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.068431] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.068434] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.102067] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.102076] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.104016] [ndv4:55432:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.104022] [ndv4:55432:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.104089] [ndv4:55432:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x2121df0 0x2121df0 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465630.035012] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.035046] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465630.035050] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.035057] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x4e96360 [id=129 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.035076] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 129 events 0x1 mode thread_spinlock | |
[1650465630.035698] [ndv4:54965:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.042707] [ndv4:54965:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x51de050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdadd | |
[1650465630.042810] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.042817] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.042826] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b4b36da9008 of 151544 bytes with 1052 elements | |
[1650465630.046424] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b34600000..0x2b4b36c00000 on mlx5_ib6 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465630.046441] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b4b34600018 of 39845864 bytes with 4752 elements | |
[1650465630.046633] [ndv4:54965:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x51de050 | |
[1650465630.046665] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[38]=0x51de050 using dc_mlx5/mlx5_ib6:1 on worker 0x23dc840 | |
[1650465630.046786] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.046796] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.046923] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.046929] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.047236] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.048108] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.048442] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x53ba340: created UD QP 0xdac4 on mlx5_ib6:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.048959] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.049261] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.049268] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.049381] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.049385] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.049838] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b36dce000..0x2b4b36e53000 on mlx5_ib6 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465630.049844] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b36dce018 of 544744 bytes with 128 elements | |
[1650465630.049849] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.050292] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x53ba340: adding gid fe80::15:5dff:fd34:1 to hash on device mlx5_ib6 port 1 index 0) | |
[1650465630.050634] [ndv4:54965:0] ud_iface.c:393 Uheadroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.095489] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.095493] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.095543] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.096132] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.096138] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.097051] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x31340a0: created RC QP 0xdaf5 on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.097907] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x31340a0 using rc_verbs/mlx5_ib2:1 on worker 0x22c98d0 | |
[1650465630.098041] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.098048] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.098388] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.098393] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.098853] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.099890] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.099904] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.099907] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.099958] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.100318] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3455010 of 8176 bytes with 127 elements | |
[1650465630.100558] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.100564] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.100670] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465630.100675] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.100686] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x1a12610 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.100710] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465630.100737] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x3281030 using rc_mlx5/mlx5_ib2:1 on worker 0x22c98d0 | |
[1650465630.100930] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.100936] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.101138] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.101144] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.101957] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.103548] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.103564] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_4756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.024053] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.024056] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.024114] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.024446] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.024452] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.024484] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465630.024488] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.024495] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x43fe9d0 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.024515] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465630.024824] [ndv4:54756:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.030990] [ndv4:54756:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4746050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda7d | |
[1650465630.031269] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.031277] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.031289] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b53fb2008 of 151544 bytes with 1052 elements | |
[1650465630.036239] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b60200000..0x2b3b62800000 on mlx5_ib3 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465630.036262] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3b60200018 of 39845864 bytes with 4752 elements | |
[1650465630.036430] [ndv4:54756:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4746050 | |
[1650465630.036466] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x4746050 using dc_mlx5/mlx5_ib3:1 on worker 0x2e9a8d0 | |
[1650465630.036644] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.036656] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.036816] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.036822] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.037918] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.039001] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.039448] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x4922060: created UD QP 0xda77 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.040034] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.040185] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.040191] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.040327] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.040335] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.040702] [ndv4:5475R support = { <none> } | |
[1650465630.091887] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.091892] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x2cb3f80 [id=110 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.091914] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 110 events 0x5 mode thread_spinlock | |
[1650465630.091925] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[25]=0x511f460 using ud_mlx5/mlx5_ib3:1 on worker 0x357a8c0 | |
[1650465630.092014] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.092020] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.092158] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.092163] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.092779] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.094036] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.094051] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.101935] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.102001] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.102810] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.102817] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.104065] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x521f0a0: created RC QP 0xdad0 on mlx5_ib4:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.105061] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[26]=0x521f0a0 using rc_verbs/mlx5_ib4:1 on worker 0x357a8c0 | |
[1650465630.105335] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.105342] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.105553] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.105558] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.106897] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
CX DEBUG iface 0x53ba340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 1) | |
[1650465630.060071] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x53ba340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 2) | |
[1650465630.060532] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x53ba340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 3) | |
[1650465630.070978] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x53ba340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 4) | |
[1650465630.071279] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x53ba340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 5) | |
[1650465630.071298] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x53ba340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 6) | |
[1650465630.071313] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x53ba340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 7) | |
[1650465630.071633] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.071642] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x53bae90 [id=130 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.071670] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 130 events 0x5 mode thread_spinlock | |
[1650465630.071684] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[39]=0x53ba340 using ud_verbs/mlx5_ib6:1 on worker 0x23dc840 | |
[1650465630.104841] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.104852] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.104896] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.104900] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.103567] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.103681] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.104083] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.104090] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.104125] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465630.104130] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.104139] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x27c1ef0 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.104162] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465630.104734] [ndv4:55512:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
6:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b62940000..0x2b3b629c5000 on mlx5_ib3 lkey 0x80d00 rkey 0x80d00 access 0xf flags 0x3e4 | |
[1650465630.040715] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b62940018 of 544744 bytes with 128 elements | |
[1650465630.040725] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.041393] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4922060: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465630.041791] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4922060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465630.042100] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4922060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465630.042288] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4922060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465630.042459] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4922060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465630.042607] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4922060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465630.042751] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4922060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465630.105477] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4922060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465630.106287] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.106297] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x3d0ee40 [id=109 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.106327] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 109 events 0x5 mode thread_spinlock | |
[1650465630.106340] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[24]=0x4922060 using ud_verbs/mlx5_ib3:1 on worker 0x2e9a8d0 | |
[1650465630.106434] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.106441] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.106818] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.106825] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.107308] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
ud_iface.c:393 UCX DEBUG iface 0x2c2e3b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465630.054650] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2c2e3b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465630.054941] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2c2e3b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465630.055210] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2c2e3b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465630.055354] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2c2e3b0: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465630.055566] [ndv4:55516:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.055595] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x19e2340 [id=95 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.055640] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 95 events 0x5 mode thread_spinlock | |
[1650465630.055665] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[14]=0x2c2e3b0 using ud_verbs/mlx5_ib1:1 on worker 0x22b38d0 | |
[1650465630.055746] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.104128] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.105054] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.105064] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.106091] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.107930] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.108441] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x2af5280: created UD QP 0xdaa6 on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.108449] [ndv4:55516:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.106888] [ndv4:54717:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.106899] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.107066] [ndv4:54717:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.107202] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.107209] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.108413] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.108428] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.108432] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.108482] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.108969] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x5540010 of 8176 bytes with 127 elements | |
[1650465630.109194] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.109201] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.109238] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465630.109242] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.109252] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x5374da0 [id=113 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.109280] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 113 events 0x1 mode thread_spinlock | |
[1650465630.109292] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[27]=0x536c030 using rc_mlx5/mlx5_ib4:1 on worker 0x357a8c0 | |
[1650465630.109423] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.109432] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.109812] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.109818] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.109158] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.109334] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.109341] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.109432] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.109441] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.109920] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae24bf1a000..0x2ae24bf9f000 on mlx5_ib1 lkey 0x81400 rkey 0x81400 access 0xf flags 0x3e4 | |
[1650465630.109930] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae24bf1a018 of 544744 bytes with 128 elements | |
[1650465630.109939] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.110268] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2af5280: adding gid fe80::15:5dff:fd33:fffc to hash on device mlx5_ib1 port 1 index 0) | |
[1650465630.110294] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2af5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 1) | |
[1650465630.110528] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2af5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 2) | |
[1650465630.110789] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2af5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 3) | |
[1650465630.111041] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2af5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 4) | |
[1650465630.111369] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2af5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 5) | |
[1650465630.111496] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2af5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 6) | |
[1650465630.109488] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.110134] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x4a40460: created UD QP 0xda7a on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.110143] [ndv4:54756:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.110873] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.111095] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.111101] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.111174] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.111179] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.111585] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b629c5000..0x2b3b62a4a000 on mlx5_ib3 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465630.111596] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b629c5018 of 544744 bytes with 128 elements | |
[1650465630.111604] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.111644] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4a40460: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465630.111985] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4a40460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465630.111487] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.111496] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.111826] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x2af5280: adding gid fe80:: to hash on device mlx5_ib1 port 1 index 7) | |
[1650465630.111840] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.111934] [ndv4:55516:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.111942] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x19e14c0 [id=96 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.111970] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 96 events 0x5 mode thread_spinlock | |
[1650465630.111996] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[15]=0x2af5280 using ud_mlx5/mlx5_ib1:1 on worker 0x22b38d0 | |
[1650465630.112153] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.112160] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.112517] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.112524] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.122248] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.123393] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.123409] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.123413] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.123466] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.123865] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.123881] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.124385] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x311e0a0: created RC QP 0xdb01 on mlx5_ib2:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.124832] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[16]=0x311e0a0 using rc_verbs/mlx5_ib2:1 on worker 0x22b38d0 | |
[1650465630.124957] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.124964] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.125108] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.125114] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.140302] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.150111] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.150131] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.150135] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.150186] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.150539] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x343f010 of 8176 bytes with 127 elements | |
[1650465630.150773] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.150781] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx[1650465630.122694] [ndv4:55691:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.123032] [ndv4:55691:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465630.137606] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2aa7210 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.137635] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465630.137639] [ndv4:55691:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465630.137902] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.137910] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.137916] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.137964] [ndv4:55691:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465630.139657] [ndv4:55691:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.139664] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.139797] [ndv4:55691:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.140029] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.140035] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.161613] [ndv4:55691:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465630.161620] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.118158] [ndv4:54717:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465630.118166] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.118657] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.118663] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.119030] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.119034] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.119245] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.119248] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.119473] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.119476] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.119765] [ndv4:54717:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.125934] [ndv4:54717:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.126231] [ndv4:54717:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465630.126733] [ndv4:54717:0] async.c:228 UCX DEBUG added async handler 0x23e12b0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.126774] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465630.126778] [ndv4:54717:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465630.127004] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.127012] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.127018] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.127062] [ndv4:54717:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465630.127961] [ndv4:54717:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.127967] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.128129] [ndv4:54717:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.128203] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.128209] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.141367] [ndv4:54717:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465630.141380] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.142364] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.142369] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.164446] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.164466] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.168240] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.168247] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.171163] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.171169] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.171507] [ndv4:5471[1650465630.122254] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.123610] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.124154] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x54d8460: created UD QP 0xdac5 on mlx5_ib6:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.124161] [ndv4:54965:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.124885] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.124914] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.124920] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.124932] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.124935] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.125385] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b36e53000..0x2b4b36ed8000 on mlx5_ib6 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465630.125390] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b36e53018 of 544744 bytes with 128 elements | |
[1650465630.125395] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.125503] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x54d8460: adding gid fe80::15:5dff:fd34:1 to hash on device mlx5_ib6 port 1 index 0) | |
[1650465630.125522] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x54d8460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 1) | |
[1650465630.125651] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x54d8460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 2) | |
[1650465630.125711] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x54d8460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 3) | |
[1650465630.125749] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x54d8460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 4) | |
[1650465630.125771] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x54d8460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 5) | |
[1650465630.125804] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x54d8460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 6) | |
[1650465630.139002] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x54d8460: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 7) | |
[1650465630.139013] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.139043] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.139048] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x4e96970 [id=131 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.139070] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 131 events 0x5 mode thread_spinlock | |
[1650465630.139082] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[40]=0x54d8460 using ud_mlx5/mlx5_ib6:1 on worker 0x23dc840 | |
[1650465630.139098] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.139103] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.139277] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.139282] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.148176] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on 5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.150817] [ndv4:55516:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465630.150821] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.150838] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x19fc610 [id=99 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.150868] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 99 events 0x1 mode thread_spinlock | |
[1650465630.150901] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[17]=0x326b030 using rc_mlx5/mlx5_ib2:1 on worker 0x22b38d0 | |
[1650465630.151123] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.151132] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.151254] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.151263] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.151649] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.152927] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.152949] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.152953] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.153014] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.153367] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.153376] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.153411] [ndv4:55516:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib2 length=2048) failed: Invalid argument | |
[1650465630.153416] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.153424] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x27abef0 [id=101 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.153444] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 101 events 0x1 mode thread_spinlock | |
[1650465630.153923] [ndv4:55516:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.161454] [ndv4:55516:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3441050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdb18 | |
[1650465630.161759] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.161767] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.161777] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae24bfa1008 of 151544 bytes with 1052 elements | |
[1650465630.165488] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae255e00000..0x2ae258400000 on mlx5_ib2 lkey 0x81500 rkey 0x81500 access 0xf flags 0x3e4 | |
[1650465630.165503] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae255e00018 of 39845864 bytes with 4752 elements | |
[1650465630.165689] [ndv4:55516:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3441050 | |
[1650465630.165717] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x3441050 using dc_mlx5/mlx5_ib2:1 on worker 0x22b38d0 | |
[1650465630.165873] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[16504[1650465630.119225] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.120438] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.120457] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.120461] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.120519] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.121071] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.121083] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.121125] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465630.121129] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.121143] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x5374fc0 [id=115 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.121174] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 115 events 0x1 mode thread_spinlock | |
[1650465630.121900] [ndv4:55152:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.130294] [ndv4:55152:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x5542050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdae6 | |
[1650465630.130720] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.130729] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.130740] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b97d6a008 of 151544 bytes with 1052 elements | |
[1650465630.134536] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3ba6200000..0x2b3ba8800000 on mlx5_ib4 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465630.134567] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3ba6200018 of 39845864 bytes with 4752 elements | |
[1650465630.134763] [ndv4:55152:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x5542050 | |
[1650465630.134800] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[28]=0x5542050 using dc_mlx5/mlx5_ib4:1 on worker 0x357a8c0 | |
[1650465630.135066] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.135077] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.135137] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.135142] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.153262] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.154664] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.155195] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x571e220: created UD QP 0xdae9 on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.155878] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.156010] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.156017] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.156099] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[165046563[1650465630.121476] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.121483] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.121738] [ndv4:55039:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.128161] [ndv4:55039:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.128480] [ndv4:55039:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465630.129224] [ndv4:55039:0] async.c:228 UCX DEBUG added async handler 0x19d5f50 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.129251] [ndv4:55039:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465630.129256] [ndv4:55039:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465630.129417] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.129425] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.129430] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.129482] [ndv4:55039:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465630.130689] [ndv4:55039:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.130696] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.130899] [ndv4:55039:0] ib_md.c:1604 UCX DEBUG mlx5_ib1: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.131002] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.131008] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.131778] [ndv4:55039:0] topo.c:99 UCX DEBUG bus id 0x102000000 doesn't exist. sys_dev = 1 | |
[1650465630.131783] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.143938] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.143944] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.146970] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.146976] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.148708] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.148713] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.150484] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x102000000 exists. sys_dev = 1 | |
[1650465630.150490] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib1 bus id 258:0:0.0 sys_dev 1 | |
[1650465630.150742] [ndv4:55039:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.158650] [ndv4:55039:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib2 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.159015] [ndv4:55039:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib2: disable ODP because it's not supported for DevX QP | |
[1650465630.171221] [ndv4:55039:0] async.c:228 UCX DEBUG added async handler 0x19ddce0 [id=61 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.171265] [ndv4:55039:0] async.c:506 UCX DEBUG listening to async event fd 61 events 0x1 mode thread_spinlock | |
[1650465630.171272] [ndv4:55039:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib2' (InfiniBand channel adapter) with 1 ports | |
[1650465630.171931] [ndv4:55039:0] ib_md.c:296 UCX D[1650465630.119657] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.119665] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.119824] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.119828] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.120311] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.120315] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.120526] [ndv4:54715:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.126435] [ndv4:54715:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.126760] [ndv4:54715:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465630.127231] [ndv4:54715:0] async.c:228 UCX DEBUG added async handler 0x20063a0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.127256] [ndv4:54715:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465630.127260] [ndv4:54715:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465630.127731] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.127742] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.127747] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.127788] [ndv4:54715:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465630.128769] [ndv4:54715:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.128774] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.129009] [ndv4:54715:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.129092] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.129098] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.151033] [ndv4:54715:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465630.151041] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.115317] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.115325] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.117406] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.117412] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.117739] [ndv4:55324:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.126006] [ndv4:55324:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.126381] [ndv4:55324:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465630.126893] [ndv4:55324:0] async.c:228 UCX DEBUG added async handler 0x1661f90 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.126922] [ndv4:55324:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465630.126926] [ndv4:55324:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465630.127116] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.127124] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.127129] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.127181] [ndv4:55324:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465630.127800] [ndv4:55324:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.127805] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.127979] [ndv4:55324:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.127999] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.128005] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.141645] [ndv4:55324:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465630.141652] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.152393] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.152399] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.154948] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.154954] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.157562] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.157568] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.158921] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.158927] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.159163] [ndv4:55324:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.117509] [ndv4:54861:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465630.117520] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.129747] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.129764] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.150708] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.150718] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.151603] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.151608] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.152237] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.152241] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.152523] [ndv4:54861:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.170735] [ndv4:54861:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.171185] [ndv4:54861:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465630.172070] [ndv4:54861:0] async.c:228 UCX DEBUG added async handler 0x1fa4d30 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.172100] [ndv4:54861:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465630.172104] [ndv4:54861:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465630.172418] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.172427] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.172435] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.172489] [ndv4:54861:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465630.174338] [ndv4:54861:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.174345] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.174632] [ndv4:54861:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.174765] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.174772] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.176322] [ndv4:54861:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465630.176328] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.114842] [ndv4:55512:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3457050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdb0b | |
[1650465630.115188] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.115196] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.115340] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2aea7c1e9008 of 151544 bytes with 1052 elements | |
[1650465630.119007] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea79a00000..0x2aea7c000000 on mlx5_ib2 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465630.119026] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2aea79a00018 of 39845864 bytes with 4752 elements | |
[1650465630.119164] [ndv4:55512:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3457050 | |
[1650465630.119200] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[18]=0x3457050 using dc_mlx5/mlx5_ib2:1 on worker 0x22c98d0 | |
[1650465630.119241] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.119252] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.119320] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.119326] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.135156] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.136388] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.136766] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x3633060: created UD QP 0xdb02 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.137380] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.137696] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.137703] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.137810] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.137815] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.138380] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea7c20e000..0x2aea7c293000 on mlx5_ib2 lkey 0x81400 rkey 0x81400 access 0xf flags 0x3e4 | |
[1650465630.138386] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aea7c20e018 of 544744 bytes with 128 elements | |
[1650465630.138391] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.138913] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3633060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465630.139464] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3633060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465630.139486] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3633060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465630.140370] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3633060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465630.141095] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3633060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465630.141833] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3633060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465630.142275] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3633060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465630.163136] [ndv4:55512:0] [1650465630.116983] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.116992] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.117352] [ndv4:54853:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.147418] [ndv4:54853:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.147799] [ndv4:54853:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465630.159497] [ndv4:54853:0] async.c:228 UCX DEBUG added async handler 0x138c3a0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.159525] [ndv4:54853:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465630.159529] [ndv4:54853:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465630.159684] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.159692] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.159698] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.159763] [ndv4:54853:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465630.160592] [ndv4:54853:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.160597] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.160792] [ndv4:54853:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.160903] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.160909] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.173499] [ndv4:54853:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465630.173514] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.120868] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4a40460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465630.121098] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4a40460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465630.121200] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4a40460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465630.121214] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4a40460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465630.121327] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4a40460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465630.121342] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x4a40460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465630.121347] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.121436] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.121441] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x25d3fb0 [id=110 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.121464] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 110 events 0x5 mode thread_spinlock | |
[1650465630.121474] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[25]=0x4a40460 using ud_mlx5/mlx5_ib3:1 on worker 0x2e9a8d0 | |
[1650465630.121647] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.121653] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.121779] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.121784] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.122144] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.123063] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.123082] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.123085] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.123141] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.123459] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.123464] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.124010] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x4b400a0: created RC QP 0xdad2 on mlx5_ib4:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.124601] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[26]=0x4b400a0 using rc_verbs/mlx5_ib4:1 on worker 0x2e9a8d0 | |
[1650465630.124686] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.124692] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.124842] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.124851] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.125249] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.126492] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.126512] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.126517] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.126658] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.126945] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4e61010 of 8176 bytes with 127 elements | |
[1650465630.127130] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.127138] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.127264] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465630.127269] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.127281] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x3c06970 [id=113 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.127304] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 113 events 0x1 mode thread_spinlock | |
[1650465630.127318] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[27]=0x4c8d030 using rc_mlx5/mlx5_ib4:1 on worker 0x2e9a8d0 | |
[1650465630.127389] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.127396] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.127479] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.127485] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.127727] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.128870] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.128886] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.128889] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.128971] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.129283] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.129290] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.129323] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465630.129327] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.129334] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x4e6ffc0 [id=115 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.129372] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 115 events 0x1 mode thread_spinlock | |
[1650465630.129662] [ndv4:54756:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.135245] [ndv4:54756:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4e63050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdaf0 | |
[1650465630.135383] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.135390] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.135398] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b365630.165880] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.166002] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.166008] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
0.156104] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.156494] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b97d8f000..0x2b3b97e14000 on mlx5_ib4 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465630.156502] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b97d8f018 of 544744 bytes with 128 elements | |
[1650465630.156508] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.157305] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x571e220: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465630.157796] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x571e220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465630.158198] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x571e220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465630.158455] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x571e220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465630.168295] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x571e220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
EBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.171941] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.171952] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.172063] [ndv4:55039:0] ib_md.c:1319 UCX DEBUG mlx5_ib2: using registration cache | |
[1650465630.173833] [ndv4:55039:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.173840] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.174023] [ndv4:55039:0] ib_md.c:1604 UCX DEBUG mlx5_ib2: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.174195] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.174201] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.174802] [ndv4:55039:0] topo.c:99 UCX DEBUG bus id 0x103000000 doesn't exist. sys_dev = 2 | |
[1650465630.174808] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
ud_iface.c:393 UCX DEBUG iface 0x3633060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465630.163396] [ndv4:55512:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.163404] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x2c441f0 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.163434] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465630.163450] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x3633060 using ud_verbs/mlx5_ib2:1 on worker 0x22c98d0 | |
[1650465630.163630] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.163637] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.163690] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.163695] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.164011] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.165098] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.165455] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x3751050: created UD QP 0xdb0e on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.165462] [ndv4:55512:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.165984] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.166157] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.166163] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.166200] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.166206] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.166539] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea7c293000..0x2aea7c318000 on mlx5_ib2 lkey 0x81600 rkey 0x81600 access 0xf flags 0x3e4 | |
[1650465630.166545] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aea7c293018 of 544744 bytes with 128 elements | |
[1650465630.166550] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.167494] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3751050: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465630.167516] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3751050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465630.167897] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3751050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465630.168270] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3751050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465630.168761] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3751050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465630.168812] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3751050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465630.168827] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3751050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465630.169285] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3751050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465630.169293] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without Amlx5_ib7:1 | |
[1650465630.149344] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.149361] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.149364] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.149416] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.150423] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.150428] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.150972] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x55d80a0: created RC QP 0xda4c on mlx5_ib7:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.151302] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[41]=0x55d80a0 using rc_verbs/mlx5_ib7:1 on worker 0x23dc840 | |
[1650465630.151429] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.151435] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.151496] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.151500] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.151826] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465630.153363] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.153378] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.153381] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.153431] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.153889] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x58f9010 of 8176 bytes with 127 elements | |
[1650465630.154108] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.154114] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.154148] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib7 length=2048) failed: Invalid argument | |
[1650465630.154152] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.154162] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x5505960 [id=134 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.154182] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 134 events 0x1 mode thread_spinlock | |
[1650465630.154192] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[42]=0x5725030 using rc_mlx5/mlx5_ib7:1 on worker 0x23dc840 | |
[1650465630.154310] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.154316] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.154467] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.154473] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.154795] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465630.155774] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs [1650465630.174478] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.174485] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.175144] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.175148] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.175276] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.175279] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.176593] [ndv4:55691:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.176597] [ndv4:55691:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.176723] [ndv4:55691:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x23a8c60 0x23a8c60 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
7:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
b53fd9008 of 151544 bytes with 1052 elements | |
[1650465630.139036] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b62c00000..0x2b3b65200000 on mlx5_ib4 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465630.139053] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3b62c00018 of 39845864 bytes with 4752 elements | |
[1650465630.139191] [ndv4:54756:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4e63050 | |
[1650465630.139228] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[28]=0x4e63050 using dc_mlx5/mlx5_ib4:1 on worker 0x2e9a8d0 | |
[1650465630.139392] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.139402] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.139721] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.139727] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.149116] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.161202] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.161476] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x503f220: created UD QP 0xdaeb on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
R support = { <none> } | |
[1650465630.169324] [ndv4:55512:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.169328] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x2bb0cb0 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.169349] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465630.169359] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x3751050 using ud_mlx5/mlx5_ib2:1 on worker 0x22c98d0 | |
[1650465630.169552] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.169557] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.169750] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.169756] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.170361] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
92 hdr_ofs 90 data_sz 8256 | |
[1650465630.155788] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.155791] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.155846] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.156197] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.156202] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.156236] [ndv4:54965:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib7 length=2048) failed: Invalid argument | |
[1650465630.156239] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.176948] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x2c1ed50 [id=136 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.176971] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 136 events 0x1 mode thread_spinlock | |
[1650465630.177522] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.177823] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.177831] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.178021] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.178027] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.178428] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b6524b000..0x2b3b652d0000 on mlx5_ib4 lkey 0x80e00 rkey 0x80e00 access 0xf flags 0x3e4 | |
[1650465630.178443] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b6524b018 of 544744 bytes with 128 elements | |
[1650465630.178452] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.179338] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x503f220: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465630.180146] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x503f220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465630.180691] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x503f220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465630.180915] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x503f220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465630.181662] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x503f220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465630.181888] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x503f220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465630.177565] [ndv4:54965:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.177284] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.178170] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.178501] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x361d060: created UD QP 0xdb0f on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.178982] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.179086] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.179092] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.179108] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.179112] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.179483] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae258458000..0x2ae2584dd000 on mlx5_ib2 lkey 0x81700 rkey 0x81700 access 0xf flags 0x3e4 | |
[1650465630.179493] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae258458018 of 544744 bytes with 128 elements | |
[1650465630.179501] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.180223] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x361d060: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465630.180496] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x361d060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465630.180876] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x361d060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465630.180894] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x361d060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465630.180910] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x361d060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465630.181253] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x361d060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465630.181665] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x361d060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465630.181902] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x361d060: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465630.177352] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.177359] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.178570] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.178646] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.180506] [ndv4:54717:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.180856] [ndv4:54717:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465630.181928] [ndv4:54717:0] async.c:228 UCX DEBUG added async handler 0x2acabf0 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.181959] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465630.181962] [ndv4:54717:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465630.182116] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.182125] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.182130] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.182176] [ndv4:54717:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465630.180471] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x571e220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465630.180842] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x571e220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465630.180863] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x571e220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465630.181203] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.181211] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x4a2f3a0 [id=116 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.181234] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 116 events 0x5 mode thread_spinlock | |
[1650465630.181246] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[29]=0x571e220 using ud_verbs/mlx5_ib4:1 on worker 0x357a8c0 | |
[1650465630.181263] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.181269] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.181422] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.181427] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.182436] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.184518] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.180322] [ndv4:55324:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.180776] [ndv4:55324:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465630.181300] [ndv4:55324:0] async.c:228 UCX DEBUG added async handler 0x1663f40 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.181323] [ndv4:55324:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465630.181327] [ndv4:55324:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465630.181699] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.181708] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.181713] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.181762] [ndv4:55324:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465630.182075] [ndv4:55516:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.182083] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x2c2e1f0 [id=102 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.182107] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 102 events 0x5 mode thread_spinlock | |
[1650465630.182128] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[19]=0x361d060 using ud_verbs/mlx5_ib2:1 on worker 0x22b38d0 | |
[1650465630.182324] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.182329] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.182493] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.182498] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.183099] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib2:1 | |
[1650465630.183958] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.184215] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x373b050: created UD QP 0xdb10 on mlx5_ib2:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.184222] [ndv4:55516:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.184640] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.184676] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.184681] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.184707] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: cuda GPUDirect RDMA is enabled | |
[1650465630.184712] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib2: rocm GPUDirect RDMA is disabled | |
[1650465630.184959] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x583c050: created UD QP 0xdaec on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.184965] [ndv4:55152:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.185643] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.185669] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.185674] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.185685] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.185688] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.185011] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae2584dd000..0x2ae258562000 on mlx5_ib2 lkey 0x81800 rkey 0x81800 access 0xf flags 0x3e4 | |
[1650465630.185021] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae2584dd018 of 544744 bytes with 128 elements | |
[1650465630.185029] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.185085] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x373b050: adding gid fe80::15:5dff:fd33:fffd to hash on device mlx5_ib2 port 1 index 0) | |
[1650465630.185117] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x373b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 1) | |
[1650465630.185167] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x373b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 2) | |
[1650465630.185198] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x373b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 3) | |
[1650465630.185222] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x373b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 4) | |
[1650465630.185246] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x373b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 5) | |
[1650465630.185276] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x373b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 6) | |
[1650465630.185294] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x373b050: adding gid fe80:: to hash on device mlx5_ib2 port 1 index 7) | |
[1650465630.185302] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib2:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.185396] [ndv4:55516:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.185401] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x2b9acb0 [id=103 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.185425] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 103 events 0x5 mode thread_spinlock | |
[1650465630.185437] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[20]=0x373b050 using ud_mlx5/mlx5_ib2:1 on worker 0x22b38d0 | |
[1650465630.185450] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.185456] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.185511] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.185516] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.185700] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.184525] [ndv4:54717:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.184531] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.184690] [ndv4:54717:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.184712] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.184718] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.184875] [ndv4:54717:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465630.184879] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.184982] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.184986] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.185083] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.185086] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.185223] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.185227] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.185333] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.185337] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.185553] [ndv4:54717:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.184994] [ndv4:54965:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x58fb050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda6f | |
[1650465630.185071] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.185078] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.185087] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b4b396d9008 of 151544 bytes with 1052 elements | |
[1650465630.186639] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.186658] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.186662] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.186717] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.186097] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b97e14000..0x2b3b97e99000 on mlx5_ib4 lkey 0x80f00 rkey 0x80f00 access 0xf flags 0x3e4 | |
[1650465630.186103] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b97e14018 of 544744 bytes with 128 elements | |
[1650465630.186108] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.186142] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x583c050: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465630.186158] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x583c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465630.186172] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x583c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465630.186187] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x583c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465630.186201] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x583c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465630.186215] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x583c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465630.186229] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x583c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465630.186242] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x583c050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465630.186246] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.186275] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.186280] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x5227c00 [id=117 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.186300] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 117 events 0x5 mode thread_spinlock | |
[1650465630.186309] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[30]=0x583c050 using ud_mlx5/mlx5_ib4:1 on worker 0x357a8c0 | |
[1650465630.186319] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.186323] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.186377] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.186381] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.186564] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.187198] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.187204] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.187494] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.187511] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.187514] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.187567] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.188183] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.188190] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.188110] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x383b0a0: created RC QP 0xda7b on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.187997] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.188018] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.188021] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.188073] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.189204] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x383b0a0 using rc_verbs/mlx5_ib3:1 on worker 0x22b38d0 | |
[1650465630.189233] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.189240] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.189327] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.189334] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.189661] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.188754] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b37000000..0x2b4b39600000 on mlx5_ib7 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465630.188770] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b4b37000018 of 39845864 bytes with 4752 elements | |
[1650465630.188911] [ndv4:54965:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x58fb050 | |
[1650465630.188943] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[43]=0x58fb050 using dc_mlx5/mlx5_ib7:1 on worker 0x23dc840 | |
[1650465630.189103] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.189114] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.189253] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.189259] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.189568] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465630.189475] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x593c0a0: created RC QP 0xda60 on mlx5_ib5:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.189543] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.189552] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.189736] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.189740] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.190033] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.190061] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.190133] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.190144] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.190768] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x38510a0: created RC QP 0xda7e on mlx5_ib3:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.190435] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[31]=0x593c0a0 using rc_verbs/mlx5_ib5:1 on worker 0x357a8c0 | |
[1650465630.190536] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.190544] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.190707] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.190714] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.190529] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.190880] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x5ad7060: created UD QP 0xda55 on mlx5_ib7:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.191130] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.191147] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.191150] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.191208] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.191541] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3b5c010 of 8176 bytes with 127 elements | |
[1650465630.191769] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.191776] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.191816] [ndv4:55516:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465630.191820] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.191831] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x19d7ef0 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.191860] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465630.191871] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x3988030 using rc_mlx5/mlx5_ib3:1 on worker 0x22b38d0 | |
[1650465630.192199] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.192206] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.192309] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.192314] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.191137] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[21]=0x38510a0 using rc_verbs/mlx5_ib3:1 on worker 0x22c98d0 | |
[1650465630.191418] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.191425] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.191487] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.191491] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.191716] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.191065] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.191409] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.191873] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.191882] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.192108] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.192113] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.192557] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b396fe000..0x2b4b39783000 on mlx5_ib7 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465630.192564] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b396fe018 of 544744 bytes with 128 elements | |
[1650465630.192568] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.192913] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x5ad7060: adding gid fe80::15:5dff:fd34:2 to hash on device mlx5_ib7 port 1 index 0) | |
[1650465630.192853] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.193453] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.193468] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.193471] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.193521] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.193260] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.193267] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.193513] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.193517] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.193435] [ndv4:54717:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.193111] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.193128] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.193131] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.193184] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.193672] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x5c5d010 of 8176 bytes with 127 elements | |
[1650465630.193809] [ndv4:54861:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.193787] [ndv4:54717:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465630.194110] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.194126] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.194129] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.194210] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.194505] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.194512] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.194548] [ndv4:55516:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465630.194552] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.194559] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x195ab00 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.194614] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465630.195136] [ndv4:55516:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.194181] [ndv4:55432:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2aee3be67000 length 12288 | |
[1650465630.194248] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465630.195325] [ndv4:55432:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465630.195334] [ndv4:55432:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2aee40ec7000 length 4296704 | |
[1650465630.195339] [ndv4:55432:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2aee40ec7018 of 4296680 bytes with 512 elements | |
[1650465630.194320] [ndv4:54717:0] async.c:228 UCX DEBUG added async handler 0x23e5710 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.194346] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465630.194349] [ndv4:54717:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465630.194500] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.194507] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.194512] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.194567] [ndv4:54717:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465630.193973] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3b72010 of 8176 bytes with 127 elements | |
[1650465630.194183] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.194194] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.194272] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465630.194277] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.194287] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x19edef0 [id=106 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.194311] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 106 events 0x1 mode thread_spinlock | |
[1650465630.194322] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[22]=0x399e030 using rc_mlx5/mlx5_ib3:1 on worker 0x22c98d0 | |
[1650465630.194349] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.194355] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.194452] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.194456] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.194685] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.193938] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.193948] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.194021] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465630.194025] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.194038] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x4a2fe50 [id=120 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.194062] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 120 events 0x1 mode thread_spinlock | |
[1650465630.194074] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[32]=0x5a89030 using rc_mlx5/mlx5_ib5:1 on worker 0x357a8c0 | |
[1650465630.194153] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.194160] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.194260] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.194265] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.195724] [ndv4:55432:0] mm_iface.c:600 UCX DEBUG created mm iface 0x289ede0 FIFO id 0x400000005232a315 va 0x2aee3be67000 size 12288 (128 x 64 elems) | |
[1650465630.195774] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x289ede0 using posix/memory on worker 0x3178950 | |
[1650465630.195799] [ndv4:55432:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465630.195836] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465630.195850] [ndv4:55432:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465630.195859] [ndv4:55432:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2aee412e0018 of 4296680 bytes with 512 elements | |
[1650465630.196418] [ndv4:55432:0] mm_iface.c:600 UCX DEBUG created mm iface 0x289f3b0 FIFO id 0x72001f va 0x2aee3be6a000 size 12288 (128 x 64 elems) | |
[1650465630.196428] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x289f3b0 using sysv/memory on worker 0x3178950 | |
[1650465630.196441] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465630.196444] [ndv4:55432:0] self.c:220 UCX DEBUG created self iface id 0xe074c57977df98cc send_size 8192 | |
[1650465630.196450] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x28b2f90 using self/memory0 on worker 0x3178950 | |
[1650465630.196476] [ndv4:55432:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465630.196482] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465630.196486] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465630.197202] [ndv4:55324:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.197219] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.197447] [ndv4:55324:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.195676] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x503f220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465630.195989] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x503f220: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465630.196161] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.196174] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x43503a0 [id=116 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.196200] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 116 events 0x5 mode thread_spinlock | |
[1650465630.196220] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[29]=0x503f220 using ud_verbs/mlx5_ib4:1 on worker 0x2e9a8d0 | |
[1650465630.196244] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.196250] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.196303] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.196308] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.196809] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.195843] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.195859] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.195863] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.195923] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.196291] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.196297] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.196363] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib3 length=2048) failed: Invalid argument | |
[1650465630.196367] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.196375] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x1970b00 [id=108 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.196394] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 108 events 0x1 mode thread_spinlock | |
[1650465630.196694] [ndv4:55512:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.199072] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x28c0fb0 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465630.199109] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465630.199127] [ndv4:55432:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x28b38f0: listening for connections (fd=78) on 10.5.0.5:49327 | |
[1650465630.199266] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x28b38f0 using tcp/eth0 on worker 0x3178950 | |
[1650465630.199288] [ndv4:55432:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465630.199292] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465630.199295] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465630.199335] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x280f370 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465630.199359] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465630.199363] [ndv4:55432:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x28b3f70: listening for connections (fd=80) on 127.0.0.1:43614 | |
[1650465630.199379] [ndv4:55432:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.199384] [ndv4:55432:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.199444] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x28b3f70 using tcp/lo on worker 0x3178950 | |
[1650465630.199461] [ndv4:55432:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465630.199464] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465630.199467] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465630.199508] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x280f0a0 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465630.199531] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465630.199535] [ndv4:55432:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x289b600: listening for connections (fd=82) on 172.16.1.242:42240 | |
[1650465630.199805] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x289b600 using tcp/ib0 on worker 0x3178950 | |
[1650465630.200198] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.200211] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.201278] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.201283] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.201008] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.201017] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.202363] [ndv4:54861:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.202759] [ndv4:54861:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465630.201940] [ndv4:55516:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3b5e050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda92 | |
[1650465630.202175] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.202182] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.202191] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae24bfc8008 of 151544 bytes with 1052 elements | |
[1650465630.203322] [ndv4:55512:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3b74050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda94 | |
[1650465630.203448] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.203455] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.203466] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2aea7eb19008 of 151544 bytes with 1052 elements | |
[1650465630.205218] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x5ad7060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 1) | |
[1650465630.205297] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x5ad7060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 2) | |
[1650465630.205398] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x5ad7060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 3) | |
[1650465630.205502] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x5ad7060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 4) | |
[1650465630.205780] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x5ad7060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 5) | |
[1650465630.206152] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x5ad7060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 6) | |
[1650465630.206419] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x5ad7060: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 7) | |
[1650465630.206699] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.206712] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x2cc3510 [id=137 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.206737] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 137 events 0x5 mode thread_spinlock | |
[1650465630.206750] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[44]=0x5ad7060 using ud_verbs/mlx5_ib7:1 on worker 0x23dc840 | |
[1650465630.206765] [ndv4:54717:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.206787] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.206997] [ndv4:54717:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.207094] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.207106] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.208394] [ndv4:54717:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465630.208406] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.206690] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae258600000..0x2ae25ac00000 on mlx5_ib3 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465630.206709] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae258600018 of 39845864 bytes with 4752 elements | |
[1650465630.206869] [ndv4:55516:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3b5e050 | |
[1650465630.206905] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x3b5e050 using dc_mlx5/mlx5_ib3:1 on worker 0x22b38d0 | |
[1650465630.207625] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea7c400000..0x2aea7ea00000 on mlx5_ib3 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465630.207651] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2aea7c400018 of 39845864 bytes with 4752 elements | |
[1650465630.207790] [ndv4:55512:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3b74050 | |
[1650465630.207824] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[23]=0x3b74050 using dc_mlx5/mlx5_ib3:1 on worker 0x22c98d0 | |
[1650465630.207935] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.207947] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.208059] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.208063] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.208391] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.209255] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.209615] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x3d50060: created UD QP 0xda95 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.209714] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.209726] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.210440] [ndv4:55324:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465630.210445] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.210108] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.210186] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.210193] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.210226] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.210230] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.210596] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea7eb3e000..0x2aea7ebc3000 on mlx5_ib3 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465630.210603] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aea7eb3e018 of 544744 bytes with 128 elements | |
[1650465630.210609] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.210925] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3d50060: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465630.211142] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3d50060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465630.211308] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3d50060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465630.211494] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3d50060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465630.211832] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3d50060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465630.212147] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3d50060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465630.212018] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.212031] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.212174] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.212180] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.210416] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.210826] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x515d050: created UD QP 0xdaed on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.210834] [ndv4:54756:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.211345] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.211635] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.211646] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.211687] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.211692] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.212056] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b652d0000..0x2b3b65355000 on mlx5_ib4 lkey 0x81000 rkey 0x81000 access 0xf flags 0x3e4 | |
[1650465630.212064] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b652d0018 of 544744 bytes with 128 elements | |
[1650465630.212072] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.212114] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x515d050: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465630.212815] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x515d050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465630.213069] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x515d050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465630.213317] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x515d050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465630.213472] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x515d050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465630.213903] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.213912] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.214129] [ndv4:54715:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.214939] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.214954] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.215556] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.215560] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.216668] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x103000000 exists. sys_dev = 2 | |
[1650465630.216674] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib2 bus id 259:0:0.0 sys_dev 2 | |
[1650465630.217029] [ndv4:55039:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.215738] [ndv4:55691:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b871f8df000 length 12288 | |
[1650465630.215798] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465630.216852] [ndv4:55691:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465630.216860] [ndv4:55691:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b871f8e2000 length 4296704 | |
[1650465630.216865] [ndv4:55691:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b871f8e2018 of 4296680 bytes with 512 elements | |
[1650465630.217126] [ndv4:55691:0] mm_iface.c:600 UCX DEBUG created mm iface 0x2b28a90 FIFO id 0x400000005555a826 va 0x2b871f8df000 size 12288 (128 x 64 elems) | |
[1650465630.217171] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x2b28a90 using posix/memory on worker 0x33ff770 | |
[1650465630.217197] [ndv4:55691:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465630.217231] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465630.217246] [ndv4:55691:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlb | |
[1650465630.217255] [ndv4:55691:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b8724000018 of 4296680 bytes with 512 elements | |
[1650465630.217872] [ndv4:55691:0] mm_iface.c:600 UCX DEBUG created mm iface 0x2b3a7e0 FIFO id 0x720021 va 0x2b871fcfb000 size 12288 (128 x 64 elems) | |
[1650465630.217880] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x2b3a7e0 using sysv/memory on worker 0x33ff770 | |
[1650465630.217892] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465630.217895] [ndv4:55691:0] self.c:220 UCX DEBUG created self iface id 0x442a197005aa060c send_size 8192 | |
[1650465630.217901] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x2b39e80 using self/memory0 on worker 0x33ff770 | |
[1650465630.217924] [ndv4:55691:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465630.217929] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465630.217932] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465630.218533] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.217065] [ndv4:54861:0] async.c:228 UCX DEBUG added async handler 0x18bff90 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.217090] [ndv4:54861:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465630.217093] [ndv4:54861:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465630.217305] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.217314] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.217320] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.217368] [ndv4:54861:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465630.219019] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.219031] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.219323] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.219328] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.219440] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.219462] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.219699] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.219705] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.220137] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.220159] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.220163] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.220229] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.220105] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.220502] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2b481a0 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465630.220536] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465630.220553] [ndv4:55691:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2b3adb0: listening for connections (fd=78) on 10.5.0.5:54662 | |
[1650465630.220740] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x2b3adb0 using tcp/eth0 on worker 0x33ff770 | |
[1650465630.220761] [ndv4:55691:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465630.220765] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465630.220768] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465630.220805] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2b47b60 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465630.220835] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465630.220838] [ndv4:55691:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2b22530: listening for connections (fd=80) on 127.0.0.1:45852 | |
[1650465630.220858] [ndv4:55691:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.220863] [ndv4:55691:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.220922] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x2b22530 using tcp/lo on worker 0x33ff770 | |
[1650465630.220938] [ndv4:55691:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465630.220941] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465630.220943] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465630.220979] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2a94a20 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465630.220999] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465630.221003] [ndv4:55691:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2b22bb0: listening for connections (fd=82) on 172.16.1.242:52533 | |
[1650465630.220684] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.220695] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.220734] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465630.220738] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.220752] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x5a91f80 [id=122 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.220781] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 122 events 0x1 mode thread_spinlock | |
[1650465630.221256] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x2b22bb0 using tcp/ib0 on worker 0x33ff770 | |
[1650465630.221324] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.221332] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.221452] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.221458] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.221427] [ndv4:55152:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.221145] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.221678] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x3d3a060: created UD QP 0xda96 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.221946] [ndv4:54715:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.222225] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.222405] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.222412] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.222474] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.222479] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.222280] [ndv4:54715:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465630.222900] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae25ad63000..0x2ae25ade8000 on mlx5_ib3 lkey 0x81400 rkey 0x81400 access 0xf flags 0x3e4 | |
[1650465630.222908] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae25ad63018 of 544744 bytes with 128 elements | |
[1650465630.222913] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.223162] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3d3a060: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465630.223309] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3d3a060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465630.223502] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3d3a060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465630.223915] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3d3a060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465630.224133] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3d3a060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465630.224522] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3d3a060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465630.222842] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.222856] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.224995] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3d3a060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465630.225527] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3d3a060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465630.225789] [ndv4:55516:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.225800] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x3d3ae60 [id=109 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.225826] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 109 events 0x5 mode thread_spinlock | |
[1650465630.225843] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[24]=0x3d3a060 using ud_verbs/mlx5_ib3:1 on worker 0x22b38d0 | |
[1650465630.226084] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.226091] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.226234] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.226240] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.225191] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3d50060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465630.225751] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3d50060: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465630.225894] [ndv4:55512:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.225904] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x3d50e60 [id=109 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.225930] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 109 events 0x5 mode thread_spinlock | |
[1650465630.225943] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[24]=0x3d50060 using ud_verbs/mlx5_ib3:1 on worker 0x22c98d0 | |
[1650465630.225960] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.225966] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.226008] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.226012] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.226366] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.225696] [ndv4:55432:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465630.227381] [ndv4:55432:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.227419] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.227422] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.227441] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.227351] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.227682] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x3e6e460: created UD QP 0xda97 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.227690] [ndv4:55512:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.227009] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x515d050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465630.227109] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x515d050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465630.227124] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x515d050: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465630.227130] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.227210] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.227216] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x4b1b340 [id=117 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.227243] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 117 events 0x5 mode thread_spinlock | |
[1650465630.227260] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[30]=0x515d050 using ud_mlx5/mlx5_ib4:1 on worker 0x2e9a8d0 | |
[1650465630.227284] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.227290] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.227463] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.227469] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.227755] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.227931] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.227940] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.227974] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.227987] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.228171] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.228254] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.228261] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.228292] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.228297] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.228685] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea7ebc3000..0x2aea7ec48000 on mlx5_ib3 lkey 0x81500 rkey 0x81500 access 0xf flags 0x3e4 | |
[1650465630.228691] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aea7ebc3018 of 544744 bytes with 128 elements | |
[1650465630.228695] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.228734] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3e6e460: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465630.228967] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3e6e460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465630.229036] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3e6e460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465630.229162] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3e6e460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465630.229329] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3e6e460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465630.229477] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3e6e460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465630.229677] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3e6e460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465630.229909] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x3e6e460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465630.229914] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.229970] [ndv4:55512:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.229976] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x382ca10 [id=110 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.229998] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 110 events 0x5 mode thread_spinlock | |
[1650465630.230011] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[25]=0x3e6e460 using ud_mlx5/mlx5_ib3:1 on worker 0x22c98d0 | |
[1650465630.230061] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.230067] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.230187] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.230193] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.228815] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.228835] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.228839] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.228890] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.229205] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.229212] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.229655] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x525d0a0: created RC QP 0xda6c on mlx5_ib5:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.230189] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[31]=0x525d0a0 using rc_verbs/mlx5_ib5:1 on worker 0x2e9a8d0 | |
[1650465630.230346] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.230352] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.230495] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.230500] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.229241] [ndv4:55152:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x5c5f050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda76 | |
[1650465630.229531] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.229539] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.229548] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b97e9b008 of 151544 bytes with 1052 elements | |
[1650465630.229153] [ndv4:55432:0] ib_iface.c:994 UCX DEBUG iface=0x28a9740: created RC QP 0xf4b8 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.231397] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x28a9740 using rc_verbs/mlx5_ib0:1 on worker 0x3178950 | |
[1650465630.231481] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.231487] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.230989] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.230782] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.231984] [ndv4:54965:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib7:1 | |
[1650465630.233530] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3ba8a00000..0x2b3bab000000 on mlx5_ib5 lkey 0x80700 rkey 0x80700 access 0xf flags 0x3e4 | |
[1650465630.233544] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3ba8a00018 of 39845864 bytes with 4752 elements | |
[1650465630.233750] [ndv4:55152:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x5c5f050 | |
[1650465630.233785] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[33]=0x5c5f050 using dc_mlx5/mlx5_ib5:1 on worker 0x357a8c0 | |
[1650465630.233845] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.233854] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.233916] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.233921] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.233826] [ndv4:55691:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465630.232258] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.232281] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.232285] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.232341] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.233904] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.233913] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.231812] [ndv4:54861:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.231820] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.232070] [ndv4:54861:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.232166] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.232173] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.232426] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.232439] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.232442] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.232493] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.232835] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x557e010 of 8176 bytes with 127 elements | |
[1650465630.233018] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.233026] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.233062] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465630.233067] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.233079] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x4350e50 [id=120 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.233105] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 120 events 0x1 mode thread_spinlock | |
[1650465630.233116] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[32]=0x53aa030 using rc_mlx5/mlx5_ib5:1 on worker 0x2e9a8d0 | |
[1650465630.233162] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.233168] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.233249] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.233254] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.233588] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.234201] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.234263] [ndv4:54715:0] async.c:228 UCX DEBUG added async handler 0x200e280 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.234289] [ndv4:54715:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465630.234292] [ndv4:54715:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465630.234542] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.234559] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.234563] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.234661] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.234991] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.234997] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.235029] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465630.235032] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.235039] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x53b2f80 [id=122 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.235079] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 122 events 0x1 mode thread_spinlock | |
[1650465630.234506] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.234514] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.234519] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.234559] [ndv4:54715:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465630.235236] [ndv4:54715:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.235241] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.234554] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x3f6e0a0: created RC QP 0xdaee on mlx5_ib4:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.234675] [ndv4:55691:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.234718] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.234723] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.234755] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.235073] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.235084] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.235079] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.235382] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.235390] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.235856] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.235861] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.235407] [ndv4:54756:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.235557] [ndv4:55691:0] ib_iface.c:994 UCX DEBUG iface=0x2b30630: created RC QP 0xf4b9 on mlx5_ib0:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.235563] [ndv4:55039:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib3 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.236129] [ndv4:55039:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib3: disable ODP because it's not supported for DevX QP | |
[1650465630.235375] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x5e3b060: created UD QP 0xda6d on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.235968] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.236014] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.236020] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.236055] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.236059] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.236365] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b97ec0000..0x2b3b97f45000 on mlx5_ib5 lkey 0x80800 rkey 0x80800 access 0xf flags 0x3e4 | |
[1650465630.236372] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b97ec0018 of 544744 bytes with 128 elements | |
[1650465630.236376] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.236409] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5e3b060: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465630.236424] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5e3b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465630.236439] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5e3b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465630.236454] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5e3b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465630.235612] [ndv4:54715:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.235647] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.235653] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.235781] [ndv4:54715:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465630.235785] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.236116] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.236120] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.235648] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[26]=0x3f6e0a0 using rc_verbs/mlx5_ib4:1 on worker 0x22c98d0 | |
[1650465630.235804] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.235811] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.235877] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.235882] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.236084] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.236812] [ndv4:55039:0] async.c:228 UCX DEBUG added async handler 0x19d52b0 [id=63 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.236837] [ndv4:55039:0] async.c:506 UCX DEBUG listening to async event fd 63 events 0x1 mode thread_spinlock | |
[1650465630.236841] [ndv4:55039:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib3' (InfiniBand channel adapter) with 1 ports | |
[1650465630.236961] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.236970] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.236975] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.237031] [ndv4:55039:0] ib_md.c:1319 UCX DEBUG mlx5_ib3: using registration cache | |
[1650465630.237655] [ndv4:55039:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.237661] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.236370] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[6]=0x2b30630 using rc_verbs/mlx5_ib0:1 on worker 0x33ff770 | |
[1650465630.236480] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.236487] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.236624] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.236630] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.236917] [ndv4:55691:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465630.236952] [ndv4:55691:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465630.237541] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.237556] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.237559] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.237623] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.238022] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x428f010 of 8176 bytes with 127 elements | |
[1650465630.238277] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.238284] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.238324] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465630.238327] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.238341] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x3061690 [id=113 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.238370] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 113 events 0x1 mode thread_spinlock | |
[1650465630.238381] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[27]=0x40bb030 using rc_mlx5/mlx5_ib4:1 on worker 0x22c98d0 | |
[1650465630.238400] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.238406] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.238479] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.238484] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.238882] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.237923] [ndv4:55691:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.237936] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.237939] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.237990] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.238434] [ndv4:55691:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x34c9010 of 8176 bytes with 127 elements | |
[1650465630.238677] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.238695] [ndv4:55691:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.238742] [ndv4:55691:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465630.238747] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.237898] [ndv4:55039:0] ib_md.c:1604 UCX DEBUG mlx5_ib3: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.237984] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.237992] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.238468] [ndv4:55039:0] topo.c:99 UCX DEBUG bus id 0x104000000 doesn't exist. sys_dev = 3 | |
[1650465630.238473] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.239125] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.239130] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.239315] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.239319] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.238875] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib3:1 | |
[1650465630.240058] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.240472] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x3e58460: created UD QP 0xda98 on mlx5_ib3:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.240479] [ndv4:55516:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.240048] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.240065] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.240068] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.240121] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.240478] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.240483] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.240516] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465630.240519] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.240527] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x40c3f10 [id=115 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.240546] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 115 events 0x1 mode thread_spinlock | |
[1650465630.240974] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.241129] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.241136] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.241215] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: cuda GPUDirect RDMA is enabled | |
[1650465630.241220] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib3: rocm GPUDirect RDMA is disabled | |
[1650465630.241596] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae25ade8000..0x2ae25ae6d000 on mlx5_ib3 lkey 0x81600 rkey 0x81600 access 0xf flags 0x3e4 | |
[1650465630.241611] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae25ade8018 of 544744 bytes with 128 elements | |
[1650465630.241621] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.241661] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3e58460: adding gid fe80::15:5dff:fd33:fffe to hash on device mlx5_ib3 port 1 index 0) | |
[1650465630.242026] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3e58460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 1) | |
[1650465630.242195] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3e58460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 2) | |
[1650465630.241165] [ndv4:54756:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x5580050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xda82 | |
[1650465630.241326] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.241332] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.241342] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b67b56008 of 151544 bytes with 1052 elements | |
[1650465630.241099] [ndv4:55512:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.240898] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.240913] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.241185] [ndv4:54853:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.240780] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.240788] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.241366] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x104000000 exists. sys_dev = 3 | |
[1650465630.241371] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib3 bus id 260:0:0.0 sys_dev 3 | |
[1650465630.241746] [ndv4:55039:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.243504] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.243516] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.244933] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b65400000..0x2b3b67a00000 on mlx5_ib5 lkey 0x80900 rkey 0x80900 access 0xf flags 0x3e4 | |
[1650465630.244951] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3b65400018 of 39845864 bytes with 4752 elements | |
[1650465630.245116] [ndv4:54756:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x5580050 | |
[1650465630.245143] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[33]=0x5580050 using dc_mlx5/mlx5_ib5:1 on worker 0x2e9a8d0 | |
[1650465630.245287] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.245295] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.245460] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.245466] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.246392] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.248209] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.248494] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x575c060: created UD QP 0xda79 on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.245760] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2a95fc0 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.245787] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465630.245803] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x2b3e080 using rc_mlx5/mlx5_ib0:1 on worker 0x33ff770 | |
[1650465630.245951] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.245958] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.246109] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.246115] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.246944] [ndv4:55691:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465630.247652] [ndv4:55691:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.247661] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.247664] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.247678] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.247970] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.247976] [ndv4:55691:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.248008] [ndv4:55691:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465630.248012] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.248021] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2a8aef0 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.248040] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465630.248639] [ndv4:55691:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.244194] [ndv4:55432:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465630.244846] [ndv4:55432:0] ib_device.c:1394 UCX DEBUG max IB CQE size is 128 | |
[1650465630.245851] [ndv4:55432:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.245859] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.245862] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.245917] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.246281] [ndv4:55432:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3242010 of 8176 bytes with 127 elements | |
[1650465630.246487] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.246514] [ndv4:55432:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.246564] [ndv4:55432:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465630.246569] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.245438] [ndv4:54965:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.245917] [ndv4:54965:0] ib_iface.c:994 UCX DEBUG iface=0x31477b0: created UD QP 0xda56 on mlx5_ib7:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.245924] [ndv4:54965:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.247343] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.247772] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.247780] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.247841] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.247846] [ndv4:54965:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.248235] [ndv4:54965:0] ib_md.c:812 UCX DEBUG registered memory 0x2b4b39783000..0x2b4b39808000 on mlx5_ib7 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465630.248240] [ndv4:54965:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b4b39783018 of 544744 bytes with 128 elements | |
[1650465630.248245] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.248491] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x31477b0: adding gid fe80::15:5dff:fd34:2 to hash on device mlx5_ib7 port 1 index 0) | |
[1650465630.248772] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x31477b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 1) | |
[1650465630.248893] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x31477b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 2) | |
[1650465630.248901] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.249027] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.249037] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.249081] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.249085] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.248866] [ndv4:55039:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.249290] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.249297] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.249411] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b67b7b000..0x2b3b67c00000 on mlx5_ib5 lkey 0x80a00 rkey 0x80a00 access 0xf flags 0x3e4 | |
[1650465630.249420] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b67b7b018 of 544744 bytes with 128 elements | |
[1650465630.249428] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.249561] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x575c060: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465630.250065] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x575c060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465630.250445] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x575c060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465630.250869] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x575c060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465630.251101] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x575c060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465630.251274] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x575c060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465630.251476] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x575c060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465630.251686] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x575c060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465630.249325] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5e3b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465630.249779] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5e3b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465630.250229] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5e3b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465630.249850] [ndv4:55512:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x4291050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdb03 | |
[1650465630.250142] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.250150] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.250159] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2aea81449008 of 151544 bytes with 1052 elements | |
[1650465630.249461] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.249469] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.249636] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.249640] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.251757] [ndv4:55039:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465630.251950] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.251961] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x52389b0 [id=123 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.251989] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 123 events 0x5 mode thread_spinlock | |
[1650465630.252013] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[34]=0x575c060 using ud_verbs/mlx5_ib5:1 on worker 0x2e9a8d0 | |
[1650465630.252105] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.252112] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.252206] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.252210] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.253468] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x27fedf0 [id=85 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.253497] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 85 events 0x1 mode thread_spinlock | |
[1650465630.253516] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[7]=0x28b7190 using rc_mlx5/mlx5_ib0:1 on worker 0x3178950 | |
[1650465630.253641] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.253650] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.253728] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.253734] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.253978] [ndv4:55432:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465630.254922] [ndv4:55432:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.254931] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.254934] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.254948] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.253789] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea7ee00000..0x2aea81400000 on mlx5_ib4 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465630.253806] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2aea7ee00018 of 39845864 bytes with 4752 elements | |
[1650465630.253945] [ndv4:55512:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x4291050 | |
[1650465630.253984] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[28]=0x4291050 using dc_mlx5/mlx5_ib4:1 on worker 0x22c98d0 | |
[1650465630.254123] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.254135] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.253985] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3e58460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 3) | |
[1650465630.254137] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3e58460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 4) | |
[1650465630.254306] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3e58460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 5) | |
[1650465630.254419] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3e58460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 6) | |
[1650465630.254608] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x3e58460: adding gid fe80:: to hash on device mlx5_ib3 port 1 index 7) | |
[1650465630.254614] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib3:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.254708] [ndv4:55516:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.254715] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x3816a10 [id=110 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.254740] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 110 events 0x5 mode thread_spinlock | |
[1650465630.254759] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[25]=0x3e58460 using ud_mlx5/mlx5_ib3:1 on worker 0x22b38d0 | |
[1650465630.254793] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.254799] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.254918] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.254923] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.255146] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.255313] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.255320] [ndv4:55432:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.255352] [ndv4:55432:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib0 length=2048) failed: Invalid argument | |
[1650465630.255356] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.255367] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x280db70 [id=87 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.255389] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 87 events 0x1 mode thread_spinlock | |
[1650465630.255649] [ndv4:55691:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3733010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xf4e3 | |
[1650465630.255822] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.255855] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.255875] [ndv4:55691:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b871fd00008 of 151544 bytes with 1052 elements | |
[1650465630.255717] [ndv4:55432:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.262408] [ndv4:55432:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x34ac010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xf4eb | |
[1650465630.262610] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.262616] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.262636] [ndv4:55432:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2aee3be6f008 of 151544 bytes with 1052 elements | |
[1650465630.266343] [ndv4:55432:0] ib_md.c:812 UCX DEBUG registered memory 0x2aee41800000..0x2aee43e00000 on mlx5_ib0 lkey 0x81d00 rkey 0x81d00 access 0xf flags 0x3e4 | |
[1650465630.266361] [ndv4:55432:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2aee41800018 of 39845864 bytes with 4752 elements | |
[1650465630.266500] [ndv4:55432:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x34ac010 | |
[1650465630.266534] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x34ac010 using dc_mlx5/mlx5_ib0:1 on worker 0x3178950 | |
[1650465630.266766] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.266776] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.266918] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.266923] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.267420] [ndv4:55432:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465630.268260] [ndv4:55432:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.268626] [ndv4:55432:0] ib_iface.c:994 UCX DEBUG iface=0x28bfb00: created UD QP 0xf4cf on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.269126] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.269510] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.269517] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.269810] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.269816] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.270146] [ndv4:55432:0] ib_md.c:812 UCX DEBUG registered memory 0x2aee3be94000..0x2aee3bf19000 on mlx5_ib0 lkey 0x81e00 rkey 0x81e00 access 0xf flags 0x3e4 | |
[1650465630.270151] [ndv4:55432:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aee3be94018 of 544744 bytes with 128 elements | |
[1650465630.270155] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.271218] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28bfb00: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465630.271833] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28bfb00: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465630.272268] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28bfb00: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465630.283461] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28bfb00: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465630.283604] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28bfb00: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465630.295321] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28bfb00: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465630.295564] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28bfb0[1650465630.256048] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.256069] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.256074] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.256128] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.256471] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.256476] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.256989] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x3f580a0: created RC QP 0xdaf9 on mlx5_ib4:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.257509] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[26]=0x3f580a0 using rc_verbs/mlx5_ib4:1 on worker 0x22b38d0 | |
[1650465630.257623] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.257631] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.257706] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.257710] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.257839] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.259062] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.259078] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.259081] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.259135] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.259448] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x4279010 of 8176 bytes with 127 elements | |
[1650465630.259642] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.259650] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.259688] [ndv4:55516:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465630.259693] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.259705] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x304b690 [id=113 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.259732] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 113 events 0x1 mode thread_spinlock | |
[1650465630.259742] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[27]=0x40a5030 using rc_mlx5/mlx5_ib4:1 on worker 0x22b38d0 | |
[1650465630.259873] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.259879] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.259950] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.259954] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.260135] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.261171] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs [1650465630.262191] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.262198] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.262485] [ndv4:55324:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.270201] [ndv4:55324:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.270565] [ndv4:55324:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465630.272334] [ndv4:55324:0] async.c:228 UCX DEBUG added async handler 0x165dcb0 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.272356] [ndv4:55324:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465630.272359] [ndv4:55324:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465630.272696] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.272705] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.272710] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.272760] [ndv4:55324:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465630.273803] [ndv4:55324:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.273809] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.274047] [ndv4:55324:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.286736] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.286748] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.287247] [ndv4:55324:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465630.287252] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.299455] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.299462] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.299817] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.299822] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.300545] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.300549] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.301481] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.301486] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.301766] [ndv4:55324:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.327638] [ndv4:55324:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.328012] [ndv4:55324:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465630.329135] [ndv4:55324:0] async.c:228 UCX DEBUG added async handler 0x165f7e0 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.329158] [ndv4:55324:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465630.329161] [ndv4:55324:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465630.329357] [ndv4:55324:0] ib_md.c:296 UCX D[1650465630.258933] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.258959] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.259769] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.259774] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.260067] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.260071] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.260440] [ndv4:54717:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.270040] [ndv4:54717:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.270383] [ndv4:54717:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465630.272129] [ndv4:54717:0] async.c:228 UCX DEBUG added async handler 0x23e5560 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.272154] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465630.272159] [ndv4:54717:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465630.272561] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.272570] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.272594] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.272640] [ndv4:54717:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465630.274043] [ndv4:54717:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.274049] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.274195] [ndv4:54717:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.274259] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.274265] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.274801] [ndv4:54717:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465630.274806] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.275201] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.275205] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.275785] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.275789] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.276334] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.276337] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.277238] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.277244] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.277518] [ndv4:54717:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.286409] [ndv4:54717:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.286746] [ndv4:54717:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465630.298564] [ndv4:54717:0] async.c:228 UCX[1650465630.265298] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.266159] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.266501] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x587a460: created UD QP 0xda7a on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.266506] [ndv4:54756:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.267036] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.267196] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.267203] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.267278] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.267283] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.268010] [ndv4:54756:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b67c00000..0x2b3b67c85000 on mlx5_ib5 lkey 0x80b00 rkey 0x80b00 access 0xf flags 0x3e4 | |
[1650465630.268017] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b67c00018 of 544744 bytes with 128 elements | |
[1650465630.268022] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.268051] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x587a460: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465630.268067] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x587a460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465630.268309] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x587a460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465630.268712] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x587a460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465630.269344] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x587a460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465630.270456] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x587a460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465630.271401] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x587a460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465630.281963] [ndv4:54756:0] ud_iface.c:393 UCX DEBUG iface 0x587a460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465630.281979] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.282064] [ndv4:54756:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.282073] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x575cf20 [id=124 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.282110] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 124 events 0x5 mode thread_spinlock | |
[1650465630.282139] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[35]=0x587a460 using ud_mlx5/mlx5_ib5:1 on worker 0x2e9a8d0 | |
[1650465630.282210] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.282219] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.282319] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.282324] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.282801] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on [1650465630.261068] [ndv4:54853:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib4 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.261496] [ndv4:54853:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib4: disable ODP because it's not supported for DevX QP | |
[1650465630.261899] [ndv4:54853:0] async.c:228 UCX DEBUG added async handler 0x1394280 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.261926] [ndv4:54853:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465630.261930] [ndv4:54853:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465630.272037] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.272055] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.272064] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.272128] [ndv4:54853:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465630.274174] [ndv4:54853:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.274180] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.274400] [ndv4:54853:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.274494] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.274500] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.288365] [ndv4:54853:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465630.288372] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.288908] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.288913] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.288976] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.288979] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.289618] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.289621] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.300950] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.300957] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.301185] [ndv4:54853:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.317485] [ndv4:54853:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.317903] [ndv4:54853:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465630.318466] [ndv4:54853:0] async.c:228 UCX DEBUG added async handler 0x1392f90 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.318491] [ndv4:54853:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465630.318495] [ndv4:54853:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465630.318722] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.318731] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.318738] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.318802] [ndv4:54853:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: usin[1650465630.260402] [ndv4:55691:0] ib_md.c:812 UCX DEBUG registered memory 0x2b8724600000..0x2b8726c00000 on mlx5_ib0 lkey 0x81b00 rkey 0x81b00 access 0xf flags 0x3e4 | |
[1650465630.260426] [ndv4:55691:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b8724600018 of 39845864 bytes with 4752 elements | |
[1650465630.260656] [ndv4:55691:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3733010 | |
[1650465630.260691] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[8]=0x3733010 using dc_mlx5/mlx5_ib0:1 on worker 0x33ff770 | |
[1650465630.260779] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.260790] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.260891] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.260895] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.261283] [ndv4:55691:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465630.262029] [ndv4:55691:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.262357] [ndv4:55691:0] ib_iface.c:994 UCX DEBUG iface=0x341f5f0: created UD QP 0xf4ce on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.262859] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.262949] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.262957] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.263011] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.263016] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.263344] [ndv4:55691:0] ib_md.c:812 UCX DEBUG registered memory 0x2b871fd25000..0x2b871fdaa000 on mlx5_ib0 lkey 0x81c00 rkey 0x81c00 access 0xf flags 0x3e4 | |
[1650465630.263352] [ndv4:55691:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b871fd25018 of 544744 bytes with 128 elements | |
[1650465630.263362] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.263421] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x341f5f0: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465630.263597] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x341f5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465630.263696] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x341f5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465630.263852] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x341f5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465630.263975] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x341f5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465630.275343] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x341f5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465630.275442] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x341f5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465630.275464] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x341f5f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465630.275745] [ndv4:55691:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.281534] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2b2e420 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.281567] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 m[1650465630.264245] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.264254] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.264470] [ndv4:54715:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.273815] [ndv4:54715:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.274173] [ndv4:54715:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465630.274794] [ndv4:54715:0] async.c:228 UCX DEBUG added async handler 0x200cf90 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.274817] [ndv4:54715:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465630.274820] [ndv4:54715:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465630.287671] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.287683] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.287688] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.287733] [ndv4:54715:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465630.288948] [ndv4:54715:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.288953] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.289156] [ndv4:54715:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.289246] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.289252] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.289858] [ndv4:54715:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465630.289863] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.301420] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.301428] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.313487] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.313494] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.339279] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.339287] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.353227] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.353234] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.353452] [ndv4:54715:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.371641] [ndv4:54715:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.265772] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.265782] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.267862] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.269303] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.269864] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x446d1c0: created UD QP 0xdb01 on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.271885] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.272209] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.272217] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.272348] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.272354] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.272933] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea8146e000..0x2aea814f3000 on mlx5_ib4 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465630.272939] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aea8146e018 of 544744 bytes with 128 elements | |
[1650465630.272944] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.284425] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x446d1c0: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465630.284825] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x446d1c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465630.285174] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x446d1c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465630.285453] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x446d1c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465630.296210] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x446d1c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465630.296648] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x446d1c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465630.296904] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x446d1c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465630.297052] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x446d1c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465630.297686] [ndv4:55512:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.297694] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x3f499f0 [id=116 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.297723] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 116 events 0x5 mode thread_spinlock | |
[1650465630.297734] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[29]=0x446d1c0 using ud_verbs/mlx5_ib4:1 on worker 0x22c98d0 | |
[1650465630.297773] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.297780] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.297860] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.297865] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.298139] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.299883] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib[1650465630.261558] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x31477b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 3) | |
[1650465630.261660] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x31477b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 4) | |
[1650465630.261779] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x31477b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 5) | |
[1650465630.261804] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x31477b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 6) | |
[1650465630.262017] [ndv4:54965:0] ud_iface.c:393 UCX DEBUG iface 0x31477b0: adding gid fe80:: to hash on device mlx5_ib7 port 1 index 7) | |
[1650465630.262023] [ndv4:54965:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib7:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.262057] [ndv4:54965:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.262061] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x2cc3580 [id=138 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.262084] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 138 events 0x5 mode thread_spinlock | |
[1650465630.262094] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[45]=0x31477b0 using ud_mlx5/mlx5_ib7:1 on worker 0x23dc840 | |
[1650465630.262155] [ndv4:54965:0] mpool.c:88 UCX DEBUG mpool uct_scopy_iface_tx_mp: align 64, maxelems 4294967295, elemsize 736 | |
[1650465630.262194] [ndv4:54965:0] ucp_worker.c:1159 UCX DEBUG created interface[46]=0x5bf5880 using cma/memory on worker 0x23dc840 | |
[1650465630.262201] [ndv4:54965:0] ucp_worker.c:982 UCX DEBUG selected scalable tl bitmap: 0x7fffffffffff 0x0 (47 tls) | |
[1650465630.272104] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x55b35c0 [id=75 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.272132] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 75 events 0x0 mode thread_spinlock | |
[1650465630.272179] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3147600 [id=76 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.272195] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 76 events 0x0 mode thread_spinlock | |
[1650465630.272213] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3147640 [id=77 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.272230] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 77 events 0x0 mode thread_spinlock | |
[1650465630.273023] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3147680 [id=79 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.273047] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 79 events 0x0 mode thread_spinlock | |
[1650465630.273065] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273070] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273131] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273136] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273189] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273194] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273245] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273250] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273301] [ndv4:54965:0] sock.c:88 UCX0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465630.295804] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28bfb00: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465630.296046] [ndv4:55432:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.301565] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x28a3e10 [id=88 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.301644] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 88 events 0x5 mode thread_spinlock | |
[1650465630.301673] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x28bfb00 using ud_verbs/mlx5_ib0:1 on worker 0x3178950 | |
[1650465630.301771] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.301778] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.301914] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.301919] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.302798] [ndv4:55432:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465630.303656] [ndv4:55432:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.304020] [ndv4:55432:0] ib_iface.c:994 UCX DEBUG iface=0x28c1390: created UD QP 0xf4d1 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.304030] [ndv4:55432:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.304560] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.304625] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.304631] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.304724] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.304729] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.305133] [ndv4:55432:0] ib_md.c:812 UCX DEBUG registered memory 0x2aee3bf19000..0x2aee3bf9e000 on mlx5_ib0 lkey 0x82000 rkey 0x82000 access 0xf flags 0x3e4 | |
[1650465630.305138] [ndv4:55432:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aee3bf19018 of 544744 bytes with 128 elements | |
[1650465630.305143] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.305337] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28c1390: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465630.305546] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28c1390: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465630.305734] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28c1390: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465630.306009] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28c1390: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465630.306120] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28c1390: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465630.306412] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28c1390: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465630.306721] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28c1390: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465630.307246] [ndv4:55432:0] ud_iface.c:393 UCX DEBUG iface 0x28c1390: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465630.307252] [ndv4:55432:0] ib_mlx5.c:858 UCX DEBUG SL=0 (A[1650465630.265194] [ndv4:55039:0] async.c:228 UCX DEBUG added async handler 0x20bebf0 [id=65 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.265227] [ndv4:55039:0] async.c:506 UCX DEBUG listening to async event fd 65 events 0x1 mode thread_spinlock | |
[1650465630.265231] [ndv4:55039:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib4' (InfiniBand channel adapter) with 1 ports | |
[1650465630.265511] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.265520] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.265526] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.265602] [ndv4:55039:0] ib_md.c:1319 UCX DEBUG mlx5_ib4: using registration cache | |
[1650465630.267981] [ndv4:55039:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.267986] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.268153] [ndv4:55039:0] ib_md.c:1604 UCX DEBUG mlx5_ib4: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.268280] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.268287] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.270522] [ndv4:55039:0] topo.c:99 UCX DEBUG bus id 0x105000000 doesn't exist. sys_dev = 4 | |
[1650465630.270527] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.281643] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.281650] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.282039] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.282043] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.294369] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.294376] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.295817] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x105000000 exists. sys_dev = 4 | |
[1650465630.295821] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib4 bus id 261:0:0.0 sys_dev 4 | |
[1650465630.296054] [ndv4:55039:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.322013] [ndv4:55039:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib5 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.322346] [ndv4:55039:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib5: disable ODP because it's not supported for DevX QP | |
[1650465630.322996] [ndv4:55039:0] async.c:228 UCX DEBUG added async handler 0x19d9710 [id=67 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.323021] [ndv4:55039:0] async.c:506 UCX DEBUG listening to async event fd 67 events 0x1 mode thread_spinlock | |
[1650465630.323024] [ndv4:55039:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib5' (InfiniBand channel adapter) with 1 ports | |
[1650465630.323179] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.323186] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.323191] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.323243] [ndv4:55039:0] ib_md.c:1319 UCX DEBUG mlx5_ib5: using registration cache | |
[1650465630.324276] [ndv4:55039:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.324282] [ndv4:55039:0] mpool.c:88 UCX D[1650465630.257746] [ndv4:54861:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465630.257755] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.269454] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.269461] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.283628] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.283635] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.284656] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.284659] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.308327] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.308334] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.308560] [ndv4:54861:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.318344] [ndv4:54861:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.318756] [ndv4:54861:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465630.319642] [ndv4:54861:0] async.c:228 UCX DEBUG added async handler 0x18bf720 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.319666] [ndv4:54861:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465630.319671] [ndv4:54861:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465630.319906] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.319914] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.319920] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.319986] [ndv4:54861:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465630.334032] [ndv4:54861:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.334041] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.334260] [ndv4:54861:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.334282] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.334288] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.334430] [ndv4:54861:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465630.334435] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.334557] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.334560] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.334727] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.334731] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.334844] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.334847] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.334965] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.334968] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.335258] [ndv4:5486[1650465630.344293] [ndv4:54724:0] debug.c:1198 UCX DEBUG using signal stack 0x2b5dafc84000 size 141824 | |
[1650465630.344370] [ndv4:54724:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465630.344393] [ndv4:54724:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b5dafaf2000 | |
[1650465630.344414] [ndv4:54724:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465630.344423] [ndv4:54724:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465630.344429] [ndv4:54724:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465630.347381] [ndv4:54724:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465630.347404] [ndv4:54724:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465630.347448] [ndv4:54724:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465630.347451] [ndv4:54724:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465630.347460] [ndv4:54724:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465630.347469] [ndv4:54724:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465630.347472] [ndv4:54724:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465630.347479] [ndv4:54724:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465630.347481] [ndv4:54724:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465630.347485] [ndv4:54724:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465630.347487] [ndv4:54724:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465630.347489] [ndv4:54724:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465630.347503] [ndv4:54724:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465630.348391] [ndv4:54724:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465630.349080] [ndv4:54724:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465630.349096] [ndv4:54724:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465630.349107] [ndv4:54724:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465630.349117] [ndv4:54724:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465630.349127] [ndv4:54724:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465630.349137] [ndv4:54724:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465630.349148] [ndv4:54724:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465630.349518] [ndv4:54724:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465630.367046] [ndv4:54724:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.367463] [ndv4:54724:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465630.272858] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5e3b060: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465630.273463] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.273473] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x59179b0 [id=123 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.273503] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 123 events 0x5 mode thread_spinlock | |
[1650465630.273521] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[34]=0x5e3b060 using ud_verbs/mlx5_ib5:1 on worker 0x357a8c0 | |
[1650465630.273688] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.273695] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.273784] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.273789] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.285510] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.297920] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.298514] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x5f59460: created UD QP 0xda7b on mlx5_ib5:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.298521] [ndv4:55152:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.299247] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.299425] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.299433] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.299479] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.299484] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.299952] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3b97f45000..0x2b3b97fca000 on mlx5_ib5 lkey 0x80c00 rkey 0x80c00 access 0xf flags 0x3e4 | |
[1650465630.299958] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3b97f45018 of 544744 bytes with 128 elements | |
[1650465630.299962] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.299990] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5f59460: adding gid fe80::15:5dff:fd34:0 to hash on device mlx5_ib5 port 1 index 0) | |
[1650465630.300007] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5f59460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 1) | |
[1650465630.300022] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5f59460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 2) | |
[1650465630.300168] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5f59460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 3) | |
[1650465630.300558] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5f59460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 4) | |
[1650465630.300931] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5f59460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 5) | |
[1650465630.300957] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5f59460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 6) | |
[1650465630.301498] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x5f59460: adding gid fe80:: to hash on device mlx5_ib5 port 1 index 7) | |
[1650465630.301505] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR sup[1650465630.372054] [ndv4:54715:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.300300] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x377e7b0: created UD QP 0xdb07 on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.300309] [ndv4:55512:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.302445] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.302608] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.302614] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.302771] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.302776] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.303975] [ndv4:55512:0] ib_md.c:812 UCX DEBUG registered memory 0x2aea814f3000..0x2aea81578000 on mlx5_ib4 lkey 0x81400 rkey 0x81400 access 0xf flags 0x3e4 | |
[1650465630.303980] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2aea814f3018 of 544744 bytes with 128 elements | |
[1650465630.303985] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.304210] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x377e7b0: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465630.304239] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x377e7b0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465630.304400] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x377e7b0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465630.304521] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x377e7b0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465630.304629] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x377e7b0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465630.304772] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x377e7b0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465630.304809] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x377e7b0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465630.304828] [ndv4:55512:0] ud_iface.c:393 UCX DEBUG iface 0x377e7b0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465630.304835] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.304867] [ndv4:55512:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.304872] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x458b960 [id=117 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.304897] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 117 events 0x5 mode thread_spinlock | |
[1650465630.304908] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[30]=0x377e7b0 using ud_mlx5/mlx5_ib4:1 on worker 0x22c98d0 | |
[1650465630.304921] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.304927] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.305002] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.305008] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.305177] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.306205] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.306220] [ndv4:port = { <none> }, SLs without AR support = { <none> } | |
[1650465630.301541] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.301547] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x5e3bf20 [id=124 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.301619] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 124 events 0x5 mode thread_spinlock | |
[1650465630.301637] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[35]=0x5f59460 using ud_mlx5/mlx5_ib5:1 on worker 0x357a8c0 | |
[1650465630.301729] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.301735] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.301903] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.301908] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.302819] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.303849] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.303865] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.303868] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.303915] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.304201] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.304206] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.304790] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x60590a0: created RC QP 0xdac7 on mlx5_ib6:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.305350] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[36]=0x60590a0 using rc_verbs/mlx5_ib6:1 on worker 0x357a8c0 | |
[1650465630.305467] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.305474] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.305541] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.305546] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.305876] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.308653] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.308672] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.308676] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.308731] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.309127] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x637a010 of 8176 bytes with 127 elements | |
[1650465630.309319] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.309326] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.309362] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465630.309367] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpooR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.307287] [ndv4:55432:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.307292] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x2130500 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.307313] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465630.307328] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[10]=0x28c1390 using ud_mlx5/mlx5_ib0:1 on worker 0x3178950 | |
[1650465630.307498] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.307505] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.315045] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.315054] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.316524] [ndv4:55432:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
EBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.324431] [ndv4:55039:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.324472] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.324478] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.325019] [ndv4:55039:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465630.325024] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.356146] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.356155] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.368755] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.368762] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.369792] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.369796] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
1:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.365719] [ndv4:54861:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.366124] [ndv4:54861:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273305] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273358] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273362] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273414] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273419] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273470] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273475] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273528] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273533] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273645] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273650] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273706] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273711] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273763] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273768] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273821] [ndv4:54965:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.273826] [ndv4:54965:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.273874] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x31476c0 [id=81 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.273895] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 81 events 0x0 mode thread_spinlock | |
[1650465630.276446] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3147700 [id=83 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276467] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 83 events 0x0 mode thread_spinlock | |
[1650465630.276537] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3147740 [id=84 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276554] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 84 events 0x0 mode thread_spinlock | |
[1650465630.276587] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3e63e80 [id=86 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276635] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 86 events 0x0 mode thread_spinlock | |
[1650465630.276679] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3e63ec0 [id=90 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276694] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 90 events 0x0 mode thread_spinlock | |
[1650465630.276710] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3e63f00 [id=91 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276731] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 91 events 0x0 mode thread_spinlock | |
[1650465630.276745] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x3e63f40 [id=93 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276761] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 93 e90 data_sz 8256 | |
[1650465630.261186] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.261190] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.261252] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.261558] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.261566] [ndv4:55516:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib4:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.261620] [ndv4:55516:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib4 length=2048) failed: Invalid argument | |
[1650465630.261624] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.261633] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x40adf10 [id=115 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.261655] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 115 events 0x1 mode thread_spinlock | |
[1650465630.262013] [ndv4:55516:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.274772] [ndv4:55516:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x427b050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdb0f | |
[1650465630.275021] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.275029] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.275038] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2ae25d66e008 of 151544 bytes with 1052 elements | |
[1650465630.278826] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae25b000000..0x2ae25d600000 on mlx5_ib4 lkey 0x81300 rkey 0x81300 access 0xf flags 0x3e4 | |
[1650465630.278844] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2ae25b000018 of 39845864 bytes with 4752 elements | |
[1650465630.279017] [ndv4:55516:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x427b050 | |
[1650465630.279048] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[28]=0x427b050 using dc_mlx5/mlx5_ib4:1 on worker 0x22b38d0 | |
[1650465630.279102] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.279109] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.279159] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.279163] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.279458] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
[1650465630.280484] [ndv4:55516:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.280808] [ndv4:55516:0] ib_iface.c:994 UCX DEBUG iface=0x44571c0: created UD QP 0xdb06 on mlx5_ib4:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.281229] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.306131] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.306145] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.306171] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.306176] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.306824] [ndv4:55516:0] ib_md.c:812 UCX DEBUG registered memory 0x2ae25d693000..0x2ae25d718000 on mlx5_ib4 lkeyEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.329366] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.329371] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.329415] [ndv4:55324:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465630.332113] [ndv4:55324:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.332119] [ndv4:55324:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.332304] [ndv4:55324:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.332321] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.332326] [ndv4:55324:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.332470] [ndv4:55324:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465630.332474] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.332642] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.332646] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.332778] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.332781] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.332889] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.332892] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.333010] [ndv4:55324:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.333013] [ndv4:55324:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.333110] [ndv4:55324:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x164ad70 0x164ad70 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
DEBUG added async handler 0x2acb940 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.298633] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465630.298639] [ndv4:54717:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465630.298812] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.298832] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.298839] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.298895] [ndv4:54717:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465630.302000] [ndv4:54717:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.302007] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.302259] [ndv4:54717:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.302434] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.302442] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.303398] [ndv4:54717:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465630.303403] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.303863] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.303867] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.316246] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.316252] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.318294] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.318299] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.320216] [ndv4:54717:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.320227] [ndv4:54717:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.320362] [ndv4:54717:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x23cec90 0x23cec90 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465630.362135] [ndv4:54717:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b7cc3f0f000 length 12288 | |
[1650465630.362204] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465630.363217] [ndv4:54717:0] mm_posix.c:305 UCX DEBUG shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=76) failed: Invalid argument | |
[1650465630.363225] [ndv4:54717:0] mm_posix.c:518 UCX DEBUG allocated posix shared memory at 0x2b7cc8423000 length 4296704 | |
[1650465630.363230] [ndv4:54717:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b7cc8423018 of 4296680 bytes with 512 elements | |
[1650465630.363487] [ndv4:54717:0] mm_iface.c:600 UCX DEBUG created mm iface 0x2b4bd10 FIFO id 0x4000000049a0888e va 0x2b7cc3f0f000 size 12288 (128 x 64 elems) | |
[1650465630.363537] [ndv4:54717:0] ucp_worker.c:1159 UCX DEBUG created interface[0]=0x2b4bd10 using posix/memory on worker 0x34258d0 | |
[1650465630.363562] [ndv4:54717:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 8447 bytes with hugetlb | |
[1650465630.363626] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool mm_recv_desc: align 64, maxelems 4294967295, elemsize 8368 | |
[1650465630.363644] [ndv4:54717:0] mm_sysv.c:94 UCX DEBUG mm failed to allocate 4292720 bytes with hugetlmlx5_ib6:1 | |
[1650465630.283851] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.283871] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.283875] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.283927] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.284420] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.284426] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.285553] [ndv4:54756:0] ib_iface.c:994 UCX DEBUG iface=0x597a0a0: created RC QP 0xdac6 on mlx5_ib6:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.286526] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[36]=0x597a0a0 using rc_verbs/mlx5_ib6:1 on worker 0x2e9a8d0 | |
[1650465630.286547] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.286553] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.286798] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.286805] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.287174] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.288444] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.288456] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.288459] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.288511] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.288907] [ndv4:54756:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x5c9b010 of 8176 bytes with 127 elements | |
[1650465630.289145] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.289152] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.289189] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465630.289193] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.289204] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x5acfc40 [id=127 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.289227] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 127 events 0x1 mode thread_spinlock | |
[1650465630.289236] [ndv4:54756:0] ucp_worker.c:1159 UCX DEBUG created interface[37]=0x5ac7030 using rc_mlx5/mlx5_ib6:1 on worker 0x2e9a8d0 | |
[1650465630.289313] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.289319] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.289445] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.289450] [ndv4:54756:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.300806] [ndv4:54756:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.302532] [ndv4:54756:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs g registration cache | |
[1650465630.321115] [ndv4:54853:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.321122] [ndv4:54853:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.321303] [ndv4:54853:0] ib_md.c:1604 UCX DEBUG mlx5_ib5: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.321366] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.321373] [ndv4:54853:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.339741] [ndv4:54853:0] topo.c:99 UCX DEBUG bus id 0x106000000 doesn't exist. sys_dev = 5 | |
[1650465630.339749] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.340720] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.340725] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.364799] [ndv4:54853:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.364807] [ndv4:54853:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
ode thread_spinlock | |
[1650465630.281613] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[9]=0x341f5f0 using ud_verbs/mlx5_ib0:1 on worker 0x33ff770 | |
[1650465630.281629] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.281635] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.281681] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.281686] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.282184] [ndv4:55691:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib0:1 | |
[1650465630.283167] [ndv4:55691:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.283539] [ndv4:55691:0] ib_iface.c:994 UCX DEBUG iface=0x2b469f0: created UD QP 0xf4d0 on mlx5_ib0:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.283549] [ndv4:55691:0] ib_mlx5.c:568 UCX DEBUG tx wq 65536 bytes [bb=64, nwqe=1024] mmio_mode bf_post | |
[1650465630.284075] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.284229] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.284258] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.284309] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.284314] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.284694] [ndv4:55691:0] ib_md.c:812 UCX DEBUG registered memory 0x2b871fdaa000..0x2b871fe2f000 on mlx5_ib0 lkey 0x81f00 rkey 0x81f00 access 0xf flags 0x3e4 | |
[1650465630.284703] [ndv4:55691:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b871fdaa018 of 544744 bytes with 128 elements | |
[1650465630.284713] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.284771] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x2b469f0: adding gid fe80::15:5dff:fd33:fffb to hash on device mlx5_ib0 port 1 index 0) | |
[1650465630.285199] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x2b469f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 1) | |
[1650465630.285306] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x2b469f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 2) | |
[1650465630.285455] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x2b469f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 3) | |
[1650465630.285616] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x2b469f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 4) | |
[1650465630.285632] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x2b469f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 5) | |
[1650465630.285646] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x2b469f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 6) | |
[1650465630.285751] [ndv4:55691:0] ud_iface.c:393 UCX DEBUG iface 0x2b469f0: adding gid fe80:: to hash on device mlx5_ib0 port 1 index 7) | |
[1650465630.285759] [ndv4:55691:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib0:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.285856] [ndv4:55691:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.285862] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2a94b00 [id=89 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.285887] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 89 events 0x5 mode thread_spinlock | |
[1650465630.285903] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG crevents 0x0 mode thread_spinlock | |
[1650465630.276796] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x393f3c0 [id=97 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276817] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 97 events 0x0 mode thread_spinlock | |
[1650465630.276832] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x393f400 [id=98 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276848] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 98 events 0x0 mode thread_spinlock | |
[1650465630.276862] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x393f440 [id=100 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276877] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 100 events 0x0 mode thread_spinlock | |
[1650465630.276912] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x393f480 [id=104 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276926] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 104 events 0x0 mode thread_spinlock | |
[1650465630.276940] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x393f4c0 [id=105 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.276953] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 105 events 0x0 mode thread_spinlock | |
[1650465630.373524] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x393f500 [id=107 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.373553] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 107 events 0x0 mode thread_spinlock | |
[1650465630.373619] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x393f540 [id=111 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.373647] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 111 events 0x0 mode thread_spinlock | |
[1650465630.373668] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x393f580 [id=112 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.373717] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 112 events 0x0 mode thread_spinlock | |
[1650465630.373733] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x5bf5dd0 [id=114 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.373771] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 114 events 0x0 mode thread_spinlock | |
[1650465630.373808] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x5bf5e10 [id=118 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.373826] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 118 events 0x0 mode thread_spinlock | |
[1650465630.373843] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x5bf5e50 [id=119 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.373880] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 119 events 0x0 mode thread_spinlock | |
[1650465630.373895] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x5bf5e90 [id=121 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.373929] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 121 events 0x0 mode thread_spinlock | |
[1650465630.373966] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x5bf5ed0 [id=125 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.373985] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 125 events 0x0 mode thread_spinlock | |
[1650465630.374001] [ndv4:54965:0] async.c:228 UCX DEBUG added async handler 0x5bf5f10 [id=126 ref 1] ucp_worker_iface_async_fd_event() to hash | |
[1650465630.374018] [ndv4:54965:0] async.c:506 UCX DEBUG listening to async event fd 126 events 0x0 mode thread_spinlock | |
[1650465630.374033] [n 0x81500 rkey 0x81500 access 0xf flags 0x3e4 | |
[1650465630.306832] [ndv4:55516:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2ae25d693018 of 544744 bytes with 128 elements | |
[1650465630.306837] [ndv4:55516:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.306986] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x44571c0: adding gid fe80::15:5dff:fd33:ffff to hash on device mlx5_ib4 port 1 index 0) | |
[1650465630.307152] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x44571c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 1) | |
[1650465630.307616] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x44571c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 2) | |
[1650465630.308115] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x44571c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 3) | |
[1650465630.308595] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x44571c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 4) | |
[1650465630.308772] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x44571c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 5) | |
[1650465630.309322] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x44571c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 6) | |
[1650465630.374136] [ndv4:55516:0] ud_iface.c:393 UCX DEBUG iface 0x44571c0: adding gid fe80:: to hash on device mlx5_ib4 port 1 index 7) | |
[1650465630.374511] [ndv4:55516:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.374524] [ndv4:55516:0] async.c:228 UCX DEBUG added async handler 0x3f339f0 [id=116 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.374550] [ndv4:55516:0] async.c:506 UCX DEBUG listening to async event fd 116 events 0x5 mode thread_spinlock | |
[1650465630.374571] [ndv4:55516:0] ucp_worker.c:1159 UCX DEBUG created interface[29]=0x44571c0 using ud_verbs/mlx5_ib4:1 on worker 0x22b38d0 | |
[1650465630.374757] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.374764] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.374912] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: cuda GPUDirect RDMA is enabled | |
[1650465630.374918] [ndv4:55516:0] ib_md.c:296 UCX DEBUG mlx5_ib4: rocm GPUDirect RDMA is disabled | |
[1650465630.375267] [ndv4:55516:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib4:1 | |
b | |
[1650465630.363654] [ndv4:54717:0] mpool.c:205 UCX DEBUG mpool mm_recv_desc: allocated chunk 0x2b7cc883c018 of 4296680 bytes with 512 elements | |
[1650465630.364215] [ndv4:54717:0] mm_iface.c:600 UCX DEBUG created mm iface 0x2b4c2e0 FIFO id 0x720035 va 0x2b7cc3f12000 size 12288 (128 x 64 elems) | |
[1650465630.364223] [ndv4:54717:0] ucp_worker.c:1159 UCX DEBUG created interface[1]=0x2b4c2e0 using sysv/memory on worker 0x34258d0 | |
[1650465630.364236] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool self_msg_desc: align 64, maxelems 4294967295, elemsize 8200 | |
[1650465630.364239] [ndv4:54717:0] self.c:220 UCX DEBUG created self iface id 0xce7e29e3d1ed3a06 send_size 8192 | |
[1650465630.364245] [ndv4:54717:0] ucp_worker.c:1159 UCX DEBUG created interface[2]=0x2b5fec0 using self/memory0 on worker 0x34258d0 | |
[1650465630.364268] [ndv4:54717:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465630.364274] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465630.364277] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465630.367264] [ndv4:54717:0] async.c:228 UCX DEBUG added async handler 0x2b62000 [id=78 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465630.367296] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 78 events 0x5 mode thread_spinlock | |
[1650465630.373550] [ndv4:54717:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2b60820: listening for connections (fd=78) on 10.5.0.5:33858 | |
[1650465630.373689] [ndv4:54717:0] ucp_worker.c:1159 UCX DEBUG created interface[3]=0x2b60820 using tcp/eth0 on worker 0x34258d0 | |
[1650465630.373711] [ndv4:54717:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465630.373715] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465630.373719] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465630.373758] [ndv4:54717:0] async.c:228 UCX DEBUG added async handler 0x2aae5c0 [id=80 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465630.373783] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 80 events 0x5 mode thread_spinlock | |
[1650465630.373787] [ndv4:54717:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2b60ea0: listening for connections (fd=80) on 127.0.0.1:58703 | |
[1650465630.373804] [ndv4:54717:0] sock.c:88 UCX DEBUG ioctl(req=35142, ifr_name=lo) failed: Operation not supported | |
[1650465630.373810] [ndv4:54717:0] tcp_net.c:61 UCX DEBUG speed of lo is UNKNOWN, assuming 100 Mbps | |
[1650465630.373877] [ndv4:54717:0] ucp_worker.c:1159 UCX DEBUG created interface[4]=0x2b60ea0 using tcp/lo on worker 0x34258d0 | |
[1650465630.373895] [ndv4:54717:0] tcp_iface.c:547 UCX DEBUG using TCP port range: 0-0 | |
[1650465630.373898] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_tx_buf_mp: align 64, maxelems 4294967295, elemsize 8205 | |
[1650465630.373901] [ndv4:54717:0] mpool.c:88 UCX DEBUG mpool uct_tcp_iface_rx_buf_mp: align 64, maxelems 4294967295, elemsize 131090 | |
[1650465630.373938] [ndv4:54717:0] async.c:228 UCX DEBUG added async handler 0x2abc960 [id=82 ref 1] uct_tcp_iface_connect_handler() to hash | |
[1650465630.373967] [ndv4:54717:0] async.c:506 UCX DEBUG listening to async event fd 82 events 0x5 mode thread_spinlock | |
[1650465630.373971] [ndv4:54717:0] tcp_iface.c:495 UCX DEBUG tcp_iface 0x2b48530: listening for connections (fd=82) on 172.16.1.242:39096 | |
[1650465630.374283] [ndv4:54717:0] ucp_worker.c:1159 UCX DEBUG created interface[5]=0x2b48530 using tcp/ib0 on worker 0x34258d0 | |
[1650465630.374435] [ndv4:54717:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.374442] [ndv4:54717:0] 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.302547] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.302550] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.302654] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.303028] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.303035] [ndv4:54756:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.303067] [ndv4:54756:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465630.303071] [ndv4:54756:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.373565] [ndv4:54756:0] async.c:228 UCX DEBUG added async handler 0x5955360 [id=129 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.373666] [ndv4:54756:0] async.c:506 UCX DEBUG listening to async event fd 129 events 0x1 mode thread_spinlock | |
[1650465630.374474] [ndv4:54756:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.306224] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.306271] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.306682] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.306688] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.307259] [ndv4:55512:0] ib_iface.c:994 UCX DEBUG iface=0x468b0a0: created RC QP 0xda7c on mlx5_ib5:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.307910] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[31]=0x468b0a0 using rc_verbs/mlx5_ib5:1 on worker 0x22c98d0 | |
[1650465630.322761] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.372951] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.373094] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.373099] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.373313] [ndv4:55512:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib5:1 | |
[1650465630.375064] [ndv4:55512:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.375082] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.375085] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.375137] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.375680] [ndv4:55512:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x49ac010 of 8176 bytes with 127 elements | |
[1650465630.376114] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.376120] [ndv4:55512:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib5:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.376151] [ndv4:55512:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib5 length=2048) failed: Invalid argument | |
[1650465630.376155] [ndv4:55512:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.376166] [ndv4:55512:0] async.c:228 UCX DEBUG added async handler 0x4666fc0 [id=120 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.376187] [ndv4:55512:0] async.c:506 UCX DEBUG listening to async event fd 120 events 0x1 mode thread_spinlock | |
[1650465630.376197] [ndv4:55512:0] ucp_worker.c:1159 UCX DEBUG created interface[32]=0x47d8030 using rc_mlx5/mlx5_ib5:1 on worker 0x22c98d0 | |
[1650465630.376505] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.376512] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.376811] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: cuda GPUDirect RDMA is enabled | |
[1650465630.376818] [ndv4:55512:0] ib_md.c:296 UCX DEBUG mlx5_ib5: rocm GPUDirect RDMA is disabled | |
[1650465630.373436] [ndv4:54715:0] async.c:228 UCX DEBUG added async handler 0x200cf40 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.373461] [ndv4:54715:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465630.373464] [ndv4:54715:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465630.373970] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.373980] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.373986] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.374033] [ndv4:54715:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465630.375896] [ndv4:54715:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.375903] [ndv4:54715:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.376100] [ndv4:54715:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.376387] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.376394] [ndv4:54715:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
ated interface[10]=0x2b469f0 using ud_mlx5/mlx5_ib0:1 on worker 0x33ff770 | |
[1650465630.285945] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.285953] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.286041] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.286045] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.287279] [ndv4:55691:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.288807] [ndv4:55691:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.288826] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.288830] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.288861] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.289354] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.373599] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.375100] [ndv4:55691:0] ib_iface.c:994 UCX DEBUG iface=0x3cd0050: created RC QP 0xdaa9 on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.376187] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x3cd0050 using rc_verbs/mlx5_ib1:1 on worker 0x33ff770 | |
[1650465630.376434] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.376442] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.376653] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.376661] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.386289] [ndv4:55691:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.387546] [ndv4:55691:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.387556] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.387559] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.387630] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.387950] [ndv4:55691:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3ce3010 of 8176 bytes with 127 elements | |
[1650465630.388129] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.388136] [ndv4:55691:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.388171] [ndv4:55691:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465630.388175] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.388185] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2b29590 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.388213] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465630.388223] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x3b51010 using rc_mlx5/mlx5_ib1:1 on worker 0x33ff770 | |
[1650465630.388886] [ndv4:55691:0] ib_md.c:296 Ul rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.309378] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x61aec40 [id=127 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.309399] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 127 events 0x1 mode thread_spinlock | |
[1650465630.309410] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[37]=0x61a6030 using rc_mlx5/mlx5_ib6:1 on worker 0x357a8c0 | |
[1650465630.309561] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.309567] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.309696] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.309702] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.310478] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.312188] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.312203] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.373469] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.373542] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.374267] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.374275] [ndv4:55152:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib6:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.374311] [ndv4:55152:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib6 length=2048) failed: Invalid argument | |
[1650465630.374316] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.374324] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x6034360 [id=129 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.374347] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 129 events 0x1 mode thread_spinlock | |
[1650465630.374666] [ndv4:55152:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.383207] [ndv4:55152:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x637c050: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdaf0 | |
[1650465630.383679] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.383690] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.383701] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b3b97fcc008 of 151544 bytes with 1052 elements | |
[1650465630.387840] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3bab200000..0x2b3bad800000 on mlx5_ib6 lkey 0x81100 rkey 0x81100 access 0xf flags 0x3e4 | |
[1650465630.387860] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b3bab200018 of 39845864 bytes with 4752 elements | |
[1650465630.388011] [ndv4:55152:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x637c050 | |
[1650465630.388049] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[38]=0x637c050 using dc_mlx5/mlx5_ib6:1 on worker 0x357a8c0 | |
[1650465630.388524] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.388536] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.389517] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.389524] [ndv4:55152:0] ib_md.c:296 UCX DEBUG[1650465630.405356] [ndv4:55674:0] debug.c:1198 UCX DEBUG using signal stack 0x2b83735fe000 size 141824 | |
[1650465630.405433] [ndv4:55674:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465630.405455] [ndv4:55674:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b8373452000 | |
[1650465630.405478] [ndv4:55674:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465630.405488] [ndv4:55674:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465630.405495] [ndv4:55674:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465630.408192] [ndv4:55674:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465630.408211] [ndv4:55674:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465630.408244] [ndv4:55674:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465630.408247] [ndv4:55674:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465630.408254] [ndv4:55674:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465630.408261] [ndv4:55674:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465630.408264] [ndv4:55674:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465630.408269] [ndv4:55674:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465630.408271] [ndv4:55674:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465630.408273] [ndv4:55674:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465630.408276] [ndv4:55674:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465630.408278] [ndv4:55674:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465630.408286] [ndv4:55674:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465630.409011] [ndv4:55674:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465630.410117] [ndv4:55674:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465630.410132] [ndv4:55674:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465630.410143] [ndv4:55674:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465630.410179] [ndv4:55674:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465630.410190] [ndv4:55674:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465630.410201] [ndv4:55674:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465630.410210] [ndv4:55674:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465630.411539] [ndv4:55674:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465630.431799] [ndv4:55674:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.432297] [ndv4:55674:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
[1650465630.437835] [ndv4:55674:0] async.c:228 UCX DEBUG added async handler 0x1d3a2c0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.437944] [ndv4:55674:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465630.437953] [ndv4:55674:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465630.438209] [[1650465630.455310] [ndv4:55796:0] debug.c:1198 UCX DEBUG using signal stack 0x2b1095d2f000 size 141824 | |
[1650465630.455394] [ndv4:55796:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465630.455417] [ndv4:55796:0] init.c:113 UCX DEBUG /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 loaded at 0x2b1095b9d000 | |
[1650465630.455444] [ndv4:55796:0] init.c:115 UCX DEBUG cmd line: osu_scatter | |
[1650465630.455453] [ndv4:55796:0] module.c:69 UCX DEBUG ucs library path: /cvmfs/ndv4.azure/CentOS-HPC/7.9.2021052401/EasyBuild/software/UCX/1.11.2-GCCcore-11.2.0/lib64/libucs.so.0 | |
[1650465630.455460] [ndv4:55796:0] module.c:253 UCX DEBUG loading modules for ucs | |
[1650465630.458173] [ndv4:55796:0] cpu.c:231 UCX DEBUG CPU does not support invariant TSC, using fallback timer | |
[1650465630.458193] [ndv4:55796:0] time.c:22 UCX DEBUG measured arch clock speed: 1000000.00 Hz | |
[1650465630.458232] [ndv4:55796:0] ucp_context.c:1361 UCX DEBUG estimated number of endpoints is 96 | |
[1650465630.458235] [ndv4:55796:0] ucp_context.c:1368 UCX DEBUG estimated number of endpoints per node is 96 | |
[1650465630.458242] [ndv4:55796:0] ucp_context.c:1375 UCX DEBUG estimated bcopy bandwidth is 5251268608.000000 | |
[1650465630.458249] [ndv4:55796:0] ucp_context.c:1427 UCX DEBUG allocation method[0] is md 'sysv' | |
[1650465630.458251] [ndv4:55796:0] ucp_context.c:1427 UCX DEBUG allocation method[1] is md 'posix' | |
[1650465630.458256] [ndv4:55796:0] ucp_context.c:1439 UCX DEBUG allocation method[2] is 'huge' | |
[1650465630.458259] [ndv4:55796:0] ucp_context.c:1439 UCX DEBUG allocation method[3] is 'thp' | |
[1650465630.458261] [ndv4:55796:0] ucp_context.c:1427 UCX DEBUG allocation method[4] is md '*' | |
[1650465630.458263] [ndv4:55796:0] ucp_context.c:1439 UCX DEBUG allocation method[5] is 'mmap' | |
[1650465630.458265] [ndv4:55796:0] ucp_context.c:1439 UCX DEBUG allocation method[6] is 'heap' | |
[1650465630.458273] [ndv4:55796:0] module.c:253 UCX DEBUG loading modules for uct | |
[1650465630.459040] [ndv4:55796:0] module.c:253 UCX DEBUG loading modules for uct_ib | |
[1650465630.460164] [ndv4:55796:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib1) failed: Cannot assign requested address | |
[1650465630.460179] [ndv4:55796:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib2) failed: Cannot assign requested address | |
[1650465630.460192] [ndv4:55796:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib3) failed: Cannot assign requested address | |
[1650465630.460232] [ndv4:55796:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib4) failed: Cannot assign requested address | |
[1650465630.460243] [ndv4:55796:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib5) failed: Cannot assign requested address | |
[1650465630.460253] [ndv4:55796:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib6) failed: Cannot assign requested address | |
[1650465630.460264] [ndv4:55796:0] sock.c:88 UCX DEBUG ioctl(req=35093, ifr_name=ib7) failed: Cannot assign requested address | |
[1650465630.460710] [ndv4:55796:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: Success, continuing, but fork may be unsafe. | |
[1650465630.467802] [ndv4:55796:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib0 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.468133] [ndv4:55796:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib0: disable ODP because it's not supported for DevX QP | |
mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.390643] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.392437] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.393969] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x6558340: created UD QP 0xdad8 on mlx5_ib6:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.394662] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64, maxelems 4294967295, elemsize 4196 | |
[1650465630.394699] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.394705] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.394724] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.394729] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.395084] [ndv4:55152:0] ib_md.c:812 UCX DEBUG registered memory 0x2b3bad807000..0x2b3bad88c000 on mlx5_ib6 lkey 0x81200 rkey 0x81200 access 0xf flags 0x3e4 | |
[1650465630.395090] [ndv4:55152:0] mpool.c:205 UCX DEBUG mpool ud_recv_skb: allocated chunk 0x2b3bad807018 of 544744 bytes with 128 elements | |
[1650465630.395095] [ndv4:55152:0] mpool.c:88 UCX DEBUG mpool ud_tx_skb: align 64, maxelems 4294967295, elemsize 4168 | |
[1650465630.395542] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x6558340: adding gid fe80::15:5dff:fd34:1 to hash on device mlx5_ib6 port 1 index 0) | |
[1650465630.407754] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x6558340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 1) | |
[1650465630.407796] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x6558340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 2) | |
[1650465630.407811] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x6558340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 3) | |
[1650465630.408169] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x6558340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 4) | |
[1650465630.421657] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x6558340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 5) | |
[1650465630.421997] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x6558340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 6) | |
[1650465630.422103] [ndv4:55152:0] ud_iface.c:393 UCX DEBUG iface 0x6558340: adding gid fe80:: to hash on device mlx5_ib6 port 1 index 7) | |
[1650465630.422402] [ndv4:55152:0] timer_wheel.c:40 UCX DEBUG high res timer created log=12 resolution=4096.000000 usec wanted: 2500.000000 usec | |
[1650465630.422412] [ndv4:55152:0] async.c:228 UCX DEBUG added async handler 0x6558e90 [id=130 ref 1] uct_ud_iface_async_handler() to hash | |
[1650465630.422448] [ndv4:55152:0] async.c:506 UCX DEBUG listening to async event fd 130 events 0x5 mode thread_spinlock | |
[1650465630.422463] [ndv4:55152:0] ucp_worker.c:1159 UCX DEBUG created interface[39]=0x6558340 using ud_verbs/mlx5_ib6:1 on worker 0x357a8c0 | |
[1650465630.422520] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.422527] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.422740] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.422746] [ndv4:55152:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.423395] [ndv4:55152:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib6:1 | |
[1650465630.424731] [ndv4:55152:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.425025] [ndv4:55152:0] ib_iface.c:994 UCX DEBUG iface=0x6676460: created UD QP 0xdadb on mlx5_ibCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.388893] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.400818] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.400829] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.401979] [ndv4:55691:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.403845] [ndv4:55691:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.403862] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.403866] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.403949] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.404289] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 112 | |
[1650465630.404296] [ndv4:55691:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.404322] [ndv4:55691:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465630.404326] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.404334] [ndv4:55691:0] async.c:228 UCX DEBUG added async handler 0x2b2cf20 [id=94 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.404354] [ndv4:55691:0] async.c:506 UCX DEBUG listening to async event fd 94 events 0x1 mode thread_spinlock | |
[1650465630.404663] [ndv4:55691:0] dc_mlx5.c:823 UCX DEBUG creating dci pool 0 with 8 QPs | |
[1650465630.410382] [ndv4:55691:0] dc_mlx5.c:1327 UCX DEBUG dc iface 0x3e7b010: using 'dcs_quota' policy with 8 dcis and 4608 cqes, dct 0xdadd | |
[1650465630.410836] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.410844] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.410853] [ndv4:55691:0] mpool.c:205 UCX DEBUG mpool rcache_mp: allocated chunk 0x2b871fe31008 of 151544 bytes with 1052 elements | |
[1650465630.414670] [ndv4:55691:0] ib_md.c:812 UCX DEBUG registered memory 0x2b8726e00000..0x2b8729400000 on mlx5_ib1 lkey 0x81500 rkey 0x81500 access 0xf flags 0x3e4 | |
[1650465630.414688] [ndv4:55691:0] mpool.c:205 UCX DEBUG mpool rc_recv_desc: allocated chunk 0x2b8726e00018 of 39845864 bytes with 4752 elements | |
[1650465630.414860] [ndv4:55691:0] dc_mlx5.c:1346 UCX DEBUG created dc iface 0x3e7b010 | |
[1650465630.414889] [ndv4:55691:0] ucp_worker.c:1159 UCX DEBUG created interface[13]=0x3e7b010 using dc_mlx5/mlx5_ib1:1 on worker 0x33ff770 | |
[1650465630.415444] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.415451] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.415801] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.415807] [ndv4:55691:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.416006] [ndv4:55691:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.418121] [ndv4:55691:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 44 data_sz 4096 | |
[1650465630.418456] [ndv4:55691:0] ib_iface.c:994 UCX DEBUG iface=0x3d7a380: created UD QP 0xdabe on mlx5_ib1:1 TX wr:341 sge:5 inl:124 resp:0 RX wr:4096 sge:1 resp:0 | |
[1650465630.418910] [ndv4:55691:0] mpool.c:88 UCX DEBUG mpool ud_recv_skb: align 64,[1650465630.373905] [ndv4:55432:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 91 data_sz 8256 | |
[1650465630.373918] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.373923] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.373944] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.374461] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.374466] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_verbs_short_desc: align 64, maxelems 4294967295, elemsize 192 | |
[1650465630.375101] [ndv4:55432:0] ib_iface.c:994 UCX DEBUG iface=0x3a49050: created RC QP 0xdaaa on mlx5_ib1:1 TX wr:409 sge:4 inl:124 resp:64 RX wr:0 sge:0 resp:64 | |
[1650465630.376196] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[11]=0x3a49050 using rc_verbs/mlx5_ib1:1 on worker 0x3178950 | |
[1650465630.376218] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.376224] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.376469] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.376475] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.377078] [ndv4:55432:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.380177] [ndv4:55432:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90 data_sz 8256 | |
[1650465630.380198] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_recv_desc: align 64, maxelems 4294967295, elemsize 8356 | |
[1650465630.380203] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_send_desc: align 64, maxelems 4294967295, elemsize 8320 | |
[1650465630.380228] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool send-ops-mpool: align 64, maxelems 4294967295, elemsize 48 | |
[1650465630.381817] [ndv4:55432:0] mpool.c:205 UCX DEBUG mpool devx dbrec: allocated chunk 0x3a5c010 of 8176 bytes with 127 elements | |
[1650465630.382162] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool pending-ops: align 1, maxelems 4294967295, elemsize 64 | |
[1650465630.382171] [ndv4:55432:0] ib_mlx5.c:858 UCX DEBUG SL=0 (AR support - unknown) was selected on mlx5_ib1:1, SLs with AR support = { <none> }, SLs without AR support = { <none> } | |
[1650465630.382245] [ndv4:55432:0] rc_mlx5_common.c:727 UCX DEBUG ibv_alloc_dm(dev=mlx5_ib1 length=2048) failed: Invalid argument | |
[1650465630.382249] [ndv4:55432:0] mpool.c:88 UCX DEBUG mpool rc_mlx5_atomic_desc: align 64, maxelems 4294967295, elemsize 72 | |
[1650465630.382262] [ndv4:55432:0] async.c:228 UCX DEBUG added async handler 0x2128910 [id=92 ref 1] uct_rc_mlx5_devx_iface_event_handler() to hash | |
[1650465630.382287] [ndv4:55432:0] async.c:506 UCX DEBUG listening to async event fd 92 events 0x1 mode thread_spinlock | |
[1650465630.382309] [ndv4:55432:0] ucp_worker.c:1159 UCX DEBUG created interface[12]=0x38ca010 using rc_mlx5/mlx5_ib1:1 on worker 0x3178950 | |
[1650465630.382328] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.382334] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.382830] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.382838] [ndv4:55432:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.383007] [ndv4:55432:0] ib_iface.c:857 UCX DEBUG using pkey[0] 0x8014 on mlx5_ib1:1 | |
[1650465630.394096] [ndv4:55432:0] ib_iface.c:1469 UCX DEBUG created uct_ib_iface_t headroom_ofs 12 payload_ofs 92 hdr_ofs 90[1650465630.380129] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x106000000 exists. sys_dev = 5 | |
[1650465630.380137] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib5 bus id 262:0:0.0 sys_dev 5 | |
[1650465630.380425] [ndv4:55039:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.390503] [ndv4:55039:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib6 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.390830] [ndv4:55039:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib6: disable ODP because it's not supported for DevX QP | |
[1650465630.391511] [ndv4:55039:0] async.c:228 UCX DEBUG added async handler 0x19d9560 [id=69 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.391537] [ndv4:55039:0] async.c:506 UCX DEBUG listening to async event fd 69 events 0x1 mode thread_spinlock | |
[1650465630.391541] [ndv4:55039:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib6' (InfiniBand channel adapter) with 1 ports | |
[1650465630.391984] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.391994] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.391999] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.392048] [ndv4:55039:0] ib_md.c:1319 UCX DEBUG mlx5_ib6: using registration cache | |
[1650465630.397091] [ndv4:55039:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.397116] [ndv4:55039:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.397307] [ndv4:55039:0] ib_md.c:1604 UCX DEBUG mlx5_ib6: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.397474] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib6: cuda GPUDirect RDMA is enabled | |
[1650465630.397485] [ndv4:55039:0] ib_md.c:296 UCX DEBUG mlx5_ib6: rocm GPUDirect RDMA is disabled | |
[1650465630.399974] [ndv4:55039:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465630.399981] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.400693] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.400698] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.425671] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.425679] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.427312] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.427318] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.428613] [ndv4:55039:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.428617] [ndv4:55039:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.428953] [ndv4:55039:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.435239] [ndv4:55039:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.435568] [ndv4:55039:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465630.435875] [ndv4:55039:0] async.c:228 UCX DEBUG added async handler 0x20bf940 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.435905] [ndv4:55039:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465630.435909] [ndv4:55039:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465630.436058] [ndv4:55039:0] ib_md.c:296 UCX D[1650465630.382124] [ndv4:54724:0] async.c:228 UCX DEBUG added async handler 0x27001f0 [id=54 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.382246] [ndv4:54724:0] async.c:506 UCX DEBUG listening to async event fd 54 events 0x1 mode thread_spinlock | |
[1650465630.382255] [ndv4:54724:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib0' (InfiniBand channel adapter) with 1 ports | |
[1650465630.382767] [ndv4:54724:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.382777] [ndv4:54724:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.382805] [ndv4:54724:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.382870] [ndv4:54724:0] module.c:253 UCX DEBUG loading modules for ucm | |
[1650465630.382892] [ndv4:54724:0] ib_md.c:1319 UCX DEBUG mlx5_ib0: using registration cache | |
[1650465630.385265] [ndv4:54724:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.385274] [ndv4:54724:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.385530] [ndv4:54724:0] ib_md.c:1604 UCX DEBUG mlx5_ib0: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.385891] [ndv4:54724:0] ib_md.c:296 UCX DEBUG mlx5_ib0: cuda GPUDirect RDMA is enabled | |
[1650465630.385898] [ndv4:54724:0] ib_md.c:296 UCX DEBUG mlx5_ib0: rocm GPUDirect RDMA is disabled | |
[1650465630.386007] [ndv4:54724:0] topo.c:99 UCX DEBUG bus id 0x101000000 doesn't exist. sys_dev = 0 | |
[1650465630.386011] [ndv4:54724:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.390822] [ndv4:54724:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.390828] [ndv4:54724:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.391210] [ndv4:54724:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.391214] [ndv4:54724:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.418782] [ndv4:54724:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.418797] [ndv4:54724:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.432140] [ndv4:54724:0] topo.c:91 UCX DEBUG bus id 0x101000000 exists. sys_dev = 0 | |
[1650465630.432160] [ndv4:54724:0] ib_device.c:1124 UCX DEBUG mlx5_ib0 bus id 257:0:0.0 sys_dev 0 | |
[1650465630.432523] [ndv4:54724:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.441107] [ndv4:54724:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib1 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.441448] [ndv4:54724:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib1: disable ODP because it's not supported for DevX QP | |
[1650465630.441850] [ndv4:54724:0] async.c:228 UCX DEBUG added async handler 0x26f8f60 [id=59 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.441878] [ndv4:54724:0] async.c:506 UCX DEBUG listening to async event fd 59 events 0x1 mode thread_spinlock | |
[1650465630.441882] [ndv4:54724:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib1' (InfiniBand channel adapter) with 1 ports | |
[1650465630.442061] [ndv4:54724:0] ib_md.c:296 UCX DEBUG mlx5_ib1: cuda GPUDirect RDMA is enabled | |
[1650465630.442070] [ndv4:54724:0] ib_md.c:296 UCX DEBUG mlx5_ib1: rocm GPUDirect RDMA is disabled | |
[1650465630.442077] [ndv4:54724:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.442132] [ndv4:54724:0] ib_md.c:1319 UCX DEBUG mlx5_ib1: using registration cache | |
[1650465630.451865] [ndv4:54724:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double>[1650465630.384949] [ndv4:54861:0] async.c:228 UCX DEBUG added async handler 0x1fa5cd0 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.384977] [ndv4:54861:0] async.c:506 UCX DEBUG listening to async event fd 71 events 0x1 mode thread_spinlock | |
[1650465630.384981] [ndv4:54861:0] ib_device.c:655 UCX DEBUG initialized device 'mlx5_ib7' (InfiniBand channel adapter) with 1 ports | |
[1650465630.385225] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.385234] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.385239] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool rcache_mp: align 8, maxelems 4294967295, elemsize 144 | |
[1650465630.385290] [ndv4:54861:0] ib_md.c:1319 UCX DEBUG mlx5_ib7: using registration cache | |
[1650465630.388783] [ndv4:54861:0] ib_md.c:1499 UCX DEBUG incorrect format of current_link_speed file: expected: <double> GT/s, actual: Unknown speed | |
[1650465630.388790] [ndv4:54861:0] mpool.c:88 UCX DEBUG mpool devx dbrec: align 64, maxelems 4294967295, elemsize 40 | |
[1650465630.389035] [ndv4:54861:0] ib_md.c:1604 UCX DEBUG mlx5_ib7: md open by 'uct_ib_mlx5_devx_md_ops' is successful | |
[1650465630.389170] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib7: cuda GPUDirect RDMA is enabled | |
[1650465630.389176] [ndv4:54861:0] ib_md.c:296 UCX DEBUG mlx5_ib7: rocm GPUDirect RDMA is disabled | |
[1650465630.404254] [ndv4:54861:0] topo.c:99 UCX DEBUG bus id 0x108000000 doesn't exist. sys_dev = 7 | |
[1650465630.404268] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.405449] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.405453] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.416878] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.416885] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.418324] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.418328] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.418476] [ndv4:54861:0] topo.c:91 UCX DEBUG bus id 0x108000000 exists. sys_dev = 7 | |
[1650465630.418480] [ndv4:54861:0] ib_device.c:1124 UCX DEBUG mlx5_ib7 bus id 264:0:0.0 sys_dev 7 | |
[1650465630.418657] [ndv4:54861:0] ucp_context.c:1556 UCX DEBUG created ucp context 0x18a8dc0 0x18a8dc0 [13 mds 47 tls] features 0x1 tl bitmap 0x7fffffffffff 0x0 | |
[1650465630.385879] [ndv4:54715:0] topo.c:99 UCX DEBUG bus id 0x107000000 doesn't exist. sys_dev = 6 | |
[1650465630.385887] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.388464] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.388470] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.403732] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.403741] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.417243] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.417269] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.419181] [ndv4:54715:0] topo.c:91 UCX DEBUG bus id 0x107000000 exists. sys_dev = 6 | |
[1650465630.419186] [ndv4:54715:0] ib_device.c:1124 UCX DEBUG mlx5_ib6 bus id 263:0:0.0 sys_dev 6 | |
[1650465630.419522] [ndv4:54715:0] ib_md.c:1592 UCX DEBUG ibv_fork_init() failed: No such file or directory, continuing, but fork may be unsafe. | |
[1650465630.461300] [ndv4:54715:0] ib_device.c:542 UCX DEBUG PF: mlx5_ib7 vendor_id: 0x15b3 device_id: 4124 | |
[1650465630.461778] [ndv4:54715:0] ib_mlx5dv_md.c:489 UCX DEBUG mlx5_ib7: disable ODP because it's not supported for DevX QP | |
[1650465630.462498] [ndv4:54715:0] async.c:228 UCX DEBUG added async handler 0x200e820 [id=71 ref 1] uct_ib_async_event_handler() to hash | |
[1650465630.462529] [ndv4:54715:0] async.c:506 UCX |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment