-
-
Save jamesongithub/ca1c9618f0dd994f6bf8356147111543 to your computer and use it in GitHub Desktop.
ucx error log with UCX_POSIX_USE_PROC_LINK=n for https://github.com/openucx/ucx/issues/8511
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-------------------------------------------------------------------------- | |
WARNING: No preset parameters were found for the device that Open MPI | |
detected: | |
Local host: slurm-slehpc15-james-hpc-pg0-3 | |
Device name: mlx5_0 | |
Device vendor ID: 0x02c9 | |
Device vendor part ID: 4120 | |
Default device parameters will be used, which may result in lower | |
performance. You can edit any of the files specified by the | |
btl_openib_device_param_files MCA parameter to set values for your | |
device. | |
NOTE: You can turn off this warning by setting the MCA parameter | |
btl_openib_warn_no_device_params_found to 0. | |
-------------------------------------------------------------------------- | |
-------------------------------------------------------------------------- | |
WARNING: There was an error initializing an OpenFabrics device. | |
Local host: slurm-slehpc15-james-hpc-pg0-3 | |
Local device: mlx5_0 | |
-------------------------------------------------------------------------- | |
[1665113879.711693] [slurm-slehpc15-james-hpc-pg0-11:18761:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.713112] [slurm-slehpc15-james-hpc-pg0-11:18759:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.712916] [slurm-slehpc15-james-hpc-pg0-12:19273:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.714923] [slurm-slehpc15-james-hpc-pg0-12:19271:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.715086] [slurm-slehpc15-james-hpc-pg0-12:19269:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.716591] [slurm-slehpc15-james-hpc-pg0-12:19270:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.684926] [slurm-slehpc15-james-hpc-pg0-6:19316:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.717970] [slurm-slehpc15-james-hpc-pg0-12:19274:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.718030] [slurm-slehpc15-james-hpc-pg0-12:19272:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.720572] [slurm-slehpc15-james-hpc-pg0-9:18729:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.721028] [slurm-slehpc15-james-hpc-pg0-11:18754:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.721585] [slurm-slehpc15-james-hpc-pg0-9:18733:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.689610] [slurm-slehpc15-james-hpc-pg0-6:19317:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.722893] [slurm-slehpc15-james-hpc-pg0-9:18731:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.723743] [slurm-slehpc15-james-hpc-pg0-9:18730:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.691608] [slurm-slehpc15-james-hpc-pg0-6:19311:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.724262] [slurm-slehpc15-james-hpc-pg0-11:18755:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.724615] [slurm-slehpc15-james-hpc-pg0-9:18728:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.692152] [slurm-slehpc15-james-hpc-pg0-6:19314:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.725917] [slurm-slehpc15-james-hpc-pg0-9:18732:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.727017] [slurm-slehpc15-james-hpc-pg0-10:18719:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.729457] [slurm-slehpc15-james-hpc-pg0-11:18756:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.728502] [slurm-slehpc15-james-hpc-pg0-10:18724:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.696510] [slurm-slehpc15-james-hpc-pg0-6:19315:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.732784] [slurm-slehpc15-james-hpc-pg0-11:18760:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.729356] [slurm-slehpc15-james-hpc-pg0-10:18725:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.733533] [slurm-slehpc15-james-hpc-pg0-11:18757:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.698059] [slurm-slehpc15-james-hpc-pg0-6:19312:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.703771] [slurm-slehpc15-james-hpc-pg0-6:19313:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.734967] [slurm-slehpc15-james-hpc-pg0-11:18758:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.736120] [slurm-slehpc15-james-hpc-pg0-10:18723:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.705746] [slurm-slehpc15-james-hpc-pg0-6:19310:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.739322] [slurm-slehpc15-james-hpc-pg0-5:19583:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.741348] [slurm-slehpc15-james-hpc-pg0-10:18726:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.709121] [slurm-slehpc15-james-hpc-pg0-6:19319:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.740870] [slurm-slehpc15-james-hpc-pg0-5:19589:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.741372] [slurm-slehpc15-james-hpc-pg0-10:18721:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.739706] [slurm-slehpc15-james-hpc-pg0-3:18713:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.741657] [slurm-slehpc15-james-hpc-pg0-5:19587:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.739371] [slurm-slehpc15-james-hpc-pg0-7:19001:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.740673] [slurm-slehpc15-james-hpc-pg0-3:18716:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.739981] [slurm-slehpc15-james-hpc-pg0-7:19004:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.742388] [slurm-slehpc15-james-hpc-pg0-7:19003:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.743870] [slurm-slehpc15-james-hpc-pg0-3:18718:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.711533] [slurm-slehpc15-james-hpc-pg0-6:19318:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.744320] [slurm-slehpc15-james-hpc-pg0-3:18715:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.745322] [slurm-slehpc15-james-hpc-pg0-10:18720:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.746665] [slurm-slehpc15-james-hpc-pg0-4:18729:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.748739] [slurm-slehpc15-james-hpc-pg0-10:18722:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.748524] [slurm-slehpc15-james-hpc-pg0-8:19792:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.749817] [slurm-slehpc15-james-hpc-pg0-3:18717:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.750299] [slurm-slehpc15-james-hpc-pg0-7:19000:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.752327] [slurm-slehpc15-james-hpc-pg0-5:19584:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.752313] [slurm-slehpc15-james-hpc-pg0-5:19588:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.752674] [slurm-slehpc15-james-hpc-pg0-8:19795:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.753482] [slurm-slehpc15-james-hpc-pg0-8:19797:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.755409] [slurm-slehpc15-james-hpc-pg0-8:19790:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.756347] [slurm-slehpc15-james-hpc-pg0-4:18727:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.756444] [slurm-slehpc15-james-hpc-pg0-7:19005:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.756665] [slurm-slehpc15-james-hpc-pg0-8:19796:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.759507] [slurm-slehpc15-james-hpc-pg0-5:19586:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.760004] [slurm-slehpc15-james-hpc-pg0-5:19582:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.761080] [slurm-slehpc15-james-hpc-pg0-7:19006:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.761528] [slurm-slehpc15-james-hpc-pg0-8:19794:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.762966] [slurm-slehpc15-james-hpc-pg0-7:18999:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.763407] [slurm-slehpc15-james-hpc-pg0-5:19585:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.764242] [slurm-slehpc15-james-hpc-pg0-8:19791:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.765307] [slurm-slehpc15-james-hpc-pg0-10:18718:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.765643] [slurm-slehpc15-james-hpc-pg0-7:19002:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.766105] [slurm-slehpc15-james-hpc-pg0-4:18722:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.767097] [slurm-slehpc15-james-hpc-pg0-7:19008:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.767942] [slurm-slehpc15-james-hpc-pg0-8:19793:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.768831] [slurm-slehpc15-james-hpc-pg0-4:18726:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.770177] [slurm-slehpc15-james-hpc-pg0-3:18720:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.770358] [slurm-slehpc15-james-hpc-pg0-4:18724:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.771517] [slurm-slehpc15-james-hpc-pg0-7:19007:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.772183] [slurm-slehpc15-james-hpc-pg0-4:18728:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.774067] [slurm-slehpc15-james-hpc-pg0-4:18723:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.774634] [slurm-slehpc15-james-hpc-pg0-3:18719:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.775701] [slurm-slehpc15-james-hpc-pg0-11:18764:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.780422] [slurm-slehpc15-james-hpc-pg0-4:18725:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.773918] [slurm-slehpc15-james-hpc-pg0-2:19199:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.784385] [slurm-slehpc15-james-hpc-pg0-10:18729:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.784640] [slurm-slehpc15-james-hpc-pg0-10:18727:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.775285] [slurm-slehpc15-james-hpc-pg0-2:19204:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.785162] [slurm-slehpc15-james-hpc-pg0-10:18728:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.779035] [slurm-slehpc15-james-hpc-pg0-2:19202:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.789937] [slurm-slehpc15-james-hpc-pg0-3:18721:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.780897] [slurm-slehpc15-james-hpc-pg0-2:19201:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.781301] [slurm-slehpc15-james-hpc-pg0-2:19198:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.790820] [slurm-slehpc15-james-hpc-pg0-7:18998:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.782305] [slurm-slehpc15-james-hpc-pg0-2:19203:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.792428] [slurm-slehpc15-james-hpc-pg0-4:18732:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.793528] [slurm-slehpc15-james-hpc-pg0-4:18734:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.793716] [slurm-slehpc15-james-hpc-pg0-4:18731:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.794361] [slurm-slehpc15-james-hpc-pg0-5:19581:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.794082] [slurm-slehpc15-james-hpc-pg0-12:19280:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.794201] [slurm-slehpc15-james-hpc-pg0-12:19281:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.787668] [slurm-slehpc15-james-hpc-pg0-2:19197:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.788802] [slurm-slehpc15-james-hpc-pg0-2:19200:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.798925] [slurm-slehpc15-james-hpc-pg0-4:18730:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.800271] [slurm-slehpc15-james-hpc-pg0-5:19590:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.803429] [slurm-slehpc15-james-hpc-pg0-7:19011:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.805439] [slurm-slehpc15-james-hpc-pg0-11:18753:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.805924] [slurm-slehpc15-james-hpc-pg0-11:18762:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.806201] [slurm-slehpc15-james-hpc-pg0-11:18765:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.806237] [slurm-slehpc15-james-hpc-pg0-11:18763:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.807055] [slurm-slehpc15-james-hpc-pg0-4:18736:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.809934] [slurm-slehpc15-james-hpc-pg0-9:18740:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.810328] [slurm-slehpc15-james-hpc-pg0-7:19009:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.809939] [slurm-slehpc15-james-hpc-pg0-8:19798:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.812338] [slurm-slehpc15-james-hpc-pg0-9:18739:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.813542] [slurm-slehpc15-james-hpc-pg0-9:18737:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.814819] [slurm-slehpc15-james-hpc-pg0-9:18741:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.815050] [slurm-slehpc15-james-hpc-pg0-4:18735:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.815820] [slurm-slehpc15-james-hpc-pg0-4:18733:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.819150] [slurm-slehpc15-james-hpc-pg0-8:19800:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.820715] [slurm-slehpc15-james-hpc-pg0-3:18723:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.820660] [slurm-slehpc15-james-hpc-pg0-4:18737:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.825582] [slurm-slehpc15-james-hpc-pg0-12:19278:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.827273] [slurm-slehpc15-james-hpc-pg0-12:19277:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.829257] [slurm-slehpc15-james-hpc-pg0-12:19282:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.802416] [slurm-slehpc15-james-hpc-pg0-6:19309:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.835452] [slurm-slehpc15-james-hpc-pg0-3:18726:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.836998] [slurm-slehpc15-james-hpc-pg0-3:18722:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.838184] [slurm-slehpc15-james-hpc-pg0-8:19806:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.839286] [slurm-slehpc15-james-hpc-pg0-3:18724:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.807140] [slurm-slehpc15-james-hpc-pg0-6:19324:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.808529] [slurm-slehpc15-james-hpc-pg0-6:19322:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.831700] [slurm-slehpc15-james-hpc-pg0-2:19206:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.809197] [slurm-slehpc15-james-hpc-pg0-6:19323:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.832477] [slurm-slehpc15-james-hpc-pg0-2:19196:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.842215] [slurm-slehpc15-james-hpc-pg0-3:18725:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.842220] [slurm-slehpc15-james-hpc-pg0-7:19013:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.833061] [slurm-slehpc15-james-hpc-pg0-2:19205:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.842533] [slurm-slehpc15-james-hpc-pg0-7:19012:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.842431] [slurm-slehpc15-james-hpc-pg0-8:19805:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.843075] [slurm-slehpc15-james-hpc-pg0-3:18727:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.845152] [slurm-slehpc15-james-hpc-pg0-5:19591:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.845635] [slurm-slehpc15-james-hpc-pg0-12:19283:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.848171] [slurm-slehpc15-james-hpc-pg0-9:18742:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.848944] [slurm-slehpc15-james-hpc-pg0-9:18727:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.848970] [slurm-slehpc15-james-hpc-pg0-10:18730:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.849961] [slurm-slehpc15-james-hpc-pg0-3:18729:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.850200] [slurm-slehpc15-james-hpc-pg0-9:18743:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.851232] [slurm-slehpc15-james-hpc-pg0-3:18728:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.851416] [slurm-slehpc15-james-hpc-pg0-9:18744:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.851935] [slurm-slehpc15-james-hpc-pg0-7:19010:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.851719] [slurm-slehpc15-james-hpc-pg0-8:19807:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.844598] [slurm-slehpc15-james-hpc-pg0-2:19213:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.854285] [slurm-slehpc15-james-hpc-pg0-3:18730:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.854723] [slurm-slehpc15-james-hpc-pg0-9:18736:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.855797] [slurm-slehpc15-james-hpc-pg0-8:19803:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.856640] [slurm-slehpc15-james-hpc-pg0-9:18735:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.825870] [slurm-slehpc15-james-hpc-pg0-6:19320:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.827187] [slurm-slehpc15-james-hpc-pg0-6:19325:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.860672] [slurm-slehpc15-james-hpc-pg0-7:19014:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.861275] [slurm-slehpc15-james-hpc-pg0-3:18714:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.861809] [slurm-slehpc15-james-hpc-pg0-9:18747:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.864041] [slurm-slehpc15-james-hpc-pg0-9:18746:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.864854] [slurm-slehpc15-james-hpc-pg0-3:18731:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.865244] [slurm-slehpc15-james-hpc-pg0-9:18738:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.866188] [slurm-slehpc15-james-hpc-pg0-9:18745:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.866290] [slurm-slehpc15-james-hpc-pg0-3:18732:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.866764] [slurm-slehpc15-james-hpc-pg0-9:18734:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.867732] [slurm-slehpc15-james-hpc-pg0-9:18749:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.868055] [slurm-slehpc15-james-hpc-pg0-9:18748:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.868315] [slurm-slehpc15-james-hpc-pg0-12:19279:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.870317] [slurm-slehpc15-james-hpc-pg0-3:18735:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.870683] [slurm-slehpc15-james-hpc-pg0-3:18734:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.871823] [slurm-slehpc15-james-hpc-pg0-3:18733:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.871520] [slurm-slehpc15-james-hpc-pg0-8:19801:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.873957] [slurm-slehpc15-james-hpc-pg0-8:19808:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.875449] [slurm-slehpc15-james-hpc-pg0-8:19799:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.869632] [slurm-slehpc15-james-hpc-pg0-2:19216:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.879539] [slurm-slehpc15-james-hpc-pg0-8:19789:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.875345] [slurm-slehpc15-james-hpc-pg0-2:19210:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.885645] [slurm-slehpc15-james-hpc-pg0-8:19802:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.886795] [slurm-slehpc15-james-hpc-pg0-8:19804:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.887137] [slurm-slehpc15-james-hpc-pg0-4:18721:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.887459] [slurm-slehpc15-james-hpc-pg0-8:19810:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.878445] [slurm-slehpc15-james-hpc-pg0-2:19208:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.888035] [slurm-slehpc15-james-hpc-pg0-8:19809:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.889261] [slurm-slehpc15-james-hpc-pg0-8:19812:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.880504] [slurm-slehpc15-james-hpc-pg0-2:19209:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.889807] [slurm-slehpc15-james-hpc-pg0-8:19811:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.889992] [slurm-slehpc15-james-hpc-pg0-12:19285:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.892216] [slurm-slehpc15-james-hpc-pg0-8:19813:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.894916] [slurm-slehpc15-james-hpc-pg0-10:18732:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.898511] [slurm-slehpc15-james-hpc-pg0-8:19814:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.891863] [slurm-slehpc15-james-hpc-pg0-2:19207:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.891982] [slurm-slehpc15-james-hpc-pg0-2:19215:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.892762] [slurm-slehpc15-james-hpc-pg0-2:19212:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.896266] [slurm-slehpc15-james-hpc-pg0-2:19214:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.905803] [slurm-slehpc15-james-hpc-pg0-4:18741:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.906232] [slurm-slehpc15-james-hpc-pg0-4:18738:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.908594] [slurm-slehpc15-james-hpc-pg0-7:19016:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.910603] [slurm-slehpc15-james-hpc-pg0-1:19597:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.912558] [slurm-slehpc15-james-hpc-pg0-10:18731:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.903181] [slurm-slehpc15-james-hpc-pg0-2:19218:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.903930] [slurm-slehpc15-james-hpc-pg0-2:19217:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.904531] [slurm-slehpc15-james-hpc-pg0-2:19211:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.883675] [slurm-slehpc15-james-hpc-pg0-6:19328:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.907398] [slurm-slehpc15-james-hpc-pg0-2:19219:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.884270] [slurm-slehpc15-james-hpc-pg0-6:19321:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.885237] [slurm-slehpc15-james-hpc-pg0-6:19326:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.885505] [slurm-slehpc15-james-hpc-pg0-6:19327:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.917997] [slurm-slehpc15-james-hpc-pg0-4:18740:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.919242] [slurm-slehpc15-james-hpc-pg0-4:18739:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.919595] [slurm-slehpc15-james-hpc-pg0-4:18743:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.887006] [slurm-slehpc15-james-hpc-pg0-6:19329:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.921378] [slurm-slehpc15-james-hpc-pg0-12:19276:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.924963] [slurm-slehpc15-james-hpc-pg0-12:19268:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.924967] [slurm-slehpc15-james-hpc-pg0-12:19286:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.927461] [slurm-slehpc15-james-hpc-pg0-7:19015:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.933493] [slurm-slehpc15-james-hpc-pg0-7:19018:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.933833] [slurm-slehpc15-james-hpc-pg0-7:19017:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.940891] [slurm-slehpc15-james-hpc-pg0-7:19019:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.945350] [slurm-slehpc15-james-hpc-pg0-4:18742:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.948972] [slurm-slehpc15-james-hpc-pg0-12:19284:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.950600] [slurm-slehpc15-james-hpc-pg0-12:19287:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.966594] [slurm-slehpc15-james-hpc-pg0-11:18766:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.972510] [slurm-slehpc15-james-hpc-pg0-5:19593:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.940302] [slurm-slehpc15-james-hpc-pg0-6:19333:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.940662] [slurm-slehpc15-james-hpc-pg0-6:19335:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.941128] [slurm-slehpc15-james-hpc-pg0-6:19332:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.941643] [slurm-slehpc15-james-hpc-pg0-6:19330:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.942256] [slurm-slehpc15-james-hpc-pg0-6:19331:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.967292] [slurm-slehpc15-james-hpc-pg0-2:19221:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.976083] [slurm-slehpc15-james-hpc-pg0-12:19292:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.976787] [slurm-slehpc15-james-hpc-pg0-4:18745:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.976922] [slurm-slehpc15-james-hpc-pg0-12:19289:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.977896] [slurm-slehpc15-james-hpc-pg0-4:18744:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.977442] [slurm-slehpc15-james-hpc-pg0-12:19291:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.977790] [slurm-slehpc15-james-hpc-pg0-12:19290:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.946187] [slurm-slehpc15-james-hpc-pg0-6:19334:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.982343] [slurm-slehpc15-james-hpc-pg0-12:19288:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113879.992583] [slurm-slehpc15-james-hpc-pg0-8:19815:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.002677] [slurm-slehpc15-james-hpc-pg0-7:19022:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.004258] [slurm-slehpc15-james-hpc-pg0-7:19024:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.005435] [slurm-slehpc15-james-hpc-pg0-7:19021:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.008764] [slurm-slehpc15-james-hpc-pg0-7:19023:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.009562] [slurm-slehpc15-james-hpc-pg0-7:19020:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.020514] [slurm-slehpc15-james-hpc-pg0-3:18740:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.034610] [slurm-slehpc15-james-hpc-pg0-12:19275:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.037717] [slurm-slehpc15-james-hpc-pg0-9:18753:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.041406] [slurm-slehpc15-james-hpc-pg0-5:19594:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.056608] [slurm-slehpc15-james-hpc-pg0-3:18737:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.053219] [slurm-slehpc15-james-hpc-pg0-2:19222:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.069522] [slurm-slehpc15-james-hpc-pg0-8:19822:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.039127] [slurm-slehpc15-james-hpc-pg0-6:19343:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.072285] [slurm-slehpc15-james-hpc-pg0-8:19816:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.041012] [slurm-slehpc15-james-hpc-pg0-6:19336:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.074197] [slurm-slehpc15-james-hpc-pg0-8:19828:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.074753] [slurm-slehpc15-james-hpc-pg0-8:19820:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.042731] [slurm-slehpc15-james-hpc-pg0-6:19342:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.079776] [slurm-slehpc15-james-hpc-pg0-8:19817:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.080575] [slurm-slehpc15-james-hpc-pg0-5:19592:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.080603] [slurm-slehpc15-james-hpc-pg0-5:19595:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.082907] [slurm-slehpc15-james-hpc-pg0-11:18769:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.083803] [slurm-slehpc15-james-hpc-pg0-3:18736:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.084812] [slurm-slehpc15-james-hpc-pg0-8:19826:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.086382] [slurm-slehpc15-james-hpc-pg0-3:18739:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.087069] [slurm-slehpc15-james-hpc-pg0-8:19819:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.088152] [slurm-slehpc15-james-hpc-pg0-7:19030:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.088638] [slurm-slehpc15-james-hpc-pg0-11:18770:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.088701] [slurm-slehpc15-james-hpc-pg0-8:19830:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.091015] [slurm-slehpc15-james-hpc-pg0-8:19821:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.061157] [slurm-slehpc15-james-hpc-pg0-6:19350:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.093878] [slurm-slehpc15-james-hpc-pg0-8:19818:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.095708] [slurm-slehpc15-james-hpc-pg0-8:19827:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.096345] [slurm-slehpc15-james-hpc-pg0-4:18751:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.063848] [slurm-slehpc15-james-hpc-pg0-6:19340:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.096246] [slurm-slehpc15-james-hpc-pg0-8:19829:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.096789] [slurm-slehpc15-james-hpc-pg0-8:19823:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.097344] [slurm-slehpc15-james-hpc-pg0-4:18748:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.097313] [slurm-slehpc15-james-hpc-pg0-8:19824:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.097841] [slurm-slehpc15-james-hpc-pg0-8:19832:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.098394] [slurm-slehpc15-james-hpc-pg0-4:18746:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.098364] [slurm-slehpc15-james-hpc-pg0-8:19831:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.098763] [slurm-slehpc15-james-hpc-pg0-8:19825:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.099356] [slurm-slehpc15-james-hpc-pg0-4:18747:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.090031] [slurm-slehpc15-james-hpc-pg0-2:19234:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.099157] [slurm-slehpc15-james-hpc-pg0-12:19295:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.100997] [slurm-slehpc15-james-hpc-pg0-12:19294:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.092664] [slurm-slehpc15-james-hpc-pg0-2:19232:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.069626] [slurm-slehpc15-james-hpc-pg0-6:19346:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.070664] [slurm-slehpc15-james-hpc-pg0-6:19337:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.095524] [slurm-slehpc15-james-hpc-pg0-2:19223:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.105128] [slurm-slehpc15-james-hpc-pg0-12:19293:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.073296] [slurm-slehpc15-james-hpc-pg0-6:19344:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.105853] [slurm-slehpc15-james-hpc-pg0-11:18768:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.107373] [slurm-slehpc15-james-hpc-pg0-11:18771:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.098305] [slurm-slehpc15-james-hpc-pg0-2:19220:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.075892] [slurm-slehpc15-james-hpc-pg0-6:19339:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.108770] [slurm-slehpc15-james-hpc-pg0-11:18767:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.110677] [slurm-slehpc15-james-hpc-pg0-9:18755:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.078835] [slurm-slehpc15-james-hpc-pg0-6:19338:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.102599] [slurm-slehpc15-james-hpc-pg0-2:19238:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.080705] [slurm-slehpc15-james-hpc-pg0-6:19351:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.104009] [slurm-slehpc15-james-hpc-pg0-2:19224:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.082737] [slurm-slehpc15-james-hpc-pg0-6:19352:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.115480] [slurm-slehpc15-james-hpc-pg0-4:18752:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.116151] [slurm-slehpc15-james-hpc-pg0-3:18743:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.115847] [slurm-slehpc15-james-hpc-pg0-12:19297:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.084186] [slurm-slehpc15-james-hpc-pg0-6:19345:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.116615] [slurm-slehpc15-james-hpc-pg0-11:18772:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.107520] [slurm-slehpc15-james-hpc-pg0-2:19227:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.117751] [slurm-slehpc15-james-hpc-pg0-4:18750:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.085637] [slurm-slehpc15-james-hpc-pg0-6:19349:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.118898] [slurm-slehpc15-james-hpc-pg0-3:18742:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.086719] [slurm-slehpc15-james-hpc-pg0-6:19341:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.121832] [slurm-slehpc15-james-hpc-pg0-9:18760:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.119091] [slurm-slehpc15-james-hpc-pg0-12:19302:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.120371] [slurm-slehpc15-james-hpc-pg0-4:18749:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.110447] [slurm-slehpc15-james-hpc-pg0-2:19237:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.121329] [slurm-slehpc15-james-hpc-pg0-7:19028:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.123176] [slurm-slehpc15-james-hpc-pg0-3:18747:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.087413] [slurm-slehpc15-james-hpc-pg0-6:19348:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.121923] [slurm-slehpc15-james-hpc-pg0-11:18774:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.122314] [slurm-slehpc15-james-hpc-pg0-9:18759:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.122309] [slurm-slehpc15-james-hpc-pg0-4:18755:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.112538] [slurm-slehpc15-james-hpc-pg0-2:19229:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.123487] [slurm-slehpc15-james-hpc-pg0-7:19038:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.125454] [slurm-slehpc15-james-hpc-pg0-3:18754:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.087909] [slurm-slehpc15-james-hpc-pg0-6:19347:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.124545] [slurm-slehpc15-james-hpc-pg0-4:18761:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.115837] [slurm-slehpc15-james-hpc-pg0-2:19228:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.124019] [slurm-slehpc15-james-hpc-pg0-7:19029:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.125217] [slurm-slehpc15-james-hpc-pg0-4:18764:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.124604] [slurm-slehpc15-james-hpc-pg0-7:19031:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.125038] [slurm-slehpc15-james-hpc-pg0-7:19037:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.126076] [slurm-slehpc15-james-hpc-pg0-7:19025:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.126914] [slurm-slehpc15-james-hpc-pg0-4:18759:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.126710] [slurm-slehpc15-james-hpc-pg0-12:19296:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.118167] [slurm-slehpc15-james-hpc-pg0-2:19236:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.128290] [slurm-slehpc15-james-hpc-pg0-9:18752:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.128363] [slurm-slehpc15-james-hpc-pg0-7:19040:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.129067] [slurm-slehpc15-james-hpc-pg0-4:18758:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.129187] [slurm-slehpc15-james-hpc-pg0-9:18751:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.129485] [slurm-slehpc15-james-hpc-pg0-12:19303:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.120657] [slurm-slehpc15-james-hpc-pg0-2:19233:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.121242] [slurm-slehpc15-james-hpc-pg0-2:19225:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.122048] [slurm-slehpc15-james-hpc-pg0-2:19231:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.131457] [slurm-slehpc15-james-hpc-pg0-4:18760:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.122664] [slurm-slehpc15-james-hpc-pg0-2:19230:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.132659] [slurm-slehpc15-james-hpc-pg0-9:18750:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.123366] [slurm-slehpc15-james-hpc-pg0-2:19235:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.132934] [slurm-slehpc15-james-hpc-pg0-4:18757:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.123810] [slurm-slehpc15-james-hpc-pg0-2:19239:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.124331] [slurm-slehpc15-james-hpc-pg0-2:19226:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.134081] [slurm-slehpc15-james-hpc-pg0-4:18763:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.134829] [slurm-slehpc15-james-hpc-pg0-3:18741:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.134933] [slurm-slehpc15-james-hpc-pg0-9:18764:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.134875] [slurm-slehpc15-james-hpc-pg0-4:18762:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.135448] [slurm-slehpc15-james-hpc-pg0-4:18756:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.136001] [slurm-slehpc15-james-hpc-pg0-4:18754:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.136296] [slurm-slehpc15-james-hpc-pg0-9:18765:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.136489] [slurm-slehpc15-james-hpc-pg0-4:18753:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.137802] [slurm-slehpc15-james-hpc-pg0-3:18748:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.139597] [slurm-slehpc15-james-hpc-pg0-3:18749:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.140308] [slurm-slehpc15-james-hpc-pg0-3:18745:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.140445] [slurm-slehpc15-james-hpc-pg0-10:18733:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.140635] [slurm-slehpc15-james-hpc-pg0-7:19036:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.141111] [slurm-slehpc15-james-hpc-pg0-10:18734:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.141756] [slurm-slehpc15-james-hpc-pg0-3:18752:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.142528] [slurm-slehpc15-james-hpc-pg0-7:19039:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.143817] [slurm-slehpc15-james-hpc-pg0-3:18746:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.143926] [slurm-slehpc15-james-hpc-pg0-7:19027:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.144244] [slurm-slehpc15-james-hpc-pg0-7:19041:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.144692] [slurm-slehpc15-james-hpc-pg0-9:18754:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.144990] [slurm-slehpc15-james-hpc-pg0-7:19034:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.145457] [slurm-slehpc15-james-hpc-pg0-9:18767:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.145574] [slurm-slehpc15-james-hpc-pg0-7:19032:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.146061] [slurm-slehpc15-james-hpc-pg0-7:19035:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.146273] [slurm-slehpc15-james-hpc-pg0-3:18753:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.146596] [slurm-slehpc15-james-hpc-pg0-9:18770:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.146670] [slurm-slehpc15-james-hpc-pg0-7:19026:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.146893] [slurm-slehpc15-james-hpc-pg0-3:18756:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.147172] [slurm-slehpc15-james-hpc-pg0-7:19033:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.147612] [slurm-slehpc15-james-hpc-pg0-3:18744:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.147666] [slurm-slehpc15-james-hpc-pg0-9:18769:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.148394] [slurm-slehpc15-james-hpc-pg0-3:18751:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.148729] [slurm-slehpc15-james-hpc-pg0-3:18755:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.149257] [slurm-slehpc15-james-hpc-pg0-3:18738:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.149559] [slurm-slehpc15-james-hpc-pg0-3:18750:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.151318] [slurm-slehpc15-james-hpc-pg0-9:18758:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.152345] [slurm-slehpc15-james-hpc-pg0-9:18768:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.152615] [slurm-slehpc15-james-hpc-pg0-9:18757:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.153508] [slurm-slehpc15-james-hpc-pg0-9:18761:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.154517] [slurm-slehpc15-james-hpc-pg0-9:18756:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.154846] [slurm-slehpc15-james-hpc-pg0-9:18763:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.155138] [slurm-slehpc15-james-hpc-pg0-9:18766:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.157418] [slurm-slehpc15-james-hpc-pg0-9:18762:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.156945] [slurm-slehpc15-james-hpc-pg0-12:19298:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.159299] [slurm-slehpc15-james-hpc-pg0-12:19309:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.160174] [slurm-slehpc15-james-hpc-pg0-12:19306:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.162479] [slurm-slehpc15-james-hpc-pg0-12:19300:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.166235] [slurm-slehpc15-james-hpc-pg0-12:19311:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.170204] [slurm-slehpc15-james-hpc-pg0-12:19308:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.171480] [slurm-slehpc15-james-hpc-pg0-12:19299:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.172289] [slurm-slehpc15-james-hpc-pg0-12:19310:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.172832] [slurm-slehpc15-james-hpc-pg0-12:19307:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.173393] [slurm-slehpc15-james-hpc-pg0-12:19305:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.174030] [slurm-slehpc15-james-hpc-pg0-12:19304:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.174856] [slurm-slehpc15-james-hpc-pg0-12:19301:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.178597] [slurm-slehpc15-james-hpc-pg0-5:19597:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.179199] [slurm-slehpc15-james-hpc-pg0-5:19596:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.192091] [slurm-slehpc15-james-hpc-pg0-11:18780:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.198925] [slurm-slehpc15-james-hpc-pg0-11:18777:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.203809] [slurm-slehpc15-james-hpc-pg0-11:18781:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.213927] [slurm-slehpc15-james-hpc-pg0-11:18776:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.218085] [slurm-slehpc15-james-hpc-pg0-11:18778:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.221881] [slurm-slehpc15-james-hpc-pg0-11:18773:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.225329] [slurm-slehpc15-james-hpc-pg0-5:19598:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.225600] [slurm-slehpc15-james-hpc-pg0-10:18736:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.231902] [slurm-slehpc15-james-hpc-pg0-10:18738:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.234880] [slurm-slehpc15-james-hpc-pg0-11:18782:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.235410] [slurm-slehpc15-james-hpc-pg0-10:18735:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.237769] [slurm-slehpc15-james-hpc-pg0-10:18743:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.240100] [slurm-slehpc15-james-hpc-pg0-11:18790:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.242214] [slurm-slehpc15-james-hpc-pg0-11:18775:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.244336] [slurm-slehpc15-james-hpc-pg0-11:18779:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.251219] [slurm-slehpc15-james-hpc-pg0-1:19581:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.255719] [slurm-slehpc15-james-hpc-pg0-10:18741:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.257092] [slurm-slehpc15-james-hpc-pg0-10:18744:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.256712] [slurm-slehpc15-james-hpc-pg0-11:18787:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.257227] [slurm-slehpc15-james-hpc-pg0-11:18789:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.258791] [slurm-slehpc15-james-hpc-pg0-11:18794:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.259672] [slurm-slehpc15-james-hpc-pg0-10:18737:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.261322] [slurm-slehpc15-james-hpc-pg0-11:18784:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.261924] [slurm-slehpc15-james-hpc-pg0-10:18740:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.264870] [slurm-slehpc15-james-hpc-pg0-5:19602:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.267149] [slurm-slehpc15-james-hpc-pg0-11:18786:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.267631] [slurm-slehpc15-james-hpc-pg0-10:18742:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.267728] [slurm-slehpc15-james-hpc-pg0-11:18792:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.270830] [slurm-slehpc15-james-hpc-pg0-11:18796:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.271372] [slurm-slehpc15-james-hpc-pg0-10:18753:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.273389] [slurm-slehpc15-james-hpc-pg0-11:18788:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.274106] [slurm-slehpc15-james-hpc-pg0-5:19599:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.274075] [slurm-slehpc15-james-hpc-pg0-11:18795:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.275001] [slurm-slehpc15-james-hpc-pg0-10:18746:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.274834] [slurm-slehpc15-james-hpc-pg0-11:18793:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.275527] [slurm-slehpc15-james-hpc-pg0-11:18791:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.276240] [slurm-slehpc15-james-hpc-pg0-11:18783:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.277245] [slurm-slehpc15-james-hpc-pg0-11:18785:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.278557] [slurm-slehpc15-james-hpc-pg0-5:19601:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.281257] [slurm-slehpc15-james-hpc-pg0-5:19600:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.282897] [slurm-slehpc15-james-hpc-pg0-5:19609:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.286757] [slurm-slehpc15-james-hpc-pg0-10:18739:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.289815] [slurm-slehpc15-james-hpc-pg0-10:18759:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.289958] [slurm-slehpc15-james-hpc-pg0-5:19603:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.290524] [slurm-slehpc15-james-hpc-pg0-10:18756:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.291758] [slurm-slehpc15-james-hpc-pg0-5:19604:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.298069] [slurm-slehpc15-james-hpc-pg0-10:18749:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.299558] [slurm-slehpc15-james-hpc-pg0-10:18757:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.300102] [slurm-slehpc15-james-hpc-pg0-10:18747:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.304372] [slurm-slehpc15-james-hpc-pg0-10:18754:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.305833] [slurm-slehpc15-james-hpc-pg0-5:19607:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.306775] [slurm-slehpc15-james-hpc-pg0-5:19611:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.308042] [slurm-slehpc15-james-hpc-pg0-10:18758:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.311813] [slurm-slehpc15-james-hpc-pg0-10:18760:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.312666] [slurm-slehpc15-james-hpc-pg0-10:18751:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.313451] [slurm-slehpc15-james-hpc-pg0-10:18752:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.314476] [slurm-slehpc15-james-hpc-pg0-10:18761:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.314784] [slurm-slehpc15-james-hpc-pg0-5:19610:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.316188] [slurm-slehpc15-james-hpc-pg0-10:18755:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.319197] [slurm-slehpc15-james-hpc-pg0-10:18745:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.320134] [slurm-slehpc15-james-hpc-pg0-10:18748:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.320292] [slurm-slehpc15-james-hpc-pg0-5:19605:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.321805] [slurm-slehpc15-james-hpc-pg0-10:18750:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.324342] [slurm-slehpc15-james-hpc-pg0-5:19612:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.326452] [slurm-slehpc15-james-hpc-pg0-5:19608:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.329311] [slurm-slehpc15-james-hpc-pg0-5:19606:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.333021] [slurm-slehpc15-james-hpc-pg0-5:19613:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.338318] [slurm-slehpc15-james-hpc-pg0-5:19621:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.338737] [slurm-slehpc15-james-hpc-pg0-5:19619:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.341938] [slurm-slehpc15-james-hpc-pg0-5:19618:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.342777] [slurm-slehpc15-james-hpc-pg0-5:19614:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.344331] [slurm-slehpc15-james-hpc-pg0-5:19616:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.346456] [slurm-slehpc15-james-hpc-pg0-5:19623:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.348543] [slurm-slehpc15-james-hpc-pg0-5:19615:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.350274] [slurm-slehpc15-james-hpc-pg0-5:19624:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.350961] [slurm-slehpc15-james-hpc-pg0-5:19617:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.356654] [slurm-slehpc15-james-hpc-pg0-5:19620:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.384363] [slurm-slehpc15-james-hpc-pg0-5:19622:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.563448] [slurm-slehpc15-james-hpc-pg0-1:19582:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.699814] [slurm-slehpc15-james-hpc-pg0-1:19585:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.712001] [slurm-slehpc15-james-hpc-pg0-1:19592:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.734386] [slurm-slehpc15-james-hpc-pg0-1:19604:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.735417] [slurm-slehpc15-james-hpc-pg0-1:19588:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.736451] [slurm-slehpc15-james-hpc-pg0-1:19598:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.740331] [slurm-slehpc15-james-hpc-pg0-1:19594:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.741306] [slurm-slehpc15-james-hpc-pg0-1:19610:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.744685] [slurm-slehpc15-james-hpc-pg0-1:19578:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.746980] [slurm-slehpc15-james-hpc-pg0-1:19571:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.747790] [slurm-slehpc15-james-hpc-pg0-1:19570:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.774834] [slurm-slehpc15-james-hpc-pg0-1:19607:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.775937] [slurm-slehpc15-james-hpc-pg0-1:19601:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.778304] [slurm-slehpc15-james-hpc-pg0-1:19600:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.779278] [slurm-slehpc15-james-hpc-pg0-1:19609:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.779669] [slurm-slehpc15-james-hpc-pg0-1:19580:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.783259] [slurm-slehpc15-james-hpc-pg0-1:19602:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.792719] [slurm-slehpc15-james-hpc-pg0-1:19595:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.792902] [slurm-slehpc15-james-hpc-pg0-1:19605:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.793756] [slurm-slehpc15-james-hpc-pg0-1:19606:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.803893] [slurm-slehpc15-james-hpc-pg0-1:19586:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.804456] [slurm-slehpc15-james-hpc-pg0-1:19587:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.805128] [slurm-slehpc15-james-hpc-pg0-1:19591:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.806193] [slurm-slehpc15-james-hpc-pg0-1:19613:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.806516] [slurm-slehpc15-james-hpc-pg0-1:19572:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.807401] [slurm-slehpc15-james-hpc-pg0-1:19593:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.808578] [slurm-slehpc15-james-hpc-pg0-1:19579:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.810720] [slurm-slehpc15-james-hpc-pg0-1:19576:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.811578] [slurm-slehpc15-james-hpc-pg0-1:19573:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.812734] [slurm-slehpc15-james-hpc-pg0-1:19590:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.816201] [slurm-slehpc15-james-hpc-pg0-1:19611:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.816933] [slurm-slehpc15-james-hpc-pg0-1:19584:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.817484] [slurm-slehpc15-james-hpc-pg0-1:19577:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.818092] [slurm-slehpc15-james-hpc-pg0-1:19589:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.818661] [slurm-slehpc15-james-hpc-pg0-1:19612:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.819113] [slurm-slehpc15-james-hpc-pg0-1:19596:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.819601] [slurm-slehpc15-james-hpc-pg0-1:19599:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.820080] [slurm-slehpc15-james-hpc-pg0-1:19574:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.820489] [slurm-slehpc15-james-hpc-pg0-1:19575:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.820877] [slurm-slehpc15-james-hpc-pg0-1:19603:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.821503] [slurm-slehpc15-james-hpc-pg0-1:19608:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.822151] [slurm-slehpc15-james-hpc-pg0-1:19583:0] parser.c:1895 UCX INFO UCX_* env variables: UCX_LOG_LEVEL=info UCX_POSIX_USE_PROC_LINK=n | |
[1665113880.831167] [slurm-slehpc15-james-hpc-pg0-1:19581:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831163] [slurm-slehpc15-james-hpc-pg0-1:19582:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831265] [slurm-slehpc15-james-hpc-pg0-1:19582:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831211] [slurm-slehpc15-james-hpc-pg0-1:19604:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831163] [slurm-slehpc15-james-hpc-pg0-1:19585:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831264] [slurm-slehpc15-james-hpc-pg0-1:19585:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831219] [slurm-slehpc15-james-hpc-pg0-1:19598:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831308] [slurm-slehpc15-james-hpc-pg0-1:19598:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831190] [slurm-slehpc15-james-hpc-pg0-1:19588:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831295] [slurm-slehpc15-james-hpc-pg0-1:19588:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831215] [slurm-slehpc15-james-hpc-pg0-1:19592:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831309] [slurm-slehpc15-james-hpc-pg0-1:19592:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831249] [slurm-slehpc15-james-hpc-pg0-1:19570:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831211] [slurm-slehpc15-james-hpc-pg0-1:19594:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831310] [slurm-slehpc15-james-hpc-pg0-1:19594:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831262] [slurm-slehpc15-james-hpc-pg0-1:19571:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831347] [slurm-slehpc15-james-hpc-pg0-1:19571:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831230] [slurm-slehpc15-james-hpc-pg0-1:19578:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831326] [slurm-slehpc15-james-hpc-pg0-1:19578:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831273] [slurm-slehpc15-james-hpc-pg0-1:19610:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831313] [slurm-slehpc15-james-hpc-pg0-1:19609:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831211] [slurm-slehpc15-james-hpc-pg0-1:19597:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831308] [slurm-slehpc15-james-hpc-pg0-1:19604:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831311] [slurm-slehpc15-james-hpc-pg0-1:19602:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831320] [slurm-slehpc15-james-hpc-pg0-1:19601:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831299] [slurm-slehpc15-james-hpc-pg0-1:19580:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831313] [slurm-slehpc15-james-hpc-pg0-1:19607:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831321] [slurm-slehpc15-james-hpc-pg0-1:19595:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831338] [slurm-slehpc15-james-hpc-pg0-1:19570:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831383] [slurm-slehpc15-james-hpc-pg0-1:19579:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831384] [slurm-slehpc15-james-hpc-pg0-1:19610:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831395] [slurm-slehpc15-james-hpc-pg0-1:19609:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831346] [slurm-slehpc15-james-hpc-pg0-1:19581:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831399] [slurm-slehpc15-james-hpc-pg0-1:19587:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831399] [slurm-slehpc15-james-hpc-pg0-1:19602:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831401] [slurm-slehpc15-james-hpc-pg0-1:19601:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831390] [slurm-slehpc15-james-hpc-pg0-1:19580:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831398] [slurm-slehpc15-james-hpc-pg0-1:19607:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831377] [slurm-slehpc15-james-hpc-pg0-1:19600:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831403] [slurm-slehpc15-james-hpc-pg0-1:19595:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831387] [slurm-slehpc15-james-hpc-pg0-1:19591:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831415] [slurm-slehpc15-james-hpc-pg0-1:19605:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831467] [slurm-slehpc15-james-hpc-pg0-1:19600:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831440] [slurm-slehpc15-james-hpc-pg0-1:19590:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831471] [slurm-slehpc15-james-hpc-pg0-1:19591:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831458] [slurm-slehpc15-james-hpc-pg0-1:19576:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831471] [slurm-slehpc15-james-hpc-pg0-1:19579:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831490] [slurm-slehpc15-james-hpc-pg0-1:19606:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831485] [slurm-slehpc15-james-hpc-pg0-1:19586:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831453] [slurm-slehpc15-james-hpc-pg0-1:19572:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831488] [slurm-slehpc15-james-hpc-pg0-1:19587:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831504] [slurm-slehpc15-james-hpc-pg0-1:19605:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831501] [slurm-slehpc15-james-hpc-pg0-1:19589:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831548] [slurm-slehpc15-james-hpc-pg0-1:19574:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831633] [slurm-slehpc15-james-hpc-pg0-1:19574:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831546] [slurm-slehpc15-james-hpc-pg0-1:19584:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831632] [slurm-slehpc15-james-hpc-pg0-1:19584:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831550] [slurm-slehpc15-james-hpc-pg0-1:19593:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831543] [slurm-slehpc15-james-hpc-pg0-1:19576:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831579] [slurm-slehpc15-james-hpc-pg0-1:19606:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831580] [slurm-slehpc15-james-hpc-pg0-1:19577:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831656] [slurm-slehpc15-james-hpc-pg0-1:19577:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831570] [slurm-slehpc15-james-hpc-pg0-1:19586:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831591] [slurm-slehpc15-james-hpc-pg0-1:19612:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831672] [slurm-slehpc15-james-hpc-pg0-1:19612:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831541] [slurm-slehpc15-james-hpc-pg0-1:19572:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831525] [slurm-slehpc15-james-hpc-pg0-1:19590:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831615] [slurm-slehpc15-james-hpc-pg0-1:19603:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831587] [slurm-slehpc15-james-hpc-pg0-1:19589:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831631] [slurm-slehpc15-james-hpc-pg0-1:19613:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831715] [slurm-slehpc15-james-hpc-pg0-1:19613:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831594] [slurm-slehpc15-james-hpc-pg0-1:19599:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831674] [slurm-slehpc15-james-hpc-pg0-1:19599:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831702] [slurm-slehpc15-james-hpc-pg0-1:19603:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831672] [slurm-slehpc15-james-hpc-pg0-1:19593:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831585] [slurm-slehpc15-james-hpc-pg0-1:19583:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831664] [slurm-slehpc15-james-hpc-pg0-1:19583:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831681] [slurm-slehpc15-james-hpc-pg0-1:19608:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831588] [slurm-slehpc15-james-hpc-pg0-1:19611:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831674] [slurm-slehpc15-james-hpc-pg0-1:19611:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831655] [slurm-slehpc15-james-hpc-pg0-1:19597:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831619] [slurm-slehpc15-james-hpc-pg0-1:19575:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831706] [slurm-slehpc15-james-hpc-pg0-1:19575:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831767] [slurm-slehpc15-james-hpc-pg0-1:19608:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.831935] [slurm-slehpc15-james-hpc-pg0-1:19596:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832038] [slurm-slehpc15-james-hpc-pg0-1:19596:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832785] [slurm-slehpc15-james-hpc-pg0-1:19573:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832879] [slurm-slehpc15-james-hpc-pg0-1:19573:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800142] [slurm-slehpc15-james-hpc-pg0-6:19317:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800142] [slurm-slehpc15-james-hpc-pg0-6:19311:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800247] [slurm-slehpc15-james-hpc-pg0-6:19311:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832772] [slurm-slehpc15-james-hpc-pg0-5:19586:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800142] [slurm-slehpc15-james-hpc-pg0-6:19316:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800247] [slurm-slehpc15-james-hpc-pg0-6:19316:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-2:19210] [[29275,1],58] selected pml cm, but peer [[29275,1],0] on slurm-slehpc15-james-hpc-pg0-1 selected pml ucx | |
[1665113880.823776] [slurm-slehpc15-james-hpc-pg0-2:19203:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832621] [slurm-slehpc15-james-hpc-pg0-8:19794:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832729] [slurm-slehpc15-james-hpc-pg0-8:19794:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-4:18734] [[29275,1],145] selected pml cm, but peer [[29275,1],0] on slurm-slehpc15-james-hpc-pg0-1 selected pml ucx | |
[1665113880.800223] [slurm-slehpc15-james-hpc-pg0-6:19313:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800347] [slurm-slehpc15-james-hpc-pg0-6:19313:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833193] [slurm-slehpc15-james-hpc-pg0-4:18727:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832884] [slurm-slehpc15-james-hpc-pg0-11:18759:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833008] [slurm-slehpc15-james-hpc-pg0-11:18759:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832771] [slurm-slehpc15-james-hpc-pg0-5:19587:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832877] [slurm-slehpc15-james-hpc-pg0-5:19587:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832620] [slurm-slehpc15-james-hpc-pg0-8:19792:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832729] [slurm-slehpc15-james-hpc-pg0-8:19792:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823779] [slurm-slehpc15-james-hpc-pg0-2:19198:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833158] [slurm-slehpc15-james-hpc-pg0-10:18722:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833264] [slurm-slehpc15-james-hpc-pg0-10:18722:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800243] [slurm-slehpc15-james-hpc-pg0-6:19315:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800329] [slurm-slehpc15-james-hpc-pg0-6:19315:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833189] [slurm-slehpc15-james-hpc-pg0-4:18728:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833179] [slurm-slehpc15-james-hpc-pg0-7:19006:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832619] [slurm-slehpc15-james-hpc-pg0-8:19790:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832728] [slurm-slehpc15-james-hpc-pg0-8:19790:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832918] [slurm-slehpc15-james-hpc-pg0-11:18760:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833008] [slurm-slehpc15-james-hpc-pg0-11:18760:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833234] [slurm-slehpc15-james-hpc-pg0-3:18718:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833165] [slurm-slehpc15-james-hpc-pg0-10:18721:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833262] [slurm-slehpc15-james-hpc-pg0-10:18721:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833225] [slurm-slehpc15-james-hpc-pg0-3:18715:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833225] [slurm-slehpc15-james-hpc-pg0-3:18713:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800243] [slurm-slehpc15-james-hpc-pg0-6:19312:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800333] [slurm-slehpc15-james-hpc-pg0-6:19312:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832683] [slurm-slehpc15-james-hpc-pg0-12:19271:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823776] [slurm-slehpc15-james-hpc-pg0-2:19202:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833182] [slurm-slehpc15-james-hpc-pg0-7:19003:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832620] [slurm-slehpc15-james-hpc-pg0-8:19795:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832731] [slurm-slehpc15-james-hpc-pg0-8:19795:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832697] [slurm-slehpc15-james-hpc-pg0-12:19269:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832784] [slurm-slehpc15-james-hpc-pg0-12:19269:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833193] [slurm-slehpc15-james-hpc-pg0-7:18999:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800224] [slurm-slehpc15-james-hpc-pg0-6:19314:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800312] [slurm-slehpc15-james-hpc-pg0-6:19314:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833225] [slurm-slehpc15-james-hpc-pg0-3:18716:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832884] [slurm-slehpc15-james-hpc-pg0-11:18761:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832989] [slurm-slehpc15-james-hpc-pg0-11:18761:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832680] [slurm-slehpc15-james-hpc-pg0-12:19273:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832784] [slurm-slehpc15-james-hpc-pg0-12:19273:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833179] [slurm-slehpc15-james-hpc-pg0-7:19001:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832620] [slurm-slehpc15-james-hpc-pg0-8:19793:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832729] [slurm-slehpc15-james-hpc-pg0-8:19793:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832680] [slurm-slehpc15-james-hpc-pg0-12:19274:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833225] [slurm-slehpc15-james-hpc-pg0-3:18726:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823776] [slurm-slehpc15-james-hpc-pg0-2:19204:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823891] [slurm-slehpc15-james-hpc-pg0-2:19204:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832680] [slurm-slehpc15-james-hpc-pg0-12:19270:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832784] [slurm-slehpc15-james-hpc-pg0-12:19270:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832697] [slurm-slehpc15-james-hpc-pg0-12:19272:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832784] [slurm-slehpc15-james-hpc-pg0-12:19272:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800242] [slurm-slehpc15-james-hpc-pg0-6:19319:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800331] [slurm-slehpc15-james-hpc-pg0-6:19319:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832702] [slurm-slehpc15-james-hpc-pg0-12:19277:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832806] [slurm-slehpc15-james-hpc-pg0-12:19277:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823778] [slurm-slehpc15-james-hpc-pg0-2:19199:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823886] [slurm-slehpc15-james-hpc-pg0-2:19199:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833225] [slurm-slehpc15-james-hpc-pg0-3:18723:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832770] [slurm-slehpc15-james-hpc-pg0-12:19276:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832860] [slurm-slehpc15-james-hpc-pg0-12:19276:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823806] [slurm-slehpc15-james-hpc-pg0-2:19196:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823902] [slurm-slehpc15-james-hpc-pg0-2:19196:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833181] [slurm-slehpc15-james-hpc-pg0-7:19004:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833336] [slurm-slehpc15-james-hpc-pg0-3:18718:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832660] [slurm-slehpc15-james-hpc-pg0-8:19803:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832753] [slurm-slehpc15-james-hpc-pg0-8:19803:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832784] [slurm-slehpc15-james-hpc-pg0-12:19271:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833189] [slurm-slehpc15-james-hpc-pg0-4:18726:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823776] [slurm-slehpc15-james-hpc-pg0-2:19200:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823886] [slurm-slehpc15-james-hpc-pg0-2:19200:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833185] [slurm-slehpc15-james-hpc-pg0-7:19005:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833260] [slurm-slehpc15-james-hpc-pg0-3:18720:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833356] [slurm-slehpc15-james-hpc-pg0-3:18720:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832847] [slurm-slehpc15-james-hpc-pg0-12:19278:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823823] [slurm-slehpc15-james-hpc-pg0-2:19208:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833247] [slurm-slehpc15-james-hpc-pg0-7:19007:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833338] [slurm-slehpc15-james-hpc-pg0-7:19007:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833336] [slurm-slehpc15-james-hpc-pg0-3:18715:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800295] [slurm-slehpc15-james-hpc-pg0-6:19310:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832782] [slurm-slehpc15-james-hpc-pg0-12:19281:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832875] [slurm-slehpc15-james-hpc-pg0-12:19281:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823808] [slurm-slehpc15-james-hpc-pg0-2:19216:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823898] [slurm-slehpc15-james-hpc-pg0-2:19216:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833287] [slurm-slehpc15-james-hpc-pg0-7:19006:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833263] [slurm-slehpc15-james-hpc-pg0-3:18722:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833347] [slurm-slehpc15-james-hpc-pg0-3:18722:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832933] [slurm-slehpc15-james-hpc-pg0-11:18756:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833230] [slurm-slehpc15-james-hpc-pg0-9:18733:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832792] [slurm-slehpc15-james-hpc-pg0-12:19286:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832895] [slurm-slehpc15-james-hpc-pg0-12:19286:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832771] [slurm-slehpc15-james-hpc-pg0-5:19582:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832881] [slurm-slehpc15-james-hpc-pg0-5:19582:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823780] [slurm-slehpc15-james-hpc-pg0-2:19205:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823886] [slurm-slehpc15-james-hpc-pg0-2:19205:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833287] [slurm-slehpc15-james-hpc-pg0-7:19003:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833337] [slurm-slehpc15-james-hpc-pg0-3:18713:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832679] [slurm-slehpc15-james-hpc-pg0-8:19806:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833256] [slurm-slehpc15-james-hpc-pg0-9:18739:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832765] [slurm-slehpc15-james-hpc-pg0-12:19283:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832864] [slurm-slehpc15-james-hpc-pg0-12:19283:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832771] [slurm-slehpc15-james-hpc-pg0-5:19589:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832877] [slurm-slehpc15-james-hpc-pg0-5:19589:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823904] [slurm-slehpc15-james-hpc-pg0-2:19206:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833288] [slurm-slehpc15-james-hpc-pg0-7:18999:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833338] [slurm-slehpc15-james-hpc-pg0-3:18716:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800244] [slurm-slehpc15-james-hpc-pg0-6:19323:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800329] [slurm-slehpc15-james-hpc-pg0-6:19323:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832627] [slurm-slehpc15-james-hpc-pg0-8:19805:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832729] [slurm-slehpc15-james-hpc-pg0-8:19805:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833265] [slurm-slehpc15-james-hpc-pg0-9:18740:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832809] [slurm-slehpc15-james-hpc-pg0-12:19280:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832902] [slurm-slehpc15-james-hpc-pg0-12:19280:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832771] [slurm-slehpc15-james-hpc-pg0-5:19588:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832877] [slurm-slehpc15-james-hpc-pg0-5:19588:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823869] [slurm-slehpc15-james-hpc-pg0-2:19197:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823964] [slurm-slehpc15-james-hpc-pg0-2:19197:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833288] [slurm-slehpc15-james-hpc-pg0-7:19001:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833348] [slurm-slehpc15-james-hpc-pg0-3:18719:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800236] [slurm-slehpc15-james-hpc-pg0-6:19324:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800324] [slurm-slehpc15-james-hpc-pg0-6:19324:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832659] [slurm-slehpc15-james-hpc-pg0-8:19801:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832752] [slurm-slehpc15-james-hpc-pg0-8:19801:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833270] [slurm-slehpc15-james-hpc-pg0-9:18732:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832822] [slurm-slehpc15-james-hpc-pg0-12:19282:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832906] [slurm-slehpc15-james-hpc-pg0-12:19282:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832771] [slurm-slehpc15-james-hpc-pg0-5:19584:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832877] [slurm-slehpc15-james-hpc-pg0-5:19584:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823886] [slurm-slehpc15-james-hpc-pg0-2:19203:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833165] [slurm-slehpc15-james-hpc-pg0-10:18726:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833263] [slurm-slehpc15-james-hpc-pg0-10:18726:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833288] [slurm-slehpc15-james-hpc-pg0-7:19004:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833319] [slurm-slehpc15-james-hpc-pg0-3:18721:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833410] [slurm-slehpc15-james-hpc-pg0-3:18721:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800223] [slurm-slehpc15-james-hpc-pg0-6:19322:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800309] [slurm-slehpc15-james-hpc-pg0-6:19322:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832801] [slurm-slehpc15-james-hpc-pg0-8:19799:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833262] [slurm-slehpc15-james-hpc-pg0-9:18741:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832758] [slurm-slehpc15-james-hpc-pg0-12:19279:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832848] [slurm-slehpc15-james-hpc-pg0-12:19279:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832774] [slurm-slehpc15-james-hpc-pg0-5:19583:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832879] [slurm-slehpc15-james-hpc-pg0-5:19583:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823895] [slurm-slehpc15-james-hpc-pg0-2:19198:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833258] [slurm-slehpc15-james-hpc-pg0-7:19008:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833353] [slurm-slehpc15-james-hpc-pg0-7:19008:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833336] [slurm-slehpc15-james-hpc-pg0-3:18726:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800213] [slurm-slehpc15-james-hpc-pg0-6:19309:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800309] [slurm-slehpc15-james-hpc-pg0-6:19309:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832659] [slurm-slehpc15-james-hpc-pg0-8:19791:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832751] [slurm-slehpc15-james-hpc-pg0-8:19791:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833256] [slurm-slehpc15-james-hpc-pg0-9:18742:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833364] [slurm-slehpc15-james-hpc-pg0-9:18742:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832819] [slurm-slehpc15-james-hpc-pg0-12:19287:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832922] [slurm-slehpc15-james-hpc-pg0-12:19287:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832791] [slurm-slehpc15-james-hpc-pg0-5:19581:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832877] [slurm-slehpc15-james-hpc-pg0-5:19581:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833188] [slurm-slehpc15-james-hpc-pg0-4:18723:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823887] [slurm-slehpc15-james-hpc-pg0-2:19202:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833277] [slurm-slehpc15-james-hpc-pg0-7:19002:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833370] [slurm-slehpc15-james-hpc-pg0-7:19002:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833294] [slurm-slehpc15-james-hpc-pg0-3:18724:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833377] [slurm-slehpc15-james-hpc-pg0-3:18724:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800243] [slurm-slehpc15-james-hpc-pg0-6:19320:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800333] [slurm-slehpc15-james-hpc-pg0-6:19320:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832719] [slurm-slehpc15-james-hpc-pg0-8:19796:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832812] [slurm-slehpc15-james-hpc-pg0-8:19796:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833259] [slurm-slehpc15-james-hpc-pg0-9:18730:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833364] [slurm-slehpc15-james-hpc-pg0-9:18730:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832846] [slurm-slehpc15-james-hpc-pg0-12:19292:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832933] [slurm-slehpc15-james-hpc-pg0-12:19292:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832787] [slurm-slehpc15-james-hpc-pg0-5:19591:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832877] [slurm-slehpc15-james-hpc-pg0-5:19591:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823845] [slurm-slehpc15-james-hpc-pg0-2:19209:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823939] [slurm-slehpc15-james-hpc-pg0-2:19209:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823964] [slurm-slehpc15-james-hpc-pg0-2:19209:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.823980] [slurm-slehpc15-james-hpc-pg0-2:19209:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833284] [slurm-slehpc15-james-hpc-pg0-7:19012:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833373] [slurm-slehpc15-james-hpc-pg0-7:19012:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833290] [slurm-slehpc15-james-hpc-pg0-3:18727:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833385] [slurm-slehpc15-james-hpc-pg0-3:18727:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800307] [slurm-slehpc15-james-hpc-pg0-6:19326:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800398] [slurm-slehpc15-james-hpc-pg0-6:19326:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832699] [slurm-slehpc15-james-hpc-pg0-8:19808:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832813] [slurm-slehpc15-james-hpc-pg0-8:19808:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833260] [slurm-slehpc15-james-hpc-pg0-9:18731:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833363] [slurm-slehpc15-james-hpc-pg0-9:18731:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832955] [slurm-slehpc15-james-hpc-pg0-12:19288:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832842] [slurm-slehpc15-james-hpc-pg0-5:19590:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832956] [slurm-slehpc15-james-hpc-pg0-5:19590:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823953] [slurm-slehpc15-james-hpc-pg0-2:19208:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833296] [slurm-slehpc15-james-hpc-pg0-7:19011:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833378] [slurm-slehpc15-james-hpc-pg0-7:19011:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833336] [slurm-slehpc15-james-hpc-pg0-3:18723:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800284] [slurm-slehpc15-james-hpc-pg0-6:19327:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800373] [slurm-slehpc15-james-hpc-pg0-6:19327:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832926] [slurm-slehpc15-james-hpc-pg0-11:18765:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833011] [slurm-slehpc15-james-hpc-pg0-11:18765:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832721] [slurm-slehpc15-james-hpc-pg0-8:19798:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832808] [slurm-slehpc15-james-hpc-pg0-8:19798:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833357] [slurm-slehpc15-james-hpc-pg0-9:18745:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833444] [slurm-slehpc15-james-hpc-pg0-9:18745:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832891] [slurm-slehpc15-james-hpc-pg0-12:19285:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832976] [slurm-slehpc15-james-hpc-pg0-12:19285:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832860] [slurm-slehpc15-james-hpc-pg0-5:19594:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832998] [slurm-slehpc15-james-hpc-pg0-5:19594:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833189] [slurm-slehpc15-james-hpc-pg0-4:18724:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823900] [slurm-slehpc15-james-hpc-pg0-2:19215:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823991] [slurm-slehpc15-james-hpc-pg0-2:19215:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833269] [slurm-slehpc15-james-hpc-pg0-7:18998:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833390] [slurm-slehpc15-james-hpc-pg0-7:18998:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833409] [slurm-slehpc15-james-hpc-pg0-3:18730:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833496] [slurm-slehpc15-james-hpc-pg0-3:18730:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800324] [slurm-slehpc15-james-hpc-pg0-6:19329:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800412] [slurm-slehpc15-james-hpc-pg0-6:19329:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832785] [slurm-slehpc15-james-hpc-pg0-8:19789:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832870] [slurm-slehpc15-james-hpc-pg0-8:19789:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833278] [slurm-slehpc15-james-hpc-pg0-9:18728:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833365] [slurm-slehpc15-james-hpc-pg0-9:18728:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832930] [slurm-slehpc15-james-hpc-pg0-12:19278:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832826] [slurm-slehpc15-james-hpc-pg0-5:19593:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832942] [slurm-slehpc15-james-hpc-pg0-5:19593:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833294] [slurm-slehpc15-james-hpc-pg0-7:19013:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833380] [slurm-slehpc15-james-hpc-pg0-7:19013:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833411] [slurm-slehpc15-james-hpc-pg0-3:18729:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833498] [slurm-slehpc15-james-hpc-pg0-3:18729:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800417] [slurm-slehpc15-james-hpc-pg0-6:19310:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832702] [slurm-slehpc15-james-hpc-pg0-8:19807:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832787] [slurm-slehpc15-james-hpc-pg0-8:19807:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833316] [slurm-slehpc15-james-hpc-pg0-9:18746:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833409] [slurm-slehpc15-james-hpc-pg0-9:18746:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832961] [slurm-slehpc15-james-hpc-pg0-12:19290:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832848] [slurm-slehpc15-james-hpc-pg0-5:19592:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832953] [slurm-slehpc15-james-hpc-pg0-5:19592:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833233] [slurm-slehpc15-james-hpc-pg0-7:19010:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833338] [slurm-slehpc15-james-hpc-pg0-7:19010:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833442] [slurm-slehpc15-james-hpc-pg0-3:18719:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800378] [slurm-slehpc15-james-hpc-pg0-6:19328:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800463] [slurm-slehpc15-james-hpc-pg0-6:19328:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832796] [slurm-slehpc15-james-hpc-pg0-8:19802:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832892] [slurm-slehpc15-james-hpc-pg0-8:19802:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833406] [slurm-slehpc15-james-hpc-pg0-9:18736:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832961] [slurm-slehpc15-james-hpc-pg0-12:19291:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832832] [slurm-slehpc15-james-hpc-pg0-5:19597:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832987] [slurm-slehpc15-james-hpc-pg0-5:19597:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833235] [slurm-slehpc15-james-hpc-pg0-7:19009:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833318] [slurm-slehpc15-james-hpc-pg0-7:19009:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833450] [slurm-slehpc15-james-hpc-pg0-3:18731:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800376] [slurm-slehpc15-james-hpc-pg0-6:19321:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800459] [slurm-slehpc15-james-hpc-pg0-6:19321:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832936] [slurm-slehpc15-james-hpc-pg0-11:18762:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833023] [slurm-slehpc15-james-hpc-pg0-11:18762:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832686] [slurm-slehpc15-james-hpc-pg0-8:19800:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832773] [slurm-slehpc15-james-hpc-pg0-8:19800:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833357] [slurm-slehpc15-james-hpc-pg0-9:18744:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833445] [slurm-slehpc15-james-hpc-pg0-9:18744:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832960] [slurm-slehpc15-james-hpc-pg0-12:19295:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832862] [slurm-slehpc15-james-hpc-pg0-5:19595:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832974] [slurm-slehpc15-james-hpc-pg0-5:19595:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833297] [slurm-slehpc15-james-hpc-pg0-7:19016:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833381] [slurm-slehpc15-james-hpc-pg0-7:19016:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833381] [slurm-slehpc15-james-hpc-pg0-3:18728:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833464] [slurm-slehpc15-james-hpc-pg0-3:18728:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800370] [slurm-slehpc15-james-hpc-pg0-6:19325:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800456] [slurm-slehpc15-james-hpc-pg0-6:19325:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832946] [slurm-slehpc15-james-hpc-pg0-11:18763:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833045] [slurm-slehpc15-james-hpc-pg0-11:18763:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832804] [slurm-slehpc15-james-hpc-pg0-8:19804:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832888] [slurm-slehpc15-james-hpc-pg0-8:19804:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833303] [slurm-slehpc15-james-hpc-pg0-9:18727:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833391] [slurm-slehpc15-james-hpc-pg0-9:18727:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832981] [slurm-slehpc15-james-hpc-pg0-12:19289:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832844] [slurm-slehpc15-james-hpc-pg0-5:19596:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832952] [slurm-slehpc15-james-hpc-pg0-5:19596:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833189] [slurm-slehpc15-james-hpc-pg0-4:18722:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833165] [slurm-slehpc15-james-hpc-pg0-10:18720:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833262] [slurm-slehpc15-james-hpc-pg0-10:18720:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833291] [slurm-slehpc15-james-hpc-pg0-7:19014:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833378] [slurm-slehpc15-james-hpc-pg0-7:19014:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-2:19204] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833464] [slurm-slehpc15-james-hpc-pg0-3:18714:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800452] [slurm-slehpc15-james-hpc-pg0-6:19332:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832963] [slurm-slehpc15-james-hpc-pg0-11:18766:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833078] [slurm-slehpc15-james-hpc-pg0-11:18766:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832737] [slurm-slehpc15-james-hpc-pg0-8:19810:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832841] [slurm-slehpc15-james-hpc-pg0-8:19810:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833368] [slurm-slehpc15-james-hpc-pg0-9:18739:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19348] [[29275,1],260] selected pml cm, but peer [[29275,1],0] on slurm-slehpc15-james-hpc-pg0-1 selected pml ucx | |
[slurm-slehpc15-james-hpc-pg0-2:19205] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.832968] [slurm-slehpc15-james-hpc-pg0-12:19302:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832880] [slurm-slehpc15-james-hpc-pg0-5:19602:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833030] [slurm-slehpc15-james-hpc-pg0-5:19602:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833188] [slurm-slehpc15-james-hpc-pg0-4:18729:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833317] [slurm-slehpc15-james-hpc-pg0-7:19015:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833410] [slurm-slehpc15-james-hpc-pg0-7:19015:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833400] [slurm-slehpc15-james-hpc-pg0-3:18734:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833500] [slurm-slehpc15-james-hpc-pg0-3:18734:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19822] [[29275,1],342] selected pml cm, but peer [[29275,1],0] on slurm-slehpc15-james-hpc-pg0-1 selected pml ucx | |
[1665113880.800393] [slurm-slehpc15-james-hpc-pg0-6:19335:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800490] [slurm-slehpc15-james-hpc-pg0-6:19335:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832946] [slurm-slehpc15-james-hpc-pg0-11:18764:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833033] [slurm-slehpc15-james-hpc-pg0-11:18764:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832901] [slurm-slehpc15-james-hpc-pg0-8:19799:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833363] [slurm-slehpc15-james-hpc-pg0-9:18740:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832961] [slurm-slehpc15-james-hpc-pg0-12:19294:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-2:19208] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833038] [slurm-slehpc15-james-hpc-pg0-5:19601:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833203] [slurm-slehpc15-james-hpc-pg0-4:18732:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833297] [slurm-slehpc15-james-hpc-pg0-7:19017:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833380] [slurm-slehpc15-james-hpc-pg0-7:19017:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833448] [slurm-slehpc15-james-hpc-pg0-3:18732:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800403] [slurm-slehpc15-james-hpc-pg0-6:19330:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800502] [slurm-slehpc15-james-hpc-pg0-6:19330:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-2:19209] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.832948] [slurm-slehpc15-james-hpc-pg0-11:18755:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833035] [slurm-slehpc15-james-hpc-pg0-11:18755:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832860] [slurm-slehpc15-james-hpc-pg0-8:19812:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832964] [slurm-slehpc15-james-hpc-pg0-8:19812:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833289] [slurm-slehpc15-james-hpc-pg0-9:18737:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833394] [slurm-slehpc15-james-hpc-pg0-9:18737:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832894] [slurm-slehpc15-james-hpc-pg0-12:19275:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832976] [slurm-slehpc15-james-hpc-pg0-12:19275:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832848] [slurm-slehpc15-james-hpc-pg0-5:19599:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832954] [slurm-slehpc15-james-hpc-pg0-5:19599:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833291] [slurm-slehpc15-james-hpc-pg0-4:18727:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833287] [slurm-slehpc15-james-hpc-pg0-7:19005:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833470] [slurm-slehpc15-james-hpc-pg0-3:18736:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800397] [slurm-slehpc15-james-hpc-pg0-6:19333:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800490] [slurm-slehpc15-james-hpc-pg0-6:19333:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832972] [slurm-slehpc15-james-hpc-pg0-11:18757:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833067] [slurm-slehpc15-james-hpc-pg0-11:18757:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832956] [slurm-slehpc15-james-hpc-pg0-8:19809:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833303] [slurm-slehpc15-james-hpc-pg0-9:18747:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833396] [slurm-slehpc15-james-hpc-pg0-9:18747:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832961] [slurm-slehpc15-james-hpc-pg0-12:19293:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832997] [slurm-slehpc15-james-hpc-pg0-5:19598:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833084] [slurm-slehpc15-james-hpc-pg0-5:19598:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18796] [[29275,1],482] selected pml cm, but peer [[29275,1],0] on slurm-slehpc15-james-hpc-pg0-1 selected pml ucx | |
-------------------------------------------------------------------------- | |
MPI_INIT has failed because at least one MPI process is unreachable | |
from another. This *usually* means that an underlying communication | |
plugin -- such as a BTL or an MTL -- has either not loaded or not | |
allowed itself to be used. Your MPI job will now abort. | |
You may wish to try to narrow down the problem; | |
* Check the output of ompi_info to see which BTL/MTL plugins are | |
available. | |
* Run your application with MPI_THREAD_SINGLE. | |
* Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose, | |
if using MTL-based communications) to see exactly which | |
communication plugins were considered and/or discarded. | |
-------------------------------------------------------------------------- | |
[1665113880.833292] [slurm-slehpc15-james-hpc-pg0-4:18728:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823914] [slurm-slehpc15-james-hpc-pg0-2:19212:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824001] [slurm-slehpc15-james-hpc-pg0-2:19212:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833373] [slurm-slehpc15-james-hpc-pg0-7:19018:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833457] [slurm-slehpc15-james-hpc-pg0-7:19018:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833534] [slurm-slehpc15-james-hpc-pg0-3:18731:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800464] [slurm-slehpc15-james-hpc-pg0-6:19344:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800546] [slurm-slehpc15-james-hpc-pg0-6:19344:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832979] [slurm-slehpc15-james-hpc-pg0-11:18753:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833067] [slurm-slehpc15-james-hpc-pg0-11:18753:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832860] [slurm-slehpc15-james-hpc-pg0-8:19814:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832966] [slurm-slehpc15-james-hpc-pg0-8:19814:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833364] [slurm-slehpc15-james-hpc-pg0-9:18732:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832963] [slurm-slehpc15-james-hpc-pg0-12:19297:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833028] [slurm-slehpc15-james-hpc-pg0-5:19609:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833126] [slurm-slehpc15-james-hpc-pg0-5:19609:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833298] [slurm-slehpc15-james-hpc-pg0-4:18726:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823966] [slurm-slehpc15-james-hpc-pg0-2:19207:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833480] [slurm-slehpc15-james-hpc-pg0-7:19023:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833533] [slurm-slehpc15-james-hpc-pg0-3:18737:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800393] [slurm-slehpc15-james-hpc-pg0-6:19331:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800492] [slurm-slehpc15-james-hpc-pg0-6:19331:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833028] [slurm-slehpc15-james-hpc-pg0-11:18758:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833121] [slurm-slehpc15-james-hpc-pg0-11:18758:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19344] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18729] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.832861] [slurm-slehpc15-james-hpc-pg0-8:19813:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832968] [slurm-slehpc15-james-hpc-pg0-8:19813:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833355] [slurm-slehpc15-james-hpc-pg0-9:18743:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833490] [slurm-slehpc15-james-hpc-pg0-9:18743:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833047] [slurm-slehpc15-james-hpc-pg0-12:19288:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833023] [slurm-slehpc15-james-hpc-pg0-5:19603:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833142] [slurm-slehpc15-james-hpc-pg0-5:19603:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19346] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18730] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.833291] [slurm-slehpc15-james-hpc-pg0-4:18723:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823910] [slurm-slehpc15-james-hpc-pg0-2:19213:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823998] [slurm-slehpc15-james-hpc-pg0-2:19213:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833164] [slurm-slehpc15-james-hpc-pg0-10:18719:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833263] [slurm-slehpc15-james-hpc-pg0-10:18719:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833425] [slurm-slehpc15-james-hpc-pg0-7:19019:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833516] [slurm-slehpc15-james-hpc-pg0-7:19019:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-4:18731] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.833515] [slurm-slehpc15-james-hpc-pg0-3:18735:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800442] [slurm-slehpc15-james-hpc-pg0-6:19343:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800555] [slurm-slehpc15-james-hpc-pg0-6:19343:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833058] [slurm-slehpc15-james-hpc-pg0-11:18771:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833148] [slurm-slehpc15-james-hpc-pg0-11:18771:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832860] [slurm-slehpc15-james-hpc-pg0-8:19811:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832965] [slurm-slehpc15-james-hpc-pg0-8:19811:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
-------------------------------------------------------------------------- | |
It looks like MPI_INIT failed for some reason; your parallel process is | |
likely to abort. There are many reasons that a parallel process can | |
fail during MPI_INIT; some of which are due to configuration or environment | |
problems. This failure appears to be an internal failure; here's some | |
additional information (which may only be relevant to an Open MPI | |
developer): | |
PML add procs failed | |
--> Returned "Error" (-1) instead of "Success" (0) | |
-------------------------------------------------------------------------- | |
[1665113880.833415] [slurm-slehpc15-james-hpc-pg0-9:18738:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833504] [slurm-slehpc15-james-hpc-pg0-9:18738:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833064] [slurm-slehpc15-james-hpc-pg0-12:19290:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833026] [slurm-slehpc15-james-hpc-pg0-5:19610:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833126] [slurm-slehpc15-james-hpc-pg0-5:19610:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833292] [slurm-slehpc15-james-hpc-pg0-4:18724:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823909] [slurm-slehpc15-james-hpc-pg0-2:19214:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824009] [slurm-slehpc15-james-hpc-pg0-2:19214:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833171] [slurm-slehpc15-james-hpc-pg0-10:18724:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833263] [slurm-slehpc15-james-hpc-pg0-10:18724:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833483] [slurm-slehpc15-james-hpc-pg0-7:19024:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833556] [slurm-slehpc15-james-hpc-pg0-3:18714:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800475] [slurm-slehpc15-james-hpc-pg0-6:19346:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800560] [slurm-slehpc15-james-hpc-pg0-6:19346:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832981] [slurm-slehpc15-james-hpc-pg0-11:18770:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833066] [slurm-slehpc15-james-hpc-pg0-11:18770:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-4:18732] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.832860] [slurm-slehpc15-james-hpc-pg0-8:19828:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832966] [slurm-slehpc15-james-hpc-pg0-8:19828:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833400] [slurm-slehpc15-james-hpc-pg0-9:18735:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833490] [slurm-slehpc15-james-hpc-pg0-9:18735:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833064] [slurm-slehpc15-james-hpc-pg0-12:19291:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833025] [slurm-slehpc15-james-hpc-pg0-5:19604:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833124] [slurm-slehpc15-james-hpc-pg0-5:19604:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-10:18746] [[29275,1],424] selected pml cm, but peer [[29275,1],0] on slurm-slehpc15-james-hpc-pg0-1 selected pml ucx | |
[1665113880.833292] [slurm-slehpc15-james-hpc-pg0-4:18722:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823925] [slurm-slehpc15-james-hpc-pg0-2:19211:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824025] [slurm-slehpc15-james-hpc-pg0-2:19211:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833158] [slurm-slehpc15-james-hpc-pg0-10:18723:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833493] [slurm-slehpc15-james-hpc-pg0-7:19021:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833535] [slurm-slehpc15-james-hpc-pg0-3:18733:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800482] [slurm-slehpc15-james-hpc-pg0-6:19350:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800567] [slurm-slehpc15-james-hpc-pg0-6:19350:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833070] [slurm-slehpc15-james-hpc-pg0-11:18769:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833153] [slurm-slehpc15-james-hpc-pg0-11:18769:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832969] [slurm-slehpc15-james-hpc-pg0-8:19826:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833374] [slurm-slehpc15-james-hpc-pg0-9:18734:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833459] [slurm-slehpc15-james-hpc-pg0-9:18734:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833065] [slurm-slehpc15-james-hpc-pg0-12:19295:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832995] [slurm-slehpc15-james-hpc-pg0-5:19600:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833082] [slurm-slehpc15-james-hpc-pg0-5:19600:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19340] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[1665113880.833277] [slurm-slehpc15-james-hpc-pg0-4:18731:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833369] [slurm-slehpc15-james-hpc-pg0-4:18731:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824045] [slurm-slehpc15-james-hpc-pg0-2:19205:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824056] [slurm-slehpc15-james-hpc-pg0-2:19205:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833158] [slurm-slehpc15-james-hpc-pg0-10:18729:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833262] [slurm-slehpc15-james-hpc-pg0-10:18729:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833480] [slurm-slehpc15-james-hpc-pg0-7:19022:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19342] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19198] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833546] [slurm-slehpc15-james-hpc-pg0-3:18732:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800457] [slurm-slehpc15-james-hpc-pg0-6:19336:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800541] [slurm-slehpc15-james-hpc-pg0-6:19336:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833022] [slurm-slehpc15-james-hpc-pg0-11:18768:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833105] [slurm-slehpc15-james-hpc-pg0-11:18768:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832920] [slurm-slehpc15-james-hpc-pg0-8:19821:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833028] [slurm-slehpc15-james-hpc-pg0-8:19821:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19351] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19199] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833364] [slurm-slehpc15-james-hpc-pg0-9:18741:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833069] [slurm-slehpc15-james-hpc-pg0-12:19289:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833023] [slurm-slehpc15-james-hpc-pg0-5:19611:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833124] [slurm-slehpc15-james-hpc-pg0-5:19611:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833219] [slurm-slehpc15-james-hpc-pg0-4:18725:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.823991] [slurm-slehpc15-james-hpc-pg0-2:19206:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824129] [slurm-slehpc15-james-hpc-pg0-2:19206:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[slurm-slehpc15-james-hpc-pg0-6:19345] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19200] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833199] [slurm-slehpc15-james-hpc-pg0-10:18727:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833297] [slurm-slehpc15-james-hpc-pg0-10:18727:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833480] [slurm-slehpc15-james-hpc-pg0-7:19030:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833527] [slurm-slehpc15-james-hpc-pg0-3:18740:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833628] [slurm-slehpc15-james-hpc-pg0-3:18740:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800482] [slurm-slehpc15-james-hpc-pg0-6:19337:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800564] [slurm-slehpc15-james-hpc-pg0-6:19337:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-2:19203] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833127] [slurm-slehpc15-james-hpc-pg0-11:18767:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833216] [slurm-slehpc15-james-hpc-pg0-11:18767:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832910] [slurm-slehpc15-james-hpc-pg0-8:19820:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833492] [slurm-slehpc15-james-hpc-pg0-9:18736:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833064] [slurm-slehpc15-james-hpc-pg0-12:19302:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833025] [slurm-slehpc15-james-hpc-pg0-5:19607:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833123] [slurm-slehpc15-james-hpc-pg0-5:19607:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833252] [slurm-slehpc15-james-hpc-pg0-4:18735:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833342] [slurm-slehpc15-james-hpc-pg0-4:18735:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824072] [slurm-slehpc15-james-hpc-pg0-2:19204:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824087] [slurm-slehpc15-james-hpc-pg0-2:19204:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833176] [slurm-slehpc15-james-hpc-pg0-10:18718:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833264] [slurm-slehpc15-james-hpc-pg0-10:18718:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833488] [slurm-slehpc15-james-hpc-pg0-7:19037:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833522] [slurm-slehpc15-james-hpc-pg0-3:18743:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833629] [slurm-slehpc15-james-hpc-pg0-3:18743:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800471] [slurm-slehpc15-james-hpc-pg0-6:19340:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800560] [slurm-slehpc15-james-hpc-pg0-6:19340:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833014] [slurm-slehpc15-james-hpc-pg0-11:18772:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833133] [slurm-slehpc15-james-hpc-pg0-11:18772:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832975] [slurm-slehpc15-james-hpc-pg0-8:19829:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19814] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19202] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18736] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833484] [slurm-slehpc15-james-hpc-pg0-9:18759:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833585] [slurm-slehpc15-james-hpc-pg0-9:18759:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833066] [slurm-slehpc15-james-hpc-pg0-12:19308:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833252] [slurm-slehpc15-james-hpc-pg0-5:19585:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833292] [slurm-slehpc15-james-hpc-pg0-4:18729:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833166] [slurm-slehpc15-james-hpc-pg0-10:18728:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833262] [slurm-slehpc15-james-hpc-pg0-10:18728:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19336] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18791] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.833480] [slurm-slehpc15-james-hpc-pg0-7:19038:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833522] [slurm-slehpc15-james-hpc-pg0-3:18739:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833629] [slurm-slehpc15-james-hpc-pg0-3:18739:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833163] [slurm-slehpc15-james-hpc-pg0-11:18781:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833272] [slurm-slehpc15-james-hpc-pg0-11:18781:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832940] [slurm-slehpc15-james-hpc-pg0-8:19815:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833056] [slurm-slehpc15-james-hpc-pg0-8:19815:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19813] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18738] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833484] [slurm-slehpc15-james-hpc-pg0-9:18749:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833586] [slurm-slehpc15-james-hpc-pg0-9:18749:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833076] [slurm-slehpc15-james-hpc-pg0-12:19294:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833230] [slurm-slehpc15-james-hpc-pg0-5:19622:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833322] [slurm-slehpc15-james-hpc-pg0-5:19622:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833222] [slurm-slehpc15-james-hpc-pg0-4:18736:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833307] [slurm-slehpc15-james-hpc-pg0-4:18736:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19337] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19820] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833179] [slurm-slehpc15-james-hpc-pg0-10:18733:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833269] [slurm-slehpc15-james-hpc-pg0-10:18733:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833481] [slurm-slehpc15-james-hpc-pg0-7:19020:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833532] [slurm-slehpc15-james-hpc-pg0-3:18741:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833629] [slurm-slehpc15-james-hpc-pg0-3:18741:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800471] [slurm-slehpc15-james-hpc-pg0-6:19342:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800558] [slurm-slehpc15-james-hpc-pg0-6:19342:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19343] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19819] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19206] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833177] [slurm-slehpc15-james-hpc-pg0-11:18780:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833271] [slurm-slehpc15-james-hpc-pg0-11:18780:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833481] [slurm-slehpc15-james-hpc-pg0-9:18748:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833589] [slurm-slehpc15-james-hpc-pg0-9:18748:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833109] [slurm-slehpc15-james-hpc-pg0-12:19306:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833139] [slurm-slehpc15-james-hpc-pg0-5:19601:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833228] [slurm-slehpc15-james-hpc-pg0-4:18730:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833316] [slurm-slehpc15-james-hpc-pg0-4:18730:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19347] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19818] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19207] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.824099] [slurm-slehpc15-james-hpc-pg0-2:19202:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824115] [slurm-slehpc15-james-hpc-pg0-2:19202:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833277] [slurm-slehpc15-james-hpc-pg0-10:18732:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833362] [slurm-slehpc15-james-hpc-pg0-10:18732:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833499] [slurm-slehpc15-james-hpc-pg0-7:19040:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833537] [slurm-slehpc15-james-hpc-pg0-3:18742:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833628] [slurm-slehpc15-james-hpc-pg0-3:18742:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19817] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18741] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.800593] [slurm-slehpc15-james-hpc-pg0-6:19318:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800689] [slurm-slehpc15-james-hpc-pg0-6:19318:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833030] [slurm-slehpc15-james-hpc-pg0-11:18774:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833122] [slurm-slehpc15-james-hpc-pg0-11:18774:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832923] [slurm-slehpc15-james-hpc-pg0-8:19830:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833031] [slurm-slehpc15-james-hpc-pg0-8:19830:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833566] [slurm-slehpc15-james-hpc-pg0-9:18752:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833658] [slurm-slehpc15-james-hpc-pg0-9:18752:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19825] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833002] [slurm-slehpc15-james-hpc-pg0-12:19296:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833105] [slurm-slehpc15-james-hpc-pg0-12:19296:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833089] [slurm-slehpc15-james-hpc-pg0-5:19608:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833221] [slurm-slehpc15-james-hpc-pg0-5:19608:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833327] [slurm-slehpc15-james-hpc-pg0-4:18741:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833262] [slurm-slehpc15-james-hpc-pg0-10:18730:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833351] [slurm-slehpc15-james-hpc-pg0-10:18730:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-10:18742] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833513] [slurm-slehpc15-james-hpc-pg0-7:19028:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833582] [slurm-slehpc15-james-hpc-pg0-3:18745:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800608] [slurm-slehpc15-james-hpc-pg0-6:19347:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800687] [slurm-slehpc15-james-hpc-pg0-6:19347:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833177] [slurm-slehpc15-james-hpc-pg0-11:18777:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833272] [slurm-slehpc15-james-hpc-pg0-11:18777:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832952] [slurm-slehpc15-james-hpc-pg0-8:19819:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833063] [slurm-slehpc15-james-hpc-pg0-8:19819:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19823] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833523] [slurm-slehpc15-james-hpc-pg0-9:18751:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833606] [slurm-slehpc15-james-hpc-pg0-9:18751:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833044] [slurm-slehpc15-james-hpc-pg0-12:19303:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833138] [slurm-slehpc15-james-hpc-pg0-12:19303:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833085] [slurm-slehpc15-james-hpc-pg0-5:19618:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833183] [slurm-slehpc15-james-hpc-pg0-5:19618:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833312] [slurm-slehpc15-james-hpc-pg0-4:18740:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-10:18744] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833275] [slurm-slehpc15-james-hpc-pg0-10:18731:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833362] [slurm-slehpc15-james-hpc-pg0-10:18731:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833521] [slurm-slehpc15-james-hpc-pg0-7:19036:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833545] [slurm-slehpc15-james-hpc-pg0-3:18748:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833635] [slurm-slehpc15-james-hpc-pg0-3:18748:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800543] [slurm-slehpc15-james-hpc-pg0-6:19332:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833113] [slurm-slehpc15-james-hpc-pg0-11:18773:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833218] [slurm-slehpc15-james-hpc-pg0-11:18773:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19331] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[1665113880.832913] [slurm-slehpc15-james-hpc-pg0-8:19816:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833027] [slurm-slehpc15-james-hpc-pg0-8:19816:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833501] [slurm-slehpc15-james-hpc-pg0-9:18753:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833594] [slurm-slehpc15-james-hpc-pg0-9:18753:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833065] [slurm-slehpc15-james-hpc-pg0-12:19293:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833124] [slurm-slehpc15-james-hpc-pg0-5:19612:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833209] [slurm-slehpc15-james-hpc-pg0-5:19612:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833324] [slurm-slehpc15-james-hpc-pg0-4:18721:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824049] [slurm-slehpc15-james-hpc-pg0-2:19217:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824157] [slurm-slehpc15-james-hpc-pg0-2:19217:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833248] [slurm-slehpc15-james-hpc-pg0-10:18738:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833345] [slurm-slehpc15-james-hpc-pg0-10:18738:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833486] [slurm-slehpc15-james-hpc-pg0-7:19031:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833559] [slurm-slehpc15-james-hpc-pg0-3:18736:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19333] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18782] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18743] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.800664] [slurm-slehpc15-james-hpc-pg0-6:19344:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800676] [slurm-slehpc15-james-hpc-pg0-6:19344:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833177] [slurm-slehpc15-james-hpc-pg0-11:18782:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833281] [slurm-slehpc15-james-hpc-pg0-11:18782:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833073] [slurm-slehpc15-james-hpc-pg0-8:19797:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833163] [slurm-slehpc15-james-hpc-pg0-8:19797:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833569] [slurm-slehpc15-james-hpc-pg0-9:18765:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833652] [slurm-slehpc15-james-hpc-pg0-9:18765:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19335] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[1665113880.833112] [slurm-slehpc15-james-hpc-pg0-12:19299:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833070] [slurm-slehpc15-james-hpc-pg0-5:19619:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833158] [slurm-slehpc15-james-hpc-pg0-5:19619:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833292] [slurm-slehpc15-james-hpc-pg0-4:18732:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833369] [slurm-slehpc15-james-hpc-pg0-4:18732:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833385] [slurm-slehpc15-james-hpc-pg0-4:18732:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.824043] [slurm-slehpc15-james-hpc-pg0-2:19208:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824059] [slurm-slehpc15-james-hpc-pg0-2:19208:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833256] [slurm-slehpc15-james-hpc-pg0-10:18734:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833346] [slurm-slehpc15-james-hpc-pg0-10:18734:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833616] [slurm-slehpc15-james-hpc-pg0-7:19033:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833696] [slurm-slehpc15-james-hpc-pg0-3:18717:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800626] [slurm-slehpc15-james-hpc-pg0-6:19339:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800711] [slurm-slehpc15-james-hpc-pg0-6:19339:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833096] [slurm-slehpc15-james-hpc-pg0-8:19825:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833182] [slurm-slehpc15-james-hpc-pg0-8:19825:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18784] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.833513] [slurm-slehpc15-james-hpc-pg0-9:18764:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833610] [slurm-slehpc15-james-hpc-pg0-9:18764:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833027] [slurm-slehpc15-james-hpc-pg0-12:19309:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833119] [slurm-slehpc15-james-hpc-pg0-12:19309:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833070] [slurm-slehpc15-james-hpc-pg0-5:19605:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833169] [slurm-slehpc15-james-hpc-pg0-5:19605:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833348] [slurm-slehpc15-james-hpc-pg0-4:18737:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833443] [slurm-slehpc15-james-hpc-pg0-4:18737:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19338] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18783] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19800] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.824055] [slurm-slehpc15-james-hpc-pg0-2:19207:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824148] [slurm-slehpc15-james-hpc-pg0-2:19207:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824158] [slurm-slehpc15-james-hpc-pg0-2:19207:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833411] [slurm-slehpc15-james-hpc-pg0-10:18744:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833580] [slurm-slehpc15-james-hpc-pg0-7:19023:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833628] [slurm-slehpc15-james-hpc-pg0-3:18737:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800656] [slurm-slehpc15-james-hpc-pg0-6:19346:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800675] [slurm-slehpc15-james-hpc-pg0-6:19346:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19341] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18790] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19801] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833101] [slurm-slehpc15-james-hpc-pg0-8:19809:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833564] [slurm-slehpc15-james-hpc-pg0-9:18755:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833662] [slurm-slehpc15-james-hpc-pg0-9:18755:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833073] [slurm-slehpc15-james-hpc-pg0-12:19300:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833159] [slurm-slehpc15-james-hpc-pg0-12:19300:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833250] [slurm-slehpc15-james-hpc-pg0-5:19620:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833339] [slurm-slehpc15-james-hpc-pg0-5:19620:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18789] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19803] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833415] [slurm-slehpc15-james-hpc-pg0-4:18721:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824018] [slurm-slehpc15-james-hpc-pg0-2:19218:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824114] [slurm-slehpc15-james-hpc-pg0-2:19218:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833273] [slurm-slehpc15-james-hpc-pg0-10:18736:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833364] [slurm-slehpc15-james-hpc-pg0-10:18736:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833581] [slurm-slehpc15-james-hpc-pg0-7:19024:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833637] [slurm-slehpc15-james-hpc-pg0-3:18735:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18788] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19804] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19196] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.800599] [slurm-slehpc15-james-hpc-pg0-6:19349:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800683] [slurm-slehpc15-james-hpc-pg0-6:19349:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833374] [slurm-slehpc15-james-hpc-pg0-11:18754:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833086] [slurm-slehpc15-james-hpc-pg0-8:19826:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833515] [slurm-slehpc15-james-hpc-pg0-9:18760:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833598] [slurm-slehpc15-james-hpc-pg0-9:18760:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833077] [slurm-slehpc15-james-hpc-pg0-12:19298:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833160] [slurm-slehpc15-james-hpc-pg0-12:19298:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19322] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18787] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18721] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19197] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833090] [slurm-slehpc15-james-hpc-pg0-5:19606:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833177] [slurm-slehpc15-james-hpc-pg0-5:19606:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833478] [slurm-slehpc15-james-hpc-pg0-4:18727:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833495] [slurm-slehpc15-james-hpc-pg0-4:18727:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.824049] [slurm-slehpc15-james-hpc-pg0-2:19232:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824156] [slurm-slehpc15-james-hpc-pg0-2:19232:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833408] [slurm-slehpc15-james-hpc-pg0-10:18737:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833499] [slurm-slehpc15-james-hpc-pg0-10:18737:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18794] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19805] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18722] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.833582] [slurm-slehpc15-james-hpc-pg0-7:19021:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833628] [slurm-slehpc15-james-hpc-pg0-3:18733:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833393] [slurm-slehpc15-james-hpc-pg0-11:18785:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833121] [slurm-slehpc15-james-hpc-pg0-8:19831:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833545] [slurm-slehpc15-james-hpc-pg0-9:18769:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833644] [slurm-slehpc15-james-hpc-pg0-9:18769:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833064] [slurm-slehpc15-james-hpc-pg0-12:19297:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19323] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19807] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18723] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.833135] [slurm-slehpc15-james-hpc-pg0-5:19623:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833220] [slurm-slehpc15-james-hpc-pg0-5:19623:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824100] [slurm-slehpc15-james-hpc-pg0-2:19219:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833413] [slurm-slehpc15-james-hpc-pg0-10:18743:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833505] [slurm-slehpc15-james-hpc-pg0-10:18743:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833582] [slurm-slehpc15-james-hpc-pg0-7:19022:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833625] [slurm-slehpc15-james-hpc-pg0-3:18756:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833717] [slurm-slehpc15-james-hpc-pg0-3:18756:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19808] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833231] [slurm-slehpc15-james-hpc-pg0-11:18778:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833328] [slurm-slehpc15-james-hpc-pg0-11:18778:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833073] [slurm-slehpc15-james-hpc-pg0-8:19820:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833552] [slurm-slehpc15-james-hpc-pg0-9:18770:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833637] [slurm-slehpc15-james-hpc-pg0-9:18770:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833181] [slurm-slehpc15-james-hpc-pg0-12:19301:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833279] [slurm-slehpc15-james-hpc-pg0-12:19301:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19324] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18793] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18724] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19201] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833128] [slurm-slehpc15-james-hpc-pg0-5:19614:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833213] [slurm-slehpc15-james-hpc-pg0-5:19614:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824049] [slurm-slehpc15-james-hpc-pg0-2:19221:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833427] [slurm-slehpc15-james-hpc-pg0-10:18740:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833520] [slurm-slehpc15-james-hpc-pg0-10:18740:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833589] [slurm-slehpc15-james-hpc-pg0-7:19030:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833614] [slurm-slehpc15-james-hpc-pg0-3:18754:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833698] [slurm-slehpc15-james-hpc-pg0-3:18754:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19810] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.800633] [slurm-slehpc15-james-hpc-pg0-6:19338:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800716] [slurm-slehpc15-james-hpc-pg0-6:19338:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833218] [slurm-slehpc15-james-hpc-pg0-11:18790:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833303] [slurm-slehpc15-james-hpc-pg0-11:18790:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833080] [slurm-slehpc15-james-hpc-pg0-8:19829:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833537] [slurm-slehpc15-james-hpc-pg0-9:18750:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833629] [slurm-slehpc15-james-hpc-pg0-9:18750:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19326] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18792] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.833162] [slurm-slehpc15-james-hpc-pg0-12:19268:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833251] [slurm-slehpc15-james-hpc-pg0-12:19268:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833135] [slurm-slehpc15-james-hpc-pg0-5:19616:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833220] [slurm-slehpc15-james-hpc-pg0-5:19616:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824081] [slurm-slehpc15-james-hpc-pg0-2:19223:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824174] [slurm-slehpc15-james-hpc-pg0-2:19223:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833410] [slurm-slehpc15-james-hpc-pg0-10:18741:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833504] [slurm-slehpc15-james-hpc-pg0-10:18741:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19327] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19811] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18725] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.833628] [slurm-slehpc15-james-hpc-pg0-7:19039:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833615] [slurm-slehpc15-james-hpc-pg0-3:18749:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833698] [slurm-slehpc15-james-hpc-pg0-3:18749:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800622] [slurm-slehpc15-james-hpc-pg0-6:19352:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800715] [slurm-slehpc15-james-hpc-pg0-6:19352:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833243] [slurm-slehpc15-james-hpc-pg0-11:18776:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833334] [slurm-slehpc15-james-hpc-pg0-11:18776:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19328] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19232] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833045] [slurm-slehpc15-james-hpc-pg0-8:19818:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833137] [slurm-slehpc15-james-hpc-pg0-8:19818:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833530] [slurm-slehpc15-james-hpc-pg0-9:18754:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833622] [slurm-slehpc15-james-hpc-pg0-9:18754:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833199] [slurm-slehpc15-james-hpc-pg0-12:19284:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833259] [slurm-slehpc15-james-hpc-pg0-5:19624:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833354] [slurm-slehpc15-james-hpc-pg0-5:19624:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19812] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18728] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.833461] [slurm-slehpc15-james-hpc-pg0-4:18731:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833472] [slurm-slehpc15-james-hpc-pg0-4:18731:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.824066] [slurm-slehpc15-james-hpc-pg0-2:19224:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824159] [slurm-slehpc15-james-hpc-pg0-2:19224:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833354] [slurm-slehpc15-james-hpc-pg0-10:18735:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833457] [slurm-slehpc15-james-hpc-pg0-10:18735:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833583] [slurm-slehpc15-james-hpc-pg0-7:19035:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833668] [slurm-slehpc15-james-hpc-pg0-7:19035:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19329] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19815] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833610] [slurm-slehpc15-james-hpc-pg0-3:18752:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833696] [slurm-slehpc15-james-hpc-pg0-3:18752:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800677] [slurm-slehpc15-james-hpc-pg0-6:19341:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833211] [slurm-slehpc15-james-hpc-pg0-11:18775:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833299] [slurm-slehpc15-james-hpc-pg0-11:18775:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833016] [slurm-slehpc15-james-hpc-pg0-8:19817:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833125] [slurm-slehpc15-james-hpc-pg0-8:19817:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19330] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19816] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18726] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19234] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833702] [slurm-slehpc15-james-hpc-pg0-9:18762:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833794] [slurm-slehpc15-james-hpc-pg0-9:18762:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833155] [slurm-slehpc15-james-hpc-pg0-12:19308:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833164] [slurm-slehpc15-james-hpc-pg0-5:19621:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833250] [slurm-slehpc15-james-hpc-pg0-5:19621:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824093] [slurm-slehpc15-james-hpc-pg0-2:19229:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833581] [slurm-slehpc15-james-hpc-pg0-7:19037:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19332] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19238] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833685] [slurm-slehpc15-james-hpc-pg0-3:18745:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800599] [slurm-slehpc15-james-hpc-pg0-6:19351:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800697] [slurm-slehpc15-james-hpc-pg0-6:19351:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833212] [slurm-slehpc15-james-hpc-pg0-11:18784:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833307] [slurm-slehpc15-james-hpc-pg0-11:18784:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833126] [slurm-slehpc15-james-hpc-pg0-8:19823:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833687] [slurm-slehpc15-james-hpc-pg0-9:18729:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833781] [slurm-slehpc15-james-hpc-pg0-9:18729:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-4:18727] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19237] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833193] [slurm-slehpc15-james-hpc-pg0-12:19306:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833262] [slurm-slehpc15-james-hpc-pg0-5:19617:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833348] [slurm-slehpc15-james-hpc-pg0-5:19617:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824051] [slurm-slehpc15-james-hpc-pg0-2:19222:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824156] [slurm-slehpc15-james-hpc-pg0-2:19222:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833411] [slurm-slehpc15-james-hpc-pg0-10:18753:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833504] [slurm-slehpc15-james-hpc-pg0-10:18753:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19792] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833582] [slurm-slehpc15-james-hpc-pg0-7:19029:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833673] [slurm-slehpc15-james-hpc-pg0-7:19029:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833653] [slurm-slehpc15-james-hpc-pg0-3:18744:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833781] [slurm-slehpc15-james-hpc-pg0-3:18744:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800592] [slurm-slehpc15-james-hpc-pg0-6:19345:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800676] [slurm-slehpc15-james-hpc-pg0-6:19345:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800696] [slurm-slehpc15-james-hpc-pg0-6:19345:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800705] [slurm-slehpc15-james-hpc-pg0-6:19345:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19793] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18733] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.833308] [slurm-slehpc15-james-hpc-pg0-11:18783:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833398] [slurm-slehpc15-james-hpc-pg0-11:18783:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833092] [slurm-slehpc15-james-hpc-pg0-8:19824:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833183] [slurm-slehpc15-james-hpc-pg0-8:19824:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833641] [slurm-slehpc15-james-hpc-pg0-9:18761:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833726] [slurm-slehpc15-james-hpc-pg0-9:18761:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833210] [slurm-slehpc15-james-hpc-pg0-12:19305:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-2:19224] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833136] [slurm-slehpc15-james-hpc-pg0-5:19615:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833222] [slurm-slehpc15-james-hpc-pg0-5:19615:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833382] [slurm-slehpc15-james-hpc-pg0-4:18725:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824064] [slurm-slehpc15-james-hpc-pg0-2:19234:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824156] [slurm-slehpc15-james-hpc-pg0-2:19234:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833581] [slurm-slehpc15-james-hpc-pg0-7:19038:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833636] [slurm-slehpc15-james-hpc-pg0-3:18751:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833721] [slurm-slehpc15-james-hpc-pg0-3:18751:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19795] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19229] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.800767] [slurm-slehpc15-james-hpc-pg0-6:19342:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800781] [slurm-slehpc15-james-hpc-pg0-6:19342:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833211] [slurm-slehpc15-james-hpc-pg0-11:18794:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833301] [slurm-slehpc15-james-hpc-pg0-11:18794:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833420] [slurm-slehpc15-james-hpc-pg0-11:18794:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833438] [slurm-slehpc15-james-hpc-pg0-11:18794:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19314] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[1665113880.833072] [slurm-slehpc15-james-hpc-pg0-8:19832:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833157] [slurm-slehpc15-james-hpc-pg0-8:19832:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833683] [slurm-slehpc15-james-hpc-pg0-9:18757:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833779] [slurm-slehpc15-james-hpc-pg0-9:18757:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833180] [slurm-slehpc15-james-hpc-pg0-12:19310:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833268] [slurm-slehpc15-james-hpc-pg0-12:19310:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833106] [slurm-slehpc15-james-hpc-pg0-5:19613:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833196] [slurm-slehpc15-james-hpc-pg0-5:19613:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19316] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19794] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19228] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833449] [slurm-slehpc15-james-hpc-pg0-4:18729:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833459] [slurm-slehpc15-james-hpc-pg0-4:18729:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.824116] [slurm-slehpc15-james-hpc-pg0-2:19220:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824208] [slurm-slehpc15-james-hpc-pg0-2:19220:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833435] [slurm-slehpc15-james-hpc-pg0-10:18756:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833523] [slurm-slehpc15-james-hpc-pg0-10:18756:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833581] [slurm-slehpc15-james-hpc-pg0-7:19020:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833655] [slurm-slehpc15-james-hpc-pg0-3:18746:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833740] [slurm-slehpc15-james-hpc-pg0-3:18746:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833264] [slurm-slehpc15-james-hpc-pg0-11:18791:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833351] [slurm-slehpc15-james-hpc-pg0-11:18791:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833375] [slurm-slehpc15-james-hpc-pg0-11:18791:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833388] [slurm-slehpc15-james-hpc-pg0-11:18791:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833045] [slurm-slehpc15-james-hpc-pg0-8:19827:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833131] [slurm-slehpc15-james-hpc-pg0-8:19827:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19796] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19233] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833654] [slurm-slehpc15-james-hpc-pg0-9:18768:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833739] [slurm-slehpc15-james-hpc-pg0-9:18768:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833153] [slurm-slehpc15-james-hpc-pg0-12:19311:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833239] [slurm-slehpc15-james-hpc-pg0-12:19311:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833343] [slurm-slehpc15-james-hpc-pg0-5:19585:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824222] [slurm-slehpc15-james-hpc-pg0-2:19201:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824318] [slurm-slehpc15-james-hpc-pg0-2:19201:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-2:19239] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833411] [slurm-slehpc15-james-hpc-pg0-10:18742:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833504] [slurm-slehpc15-james-hpc-pg0-10:18742:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833581] [slurm-slehpc15-james-hpc-pg0-7:19041:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833672] [slurm-slehpc15-james-hpc-pg0-7:19041:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833610] [slurm-slehpc15-james-hpc-pg0-3:18747:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833695] [slurm-slehpc15-james-hpc-pg0-3:18747:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833223] [slurm-slehpc15-james-hpc-pg0-11:18787:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833315] [slurm-slehpc15-james-hpc-pg0-11:18787:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833464] [slurm-slehpc15-james-hpc-pg0-11:18787:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833480] [slurm-slehpc15-james-hpc-pg0-11:18787:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19319] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19798] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833269] [slurm-slehpc15-james-hpc-pg0-8:19810:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833288] [slurm-slehpc15-james-hpc-pg0-8:19810:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833668] [slurm-slehpc15-james-hpc-pg0-9:18756:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833759] [slurm-slehpc15-james-hpc-pg0-9:18756:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833140] [slurm-slehpc15-james-hpc-pg0-12:19307:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833223] [slurm-slehpc15-james-hpc-pg0-12:19307:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833366] [slurm-slehpc15-james-hpc-pg0-5:19586:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19320] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[1665113880.833446] [slurm-slehpc15-james-hpc-pg0-4:18730:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833459] [slurm-slehpc15-james-hpc-pg0-4:18730:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.833414] [slurm-slehpc15-james-hpc-pg0-10:18739:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833506] [slurm-slehpc15-james-hpc-pg0-10:18739:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833596] [slurm-slehpc15-james-hpc-pg0-7:19032:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833680] [slurm-slehpc15-james-hpc-pg0-7:19032:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833600] [slurm-slehpc15-james-hpc-pg0-3:18753:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833692] [slurm-slehpc15-james-hpc-pg0-3:18753:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19321] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19799] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833260] [slurm-slehpc15-james-hpc-pg0-11:18779:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833365] [slurm-slehpc15-james-hpc-pg0-11:18779:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833270] [slurm-slehpc15-james-hpc-pg0-8:19808:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833288] [slurm-slehpc15-james-hpc-pg0-8:19808:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833713] [slurm-slehpc15-james-hpc-pg0-9:18766:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833808] [slurm-slehpc15-james-hpc-pg0-9:18766:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833194] [slurm-slehpc15-james-hpc-pg0-12:19299:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19325] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18761] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19215] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833414] [slurm-slehpc15-james-hpc-pg0-4:18741:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833587] [slurm-slehpc15-james-hpc-pg0-7:19025:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833670] [slurm-slehpc15-james-hpc-pg0-7:19025:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833784] [slurm-slehpc15-james-hpc-pg0-3:18750:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833878] [slurm-slehpc15-james-hpc-pg0-3:18750:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833204] [slurm-slehpc15-james-hpc-pg0-11:18789:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833311] [slurm-slehpc15-james-hpc-pg0-11:18789:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18762] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19216] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833269] [slurm-slehpc15-james-hpc-pg0-8:19807:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833288] [slurm-slehpc15-james-hpc-pg0-8:19807:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833653] [slurm-slehpc15-james-hpc-pg0-9:18767:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833748] [slurm-slehpc15-james-hpc-pg0-9:18767:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833333] [slurm-slehpc15-james-hpc-pg0-12:19284:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833403] [slurm-slehpc15-james-hpc-pg0-4:18740:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833585] [slurm-slehpc15-james-hpc-pg0-10:18725:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833677] [slurm-slehpc15-james-hpc-pg0-10:18725:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18763] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19802] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19222] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833582] [slurm-slehpc15-james-hpc-pg0-7:19040:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833790] [slurm-slehpc15-james-hpc-pg0-3:18717:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800793] [slurm-slehpc15-james-hpc-pg0-6:19340:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800808] [slurm-slehpc15-james-hpc-pg0-6:19340:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833280] [slurm-slehpc15-james-hpc-pg0-11:18795:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833367] [slurm-slehpc15-james-hpc-pg0-11:18795:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833277] [slurm-slehpc15-james-hpc-pg0-8:19805:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833293] [slurm-slehpc15-james-hpc-pg0-8:19805:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18764] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19220] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18718] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833689] [slurm-slehpc15-james-hpc-pg0-9:18758:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833773] [slurm-slehpc15-james-hpc-pg0-9:18758:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833310] [slurm-slehpc15-james-hpc-pg0-12:19305:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833354] [slurm-slehpc15-james-hpc-pg0-4:18738:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833441] [slurm-slehpc15-james-hpc-pg0-4:18738:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833650] [slurm-slehpc15-james-hpc-pg0-10:18733:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833683] [slurm-slehpc15-james-hpc-pg0-10:18733:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19339] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19809] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18719] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833602] [slurm-slehpc15-james-hpc-pg0-7:19028:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833775] [slurm-slehpc15-james-hpc-pg0-3:18755:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833870] [slurm-slehpc15-james-hpc-pg0-3:18755:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800759] [slurm-slehpc15-james-hpc-pg0-6:19341:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833288] [slurm-slehpc15-james-hpc-pg0-11:18786:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833376] [slurm-slehpc15-james-hpc-pg0-11:18786:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833268] [slurm-slehpc15-james-hpc-pg0-8:19825:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833284] [slurm-slehpc15-james-hpc-pg0-8:19825:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19309] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18765] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19227] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833685] [slurm-slehpc15-james-hpc-pg0-9:18763:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833769] [slurm-slehpc15-james-hpc-pg0-9:18763:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.834276] [slurm-slehpc15-james-hpc-pg0-12:19274:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824186] [slurm-slehpc15-james-hpc-pg0-2:19203:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824203] [slurm-slehpc15-james-hpc-pg0-2:19203:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833600] [slurm-slehpc15-james-hpc-pg0-10:18738:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833617] [slurm-slehpc15-james-hpc-pg0-10:18738:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19311] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19231] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18720] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833611] [slurm-slehpc15-james-hpc-pg0-7:19036:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833842] [slurm-slehpc15-james-hpc-pg0-3:18738:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833940] [slurm-slehpc15-james-hpc-pg0-3:18738:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800756] [slurm-slehpc15-james-hpc-pg0-6:19351:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800772] [slurm-slehpc15-james-hpc-pg0-6:19351:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833255] [slurm-slehpc15-james-hpc-pg0-8:19814:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833270] [slurm-slehpc15-james-hpc-pg0-8:19814:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18766] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19790] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19230] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.834570] [slurm-slehpc15-james-hpc-pg0-9:18733:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.837047] [slurm-slehpc15-james-hpc-pg0-12:19304:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833354] [slurm-slehpc15-james-hpc-pg0-4:18739:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833438] [slurm-slehpc15-james-hpc-pg0-4:18739:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824233] [slurm-slehpc15-james-hpc-pg0-2:19198:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824244] [slurm-slehpc15-james-hpc-pg0-2:19198:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833581] [slurm-slehpc15-james-hpc-pg0-7:19031:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19312] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19791] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18721] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.834858] [slurm-slehpc15-james-hpc-pg0-3:18725:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800896] [slurm-slehpc15-james-hpc-pg0-6:19347:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800911] [slurm-slehpc15-james-hpc-pg0-6:19347:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833288] [slurm-slehpc15-james-hpc-pg0-11:18793:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833387] [slurm-slehpc15-james-hpc-pg0-11:18793:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833477] [slurm-slehpc15-james-hpc-pg0-11:18793:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833487] [slurm-slehpc15-james-hpc-pg0-11:18793:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19313] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18768] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19235] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.837136] [slurm-slehpc15-james-hpc-pg0-12:19304:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833663] [slurm-slehpc15-james-hpc-pg0-7:19000:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833752] [slurm-slehpc15-james-hpc-pg0-7:19000:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.835047] [slurm-slehpc15-james-hpc-pg0-3:18725:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833202] [slurm-slehpc15-james-hpc-pg0-8:19813:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833222] [slurm-slehpc15-james-hpc-pg0-8:19813:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833354] [slurm-slehpc15-james-hpc-pg0-4:18743:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833440] [slurm-slehpc15-james-hpc-pg0-4:18743:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-10:18722] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.824144] [slurm-slehpc15-james-hpc-pg0-2:19206:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833527] [slurm-slehpc15-james-hpc-pg0-10:18744:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833621] [slurm-slehpc15-james-hpc-pg0-10:18744:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833636] [slurm-slehpc15-james-hpc-pg0-10:18744:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833710] [slurm-slehpc15-james-hpc-pg0-7:19033:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800845] [slurm-slehpc15-james-hpc-pg0-6:19343:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800861] [slurm-slehpc15-james-hpc-pg0-6:19343:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19315] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18769] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.833306] [slurm-slehpc15-james-hpc-pg0-11:18792:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833395] [slurm-slehpc15-james-hpc-pg0-11:18792:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833454] [slurm-slehpc15-james-hpc-pg0-11:18792:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833473] [slurm-slehpc15-james-hpc-pg0-11:18792:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833450] [slurm-slehpc15-james-hpc-pg0-4:18742:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833553] [slurm-slehpc15-james-hpc-pg0-4:18742:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824213] [slurm-slehpc15-james-hpc-pg0-2:19199:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824232] [slurm-slehpc15-james-hpc-pg0-2:19199:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19789] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18762] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18724] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833735] [slurm-slehpc15-james-hpc-pg0-7:19039:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833309] [slurm-slehpc15-james-hpc-pg0-11:18788:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833395] [slurm-slehpc15-james-hpc-pg0-11:18788:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824303] [slurm-slehpc15-james-hpc-pg0-2:19196:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824318] [slurm-slehpc15-james-hpc-pg0-2:19196:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833657] [slurm-slehpc15-james-hpc-pg0-7:19027:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833747] [slurm-slehpc15-james-hpc-pg0-7:19027:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18770] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19797] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833450] [slurm-slehpc15-james-hpc-pg0-4:18745:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833563] [slurm-slehpc15-james-hpc-pg0-4:18745:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833650] [slurm-slehpc15-james-hpc-pg0-10:18736:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833665] [slurm-slehpc15-james-hpc-pg0-10:18736:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.835102] [slurm-slehpc15-james-hpc-pg0-7:19034:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833450] [slurm-slehpc15-james-hpc-pg0-4:18751:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833553] [slurm-slehpc15-james-hpc-pg0-4:18751:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19310] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18725] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.835295] [slurm-slehpc15-james-hpc-pg0-7:19034:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800881] [slurm-slehpc15-james-hpc-pg0-6:19336:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800896] [slurm-slehpc15-james-hpc-pg0-6:19336:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.824185] [slurm-slehpc15-james-hpc-pg0-2:19200:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824202] [slurm-slehpc15-james-hpc-pg0-2:19200:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.837204] [slurm-slehpc15-james-hpc-pg0-7:19026:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18771] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19828] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18752] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.800869] [slurm-slehpc15-james-hpc-pg0-6:19337:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800886] [slurm-slehpc15-james-hpc-pg0-6:19337:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833450] [slurm-slehpc15-james-hpc-pg0-4:18750:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833552] [slurm-slehpc15-james-hpc-pg0-4:18750:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824192] [slurm-slehpc15-james-hpc-pg0-2:19219:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833568] [slurm-slehpc15-james-hpc-pg0-10:18743:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833586] [slurm-slehpc15-james-hpc-pg0-10:18743:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19318] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19211] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18726] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.837292] [slurm-slehpc15-james-hpc-pg0-7:19026:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800903] [slurm-slehpc15-james-hpc-pg0-6:19341:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800920] [slurm-slehpc15-james-hpc-pg0-6:19341:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833211] [slurm-slehpc15-james-hpc-pg0-8:19831:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833450] [slurm-slehpc15-james-hpc-pg0-4:18748:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833552] [slurm-slehpc15-james-hpc-pg0-4:18748:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824236] [slurm-slehpc15-james-hpc-pg0-2:19221:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18772] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19829] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18750] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.801002] [slurm-slehpc15-james-hpc-pg0-6:19329:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801018] [slurm-slehpc15-james-hpc-pg0-6:19329:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833625] [slurm-slehpc15-james-hpc-pg0-11:18754:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833457] [slurm-slehpc15-james-hpc-pg0-4:18746:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833553] [slurm-slehpc15-james-hpc-pg0-4:18746:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824229] [slurm-slehpc15-james-hpc-pg0-2:19239:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824318] [slurm-slehpc15-james-hpc-pg0-2:19239:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18773] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19212] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18727] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.800923] [slurm-slehpc15-james-hpc-pg0-6:19335:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800935] [slurm-slehpc15-james-hpc-pg0-6:19335:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833476] [slurm-slehpc15-james-hpc-pg0-11:18785:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833253] [slurm-slehpc15-james-hpc-pg0-8:19820:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833264] [slurm-slehpc15-james-hpc-pg0-8:19820:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833453] [slurm-slehpc15-james-hpc-pg0-4:18747:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833552] [slurm-slehpc15-james-hpc-pg0-4:18747:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18774] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19213] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.824159] [slurm-slehpc15-james-hpc-pg0-2:19233:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824267] [slurm-slehpc15-james-hpc-pg0-2:19233:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833668] [slurm-slehpc15-james-hpc-pg0-11:18781:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833683] [slurm-slehpc15-james-hpc-pg0-11:18781:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833256] [slurm-slehpc15-james-hpc-pg0-8:19818:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833270] [slurm-slehpc15-james-hpc-pg0-8:19818:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19349] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18751] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19214] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18728] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833450] [slurm-slehpc15-james-hpc-pg0-4:18752:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833552] [slurm-slehpc15-james-hpc-pg0-4:18752:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824147] [slurm-slehpc15-james-hpc-pg0-2:19238:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824233] [slurm-slehpc15-james-hpc-pg0-2:19238:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.801002] [slurm-slehpc15-james-hpc-pg0-6:19327:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801026] [slurm-slehpc15-james-hpc-pg0-6:19327:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833554] [slurm-slehpc15-james-hpc-pg0-11:18790:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833572] [slurm-slehpc15-james-hpc-pg0-11:18790:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19350] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18776] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.833221] [slurm-slehpc15-james-hpc-pg0-8:19817:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833235] [slurm-slehpc15-james-hpc-pg0-8:19817:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833450] [slurm-slehpc15-james-hpc-pg0-4:18744:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833552] [slurm-slehpc15-james-hpc-pg0-4:18744:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824283] [slurm-slehpc15-james-hpc-pg0-2:19225:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824379] [slurm-slehpc15-james-hpc-pg0-2:19225:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833659] [slurm-slehpc15-james-hpc-pg0-10:18741:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833670] [slurm-slehpc15-james-hpc-pg0-10:18741:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18757] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18729] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833218] [slurm-slehpc15-james-hpc-pg0-8:19823:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833244] [slurm-slehpc15-james-hpc-pg0-8:19823:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833261] [slurm-slehpc15-james-hpc-pg0-8:19823:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833453] [slurm-slehpc15-james-hpc-pg0-4:18749:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833552] [slurm-slehpc15-james-hpc-pg0-4:18749:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824229] [slurm-slehpc15-james-hpc-pg0-2:19231:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824319] [slurm-slehpc15-james-hpc-pg0-2:19231:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-6:19352] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19821] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19218] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833614] [slurm-slehpc15-james-hpc-pg0-11:18784:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833628] [slurm-slehpc15-james-hpc-pg0-11:18784:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833210] [slurm-slehpc15-james-hpc-pg0-8:19819:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833223] [slurm-slehpc15-james-hpc-pg0-8:19819:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.824202] [slurm-slehpc15-james-hpc-pg0-2:19237:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824308] [slurm-slehpc15-james-hpc-pg0-2:19237:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-4:18756] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19217] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18730] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833555] [slurm-slehpc15-james-hpc-pg0-10:18755:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833637] [slurm-slehpc15-james-hpc-pg0-10:18755:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800942] [slurm-slehpc15-james-hpc-pg0-6:19333:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800958] [slurm-slehpc15-james-hpc-pg0-6:19333:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833653] [slurm-slehpc15-james-hpc-pg0-11:18783:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833667] [slurm-slehpc15-james-hpc-pg0-11:18783:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19317] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18775] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19830] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833389] [slurm-slehpc15-james-hpc-pg0-8:19801:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833405] [slurm-slehpc15-james-hpc-pg0-8:19801:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.824219] [slurm-slehpc15-james-hpc-pg0-2:19229:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800976] [slurm-slehpc15-james-hpc-pg0-6:19331:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800987] [slurm-slehpc15-james-hpc-pg0-6:19331:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833301] [slurm-slehpc15-james-hpc-pg0-8:19816:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833317] [slurm-slehpc15-james-hpc-pg0-8:19816:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19827] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18754] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19219] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.824235] [slurm-slehpc15-james-hpc-pg0-2:19230:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824321] [slurm-slehpc15-james-hpc-pg0-2:19230:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824289] [slurm-slehpc15-james-hpc-pg0-2:19235:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824373] [slurm-slehpc15-james-hpc-pg0-2:19235:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833625] [slurm-slehpc15-james-hpc-pg0-10:18745:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833707] [slurm-slehpc15-james-hpc-pg0-10:18745:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833727] [slurm-slehpc15-james-hpc-pg0-10:18745:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833743] [slurm-slehpc15-james-hpc-pg0-10:18745:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19826] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18755] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18731] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.824138] [slurm-slehpc15-james-hpc-pg0-2:19227:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824227] [slurm-slehpc15-james-hpc-pg0-2:19227:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833545] [slurm-slehpc15-james-hpc-pg0-10:18757:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833630] [slurm-slehpc15-james-hpc-pg0-10:18757:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.800968] [slurm-slehpc15-james-hpc-pg0-6:19338:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.800985] [slurm-slehpc15-james-hpc-pg0-6:19338:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.824125] [slurm-slehpc15-james-hpc-pg0-2:19228:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824219] [slurm-slehpc15-james-hpc-pg0-2:19228:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19824] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19223] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.801075] [slurm-slehpc15-james-hpc-pg0-6:19323:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801086] [slurm-slehpc15-james-hpc-pg0-6:19323:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833467] [slurm-slehpc15-james-hpc-pg0-10:18760:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833555] [slurm-slehpc15-james-hpc-pg0-10:18760:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833341] [slurm-slehpc15-james-hpc-pg0-8:19803:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833358] [slurm-slehpc15-james-hpc-pg0-8:19803:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-6:19334] pml_ucx.c:419 Error: ucp_ep_create(proc=260) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18777] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19832] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18761] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19221] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18732] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833661] [slurm-slehpc15-james-hpc-pg0-10:18748:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833769] [slurm-slehpc15-james-hpc-pg0-10:18748:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.801016] [slurm-slehpc15-james-hpc-pg0-6:19324:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801041] [slurm-slehpc15-james-hpc-pg0-6:19324:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.824546] [slurm-slehpc15-james-hpc-pg0-2:19201:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824560] [slurm-slehpc15-james-hpc-pg0-2:19201:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.824327] [slurm-slehpc15-james-hpc-pg0-2:19197:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824342] [slurm-slehpc15-james-hpc-pg0-2:19197:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833560] [slurm-slehpc15-james-hpc-pg0-10:18751:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833651] [slurm-slehpc15-james-hpc-pg0-10:18751:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.801076] [slurm-slehpc15-james-hpc-pg0-6:19322:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801090] [slurm-slehpc15-james-hpc-pg0-6:19322:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18778] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-8:19831] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18760] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18733] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833519] [slurm-slehpc15-james-hpc-pg0-11:18789:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833533] [slurm-slehpc15-james-hpc-pg0-11:18789:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833781] [slurm-slehpc15-james-hpc-pg0-4:18721:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833798] [slurm-slehpc15-james-hpc-pg0-4:18721:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.824341] [slurm-slehpc15-james-hpc-pg0-2:19226:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.824433] [slurm-slehpc15-james-hpc-pg0-2:19226:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-2:19226] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.833438] [slurm-slehpc15-james-hpc-pg0-8:19800:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833453] [slurm-slehpc15-james-hpc-pg0-8:19800:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833531] [slurm-slehpc15-james-hpc-pg0-10:18747:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833618] [slurm-slehpc15-james-hpc-pg0-10:18747:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.801045] [slurm-slehpc15-james-hpc-pg0-6:19328:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801056] [slurm-slehpc15-james-hpc-pg0-6:19328:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18758] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19225] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18734] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833540] [slurm-slehpc15-james-hpc-pg0-4:18728:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833557] [slurm-slehpc15-james-hpc-pg0-4:18728:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.824734] [slurm-slehpc15-james-hpc-pg0-2:19232:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824753] [slurm-slehpc15-james-hpc-pg0-2:19232:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833609] [slurm-slehpc15-james-hpc-pg0-10:18761:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833691] [slurm-slehpc15-james-hpc-pg0-10:18761:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-8:19806] pml_ucx.c:419 Error: ucp_ep_create(proc=342) failed: Shared memory error | |
[1665113880.833470] [slurm-slehpc15-james-hpc-pg0-8:19804:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833486] [slurm-slehpc15-james-hpc-pg0-8:19804:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.824753] [slurm-slehpc15-james-hpc-pg0-2:19238:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824769] [slurm-slehpc15-james-hpc-pg0-2:19238:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833554] [slurm-slehpc15-james-hpc-pg0-10:18759:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833639] [slurm-slehpc15-james-hpc-pg0-10:18759:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18779] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18759] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.833652] [slurm-slehpc15-james-hpc-pg0-11:18782:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833667] [slurm-slehpc15-james-hpc-pg0-11:18782:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.824762] [slurm-slehpc15-james-hpc-pg0-2:19237:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824790] [slurm-slehpc15-james-hpc-pg0-2:19237:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833495] [slurm-slehpc15-james-hpc-pg0-10:18749:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833608] [slurm-slehpc15-james-hpc-pg0-10:18749:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-10:18735] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833567] [slurm-slehpc15-james-hpc-pg0-11:18788:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833584] [slurm-slehpc15-james-hpc-pg0-11:18788:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833307] [slurm-slehpc15-james-hpc-pg0-8:19812:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833318] [slurm-slehpc15-james-hpc-pg0-8:19812:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833505] [slurm-slehpc15-james-hpc-pg0-4:18726:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833521] [slurm-slehpc15-james-hpc-pg0-4:18726:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18764] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.833530] [slurm-slehpc15-james-hpc-pg0-10:18754:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833634] [slurm-slehpc15-james-hpc-pg0-10:18754:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.801042] [slurm-slehpc15-james-hpc-pg0-6:19326:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801054] [slurm-slehpc15-james-hpc-pg0-6:19326:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.824772] [slurm-slehpc15-james-hpc-pg0-2:19234:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824786] [slurm-slehpc15-james-hpc-pg0-2:19234:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833542] [slurm-slehpc15-james-hpc-pg0-10:18758:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833630] [slurm-slehpc15-james-hpc-pg0-10:18758:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833616] [slurm-slehpc15-james-hpc-pg0-4:18723:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833633] [slurm-slehpc15-james-hpc-pg0-4:18723:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.824802] [slurm-slehpc15-james-hpc-pg0-2:19239:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824820] [slurm-slehpc15-james-hpc-pg0-2:19239:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833625] [slurm-slehpc15-james-hpc-pg0-10:18742:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833636] [slurm-slehpc15-james-hpc-pg0-10:18742:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833560] [slurm-slehpc15-james-hpc-pg0-10:18752:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833652] [slurm-slehpc15-james-hpc-pg0-10:18752:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833333] [slurm-slehpc15-james-hpc-pg0-8:19811:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833350] [slurm-slehpc15-james-hpc-pg0-8:19811:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18780] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18737] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833589] [slurm-slehpc15-james-hpc-pg0-4:18724:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833604] [slurm-slehpc15-james-hpc-pg0-4:18724:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.801049] [slurm-slehpc15-james-hpc-pg0-6:19330:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801064] [slurm-slehpc15-james-hpc-pg0-6:19330:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833354] [slurm-slehpc15-james-hpc-pg0-8:19815:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833365] [slurm-slehpc15-james-hpc-pg0-8:19815:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18763] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.833685] [slurm-slehpc15-james-hpc-pg0-10:18739:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833703] [slurm-slehpc15-james-hpc-pg0-10:18739:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.801069] [slurm-slehpc15-james-hpc-pg0-6:19332:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801085] [slurm-slehpc15-james-hpc-pg0-6:19332:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833642] [slurm-slehpc15-james-hpc-pg0-4:18722:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833659] [slurm-slehpc15-james-hpc-pg0-4:18722:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18740] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.824871] [slurm-slehpc15-james-hpc-pg0-2:19228:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824887] [slurm-slehpc15-james-hpc-pg0-2:19228:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.801120] [slurm-slehpc15-james-hpc-pg0-6:19339:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801136] [slurm-slehpc15-james-hpc-pg0-6:19339:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.824837] [slurm-slehpc15-james-hpc-pg0-2:19233:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824855] [slurm-slehpc15-james-hpc-pg0-2:19233:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18739] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.801157] [slurm-slehpc15-james-hpc-pg0-6:19316:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801169] [slurm-slehpc15-james-hpc-pg0-6:19316:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833541] [slurm-slehpc15-james-hpc-pg0-8:19794:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833553] [slurm-slehpc15-james-hpc-pg0-8:19794:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833565] [slurm-slehpc15-james-hpc-pg0-4:18753:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833645] [slurm-slehpc15-james-hpc-pg0-4:18753:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18781] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.801283] [slurm-slehpc15-james-hpc-pg0-6:19315:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801301] [slurm-slehpc15-james-hpc-pg0-6:19315:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833649] [slurm-slehpc15-james-hpc-pg0-4:18725:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833660] [slurm-slehpc15-james-hpc-pg0-4:18725:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.833886] [slurm-slehpc15-james-hpc-pg0-11:18766:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833902] [slurm-slehpc15-james-hpc-pg0-11:18766:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833661] [slurm-slehpc15-james-hpc-pg0-8:19791:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833677] [slurm-slehpc15-james-hpc-pg0-8:19791:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833755] [slurm-slehpc15-james-hpc-pg0-4:18733:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833851] [slurm-slehpc15-james-hpc-pg0-4:18733:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833874] [slurm-slehpc15-james-hpc-pg0-4:18733:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.833886] [slurm-slehpc15-james-hpc-pg0-4:18733:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.824909] [slurm-slehpc15-james-hpc-pg0-2:19224:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824927] [slurm-slehpc15-james-hpc-pg0-2:19224:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833609] [slurm-slehpc15-james-hpc-pg0-4:18754:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833709] [slurm-slehpc15-james-hpc-pg0-4:18754:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.801305] [slurm-slehpc15-james-hpc-pg0-6:19311:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801316] [slurm-slehpc15-james-hpc-pg0-6:19311:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18786] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.833900] [slurm-slehpc15-james-hpc-pg0-11:18764:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833912] [slurm-slehpc15-james-hpc-pg0-11:18764:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833623] [slurm-slehpc15-james-hpc-pg0-8:19792:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833634] [slurm-slehpc15-james-hpc-pg0-8:19792:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833557] [slurm-slehpc15-james-hpc-pg0-4:18760:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833640] [slurm-slehpc15-james-hpc-pg0-4:18760:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[slurm-slehpc15-james-hpc-pg0-11:18785] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18745] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.824897] [slurm-slehpc15-james-hpc-pg0-2:19229:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824914] [slurm-slehpc15-james-hpc-pg0-2:19229:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.801246] [slurm-slehpc15-james-hpc-pg0-6:19314:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801263] [slurm-slehpc15-james-hpc-pg0-6:19314:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833638] [slurm-slehpc15-james-hpc-pg0-4:18762:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833724] [slurm-slehpc15-james-hpc-pg0-4:18762:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.801165] [slurm-slehpc15-james-hpc-pg0-6:19319:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801182] [slurm-slehpc15-james-hpc-pg0-6:19319:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833969] [slurm-slehpc15-james-hpc-pg0-11:18760:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833985] [slurm-slehpc15-james-hpc-pg0-11:18760:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833568] [slurm-slehpc15-james-hpc-pg0-8:19795:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833584] [slurm-slehpc15-james-hpc-pg0-8:19795:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833558] [slurm-slehpc15-james-hpc-pg0-4:18756:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833646] [slurm-slehpc15-james-hpc-pg0-4:18756:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833536] [slurm-slehpc15-james-hpc-pg0-4:18755:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833621] [slurm-slehpc15-james-hpc-pg0-4:18755:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.825070] [slurm-slehpc15-james-hpc-pg0-2:19220:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825084] [slurm-slehpc15-james-hpc-pg0-2:19220:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833951] [slurm-slehpc15-james-hpc-pg0-11:18761:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833967] [slurm-slehpc15-james-hpc-pg0-11:18761:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18732] *** An error occurred in MPI_Init | |
[slurm-slehpc15-james-hpc-pg0-4:18732] *** reported by process [1918566401,143] | |
[slurm-slehpc15-james-hpc-pg0-4:18732] *** on a NULL communicator | |
[slurm-slehpc15-james-hpc-pg0-4:18732] *** Unknown error | |
[slurm-slehpc15-james-hpc-pg0-4:18732] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, | |
[slurm-slehpc15-james-hpc-pg0-4:18732] *** and potentially your MPI job) | |
[1665113880.833596] [slurm-slehpc15-james-hpc-pg0-8:19796:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833610] [slurm-slehpc15-james-hpc-pg0-8:19796:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833536] [slurm-slehpc15-james-hpc-pg0-4:18758:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833620] [slurm-slehpc15-james-hpc-pg0-4:18758:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.825002] [slurm-slehpc15-james-hpc-pg0-2:19216:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825021] [slurm-slehpc15-james-hpc-pg0-2:19216:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833558] [slurm-slehpc15-james-hpc-pg0-4:18759:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833641] [slurm-slehpc15-james-hpc-pg0-4:18759:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.825114] [slurm-slehpc15-james-hpc-pg0-2:19215:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825132] [slurm-slehpc15-james-hpc-pg0-2:19215:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833592] [slurm-slehpc15-james-hpc-pg0-8:19793:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833609] [slurm-slehpc15-james-hpc-pg0-8:19793:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833573] [slurm-slehpc15-james-hpc-pg0-4:18764:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833656] [slurm-slehpc15-james-hpc-pg0-4:18764:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.834144] [slurm-slehpc15-james-hpc-pg0-4:18764:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.801140] [slurm-slehpc15-james-hpc-pg0-6:19320:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801155] [slurm-slehpc15-james-hpc-pg0-6:19320:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833535] [slurm-slehpc15-james-hpc-pg0-8:19798:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833552] [slurm-slehpc15-james-hpc-pg0-8:19798:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833549] [slurm-slehpc15-james-hpc-pg0-4:18761:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833641] [slurm-slehpc15-james-hpc-pg0-4:18761:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.801252] [slurm-slehpc15-james-hpc-pg0-6:19321:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801264] [slurm-slehpc15-james-hpc-pg0-6:19321:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833549] [slurm-slehpc15-james-hpc-pg0-4:18757:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833643] [slurm-slehpc15-james-hpc-pg0-4:18757:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833684] [slurm-slehpc15-james-hpc-pg0-10:18750:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833766] [slurm-slehpc15-james-hpc-pg0-10:18750:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.801139] [slurm-slehpc15-james-hpc-pg0-6:19325:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801155] [slurm-slehpc15-james-hpc-pg0-6:19325:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833508] [slurm-slehpc15-james-hpc-pg0-8:19802:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833524] [slurm-slehpc15-james-hpc-pg0-8:19802:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833531] [slurm-slehpc15-james-hpc-pg0-4:18763:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833666] [slurm-slehpc15-james-hpc-pg0-4:18763:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.834144] [slurm-slehpc15-james-hpc-pg0-4:18763:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834160] [slurm-slehpc15-james-hpc-pg0-4:18763:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.833870] [slurm-slehpc15-james-hpc-pg0-11:18765:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833883] [slurm-slehpc15-james-hpc-pg0-11:18765:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.825157] [slurm-slehpc15-james-hpc-pg0-2:19223:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825173] [slurm-slehpc15-james-hpc-pg0-2:19223:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833606] [slurm-slehpc15-james-hpc-pg0-8:19799:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833622] [slurm-slehpc15-james-hpc-pg0-8:19799:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833936] [slurm-slehpc15-james-hpc-pg0-10:18722:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833954] [slurm-slehpc15-james-hpc-pg0-10:18722:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833949] [slurm-slehpc15-james-hpc-pg0-11:18762:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833967] [slurm-slehpc15-james-hpc-pg0-11:18762:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833578] [slurm-slehpc15-james-hpc-pg0-8:19809:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833596] [slurm-slehpc15-james-hpc-pg0-8:19809:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.801342] [slurm-slehpc15-james-hpc-pg0-6:19313:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801360] [slurm-slehpc15-james-hpc-pg0-6:19313:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833968] [slurm-slehpc15-james-hpc-pg0-10:18721:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833986] [slurm-slehpc15-james-hpc-pg0-10:18721:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833835] [slurm-slehpc15-james-hpc-pg0-11:18771:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833857] [slurm-slehpc15-james-hpc-pg0-11:18771:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.825054] [slurm-slehpc15-james-hpc-pg0-2:19231:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825071] [slurm-slehpc15-james-hpc-pg0-2:19231:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833686] [slurm-slehpc15-james-hpc-pg0-8:19790:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833702] [slurm-slehpc15-james-hpc-pg0-8:19790:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.833830] [slurm-slehpc15-james-hpc-pg0-10:18726:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833845] [slurm-slehpc15-james-hpc-pg0-10:18726:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.801356] [slurm-slehpc15-james-hpc-pg0-6:19312:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801374] [slurm-slehpc15-james-hpc-pg0-6:19312:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.833949] [slurm-slehpc15-james-hpc-pg0-11:18763:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833967] [slurm-slehpc15-james-hpc-pg0-11:18763:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833889] [slurm-slehpc15-james-hpc-pg0-8:19797:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833903] [slurm-slehpc15-james-hpc-pg0-8:19797:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18753] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.825062] [slurm-slehpc15-james-hpc-pg0-2:19222:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825073] [slurm-slehpc15-james-hpc-pg0-2:19222:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833885] [slurm-slehpc15-james-hpc-pg0-8:19789:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.833903] [slurm-slehpc15-james-hpc-pg0-8:19789:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.825031] [slurm-slehpc15-james-hpc-pg0-2:19230:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825048] [slurm-slehpc15-james-hpc-pg0-2:19230:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18755] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.833988] [slurm-slehpc15-james-hpc-pg0-10:18720:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834000] [slurm-slehpc15-james-hpc-pg0-10:18720:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.801390] [slurm-slehpc15-james-hpc-pg0-6:19317:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833763] [slurm-slehpc15-james-hpc-pg0-11:18770:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833788] [slurm-slehpc15-james-hpc-pg0-11:18770:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.824982] [slurm-slehpc15-james-hpc-pg0-2:19235:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.824999] [slurm-slehpc15-james-hpc-pg0-2:19235:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.801414] [slurm-slehpc15-james-hpc-pg0-6:19309:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801430] [slurm-slehpc15-james-hpc-pg0-6:19309:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.834291] [slurm-slehpc15-james-hpc-pg0-4:18751:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834309] [slurm-slehpc15-james-hpc-pg0-4:18751:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.825065] [slurm-slehpc15-james-hpc-pg0-2:19227:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825076] [slurm-slehpc15-james-hpc-pg0-2:19227:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18757] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.834029] [slurm-slehpc15-james-hpc-pg0-10:18719:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834045] [slurm-slehpc15-james-hpc-pg0-10:18719:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833868] [slurm-slehpc15-james-hpc-pg0-11:18769:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833885] [slurm-slehpc15-james-hpc-pg0-11:18769:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834024] [slurm-slehpc15-james-hpc-pg0-8:19828:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.834042] [slurm-slehpc15-james-hpc-pg0-8:19828:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.834303] [slurm-slehpc15-james-hpc-pg0-4:18750:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834315] [slurm-slehpc15-james-hpc-pg0-4:18750:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.825235] [slurm-slehpc15-james-hpc-pg0-2:19211:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825254] [slurm-slehpc15-james-hpc-pg0-2:19211:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.801560] [slurm-slehpc15-james-hpc-pg0-6:19318:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801576] [slurm-slehpc15-james-hpc-pg0-6:19318:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18758] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.825274] [slurm-slehpc15-james-hpc-pg0-2:19226:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825290] [slurm-slehpc15-james-hpc-pg0-2:19226:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833904] [slurm-slehpc15-james-hpc-pg0-10:18724:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833922] [slurm-slehpc15-james-hpc-pg0-10:18724:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833863] [slurm-slehpc15-james-hpc-pg0-11:18768:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833885] [slurm-slehpc15-james-hpc-pg0-11:18768:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18759] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.834082] [slurm-slehpc15-james-hpc-pg0-8:19829:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.834098] [slurm-slehpc15-james-hpc-pg0-8:19829:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.834376] [slurm-slehpc15-james-hpc-pg0-4:18747:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834393] [slurm-slehpc15-james-hpc-pg0-4:18747:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.801525] [slurm-slehpc15-james-hpc-pg0-6:19310:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801541] [slurm-slehpc15-james-hpc-pg0-6:19310:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18760] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18735] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18761] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.825198] [slurm-slehpc15-james-hpc-pg0-2:19217:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825216] [slurm-slehpc15-james-hpc-pg0-2:19217:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.834107] [slurm-slehpc15-james-hpc-pg0-10:18723:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.801817] [slurm-slehpc15-james-hpc-pg0-6:19350:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801836] [slurm-slehpc15-james-hpc-pg0-6:19350:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.834198] [slurm-slehpc15-james-hpc-pg0-8:19832:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.834213] [slurm-slehpc15-james-hpc-pg0-8:19832:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18795] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[1665113880.834346] [slurm-slehpc15-james-hpc-pg0-4:18752:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834359] [slurm-slehpc15-james-hpc-pg0-4:18752:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.801817] [slurm-slehpc15-james-hpc-pg0-6:19349:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801836] [slurm-slehpc15-james-hpc-pg0-6:19349:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.834248] [slurm-slehpc15-james-hpc-pg0-8:19827:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.834266] [slurm-slehpc15-james-hpc-pg0-8:19827:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18723] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833746] [slurm-slehpc15-james-hpc-pg0-10:18729:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833777] [slurm-slehpc15-james-hpc-pg0-10:18729:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833779] [slurm-slehpc15-james-hpc-pg0-11:18772:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833803] [slurm-slehpc15-james-hpc-pg0-11:18772:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834215] [slurm-slehpc15-james-hpc-pg0-8:19821:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.834228] [slurm-slehpc15-james-hpc-pg0-8:19821:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18736] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.834205] [slurm-slehpc15-james-hpc-pg0-8:19826:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.834220] [slurm-slehpc15-james-hpc-pg0-8:19826:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.834342] [slurm-slehpc15-james-hpc-pg0-4:18754:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834358] [slurm-slehpc15-james-hpc-pg0-4:18754:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.825208] [slurm-slehpc15-james-hpc-pg0-2:19212:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825226] [slurm-slehpc15-james-hpc-pg0-2:19212:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833829] [slurm-slehpc15-james-hpc-pg0-10:18727:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833846] [slurm-slehpc15-james-hpc-pg0-10:18727:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.801877] [slurm-slehpc15-james-hpc-pg0-6:19352:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.801895] [slurm-slehpc15-james-hpc-pg0-6:19352:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.834223] [slurm-slehpc15-james-hpc-pg0-8:19831:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.834240] [slurm-slehpc15-james-hpc-pg0-8:19831:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18737] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.825166] [slurm-slehpc15-james-hpc-pg0-2:19213:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825184] [slurm-slehpc15-james-hpc-pg0-2:19213:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833812] [slurm-slehpc15-james-hpc-pg0-11:18778:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833828] [slurm-slehpc15-james-hpc-pg0-11:18778:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834206] [slurm-slehpc15-james-hpc-pg0-4:18760:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834222] [slurm-slehpc15-james-hpc-pg0-4:18760:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18767] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18738] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18748] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.834047] [slurm-slehpc15-james-hpc-pg0-10:18718:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834058] [slurm-slehpc15-james-hpc-pg0-10:18718:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.802326] [slurm-slehpc15-james-hpc-pg0-6:19317:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.802342] [slurm-slehpc15-james-hpc-pg0-6:19317:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.834198] [slurm-slehpc15-james-hpc-pg0-8:19830:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.834214] [slurm-slehpc15-james-hpc-pg0-8:19830:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18747] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.834148] [slurm-slehpc15-james-hpc-pg0-4:18762:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834161] [slurm-slehpc15-james-hpc-pg0-4:18762:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.825167] [slurm-slehpc15-james-hpc-pg0-2:19214:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825184] [slurm-slehpc15-james-hpc-pg0-2:19214:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.834255] [slurm-slehpc15-james-hpc-pg0-10:18725:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834270] [slurm-slehpc15-james-hpc-pg0-10:18725:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18739] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18749] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.804710] [slurm-slehpc15-james-hpc-pg0-6:19334:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.804801] [slurm-slehpc15-james-hpc-pg0-6:19334:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.833794] [slurm-slehpc15-james-hpc-pg0-11:18785:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833812] [slurm-slehpc15-james-hpc-pg0-11:18785:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.825166] [slurm-slehpc15-james-hpc-pg0-2:19218:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825184] [slurm-slehpc15-james-hpc-pg0-2:19218:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-2:19236] pml_ucx.c:419 Error: ucp_ep_create(proc=58) failed: Shared memory error | |
[1665113880.805182] [slurm-slehpc15-james-hpc-pg0-6:19334:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1e7f4109 flags=0x0) failed: No such file or directory | |
[1665113880.805196] [slurm-slehpc15-james-hpc-pg0-6:19334:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001e7f4109: Shared memory error | |
[1665113880.834308] [slurm-slehpc15-james-hpc-pg0-4:18756:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834320] [slurm-slehpc15-james-hpc-pg0-4:18756:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.833755] [slurm-slehpc15-james-hpc-pg0-10:18732:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833772] [slurm-slehpc15-james-hpc-pg0-10:18732:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18741] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18751] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.834276] [slurm-slehpc15-james-hpc-pg0-8:19824:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.834292] [slurm-slehpc15-james-hpc-pg0-8:19824:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.834223] [slurm-slehpc15-james-hpc-pg0-4:18755:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834235] [slurm-slehpc15-james-hpc-pg0-4:18755:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.825200] [slurm-slehpc15-james-hpc-pg0-2:19219:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825216] [slurm-slehpc15-james-hpc-pg0-2:19219:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833800] [slurm-slehpc15-james-hpc-pg0-10:18730:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833810] [slurm-slehpc15-james-hpc-pg0-10:18730:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.825203] [slurm-slehpc15-james-hpc-pg0-2:19221:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825217] [slurm-slehpc15-james-hpc-pg0-2:19221:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.834331] [slurm-slehpc15-james-hpc-pg0-8:19806:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.834147] [slurm-slehpc15-james-hpc-pg0-4:18758:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834160] [slurm-slehpc15-james-hpc-pg0-4:18758:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18740] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18753] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833787] [slurm-slehpc15-james-hpc-pg0-10:18728:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833799] [slurm-slehpc15-james-hpc-pg0-10:18728:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833929] [slurm-slehpc15-james-hpc-pg0-11:18776:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833944] [slurm-slehpc15-james-hpc-pg0-11:18776:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834872] [slurm-slehpc15-james-hpc-pg0-8:19806:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_5eba7378 flags=0x0) failed: No such file or directory | |
[1665113880.834887] [slurm-slehpc15-james-hpc-pg0-8:19806:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000005eba7378: Shared memory error | |
[1665113880.834149] [slurm-slehpc15-james-hpc-pg0-4:18759:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834162] [slurm-slehpc15-james-hpc-pg0-4:18759:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.825204] [slurm-slehpc15-james-hpc-pg0-2:19225:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.825217] [slurm-slehpc15-james-hpc-pg0-2:19225:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.833782] [slurm-slehpc15-james-hpc-pg0-10:18731:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833794] [slurm-slehpc15-james-hpc-pg0-10:18731:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18743] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18750] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.834178] [slurm-slehpc15-james-hpc-pg0-4:18764:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.833691] [slurm-slehpc15-james-hpc-pg0-11:18780:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833703] [slurm-slehpc15-james-hpc-pg0-11:18780:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833696] [slurm-slehpc15-james-hpc-pg0-10:18734:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833715] [slurm-slehpc15-james-hpc-pg0-10:18734:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834224] [slurm-slehpc15-james-hpc-pg0-4:18761:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834237] [slurm-slehpc15-james-hpc-pg0-4:18761:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.833733] [slurm-slehpc15-james-hpc-pg0-11:18774:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833752] [slurm-slehpc15-james-hpc-pg0-11:18774:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.832124] [slurm-slehpc15-james-hpc-pg0-2:19236:0] ucp_worker.c:1777 UCX INFO ep_cfg[0]: tag(self/memory0 cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.832211] [slurm-slehpc15-james-hpc-pg0-2:19236:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.834334] [slurm-slehpc15-james-hpc-pg0-4:18757:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834349] [slurm-slehpc15-james-hpc-pg0-4:18757:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18742] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18752] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833760] [slurm-slehpc15-james-hpc-pg0-10:18737:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833773] [slurm-slehpc15-james-hpc-pg0-10:18737:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833796] [slurm-slehpc15-james-hpc-pg0-11:18777:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833812] [slurm-slehpc15-james-hpc-pg0-11:18777:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833854] [slurm-slehpc15-james-hpc-pg0-11:18775:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833871] [slurm-slehpc15-james-hpc-pg0-11:18775:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833750] [slurm-slehpc15-james-hpc-pg0-10:18740:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833761] [slurm-slehpc15-james-hpc-pg0-10:18740:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833775] [slurm-slehpc15-james-hpc-pg0-10:18735:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.833789] [slurm-slehpc15-james-hpc-pg0-10:18735:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.833836] [slurm-slehpc15-james-hpc-pg0-11:18773:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833848] [slurm-slehpc15-james-hpc-pg0-11:18773:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18745] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18754] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.833820] [slurm-slehpc15-james-hpc-pg0-11:18779:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833830] [slurm-slehpc15-james-hpc-pg0-11:18779:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.833710] [slurm-slehpc15-james-hpc-pg0-11:18786:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.833721] [slurm-slehpc15-james-hpc-pg0-11:18786:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834439] [slurm-slehpc15-james-hpc-pg0-4:18749:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834456] [slurm-slehpc15-james-hpc-pg0-4:18749:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834024] [slurm-slehpc15-james-hpc-pg0-11:18759:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.834040] [slurm-slehpc15-james-hpc-pg0-11:18759:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834618] [slurm-slehpc15-james-hpc-pg0-4:18737:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834635] [slurm-slehpc15-james-hpc-pg0-4:18737:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834577] [slurm-slehpc15-james-hpc-pg0-4:18735:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834590] [slurm-slehpc15-james-hpc-pg0-4:18735:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18744] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18755] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.834163] [slurm-slehpc15-james-hpc-pg0-11:18755:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.834180] [slurm-slehpc15-james-hpc-pg0-11:18755:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834414] [slurm-slehpc15-james-hpc-pg0-4:18753:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834426] [slurm-slehpc15-james-hpc-pg0-4:18753:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834837] [slurm-slehpc15-james-hpc-pg0-10:18750:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834854] [slurm-slehpc15-james-hpc-pg0-10:18750:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834678] [slurm-slehpc15-james-hpc-pg0-10:18752:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834694] [slurm-slehpc15-james-hpc-pg0-10:18752:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834116] [slurm-slehpc15-james-hpc-pg0-11:18757:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.834132] [slurm-slehpc15-james-hpc-pg0-11:18757:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834550] [slurm-slehpc15-james-hpc-pg0-4:18736:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834567] [slurm-slehpc15-james-hpc-pg0-4:18736:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18748] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.834041] [slurm-slehpc15-james-hpc-pg0-11:18756:0] ucp_worker.c:1777 UCX INFO ep_cfg[1]: tag(posix/memory cma/memory dc_mlx5/mlx5_0:1); | |
[1665113880.834761] [slurm-slehpc15-james-hpc-pg0-10:18723:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834776] [slurm-slehpc15-james-hpc-pg0-10:18723:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834217] [slurm-slehpc15-james-hpc-pg0-11:18753:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.834234] [slurm-slehpc15-james-hpc-pg0-11:18753:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834589] [slurm-slehpc15-james-hpc-pg0-10:18753:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834606] [slurm-slehpc15-james-hpc-pg0-10:18753:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18749] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18756] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.834140] [slurm-slehpc15-james-hpc-pg0-11:18758:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.834155] [slurm-slehpc15-james-hpc-pg0-11:18758:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834587] [slurm-slehpc15-james-hpc-pg0-10:18755:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834605] [slurm-slehpc15-james-hpc-pg0-10:18755:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834025] [slurm-slehpc15-james-hpc-pg0-11:18767:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.834041] [slurm-slehpc15-james-hpc-pg0-11:18767:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18746] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.834613] [slurm-slehpc15-james-hpc-pg0-11:18795:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.834632] [slurm-slehpc15-james-hpc-pg0-11:18795:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834552] [slurm-slehpc15-james-hpc-pg0-4:18741:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834568] [slurm-slehpc15-james-hpc-pg0-4:18741:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834528] [slurm-slehpc15-james-hpc-pg0-10:18757:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834538] [slurm-slehpc15-james-hpc-pg0-10:18757:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.832645] [slurm-slehpc15-james-hpc-pg0-2:19236:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_6ae232e6 flags=0x0) failed: No such file or directory | |
[1665113880.832659] [slurm-slehpc15-james-hpc-pg0-2:19236:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000006ae232e6: Shared memory error | |
[1665113880.834552] [slurm-slehpc15-james-hpc-pg0-4:18740:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834567] [slurm-slehpc15-james-hpc-pg0-4:18740:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834484] [slurm-slehpc15-james-hpc-pg0-10:18756:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834501] [slurm-slehpc15-james-hpc-pg0-10:18756:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18747] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18757] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.834619] [slurm-slehpc15-james-hpc-pg0-4:18738:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834635] [slurm-slehpc15-james-hpc-pg0-4:18738:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834457] [slurm-slehpc15-james-hpc-pg0-10:18760:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834473] [slurm-slehpc15-james-hpc-pg0-10:18760:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834575] [slurm-slehpc15-james-hpc-pg0-4:18739:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834588] [slurm-slehpc15-james-hpc-pg0-4:18739:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834934] [slurm-slehpc15-james-hpc-pg0-10:18748:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834952] [slurm-slehpc15-james-hpc-pg0-10:18748:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834519] [slurm-slehpc15-james-hpc-pg0-4:18743:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834537] [slurm-slehpc15-james-hpc-pg0-4:18743:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834771] [slurm-slehpc15-james-hpc-pg0-10:18751:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834788] [slurm-slehpc15-james-hpc-pg0-10:18751:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-4:18753] pml_ucx.c:419 Error: ucp_ep_create(proc=145) failed: Shared memory error | |
[1665113880.834615] [slurm-slehpc15-james-hpc-pg0-4:18742:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834631] [slurm-slehpc15-james-hpc-pg0-4:18742:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834772] [slurm-slehpc15-james-hpc-pg0-10:18747:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834789] [slurm-slehpc15-james-hpc-pg0-10:18747:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834575] [slurm-slehpc15-james-hpc-pg0-4:18745:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834592] [slurm-slehpc15-james-hpc-pg0-4:18745:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834483] [slurm-slehpc15-james-hpc-pg0-10:18761:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834501] [slurm-slehpc15-james-hpc-pg0-10:18761:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834488] [slurm-slehpc15-james-hpc-pg0-10:18759:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834501] [slurm-slehpc15-james-hpc-pg0-10:18759:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834472] [slurm-slehpc15-james-hpc-pg0-4:18748:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834489] [slurm-slehpc15-james-hpc-pg0-4:18748:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18758] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.834407] [slurm-slehpc15-james-hpc-pg0-4:18746:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834424] [slurm-slehpc15-james-hpc-pg0-4:18746:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.834775] [slurm-slehpc15-james-hpc-pg0-10:18749:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834789] [slurm-slehpc15-james-hpc-pg0-10:18749:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[1665113880.834533] [slurm-slehpc15-james-hpc-pg0-4:18744:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_1722d684 flags=0x0) failed: No such file or directory | |
[1665113880.834550] [slurm-slehpc15-james-hpc-pg0-4:18744:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000001722d684: Shared memory error | |
[1665113880.835171] [slurm-slehpc15-james-hpc-pg0-11:18756:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.835188] [slurm-slehpc15-james-hpc-pg0-11:18756:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[1665113880.834627] [slurm-slehpc15-james-hpc-pg0-10:18754:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834645] [slurm-slehpc15-james-hpc-pg0-10:18754:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18760] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.834508] [slurm-slehpc15-james-hpc-pg0-10:18758:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_2c22ca39 flags=0x0) failed: No such file or directory | |
[1665113880.834524] [slurm-slehpc15-james-hpc-pg0-10:18758:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000002c22ca39: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18756] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-10:18759] pml_ucx.c:419 Error: ucp_ep_create(proc=424) failed: Shared memory error | |
[1665113880.835505] [slurm-slehpc15-james-hpc-pg0-11:18754:0] mm_posix.c:207 UCX ERROR shm_open(file_name=/ucx_shm_posix_685d94d flags=0x0) failed: No such file or directory | |
[1665113880.835523] [slurm-slehpc15-james-hpc-pg0-11:18754:0] mm_ep.c:159 UCX ERROR mm ep failed to connect to remote FIFO id 0x400000000685d94d: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-11:18754] pml_ucx.c:419 Error: ucp_ep_create(proc=482) failed: Shared memory error | |
[slurm-slehpc15-james-hpc-pg0-1:19546] 527 more processes have sent help message help-mpi-btl-openib.txt / no device params found | |
[slurm-slehpc15-james-hpc-pg0-1:19546] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages | |
[slurm-slehpc15-james-hpc-pg0-1:19546] 527 more processes have sent help message help-mpi-btl-openib.txt / error in device init | |
[slurm-slehpc15-james-hpc-pg0-1:19546] 5 more processes have sent help message help-mpi-runtime.txt / mpi_init:startup:pml-add-procs-fail | |
[slurm-slehpc15-james-hpc-pg0-1:19546] 257 more processes have sent help message help-mpi-runtime.txt / mpi_init:startup:internal-failure | |
[slurm-slehpc15-james-hpc-pg0-1:19546] 263 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal unknown handle |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment