Created
July 10, 2024 19:03
-
-
Save casparvl/46dce0c25a32d73f58e2c30c7e7aa4de to your computer and use it in GitHub Desktop.
(partial) EasyBuild log for failed build of /tmp/eb-tmp/eb-15xssewa/files_pr20358/t/TensorFlow/TensorFlow-2.15.1-foss-2023a-CUDA-12.1.1.eb (PR(s) #20358) (easyblock PR(s) #3303)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'na2024-07-10 17:27:43.408395: E tensorflow/core/common_runtime/base_collective_executor.cc:249] BaseCollectiveExecutor::StartAbort INTERNAL: NCCL: unhandled cuda error (run with NCCL_DEBUG=INFO for details). Set NCCL_DEBUG=WARN for detail. | |
2024-07-10 17:27:43.408472: W tensorflow/core/framework/op_kernel.cc:1839] OP_REQUIRES failed at collective_ops.cc:320 : INTERNAL: Collective ops is aborted by: NCCL: unhandled cuda error (run with NCCL_DEBUG=INFO for details). Set NCCL_DEBUG=WARN for detail. | |
The error could be from a previous operation. Restart your program to reset. [type.googleapis.com/tensorflow.DerivedStatus=''] | |
[ FAILED ] CollectiveOpGPUTest.testNcclStress | |
INFO:tensorflow:time(__main__.CollectiveOpGPUTest.testNcclStress): 3.91s | |
I0710 17:27:43.410111 22563912242048 test_util.py:2574] time(__main__.CollectiveOpGPUTest.testNcclStress): 3.91s | |
[ RUN ] CollectiveOpGPUTest.test_session | |
[ SKIPPED ] CollectiveOpGPUTest.test_session | |
====================================================================== | |
ERROR: testNcclStress (__main__.CollectiveOpGPUTest.testNcclStress) | |
CollectiveOpGPUTest.testNcclStress | |
---------------------------------------------------------------------- | |
Traceback (most recent call last): | |
File "/tmp/eb-build/TensorFlow/2.15.1/foss-2023a-CUDA-12.1.1/TensorFlow/bazel-root/7e81cf2bbcceb6bbb72b8d4762f8a810/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/python/ops/collective_ops_gpu_test_gpu.runfiles/org_tensorflow/tensorflow/python/ops/collective_ops_gpu_test.py", line 279, in testNcclStress | |
collective_ops.all_reduce( | |
File "/tmp/eb-build/TensorFlow/2.15.1/foss-2023a-CUDA-12.1.1/TensorFlow/bazel-root/7e81cf2bbcceb6bbb72b8d4762f8a810/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/python/ops/collective_ops_gpu_test_gpu.runfiles/org_tensorflow/tensorflow/python/ops/collective_ops.py", line 59, in all_reduce | |
return gen_collective_ops.collective_reduce( | |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
File "/tmp/eb-build/TensorFlow/2.15.1/foss-2023a-CUDA-12.1.1/TensorFlow/bazel-root/7e81cf2bbcceb6bbb72b8d4762f8a810/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/python/ops/collective_ops_gpu_test_gpu.runfiles/org_tensorflow/tensorflow/python/ops/gen_collective_ops.py", line 998, in collective_reduce | |
return collective_reduce_eager_fallback( | |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
File "/tmp/eb-build/TensorFlow/2.15.1/foss-2023a-CUDA-12.1.1/TensorFlow/bazel-root/7e81cf2bbcceb6bbb72b8d4762f8a810/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/python/ops/collective_ops_gpu_test_gpu.runfiles/org_tensorflow/tensorflow/python/ops/gen_collective_ops.py", line 1088, in collective_reduce_eager_fallback | |
_result = _execute.execute(b"CollectiveReduce", 1, inputs=_inputs_flat, | |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
File "/tmp/eb-build/TensorFlow/2.15.1/foss-2023a-CUDA-12.1.1/TensorFlow/bazel-root/7e81cf2bbcceb6bbb72b8d4762f8a810/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/python/ops/collective_ops_gpu_test_gpu.runfiles/org_tensorflow/tensorflow/python/eager/execute.py", line 53, in quick_execute | |
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, | |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
tensorflow.python.framework.errors_impl.InternalError: {{function_node __wrapped__CollectiveReduce_device_/job:localhost/replica:0/task:0/device:GPU:0}} Collective ops is aborted by: NCCL: unhandled cuda error (run with NCCL_DEBUG=INFO for details). Set NCCL_DEBUG=WARN for detail. | |
The error could be from a previous operation. Restart your program to reset. [Op:CollectiveReduce] | |
---------------------------------------------------------------------- | |
Ran 13 tests in 5.990s | |
FAILED (errors=1, skipped=12) | |
med symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn153:774168:775656 [0] NCCL INFO init.cc:1332 -> 1 | |
gcn153:774168:775656 [0] NCCL INFO group.cc:65 -> 1 [Async thread] | |
gcn153:774168:775644 [0] NCCL INFO group.cc:406 -> 1 | |
gcn153:774168:775644 [0] NCCL INFO group.cc:96 -> 1 | |
== 2024-07-10 19:33:02,270 build_log.py:171 ERROR EasyBuild crashed with an error (at easybuild/base/exceptions.py:126 in __init__): At least 1 gpu tests failed: | |
//tensorflow/python/ops:collective_ops_gpu_test_gpu (at easybuild/framework/easyblock.py:2287 in report_test_failure) | |
== 2024-07-10 19:33:02,272 build_log.py:267 INFO ... (took 2 hours 50 mins 1 secs) | |
== 2024-07-10 19:33:02,272 build_log.py:267 INFO ... (took 2 hours 51 mins 16 secs) | |
== 2024-07-10 19:33:02,272 filetools.py:2012 INFO Removing lock /scratch-nvme/1/casparl/generic/software/.locks/_scratch-nvme_1_casparl_generic_software_TensorFlow_2.15.1-foss-2023a-CUDA-12.1.1.lock... | |
== 2024-07-10 19:33:02,273 filetools.py:383 INFO Path /scratch-nvme/1/casparl/generic/software/.locks/_scratch-nvme_1_casparl_generic_software_TensorFlow_2.15.1-foss-2023a-CUDA-12.1.1.lock successfully removed. | |
== 2024-07-10 19:33:02,273 filetools.py:2016 INFO Lock removed: /scratch-nvme/1/casparl/generic/software/.locks/_scratch-nvme_1_casparl_generic_software_TensorFlow_2.15.1-foss-2023a-CUDA-12.1.1.lock | |
== 2024-07-10 19:33:02,273 easyblock.py:4283 WARNING build failed (first 300 chars): At least 1 gpu tests failed: | |
//tensorflow/python/ops:collective_ops_gpu_test_gpu | |
== 2024-07-10 19:33:02,273 easyblock.py:328 INFO Closing log for application name TensorFlow version 2.15.1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment