Created
July 18, 2024 14:59
-
-
Save casparvl/01b3d181e7e53143584ac02e10011126 to your computer and use it in GitHub Desktop.
(partial) EasyBuild log for failed build of /scratch-node/casparl.7053181/eb-tmp/eb-2v8kq72n/files_pr20358/t/TensorFlow/TensorFlow-2.15.1-foss-2023a-CUDA-12.1.1.eb (PR(s) #20358) (easyblock PR(s) #3303)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue2024-07-18 13:35:54.553401: E tensorflow/core/common_runtime/base_collective_executor.cc:249] BaseCollectiveExecutor::StartAbort INTERNAL: NCCL: unhandled cuda error (run with NCCL_DEBUG=INFO for details). Set NCCL_DEBUG=WARN for detail. | |
2024-07-18 13:35:54.553470: W tensorflow/core/framework/op_kernel.cc:1839] OP_REQUIRES failed at collective_ops.cc:320 : INTERNAL: Collective ops is aborted by: NCCL: unhandled cuda error (run with NCCL_DEBUG=INFO for details). Set NCCL_DEBUG=WARN for detail. | |
The error could be from a previous operation. Restart your program to reset. [type.googleapis.com/tensorflow.DerivedStatus=''] | |
[ FAILED ] CollectiveOpGPUTest.testNcclStress | |
INFO:tensorflow:time(__main__.CollectiveOpGPUTest.testNcclStress): 1.73s | |
I0718 13:35:54.555011 22717969134464 test_util.py:2574] time(__main__.CollectiveOpGPUTest.testNcclStress): 1.73s | |
[ RUN ] CollectiveOpGPUTest.test_session | |
[ SKIPPED ] CollectiveOpGPUTest.test_session | |
====================================================================== | |
ERROR: testNcclStress (__main__.CollectiveOpGPUTest.testNcclStress) | |
CollectiveOpGPUTest.testNcclStress | |
---------------------------------------------------------------------- | |
Traceback (most recent call last): | |
File "/scratch-node/casparl.7053181/eb-build/TensorFlow/2.15.1/foss-2023a-CUDA-12.1.1/TensorFlow/bazel-root/90dfda158e6c36a7b501f9dc86aa7413/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/python/ops/collective_ops_gpu_test_gpu.runfiles/org_tensorflow/tensorflow/python/ops/collective_ops_gpu_test.py", line 279, in testNcclStress | |
collective_ops.all_reduce( | |
File "/scratch-node/casparl.7053181/eb-build/TensorFlow/2.15.1/foss-2023a-CUDA-12.1.1/TensorFlow/bazel-root/90dfda158e6c36a7b501f9dc86aa7413/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/python/ops/collective_ops_gpu_test_gpu.runfiles/org_tensorflow/tensorflow/python/ops/collective_ops.py", line 59, in all_reduce | |
return gen_collective_ops.collective_reduce( | |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
File "/scratch-node/casparl.7053181/eb-build/TensorFlow/2.15.1/foss-2023a-CUDA-12.1.1/TensorFlow/bazel-root/90dfda158e6c36a7b501f9dc86aa7413/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/python/ops/collective_ops_gpu_test_gpu.runfiles/org_tensorflow/tensorflow/python/ops/gen_collective_ops.py", line 998, in collective_reduce | |
return collective_reduce_eager_fallback( | |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
File "/scratch-node/casparl.7053181/eb-build/TensorFlow/2.15.1/foss-2023a-CUDA-12.1.1/TensorFlow/bazel-root/90dfda158e6c36a7b501f9dc86aa7413/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/python/ops/collective_ops_gpu_test_gpu.runfiles/org_tensorflow/tensorflow/python/ops/gen_collective_ops.py", line 1088, in collective_reduce_eager_fallback | |
_result = _execute.execute(b"CollectiveReduce", 1, inputs=_inputs_flat, | |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
File "/scratch-node/casparl.7053181/eb-build/TensorFlow/2.15.1/foss-2023a-CUDA-12.1.1/TensorFlow/bazel-root/90dfda158e6c36a7b501f9dc86aa7413/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/python/ops/collective_ops_gpu_test_gpu.runfiles/org_tensorflow/tensorflow/python/eager/execute.py", line 53, in quick_execute | |
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, | |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
tensorflow.python.framework.errors_impl.InternalError: {{function_node __wrapped__CollectiveReduce_device_/job:localhost/replica:0/task:0/device:GPU:0}} Collective ops is aborted by: NCCL: unhandled cuda error (run with NCCL_DEBUG=INFO for details). Set NCCL_DEBUG=WARN for detail. | |
The error could be from a previous operation. Restart your program to reset. [Op:CollectiveReduce] | |
---------------------------------------------------------------------- | |
Ran 13 tests in 5.004s | |
FAILED (errors=1, skipped=12) | |
.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:115 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] enqueue.cc:128 NCCL WARN Cuda failure 'named symbol not found' | |
gcn74:1582888:1583878 [0] NCCL INFO init.cc:1332 -> 1 | |
gcn74:1582888:1583878 [0] NCCL INFO group.cc:65 -> 1 [Async thread] | |
gcn74:1582888:1583871 [0] NCCL INFO group.cc:406 -> 1 | |
gcn74:1582888:1583871 [0] NCCL INFO group.cc:96 -> 1 | |
== 2024-07-18 15:40:55,854 build_log.py:171 ERROR EasyBuild crashed with an error (at easybuild/base/exceptions.py:126 in __init__): At least 1 gpu tests failed: | |
//tensorflow/python/ops:collective_ops_gpu_test_gpu (at easybuild/framework/easyblock.py:2287 in report_test_failure) | |
== 2024-07-18 15:40:55,856 build_log.py:267 INFO ... (took 2 hours 23 mins 22 secs) | |
== 2024-07-18 15:40:55,856 build_log.py:267 INFO ... (took 2 hours 24 mins 38 secs) | |
== 2024-07-18 15:40:55,856 filetools.py:2012 INFO Removing lock /scratch-nvme/1/casparl/generic/software/.locks/_scratch-nvme_1_casparl_generic_software_TensorFlow_2.15.1-foss-2023a-CUDA-12.1.1.lock... | |
== 2024-07-18 15:40:55,859 filetools.py:383 INFO Path /scratch-nvme/1/casparl/generic/software/.locks/_scratch-nvme_1_casparl_generic_software_TensorFlow_2.15.1-foss-2023a-CUDA-12.1.1.lock successfully removed. | |
== 2024-07-18 15:40:55,859 filetools.py:2016 INFO Lock removed: /scratch-nvme/1/casparl/generic/software/.locks/_scratch-nvme_1_casparl_generic_software_TensorFlow_2.15.1-foss-2023a-CUDA-12.1.1.lock | |
== 2024-07-18 15:40:55,859 easyblock.py:4283 WARNING build failed (first 300 chars): At least 1 gpu tests failed: | |
//tensorflow/python/ops:collective_ops_gpu_test_gpu | |
== 2024-07-18 15:40:55,859 easyblock.py:328 INFO Closing log for application name TensorFlow version 2.15.1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment