Skip to content

Instantly share code, notes, and snippets.

@boegel
Created March 17, 2021 00:04
Show Gist options
  • Save boegel/1aa1013e4ed949fb72fff7e68e16aacd to your computer and use it in GitHub Desktop.
Save boegel/1aa1013e4ed949fb72fff7e68e16aacd to your computer and use it in GitHub Desktop.
(partial) EasyBuild log for failed build of /tmp/eb-90382g2c/files_pr12347/p/PyTorch/PyTorch-1.8.0-foss-2020b.eb (PR #12347)
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 6 items
distributed/pipeline/sync/skip/test_inspect_skip_layout.py::test_no_skippables PASSED [ 16%]
distributed/pipeline/sync/skip/test_inspect_skip_layout.py::test_inner_partition PASSED [ 33%]
distributed/pipeline/sync/skip/test_inspect_skip_layout.py::test_adjoining_partitions PASSED [ 50%]
distributed/pipeline/sync/skip/test_inspect_skip_layout.py::test_far_partitions PASSED [ 66%]
distributed/pipeline/sync/skip/test_inspect_skip_layout.py::test_pop_2_from_different_partitions PASSED [ 83%]
distributed/pipeline/sync/skip/test_inspect_skip_layout.py::test_namespace PASSED [100%]
============================== 6 passed in 0.04s ===============================
Running distributed/pipeline/sync/skip/test_leak ... [2021-03-17 00:13:36.010639]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_leak.py', '-v'] ... [2021-03-17 00:13:36.010724]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 8 items
distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-train] PASSED [ 12%]
distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-eval] PASSED [ 25%]
distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-train] PASSED [ 37%]
distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-eval] PASSED [ 50%]
distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-train] PASSED [ 62%]
distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-eval] PASSED [ 75%]
distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[train] PASSED [ 87%]
distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[eval] PASSED [100%]
============================== 8 passed in 0.53s ===============================
Running distributed/pipeline/sync/skip/test_portal ... [2021-03-17 00:13:37.730496]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_portal.py', '-v'] ... [2021-03-17 00:13:37.730588]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 10 items
distributed/pipeline/sync/skip/test_portal.py::test_copy_returns_on_next_device SKIPPED [ 10%]
distributed/pipeline/sync/skip/test_portal.py::test_blue_orange PASSED [ 20%]
distributed/pipeline/sync/skip/test_portal.py::test_blue_orange_not_requires_grad PASSED [ 30%]
distributed/pipeline/sync/skip/test_portal.py::test_use_grad PASSED [ 40%]
distributed/pipeline/sync/skip/test_portal.py::TestTensorLife::test_tensor_life_0 PASSED [ 50%]
distributed/pipeline/sync/skip/test_portal.py::TestTensorLife::test_tensor_life_1 PASSED [ 60%]
distributed/pipeline/sync/skip/test_portal.py::TestTensorLife::test_tensor_life_2 PASSED [ 70%]
distributed/pipeline/sync/skip/test_portal.py::TestTensorLife::test_tensor_life_3 PASSED [ 80%]
distributed/pipeline/sync/skip/test_portal.py::TestTensorLife::test_tensor_life_4 PASSED [ 90%]
distributed/pipeline/sync/skip/test_portal.py::TestTensorLife::test_tensor_life_3_plus_1 PASSED [100%]
========================= 9 passed, 1 skipped in 0.05s =========================
Running distributed/pipeline/sync/skip/test_stash_pop ... [2021-03-17 00:13:38.967187]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_stash_pop.py', '-v'] ... [2021-03-17 00:13:38.967275]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 7 items
distributed/pipeline/sync/skip/test_stash_pop.py::test_stash PASSED [ 14%]
distributed/pipeline/sync/skip/test_stash_pop.py::test_pop PASSED [ 28%]
distributed/pipeline/sync/skip/test_stash_pop.py::test_declare_but_not_use PASSED [ 42%]
distributed/pipeline/sync/skip/test_stash_pop.py::test_stash_not_declared PASSED [ 57%]
distributed/pipeline/sync/skip/test_stash_pop.py::test_pop_not_declared PASSED [ 71%]
distributed/pipeline/sync/skip/test_stash_pop.py::test_pop_not_stashed PASSED [ 85%]
distributed/pipeline/sync/skip/test_stash_pop.py::test_stash_none PASSED [100%]
============================== 7 passed in 0.04s ===============================
Running distributed/pipeline/sync/skip/test_tracker ... [2021-03-17 00:13:40.195985]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_tracker.py', '-v'] ... [2021-03-17 00:13:40.196078]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 6 items
distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker PASSED [ 16%]
distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker_by_data_parallel SKIPPED [ 33%]
distributed/pipeline/sync/skip/test_tracker.py::test_reuse_portal PASSED [ 50%]
distributed/pipeline/sync/skip/test_tracker.py::test_no_copy_no_portal PASSED [ 66%]
distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_without_checkpointing PASSED [ 83%]
distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_with_checkpointing PASSED [100%]
========================= 5 passed, 1 skipped in 0.04s =========================
Running distributed/pipeline/sync/skip/test_verify_skippables ... [2021-03-17 00:13:41.412040]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_verify_skippables.py', '-v'] ... [2021-03-17 00:13:41.412132]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 9 items
distributed/pipeline/sync/skip/test_verify_skippables.py::test_matching PASSED [ 11%]
distributed/pipeline/sync/skip/test_verify_skippables.py::test_stash_not_pop PASSED [ 22%]
distributed/pipeline/sync/skip/test_verify_skippables.py::test_pop_unknown PASSED [ 33%]
distributed/pipeline/sync/skip/test_verify_skippables.py::test_stash_again PASSED [ 44%]
distributed/pipeline/sync/skip/test_verify_skippables.py::test_pop_again PASSED [ 55%]
distributed/pipeline/sync/skip/test_verify_skippables.py::test_stash_pop_together_different_names PASSED [ 66%]
distributed/pipeline/sync/skip/test_verify_skippables.py::test_stash_pop_together_same_name PASSED [ 77%]
distributed/pipeline/sync/skip/test_verify_skippables.py::test_double_stash_pop PASSED [ 88%]
distributed/pipeline/sync/skip/test_verify_skippables.py::test_double_stash_pop_but_isolated PASSED [100%]
============================== 9 passed in 0.04s ===============================
Running distributed/pipeline/sync/test_balance ... [2021-03-17 00:13:42.632554]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_balance.py', '-v'] ... [2021-03-17 00:13:42.632649]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 15 items
distributed/pipeline/sync/test_balance.py::test_blockpartition PASSED [ 6%]
distributed/pipeline/sync/test_balance.py::test_blockpartition_zeros PASSED [ 13%]
distributed/pipeline/sync/test_balance.py::test_blockpartition_non_positive_partitions PASSED [ 20%]
distributed/pipeline/sync/test_balance.py::test_blockpartition_short_sequence PASSED [ 26%]
distributed/pipeline/sync/test_balance.py::test_balance_by_time[cpu] SKIPPED [ 33%]
distributed/pipeline/sync/test_balance.py::test_balance_by_time_loop_resets_input PASSED [ 40%]
distributed/pipeline/sync/test_balance.py::test_balance_by_size_latent SKIPPED [ 46%]
distributed/pipeline/sync/test_balance.py::test_balance_by_size_param SKIPPED [ 53%]
distributed/pipeline/sync/test_balance.py::test_balance_by_size_param_scale SKIPPED [ 60%]
distributed/pipeline/sync/test_balance.py::test_layerwise_sandbox[cpu] PASSED [ 66%]
distributed/pipeline/sync/test_balance.py::test_sandbox_during_profiling[cpu] PASSED [ 73%]
distributed/pipeline/sync/test_balance.py::test_not_training PASSED [ 80%]
distributed/pipeline/sync/test_balance.py::test_balance_by_time_tuple PASSED [ 86%]
distributed/pipeline/sync/test_balance.py::test_balance_by_size_tuple SKIPPED [ 93%]
distributed/pipeline/sync/test_balance.py::test_already_has_grad PASSED [100%]
======================== 10 passed, 5 skipped in 4.06s =========================
Running distributed/pipeline/sync/test_bugs ... [2021-03-17 00:13:47.873251]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_bugs.py', '-v'] ... [2021-03-17 00:13:47.873341]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 4 items
distributed/pipeline/sync/test_bugs.py::test_python_autograd_function PASSED [ 25%]
distributed/pipeline/sync/test_bugs.py::test_exception_no_hang PASSED [ 50%]
distributed/pipeline/sync/test_bugs.py::test_tuple_wait SKIPPED [ 75%]
distributed/pipeline/sync/test_bugs.py::test_parallel_randoms PASSED [100%]
========================= 3 passed, 1 skipped in 0.39s =========================
Running distributed/pipeline/sync/test_checkpoint ... [2021-03-17 00:13:49.445821]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_checkpoint.py', '-v'] ... [2021-03-17 00:13:49.445907]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 7 items
distributed/pipeline/sync/test_checkpoint.py::test_serial_checkpoints[cpu] PASSED [ 14%]
distributed/pipeline/sync/test_checkpoint.py::test_not_requires_grad PASSED [ 28%]
distributed/pipeline/sync/test_checkpoint.py::test_not_requires_grad_with_parameter PASSED [ 42%]
distributed/pipeline/sync/test_checkpoint.py::test_random_in_checkpoint[cpu] PASSED [ 57%]
distributed/pipeline/sync/test_checkpoint.py::test_detect_checkpointing_recomputing PASSED [ 71%]
distributed/pipeline/sync/test_checkpoint.py::test_detect_checkpointing_recomputing_without_checkpoint PASSED [ 85%]
distributed/pipeline/sync/test_checkpoint.py::test_non_grad_output PASSED [100%]
============================== 7 passed in 0.04s ===============================
Running distributed/pipeline/sync/test_copy ... [2021-03-17 00:13:50.684755]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_copy.py', '-v'] ... [2021-03-17 00:13:50.684842]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 5 items
distributed/pipeline/sync/test_copy.py::test_copy_wait_cpu_cpu PASSED [ 20%]
distributed/pipeline/sync/test_copy.py::test_copy_wait_cpu_cuda SKIPPED [ 40%]
distributed/pipeline/sync/test_copy.py::test_copy_wait_cuda_cpu SKIPPED [ 60%]
distributed/pipeline/sync/test_copy.py::test_copy_wait_cuda_cuda SKIPPED [ 80%]
distributed/pipeline/sync/test_copy.py::test_wait_multiple_tensors PASSED [100%]
========================= 2 passed, 3 skipped in 0.03s =========================
Running distributed/pipeline/sync/test_deferred_batch_norm ... [2021-03-17 00:13:51.912225]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_deferred_batch_norm.py', '-v'] ... [2021-03-17 00:13:51.912311]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 11 items
distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-1] PASSED [ 9%]
distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-4] PASSED [ 18%]
distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-1] PASSED [ 27%]
distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-4] PASSED [ 36%]
distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[0.1] PASSED [ 45%]
distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[None] PASSED [ 54%]
distributed/pipeline/sync/test_deferred_batch_norm.py::test_convert_deferred_batch_norm PASSED [ 63%]
distributed/pipeline/sync/test_deferred_batch_norm.py::test_eval PASSED [ 72%]
distributed/pipeline/sync/test_deferred_batch_norm.py::test_optimize PASSED [ 81%]
distributed/pipeline/sync/test_deferred_batch_norm.py::test_conv_bn PASSED [ 90%]
distributed/pipeline/sync/test_deferred_batch_norm.py::test_input_requiring_grad PASSED [100%]
============================== 11 passed in 0.92s ==============================
Running distributed/pipeline/sync/test_dependency ... [2021-03-17 00:13:54.021209]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_dependency.py', '-v'] ... [2021-03-17 00:13:54.021300]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 6 items
distributed/pipeline/sync/test_dependency.py::test_fork_join SKIPPED [ 16%]
distributed/pipeline/sync/test_dependency.py::test_fork_join_enable_grad PASSED [ 33%]
distributed/pipeline/sync/test_dependency.py::test_fork_join_no_grad PASSED [ 50%]
distributed/pipeline/sync/test_dependency.py::test_fork_leak PASSED [ 66%]
distributed/pipeline/sync/test_dependency.py::test_join_when_fork_not_requires_grad PASSED [ 83%]
distributed/pipeline/sync/test_dependency.py::test_join_when_fork_requires_grad PASSED [100%]
========================= 5 passed, 1 skipped in 0.04s =========================
Running distributed/pipeline/sync/test_inplace ... [2021-03-17 00:13:55.234898]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_inplace.py', '-v'] ... [2021-03-17 00:13:55.234987]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 3 items
distributed/pipeline/sync/test_inplace.py::test_inplace_on_requires_grad PASSED [ 33%]
distributed/pipeline/sync/test_inplace.py::test_inplace_on_not_requires_grad XFAIL [ 66%]
distributed/pipeline/sync/test_inplace.py::test_inplace_incorrect_grad XFAIL [100%]
========================= 1 passed, 2 xfailed in 0.48s =========================
Running distributed/pipeline/sync/test_microbatch ... [2021-03-17 00:13:56.914657]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_microbatch.py', '-v'] ... [2021-03-17 00:13:56.914741]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 10 items
distributed/pipeline/sync/test_microbatch.py::test_batch_atomic PASSED [ 10%]
distributed/pipeline/sync/test_microbatch.py::test_batch_non_atomic PASSED [ 20%]
distributed/pipeline/sync/test_microbatch.py::test_batch_call PASSED [ 30%]
distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_index PASSED [ 40%]
distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_slice PASSED [ 50%]
distributed/pipeline/sync/test_microbatch.py::test_check PASSED [ 60%]
distributed/pipeline/sync/test_microbatch.py::test_gather_tensors PASSED [ 70%]
distributed/pipeline/sync/test_microbatch.py::test_gather_tuples PASSED [ 80%]
distributed/pipeline/sync/test_microbatch.py::test_scatter_tensor PASSED [ 90%]
distributed/pipeline/sync/test_microbatch.py::test_scatter_tuple PASSED [100%]
============================== 10 passed in 0.05s ==============================
Running distributed/pipeline/sync/test_phony ... [2021-03-17 00:13:58.142395]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_phony.py', '-v'] ... [2021-03-17 00:13:58.142488]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 4 items
distributed/pipeline/sync/test_phony.py::test_phony_size PASSED [ 25%]
distributed/pipeline/sync/test_phony.py::test_phony_requires_grad PASSED [ 50%]
distributed/pipeline/sync/test_phony.py::test_cached_phony PASSED [ 75%]
distributed/pipeline/sync/test_phony.py::test_phony_in_autograd_function PASSED [100%]
============================== 4 passed in 0.03s ===============================
Running distributed/pipeline/sync/test_pipe ... [2021-03-17 00:13:59.377725]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_pipe.py', '-v'] ... [2021-03-17 00:13:59.377811]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 36 items
distributed/pipeline/sync/test_pipe.py::test_parameters PASSED [ 2%]
distributed/pipeline/sync/test_pipe.py::test_public_attrs PASSED [ 5%]
distributed/pipeline/sync/test_pipe.py::test_sequential_like PASSED [ 8%]
distributed/pipeline/sync/test_pipe.py::test_chunks_less_than_1 PASSED [ 11%]
distributed/pipeline/sync/test_pipe.py::test_batch_size_indivisible PASSED [ 13%]
distributed/pipeline/sync/test_pipe.py::test_batch_size_small PASSED [ 16%]
distributed/pipeline/sync/test_pipe.py::test_checkpoint_mode PASSED [ 19%]
distributed/pipeline/sync/test_pipe.py::test_checkpoint_mode_invalid PASSED [ 22%]
distributed/pipeline/sync/test_pipe.py::test_checkpoint_mode_when_chunks_1 PASSED [ 25%]
distributed/pipeline/sync/test_pipe.py::test_checkpoint_eval PASSED [ 27%]
distributed/pipeline/sync/test_pipe.py::test_checkpoint_non_float_input PASSED [ 30%]
distributed/pipeline/sync/test_pipe.py::test_no_grad PASSED [ 33%]
distributed/pipeline/sync/test_pipe.py::test_exception PASSED [ 36%]
distributed/pipeline/sync/test_pipe.py::test_exception_early_stop_asap PASSED [ 38%]
distributed/pipeline/sync/test_pipe.py::test_nested_input PASSED [ 41%]
distributed/pipeline/sync/test_pipe.py::test_input_pair PASSED [ 44%]
distributed/pipeline/sync/test_pipe.py::test_input_singleton PASSED [ 47%]
distributed/pipeline/sync/test_pipe.py::test_input_varargs PASSED [ 50%]
distributed/pipeline/sync/test_pipe.py::test_non_tensor PASSED [ 52%]
distributed/pipeline/sync/test_pipe.py::test_non_tensor_sequence PASSED [ 55%]
distributed/pipeline/sync/test_pipe.py::test_deferred_batch_norm[never] PASSED [ 58%]
distributed/pipeline/sync/test_pipe.py::test_deferred_batch_norm[always] PASSED [ 61%]
distributed/pipeline/sync/test_pipe.py::test_deferred_batch_norm[except_last] PASSED [ 63%]
distributed/pipeline/sync/test_pipe.py::test_deferred_batch_norm_params[never] PASSED [ 66%]
distributed/pipeline/sync/test_pipe.py::test_deferred_batch_norm_params[always] PASSED [ 69%]
distributed/pipeline/sync/test_pipe.py::test_devices PASSED [ 72%]
distributed/pipeline/sync/test_pipe.py::test_partitions PASSED [ 75%]
distributed/pipeline/sync/test_pipe.py::test_deny_moving PASSED [ 77%]
distributed/pipeline/sync/test_pipe.py::test_empty_module PASSED [ 80%]
distributed/pipeline/sync/test_pipe.py::test_named_children PASSED [ 83%]
distributed/pipeline/sync/test_pipe.py::test_verify_module_non_sequential PASSED [ 86%]
distributed/pipeline/sync/test_pipe.py::test_verify_module_duplicate_children PASSED [ 88%]
distributed/pipeline/sync/test_pipe.py::test_verify_module_params_on_same_device SKIPPED [ 91%]
distributed/pipeline/sync/test_pipe.py::test_verify_nested_modules SKIPPED [ 94%]
distributed/pipeline/sync/test_pipe.py::test_verify_module_duplicate_parameters_on_same_device PASSED [ 97%]
distributed/pipeline/sync/test_pipe.py::test_forward_lockstep PASSED [100%]
======================== 34 passed, 2 skipped in 1.50s =========================
Running distributed/pipeline/sync/test_pipeline ... [2021-03-17 00:14:02.063616]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_pipeline.py', '-v'] ... [2021-03-17 00:14:02.063703]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 1 item
distributed/pipeline/sync/test_pipeline.py::test_clock_cycles PASSED [100%]
============================== 1 passed in 0.02s ===============================
Running distributed/pipeline/sync/test_stream ... [2021-03-17 00:14:03.259861]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_stream.py', '-v'] ... [2021-03-17 00:14:03.259945]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 19 items
distributed/pipeline/sync/test_stream.py::TestNewStream::test_new_stream_cpu PASSED [ 5%]
distributed/pipeline/sync/test_stream.py::TestNewStream::test_new_stream_cuda SKIPPED [ 10%]
distributed/pipeline/sync/test_stream.py::TestCurrentStream::test_current_stream_cpu PASSED [ 15%]
distributed/pipeline/sync/test_stream.py::TestCurrentStream::test_current_stream_cuda SKIPPED [ 21%]
distributed/pipeline/sync/test_stream.py::TestDefaultStream::test_default_stream_cpu PASSED [ 26%]
distributed/pipeline/sync/test_stream.py::TestDefaultStream::test_default_stream_cuda SKIPPED [ 31%]
distributed/pipeline/sync/test_stream.py::TestUseDevice::test_use_device_cpu PASSED [ 36%]
distributed/pipeline/sync/test_stream.py::TestUseDevice::test_use_device_cuda SKIPPED [ 42%]
distributed/pipeline/sync/test_stream.py::TestUseStream::test_use_stream_cpu PASSED [ 47%]
distributed/pipeline/sync/test_stream.py::TestUseStream::test_use_stream_cuda SKIPPED [ 52%]
distributed/pipeline/sync/test_stream.py::TestGetDevice::test_get_device_cpu PASSED [ 57%]
distributed/pipeline/sync/test_stream.py::TestGetDevice::test_get_device_cuda SKIPPED [ 63%]
distributed/pipeline/sync/test_stream.py::TestWaitStream::test_wait_stream_cpu_cpu PASSED [ 68%]
distributed/pipeline/sync/test_stream.py::TestWaitStream::test_wait_stream_cpu_cuda SKIPPED [ 73%]
distributed/pipeline/sync/test_stream.py::TestWaitStream::test_wait_stream_cuda_cpu SKIPPED [ 78%]
distributed/pipeline/sync/test_stream.py::TestWaitStream::test_wait_stream_cuda_cuda SKIPPED [ 84%]
distributed/pipeline/sync/test_stream.py::TestRecordStream::test_record_stream_cpu PASSED [ 89%]
distributed/pipeline/sync/test_stream.py::TestRecordStream::test_record_stream_cuda SKIPPED [ 94%]
distributed/pipeline/sync/test_stream.py::TestRecordStream::test_record_stream_shifted_view SKIPPED [100%]
======================== 8 passed, 11 skipped in 0.06s =========================
Running distributed/pipeline/sync/test_transparency ... [2021-03-17 00:14:04.502637]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_transparency.py', '-v'] ... [2021-03-17 00:14:04.502725]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 1 item
distributed/pipeline/sync/test_transparency.py::test_simple_linears PASSED [100%]
============================== 1 passed in 0.18s ===============================
Running distributed/pipeline/sync/test_worker ... [2021-03-17 00:14:05.873983]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', '-m', 'pytest', 'distributed/pipeline/sync/test_worker.py', '-v'] ... [2021-03-17 00:14:05.874071]
============================= test session starts ==============================
platform linux -- Python 3.8.6, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 -- /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch/test/.hypothesis/examples')
torch: 1.8.0
rootdir: /tmp/vsc40023/easybuild_build/PyTorch/1.8.0/foss-2020b/pytorch
plugins: hypothesis-5.41.5
collecting ... collected 8 items
distributed/pipeline/sync/test_worker.py::test_join_running_workers PASSED [ 12%]
distributed/pipeline/sync/test_worker.py::test_join_running_workers_with_exception PASSED [ 25%]
distributed/pipeline/sync/test_worker.py::test_compute_multithreading PASSED [ 37%]
distributed/pipeline/sync/test_worker.py::test_compute_success PASSED [ 50%]
distributed/pipeline/sync/test_worker.py::test_compute_exception PASSED [ 62%]
distributed/pipeline/sync/test_worker.py::test_grad_mode[True] PASSED [ 75%]
distributed/pipeline/sync/test_worker.py::test_grad_mode[False] PASSED [ 87%]
distributed/pipeline/sync/test_worker.py::test_worker_per_device PASSED [100%]
============================== 8 passed in 0.25s ===============================
Running distributed/optim/test_zero_redundancy_optimizer ... [2021-03-17 00:14:07.329083]
Executing ['/user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python', 'distributed/optim/test_zero_redundancy_optimizer.py', '-v'] ... [2021-03-17 00:14:07.329182]
test_add_param_group (__main__.TestZeroRedundancyOptimizerDistributed)
Check that ZeroRedundancyOptimizer properly handles adding a new param_group a posteriori, ... ok
test_collect_shards (__main__.TestZeroRedundancyOptimizerDistributed)
Check the state consolidation mechanism, and the state dict exposed by ZeroRedundancyOptimizer ... ok
test_multiple_groups (__main__.TestZeroRedundancyOptimizerDistributed)
Check that the ZeroRedundancyOptimizer handles working with multiple process groups ... ok
test_pytorch_parity (__main__.TestZeroRedundancyOptimizerDistributed)
When combined with DDP, check that ZeroRedundancyOptimizer(optimizer) and the same monolithic optimizer ... skipped 'CUDA is not available.'
test_sharding (__main__.TestZeroRedundancyOptimizerDistributed)
Check the sharding at construction time ... ok
test_step (__main__.TestZeroRedundancyOptimizerDistributed)
Check that the ZeroRedundancyOptimizer wrapper properly exposes the `.step()` interface ... ok
test_step_with_closure (__main__.TestZeroRedundancyOptimizerDistributed)
Check that the ZeroRedundancyOptimizer wrapper properly exposes the `.step(closure)` interface ... ok
test_implicit_local_state_dict (__main__.TestZeroRedundancyOptimizerSingleRank)
Check that it's possible to pull a local state dict ... WARNING:root:Optimizer state has not been consolidated. Returning the local state
WARNING:root:Please call `consolidate_state_dict()` beforehand if you meant to save the global state
ok
test_local_state_dict (__main__.TestZeroRedundancyOptimizerSingleRank)
Check that it's possible to pull a local state dict ... ok
test_lr_scheduler (__main__.TestZeroRedundancyOptimizerSingleRank)
Check that a normal torch lr_scheduler is usable with ZeroRedundancyOptimizer ... ok
test_state_dict (__main__.TestZeroRedundancyOptimizerSingleRank)
Check that the ZeroRedundancyOptimizer exposes the expected state dict interface, ... ok
test_step_with_extra_inner_key (__main__.TestZeroRedundancyOptimizerSingleRank)
Check that an optimizer adding extra keys to the param_groups ... ok
test_step_with_kwargs (__main__.TestZeroRedundancyOptimizerSingleRank)
Check that the `step(**kwargs)` interface is properly exposed ... ok
test_step_without_closure (__main__.TestZeroRedundancyOptimizerSingleRank)
Check that the step() method (without closure) is handlded as expected ... ok
test_zero_grad (__main__.TestZeroRedundancyOptimizerSingleRank)
Check that the zero_grad attribute is properly handled ... Timing out after 3000 seconds and killing subprocesses.
ERROR
======================================================================
ERROR: test_zero_grad (__main__.TestZeroRedundancyOptimizerSingleRank)
Check that the zero_grad attribute is properly handled
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/eb-90382g2c/tmphcynp_ai/lib/python3.8/site-packages/torch/testing/_internal/common_distributed.py", line 282, in wrapper
self._join_processes(fn)
File "/tmp/eb-90382g2c/tmphcynp_ai/lib/python3.8/site-packages/torch/testing/_internal/common_distributed.py", line 399, in _join_processes
self._check_return_codes(elapsed_time)
File "/tmp/eb-90382g2c/tmphcynp_ai/lib/python3.8/site-packages/torch/testing/_internal/common_distributed.py", line 440, in _check_return_codes
raise RuntimeError('Process {} terminated or timed out after {} seconds'.format(i, elapsed_time))
RuntimeError: Process 0 terminated or timed out after 3000.0502138137817 seconds
----------------------------------------------------------------------
Ran 15 tests in 3013.415s
FAILED (errors=1, skipped=1)
Traceback (most recent call last):
File "run_test.py", line 926, in <module>
main()
File "run_test.py", line 905, in main
raise RuntimeError(err_message)
RuntimeError: distributed/optim/test_zero_redundancy_optimizer failed!
(at easybuild/easybuild-framework/easybuild/tools/run.py:537 in parse_cmd_output)
== 2021-03-17 01:04:21,917 config.py:586 DEBUG software install path as specified by 'installpath' and 'subdir_software': /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software
== 2021-03-17 01:04:21,917 filetools.py:1785 INFO Removing lock /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/.locks/_user_gent_400_vsc40023_eb_arcaninescratch_CO7_skylake-ib_software_PyTorch_1.8.0-foss-2020b.lock...
== 2021-03-17 01:04:21,920 filetools.py:341 INFO Path /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/.locks/_user_gent_400_vsc40023_eb_arcaninescratch_CO7_skylake-ib_software_PyTorch_1.8.0-foss-2020b.lock successfully removed.
== 2021-03-17 01:04:21,920 filetools.py:1789 INFO Lock removed: /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/.locks/_user_gent_400_vsc40023_eb_arcaninescratch_CO7_skylake-ib_software_PyTorch_1.8.0-foss-2020b.lock
== 2021-03-17 01:04:21,920 easyblock.py:3389 WARNING build failed (first 300 chars): cmd "export PYTHONPATH=/tmp/eb-90382g2c/tmphcynp_ai/lib/python3.8/site-packages:$PYTHONPATH && cd test && PYTHONUNBUFFERED=1 /user/gent/400/vsc40023/eb_arcaninescratch/CO7/skylake-ib/software/Python/3.8.6-GCCcore-10.2.0/bin/python run_test.py --verbose -x distributed/rpc/test_process_group_agent te
== 2021-03-17 01:04:21,920 easyblock.py:298 INFO Closing log for application name PyTorch version 1.8.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment