-
-
Save yukunlin/95a1036dba1c3a677f8f130e6cf23fbf to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated | |
and will be removed in future. Use torchrun. | |
Note that --use_env is set by default in torchrun. | |
If your script expects `--local_rank` argument to be set, please | |
change it to read from `os.environ['LOCAL_RANK']` instead. See | |
https://pytorch.org/docs/stable/distributed.html#launch-utility for | |
further instructions | |
warnings.warn( | |
WARNING:torch.distributed.run: | |
***************************************** | |
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. | |
***************************************** | |
INFO:torch.distributed.launcher.api:Starting elastic_operator with launch configs: | |
entrypoint : fairseq_train_wrapped | |
min_nodes : 2 | |
max_nodes : 2 | |
nproc_per_node : 8 | |
run_id : none | |
rdzv_backend : static | |
rdzv_endpoint : 10.0.0.115:12347 | |
rdzv_configs : {'rank': 0, 'timeout': 900} | |
max_restarts : 0 | |
monitor_interval : 5 | |
log_dir : None | |
metrics_cfg : {} | |
INFO:torch.distributed.elastic.agent.server.local_elastic_agent:log directory set to: /tmp/torchelastic_mxzxvmjr/none_lwbywd8h | |
INFO:torch.distributed.elastic.agent.server.api:[default] starting workers for entrypoint: python | |
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group | |
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result: | |
restart_count=0 | |
master_addr=10.0.0.115 | |
master_port=12347 | |
group_rank=0 | |
group_world_size=2 | |
local_ranks=[0, 1, 2, 3, 4, 5, 6, 7] | |
role_ranks=[0, 1, 2, 3, 4, 5, 6, 7] | |
global_ranks=[0, 1, 2, 3, 4, 5, 6, 7] | |
role_world_sizes=[16, 16, 16, 16, 16, 16, 16, 16] | |
global_world_sizes=[16, 16, 16, 16, 16, 16, 16, 16] | |
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group | |
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_mxzxvmjr/none_lwbywd8h/attempt_0/0/error.json | |
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_mxzxvmjr/none_lwbywd8h/attempt_0/1/error.json | |
INFO:torch.distributed.elastic.multiprocessing:Setting worker2 reply file to: /tmp/torchelastic_mxzxvmjr/none_lwbywd8h/attempt_0/2/error.json | |
INFO:torch.distributed.elastic.multiprocessing:Setting worker3 reply file to: /tmp/torchelastic_mxzxvmjr/none_lwbywd8h/attempt_0/3/error.json | |
INFO:torch.distributed.elastic.multiprocessing:Setting worker4 reply file to: /tmp/torchelastic_mxzxvmjr/none_lwbywd8h/attempt_0/4/error.json | |
INFO:torch.distributed.elastic.multiprocessing:Setting worker5 reply file to: /tmp/torchelastic_mxzxvmjr/none_lwbywd8h/attempt_0/5/error.json | |
INFO:torch.distributed.elastic.multiprocessing:Setting worker6 reply file to: /tmp/torchelastic_mxzxvmjr/none_lwbywd8h/attempt_0/6/error.json | |
INFO:torch.distributed.elastic.multiprocessing:Setting worker7 reply file to: /tmp/torchelastic_mxzxvmjr/none_lwbywd8h/attempt_0/7/error.json | |
2022-04-18 23:34:06 | WARNING | root | Pytorch pre-release version 1.10.0a0+git36449ea - assuming intent to test it | |
2022-04-18 23:34:06 | WARNING | root | Pytorch pre-release version 1.10.0a0+git36449ea - assuming intent to test it | |
2022-04-18 23:34:06 | WARNING | root | Pytorch pre-release version 1.10.0a0+git36449ea - assuming intent to test it | |
2022-04-18 23:34:06 | WARNING | root | Pytorch pre-release version 1.10.0a0+git36449ea - assuming intent to test it | |
2022-04-18 23:34:06 | WARNING | root | Pytorch pre-release version 1.10.0a0+git36449ea - assuming intent to test it | |
2022-04-18 23:34:06 | WARNING | root | Pytorch pre-release version 1.10.0a0+git36449ea - assuming intent to test it | |
2022-04-18 23:34:06 | WARNING | root | Pytorch pre-release version 1.10.0a0+git36449ea - assuming intent to test it | |
2022-04-18 23:34:06 | WARNING | root | Pytorch pre-release version 1.10.0a0+git36449ea - assuming intent to test it | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | distributed init (rank 5): env:// | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 5 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | distributed init (rank 0): env:// | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 0 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | distributed init (rank 6): env:// | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 6 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | distributed init (rank 7): env:// | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | distributed init (rank 4): env:// | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 7 | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 4 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | distributed init (rank 3): env:// | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 3 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | distributed init (rank 1): env:// | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | distributed init (rank 2): env:// | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 1 | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 2 | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 16 nodes. | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Rank 6: Completed store-based barrier for key:store_based_barrier_key:1 with 16 nodes. | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 16 nodes. | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Rank 5: Completed store-based barrier for key:store_based_barrier_key:1 with 16 nodes. | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Rank 7: Completed store-based barrier for key:store_based_barrier_key:1 with 16 nodes. | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 16 nodes. | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Rank 4: Completed store-based barrier for key:store_based_barrier_key:1 with 16 nodes. | |
2022-04-18 23:34:08 | INFO | torch.distributed.distributed_c10d | Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 16 nodes. | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | initialized host ip-10-0-0-115 as rank 0 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | initialized host ip-10-0-0-115 as rank 7 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | initialized host ip-10-0-0-115 as rank 1 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | initialized host ip-10-0-0-115 as rank 3 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | initialized host ip-10-0-0-115 as rank 6 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | initialized host ip-10-0-0-115 as rank 4 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | initialized host ip-10-0-0-115 as rank 5 | |
2022-04-18 23:34:08 | INFO | fairseq.distributed.utils | initialized host ip-10-0-0-115 as rank 2 | |
ip-10-0-0-115:73:73 [0] NCCL INFO Bootstrap : Using ens5:10.0.0.115<0> | |
ip-10-0-0-115:73:73 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v4 symbol. | |
ip-10-0-0-115:73:73 [0] NCCL INFO NET/OFI Using aws-ofi-nccl 1.2.0aws | |
ip-10-0-0-115:73:73 [0] NCCL INFO NET/OFI Setting FI_EFA_FORK_SAFE environment variable to 1 | |
ip-10-0-0-115:73:73 [0] ofi_init:1157 NCCL WARN NET/OFI Only EFA provider is supported | |
ip-10-0-0-115:73:73 [0] ofi_init:1208 NCCL WARN NET/OFI aws-ofi-nccl initialization failed | |
ip-10-0-0-115:73:73 [0] NCCL INFO NET/IB : No device found. | |
ip-10-0-0-115:73:73 [0] NCCL INFO NET/Socket : Using [0]ens5:10.0.0.115<0> | |
ip-10-0-0-115:73:73 [0] NCCL INFO Using network Socket | |
NCCL version 2.10.3+cuda11.3 | |
ip-10-0-0-115:80:80 [7] NCCL INFO Bootstrap : Using ens5:10.0.0.115<0> | |
ip-10-0-0-115:77:77 [4] NCCL INFO Bootstrap : Using ens5:10.0.0.115<0> | |
ip-10-0-0-115:74:74 [1] NCCL INFO Bootstrap : Using ens5:10.0.0.115<0> | |
ip-10-0-0-115:78:78 [5] NCCL INFO Bootstrap : Using ens5:10.0.0.115<0> | |
ip-10-0-0-115:74:74 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v4 symbol. | |
ip-10-0-0-115:80:80 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v4 symbol. | |
ip-10-0-0-115:77:77 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v4 symbol. | |
ip-10-0-0-115:74:74 [1] NCCL INFO NET/OFI Using aws-ofi-nccl 1.2.0aws | |
ip-10-0-0-115:80:80 [7] NCCL INFO NET/OFI Using aws-ofi-nccl 1.2.0aws | |
ip-10-0-0-115:77:77 [4] NCCL INFO NET/OFI Using aws-ofi-nccl 1.2.0aws | |
ip-10-0-0-115:77:77 [4] NCCL INFO NET/OFI Setting FI_EFA_FORK_SAFE environment variable to 1 | |
ip-10-0-0-115:74:74 [1] NCCL INFO NET/OFI Setting FI_EFA_FORK_SAFE environment variable to 1 | |
ip-10-0-0-115:78:78 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v4 symbol. | |
ip-10-0-0-115:80:80 [7] NCCL INFO NET/OFI Setting FI_EFA_FORK_SAFE environment variable to 1 | |
ip-10-0-0-115:78:78 [5] NCCL INFO NET/OFI Using aws-ofi-nccl 1.2.0aws | |
ip-10-0-0-115:78:78 [5] NCCL INFO NET/OFI Setting FI_EFA_FORK_SAFE environment variable to 1 | |
ip-10-0-0-115:79:79 [6] NCCL INFO Bootstrap : Using ens5:10.0.0.115<0> | |
ip-10-0-0-115:79:79 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v4 symbol. | |
ip-10-0-0-115:79:79 [6] NCCL INFO NET/OFI Using aws-ofi-nccl 1.2.0aws | |
ip-10-0-0-115:79:79 [6] NCCL INFO NET/OFI Setting FI_EFA_FORK_SAFE environment variable to 1 | |
ip-10-0-0-115:76:76 [3] NCCL INFO Bootstrap : Using ens5:10.0.0.115<0> | |
ip-10-0-0-115:74:74 [1] ofi_init:1157 NCCL WARN NET/OFI Only EFA provider is supported | |
ip-10-0-0-115:77:77 [4] ofi_init:1157 NCCL WARN NET/OFI Only EFA provider is supported | |
ip-10-0-0-115:80:80 [7] ofi_init:1157 NCCL WARN NET/OFI Only EFA provider is supported | |
ip-10-0-0-115:78:78 [5] ofi_init:1157 NCCL WARN NET/OFI Only EFA provider is supported | |
ip-10-0-0-115:74:74 [1] ofi_init:1208 NCCL WARN NET/OFI aws-ofi-nccl initialization failed | |
ip-10-0-0-115:80:80 [7] ofi_init:1208 NCCL WARN NET/OFI aws-ofi-nccl initialization failed | |
ip-10-0-0-115:77:77 [4] ofi_init:1208 NCCL WARN NET/OFI aws-ofi-nccl initialization failed | |
ip-10-0-0-115:78:78 [5] ofi_init:1208 NCCL WARN NET/OFI aws-ofi-nccl initialization failed | |
ip-10-0-0-115:80:80 [7] NCCL INFO NET/IB : No device found. | |
ip-10-0-0-115:77:77 [4] NCCL INFO NET/IB : No device found. | |
ip-10-0-0-115:78:78 [5] NCCL INFO NET/IB : No device found. | |
ip-10-0-0-115:74:74 [1] NCCL INFO NET/IB : No device found. | |
ip-10-0-0-115:76:76 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v4 symbol. | |
ip-10-0-0-115:76:76 [3] NCCL INFO NET/OFI Using aws-ofi-nccl 1.2.0aws | |
ip-10-0-0-115:76:76 [3] NCCL INFO NET/OFI Setting FI_EFA_FORK_SAFE environment variable to 1 | |
ip-10-0-0-115:80:80 [7] NCCL INFO NET/Socket : Using [0]ens5:10.0.0.115<0> | |
ip-10-0-0-115:74:74 [1] NCCL INFO NET/Socket : Using [0]ens5:10.0.0.115<0> | |
ip-10-0-0-115:78:78 [5] NCCL INFO NET/Socket : Using [0]ens5:10.0.0.115<0> | |
ip-10-0-0-115:80:80 [7] NCCL INFO Using network Socket | |
ip-10-0-0-115:77:77 [4] NCCL INFO NET/Socket : Using [0]ens5:10.0.0.115<0> | |
ip-10-0-0-115:74:74 [1] NCCL INFO Using network Socket | |
ip-10-0-0-115:78:78 [5] NCCL INFO Using network Socket | |
ip-10-0-0-115:77:77 [4] NCCL INFO Using network Socket | |
ip-10-0-0-115:79:79 [6] ofi_init:1157 NCCL WARN NET/OFI Only EFA provider is supported | |
ip-10-0-0-115:79:79 [6] ofi_init:1208 NCCL WARN NET/OFI aws-ofi-nccl initialization failed | |
ip-10-0-0-115:79:79 [6] NCCL INFO NET/IB : No device found. | |
ip-10-0-0-115:79:79 [6] NCCL INFO NET/Socket : Using [0]ens5:10.0.0.115<0> | |
ip-10-0-0-115:79:79 [6] NCCL INFO Using network Socket | |
ip-10-0-0-115:76:76 [3] ofi_init:1157 NCCL WARN NET/OFI Only EFA provider is supported | |
ip-10-0-0-115:76:76 [3] ofi_init:1208 NCCL WARN NET/OFI aws-ofi-nccl initialization failed | |
ip-10-0-0-115:76:76 [3] NCCL INFO NET/IB : No device found. | |
ip-10-0-0-115:76:76 [3] NCCL INFO NET/Socket : Using [0]ens5:10.0.0.115<0> | |
ip-10-0-0-115:76:76 [3] NCCL INFO Using network Socket | |
ip-10-0-0-115:75:75 [2] NCCL INFO Bootstrap : Using ens5:10.0.0.115<0> | |
ip-10-0-0-115:75:75 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v4 symbol. | |
ip-10-0-0-115:75:75 [2] NCCL INFO NET/OFI Using aws-ofi-nccl 1.2.0aws | |
ip-10-0-0-115:75:75 [2] NCCL INFO NET/OFI Setting FI_EFA_FORK_SAFE environment variable to 1 | |
ip-10-0-0-115:75:75 [2] ofi_init:1157 NCCL WARN NET/OFI Only EFA provider is supported | |
ip-10-0-0-115:75:75 [2] ofi_init:1208 NCCL WARN NET/OFI aws-ofi-nccl initialization failed | |
ip-10-0-0-115:75:75 [2] NCCL INFO NET/IB : No device found. | |
ip-10-0-0-115:75:75 [2] NCCL INFO NET/Socket : Using [0]ens5:10.0.0.115<0> | |
ip-10-0-0-115:75:75 [2] NCCL INFO Using network Socket | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 00/02 : 0 3 2 1 5 6 7 4 8 11 10 9 13 14 15 12 | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 01/02 : 0 3 2 1 5 6 7 4 8 11 10 9 13 14 15 12 | |
ip-10-0-0-115:73:130 [0] NCCL INFO Trees [0] 3/8/-1->0->-1 [1] 3/-1/-1->0->8 | |
ip-10-0-0-115:74:132 [1] NCCL INFO Trees [0] 5/-1/-1->1->2 [1] 5/-1/-1->1->2 | |
ip-10-0-0-115:75:137 [2] NCCL INFO Trees [0] 1/-1/-1->2->3 [1] 1/-1/-1->2->3 | |
ip-10-0-0-115:76:136 [3] NCCL INFO Trees [0] 2/-1/-1->3->0 [1] 2/-1/-1->3->0 | |
ip-10-0-0-115:77:133 [4] NCCL INFO Trees [0] -1/-1/-1->4->7 [1] -1/-1/-1->4->7 | |
ip-10-0-0-115:78:134 [5] NCCL INFO Trees [0] 6/-1/-1->5->1 [1] 6/-1/-1->5->1 | |
ip-10-0-0-115:79:135 [6] NCCL INFO Trees [0] 7/-1/-1->6->5 [1] 7/-1/-1->6->5 | |
ip-10-0-0-115:80:131 [7] NCCL INFO Trees [0] 4/-1/-1->7->6 [1] 4/-1/-1->7->6 | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 00 : 0[160] -> 3[190] via P2P/IPC | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 01 : 0[160] -> 3[190] via P2P/IPC | |
ip-10-0-0-115:74:132 [1] NCCL INFO Channel 00 : 1[170] -> 5[1b0] via P2P/IPC | |
ip-10-0-0-115:78:134 [5] NCCL INFO Channel 00 : 5[1b0] -> 6[1c0] via P2P/IPC | |
ip-10-0-0-115:74:132 [1] NCCL INFO Channel 01 : 1[170] -> 5[1b0] via P2P/IPC | |
ip-10-0-0-115:78:134 [5] NCCL INFO Channel 01 : 5[1b0] -> 6[1c0] via P2P/IPC | |
ip-10-0-0-115:75:137 [2] NCCL INFO Channel 00 : 2[180] -> 1[170] via P2P/IPC | |
ip-10-0-0-115:79:135 [6] NCCL INFO Channel 00 : 6[1c0] -> 7[1d0] via P2P/IPC | |
ip-10-0-0-115:75:137 [2] NCCL INFO Channel 01 : 2[180] -> 1[170] via P2P/IPC | |
ip-10-0-0-115:79:135 [6] NCCL INFO Channel 01 : 6[1c0] -> 7[1d0] via P2P/IPC | |
ip-10-0-0-115:76:136 [3] NCCL INFO Channel 00 : 3[190] -> 2[180] via P2P/IPC | |
ip-10-0-0-115:77:133 [4] NCCL INFO Channel 00 : 4[1a0] -> 8[160] [send] via NET/Socket/0 | |
ip-10-0-0-115:76:136 [3] NCCL INFO Channel 01 : 3[190] -> 2[180] via P2P/IPC | |
ip-10-0-0-115:76:136 [3] NCCL INFO Connected all rings | |
ip-10-0-0-115:80:131 [7] NCCL INFO Channel 00 : 7[1d0] -> 4[1a0] via P2P/IPC | |
ip-10-0-0-115:79:135 [6] NCCL INFO Connected all rings | |
ip-10-0-0-115:80:131 [7] NCCL INFO Channel 01 : 7[1d0] -> 4[1a0] via P2P/IPC | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 00 : 12[1a0] -> 0[160] [receive] via NET/Socket/0 | |
ip-10-0-0-115:73:130 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread | |
ip-10-0-0-115:78:134 [5] NCCL INFO Connected all rings | |
ip-10-0-0-115:77:133 [4] NCCL INFO Channel 01 : 4[1a0] -> 8[160] [send] via NET/Socket/0 | |
ip-10-0-0-115:79:135 [6] NCCL INFO Channel 00 : 6[1c0] -> 5[1b0] via P2P/IPC | |
ip-10-0-0-115:78:134 [5] NCCL INFO Channel 00 : 5[1b0] -> 1[170] via P2P/IPC | |
ip-10-0-0-115:79:135 [6] NCCL INFO Channel 01 : 6[1c0] -> 5[1b0] via P2P/IPC | |
ip-10-0-0-115:78:134 [5] NCCL INFO Channel 01 : 5[1b0] -> 1[170] via P2P/IPC | |
ip-10-0-0-115:74:132 [1] NCCL INFO Connected all rings | |
ip-10-0-0-115:74:132 [1] NCCL INFO Channel 00 : 1[170] -> 2[180] via P2P/IPC | |
ip-10-0-0-115:75:137 [2] NCCL INFO Connected all rings | |
ip-10-0-0-115:74:132 [1] NCCL INFO Channel 01 : 1[170] -> 2[180] via P2P/IPC | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 01 : 12[1a0] -> 0[160] [receive] via NET/Socket/0 | |
ip-10-0-0-115:75:137 [2] NCCL INFO Channel 00 : 2[180] -> 3[190] via P2P/IPC | |
ip-10-0-0-115:73:130 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread | |
ip-10-0-0-115:75:137 [2] NCCL INFO Channel 01 : 2[180] -> 3[190] via P2P/IPC | |
ip-10-0-0-115:76:136 [3] NCCL INFO Channel 00 : 3[190] -> 0[160] via P2P/IPC | |
ip-10-0-0-115:75:137 [2] NCCL INFO Connected all trees | |
ip-10-0-0-115:75:137 [2] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:75:137 [2] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:76:136 [3] NCCL INFO Channel 01 : 3[190] -> 0[160] via P2P/IPC | |
ip-10-0-0-115:75:137 [2] NCCL INFO Channel 00 : 2[180] -> 4[1a0] via P2P/indirect/0[160] | |
ip-10-0-0-115:74:132 [1] NCCL INFO Connected all trees | |
ip-10-0-0-115:74:132 [1] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:74:132 [1] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:73:130 [0] NCCL INFO Connected all rings | |
ip-10-0-0-115:74:132 [1] NCCL INFO Channel 01 : 1[170] -> 4[1a0] via P2P/indirect/0[160] | |
ip-10-0-0-115:78:134 [5] NCCL INFO Connected all trees | |
ip-10-0-0-115:78:134 [5] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:78:134 [5] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:77:133 [4] NCCL INFO Connected all rings | |
ip-10-0-0-115:80:131 [7] NCCL INFO Connected all rings | |
ip-10-0-0-115:77:133 [4] NCCL INFO Channel 00 : 4[1a0] -> 7[1d0] via P2P/IPC | |
ip-10-0-0-115:77:133 [4] NCCL INFO Channel 01 : 4[1a0] -> 7[1d0] via P2P/IPC | |
ip-10-0-0-115:77:133 [4] NCCL INFO Connected all trees | |
ip-10-0-0-115:77:133 [4] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:77:133 [4] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:80:131 [7] NCCL INFO Channel 00 : 7[1d0] -> 6[1c0] via P2P/IPC | |
ip-10-0-0-115:80:131 [7] NCCL INFO Channel 01 : 7[1d0] -> 6[1c0] via P2P/IPC | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 00 : 8[160] -> 0[160] [receive] via NET/Socket/0 | |
ip-10-0-0-115:73:130 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread | |
ip-10-0-0-115:80:131 [7] NCCL INFO Connected all trees | |
ip-10-0-0-115:80:131 [7] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:80:131 [7] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:79:135 [6] NCCL INFO Connected all trees | |
ip-10-0-0-115:79:135 [6] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:79:135 [6] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 01 : 8[160] -> 0[160] [receive] via NET/Socket/0 | |
ip-10-0-0-115:73:130 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 00 : 0[160] -> 8[160] [send] via NET/Socket/0 | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 01 : 0[160] -> 8[160] [send] via NET/Socket/0 | |
ip-10-0-0-115:73:130 [0] NCCL INFO Connected all trees | |
ip-10-0-0-115:73:130 [0] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:76:136 [3] NCCL INFO Connected all trees | |
ip-10-0-0-115:73:130 [0] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:76:136 [3] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:76:136 [3] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:76:136 [3] NCCL INFO Channel 01 : 3[190] -> 4[1a0] via P2P/indirect/0[160] | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 01 : 0[160] -> 5[1b0] via P2P/indirect/1[170] | |
ip-10-0-0-115:76:136 [3] NCCL INFO Channel 00 : 3[190] -> 5[1b0] via P2P/indirect/1[170] | |
ip-10-0-0-115:76:136 [3] NCCL INFO Channel 01 : 3[190] -> 6[1c0] via P2P/indirect/7[1d0] | |
ip-10-0-0-115:75:137 [2] NCCL INFO Channel 01 : 2[180] -> 5[1b0] via P2P/indirect/1[170] | |
ip-10-0-0-115:75:137 [2] NCCL INFO Channel 01 : 2[180] -> 7[1d0] via P2P/indirect/6[1c0] | |
ip-10-0-0-115:77:133 [4] NCCL INFO Channel 01 : 4[1a0] -> 1[170] via P2P/indirect/5[1b0] | |
ip-10-0-0-115:74:132 [1] NCCL INFO Channel 01 : 1[170] -> 6[1c0] via P2P/indirect/5[1b0] | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 00 : 0[160] -> 6[1c0] via P2P/indirect/4[1a0] | |
ip-10-0-0-115:74:132 [1] NCCL INFO Channel 00 : 1[170] -> 7[1d0] via P2P/indirect/3[190] | |
ip-10-0-0-115:78:134 [5] NCCL INFO Channel 01 : 5[1b0] -> 0[160] via P2P/indirect/4[1a0] | |
ip-10-0-0-115:79:135 [6] NCCL INFO Channel 00 : 6[1c0] -> 0[160] via P2P/indirect/4[1a0] | |
ip-10-0-0-115:73:130 [0] NCCL INFO Channel 01 : 0[160] -> 7[1d0] via P2P/indirect/4[1a0] | |
ip-10-0-0-115:80:131 [7] NCCL INFO Channel 01 : 7[1d0] -> 0[160] via P2P/indirect/4[1a0] | |
ip-10-0-0-115:80:131 [7] NCCL INFO Channel 00 : 7[1d0] -> 1[170] via P2P/indirect/5[1b0] | |
ip-10-0-0-115:79:135 [6] NCCL INFO Channel 01 : 6[1c0] -> 1[170] via P2P/indirect/5[1b0] | |
ip-10-0-0-115:80:131 [7] NCCL INFO Channel 01 : 7[1d0] -> 2[180] via P2P/indirect/3[190] | |
ip-10-0-0-115:79:135 [6] NCCL INFO Channel 01 : 6[1c0] -> 3[190] via P2P/indirect/2[180] | |
ip-10-0-0-115:78:134 [5] NCCL INFO Channel 01 : 5[1b0] -> 2[180] via P2P/indirect/1[170] | |
ip-10-0-0-115:77:133 [4] NCCL INFO Channel 00 : 4[1a0] -> 2[180] via P2P/indirect/6[1c0] | |
ip-10-0-0-115:78:134 [5] NCCL INFO Channel 00 : 5[1b0] -> 3[190] via P2P/indirect/1[170] | |
ip-10-0-0-115:77:133 [4] NCCL INFO Channel 01 : 4[1a0] -> 3[190] via P2P/indirect/0[160] | |
ip-10-0-0-115:77:133 [4] NCCL INFO comm 0x7effec002fb0 rank 4 nranks 16 cudaDev 4 busId 1a0 - Init COMPLETE | |
ip-10-0-0-115:73:130 [0] NCCL INFO comm 0x7f50b4002fb0 rank 0 nranks 16 cudaDev 0 busId 160 - Init COMPLETE | |
ip-10-0-0-115:79:135 [6] NCCL INFO comm 0x7ff9d0002fb0 rank 6 nranks 16 cudaDev 6 busId 1c0 - Init COMPLETE | |
ip-10-0-0-115:75:137 [2] NCCL INFO comm 0x7f5340002fb0 rank 2 nranks 16 cudaDev 2 busId 180 - Init COMPLETE | |
ip-10-0-0-115:78:134 [5] NCCL INFO comm 0x7fb09c002fb0 rank 5 nranks 16 cudaDev 5 busId 1b0 - Init COMPLETE | |
ip-10-0-0-115:76:136 [3] NCCL INFO comm 0x7ffacc002fb0 rank 3 nranks 16 cudaDev 3 busId 190 - Init COMPLETE | |
ip-10-0-0-115:80:131 [7] NCCL INFO comm 0x7fa574002fb0 rank 7 nranks 16 cudaDev 7 busId 1d0 - Init COMPLETE | |
ip-10-0-0-115:74:132 [1] NCCL INFO comm 0x7f5570002fb0 rank 1 nranks 16 cudaDev 1 busId 170 - Init COMPLETE | |
ip-10-0-0-115:73:73 [0] NCCL INFO Launch mode Parallel | |
2022-04-18 23:34:11 | INFO | fairseq_cli.train | {'_name': None, 'common': {'_name': None, 'no_progress_bar': False, 'log_interval': 100, 'log_format': None, 'log_file': None, 'tensorboard_logdir': None, 'wandb_project': None, 'azureml_logging': False, 'seed': 1, 'cpu': False, 'tpu': False, 'bf16': False, 'memory_efficient_bf16': False, 'fp16': False, 'memory_efficient_fp16': False, 'fp16_no_flatten_grads': False, 'fp16_init_scale': 128, 'fp16_scale_window': None, 'fp16_scale_tolerance': 0.0, 'on_cpu_convert_precision': False, 'min_loss_scale': 0.0001, 'threshold_loss_scale': None, 'amp': False, 'amp_batch_retries': 2, 'amp_init_scale': 128, 'amp_scale_window': None, 'user_dir': None, 'empty_cache_freq': 0, 'all_gather_list_size': 16384, 'model_parallel_size': 1, 'quantization_config_path': None, 'profile': False, 'reset_logging': False, 'suppress_crashes': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma'}, 'common_eval': {'_name': None, 'path': None, 'post_process': None, 'quiet': False, 'model_overrides': '{}', 'results_path': None}, 'distributed_training': {'_name': None, 'distributed_world_size': 16, 'distributed_num_procs': 8, 'distributed_rank': 0, 'distributed_backend': 'nccl', 'distributed_init_method': 'env://', 'distributed_port': -1, 'device_id': 0, 'distributed_no_spawn': True, 'ddp_backend': 'pytorch_ddp', 'ddp_comm_hook': 'none', 'bucket_cap_mb': 25, 'fix_batches_to_gpus': False, 'find_unused_parameters': False, 'gradient_as_bucket_view': False, 'fast_stat_sync': False, 'heartbeat_timeout': -1, 'broadcast_buffers': False, 'slowmo_momentum': None, 'slowmo_base_algorithm': 'localsgd', 'localsgd_frequency': 3, 'nprocs_per_node': 8, 'pipeline_model_parallel': False, 'pipeline_balance': None, 'pipeline_devices': None, 'pipeline_chunks': 0, 'pipeline_encoder_balance': None, 'pipeline_encoder_devices': None, 'pipeline_decoder_balance': None, 'pipeline_decoder_devices': None, 'pipeline_checkpoint': 'never', 'zero_sharding': 'none', 'fp16': False, 'memory_efficient_fp16': False, 'tpu': False, 'no_reshard_after_forward': False, 'fp32_reduce_scatter': False, 'cpu_offload': False, 'use_sharded_state': False, 'not_fsdp_flatten_parameters': False}, 'dataset': {'_name': None, 'num_workers': 1, 'skip_invalid_size_inputs_valid_test': False, 'max_tokens': 2048, 'batch_size': None, 'required_batch_size_multiple': 8, 'required_seq_len_multiple': 1, 'dataset_impl': None, 'data_buffer_size': 10, 'train_subset': 'train', 'valid_subset': 'valid', 'combine_valid_subsets': None, 'ignore_unused_valid_subsets': False, 'validate_interval': 1, 'validate_interval_updates': 0, 'validate_after_updates': 0, 'fixed_validation_seed': None, 'disable_validation': False, 'max_tokens_valid': 2048, 'batch_size_valid': None, 'max_valid_steps': None, 'curriculum': 0, 'gen_subset': 'test', 'num_shards': 1, 'shard_id': 0, 'grouped_shuffling': False, 'update_epoch_batch_itr': False, 'update_ordered_indices_seed': False}, 'optimization': {'_name': None, 'max_epoch': 0, 'max_update': 50000, 'stop_time_hours': 0.0, 'clip_norm': 0.0, 'sentence_avg': False, 'update_freq': [1], 'lr': [0.0005], 'stop_min_lr': -1.0, 'use_bmuf': False, 'skip_remainder_batch': False}, 'checkpoint': {'_name': None, 'save_dir': '/job/fairseq/checkpoints/transformer_wikitext-103_ubuntu', 'restore_file': 'checkpoint_last.pt', 'continue_once': None, 'finetune_from_model': None, 'reset_dataloader': False, 'reset_lr_scheduler': False, 'reset_meters': False, 'reset_optimizer': False, 'optimizer_overrides': '{}', 'save_interval': 1, 'save_interval_updates': 0, 'keep_interval_updates': -1, 'keep_interval_updates_pattern': -1, 'keep_last_epochs': -1, 'keep_best_checkpoints': -1, 'no_save': False, 'no_epoch_checkpoints': False, 'no_last_checkpoints': False, 'no_save_optimizer_state': False, 'best_checkpoint_metric': 'loss', 'maximize_best_checkpoint_metric': False, 'patience': -1, 'checkpoint_suffix': '', 'checkpoint_shard_count': 1, 'load_checkpoint_on_all_dp_ranks': False, 'write_checkpoints_asynchronously': False, 'model_parallel_size': 1}, 'bmuf': {'_name': None, 'block_lr': 1.0, 'block_momentum': 0.875, 'global_sync_iter': 50, 'warmup_iterations': 500, 'use_nbm': False, 'average_sync': False, 'distributed_world_size': 8}, 'generation': {'_name': None, 'beam': 5, 'nbest': 1, 'max_len_a': 0.0, 'max_len_b': 200, 'min_len': 1, 'match_source_len': False, 'unnormalized': False, 'no_early_stop': False, 'no_beamable_mm': False, 'lenpen': 1.0, 'unkpen': 0.0, 'replace_unk': None, 'sacrebleu': False, 'score_reference': False, 'prefix_size': 0, 'no_repeat_ngram_size': 0, 'sampling': False, 'sampling_topk': -1, 'sampling_topp': -1.0, 'constraints': None, 'temperature': 1.0, 'diverse_beam_groups': -1, 'diverse_beam_strength': 0.5, 'diversity_rate': -1.0, 'print_alignment': None, 'print_step': False, 'lm_path': None, 'lm_weight': 0.0, 'iter_decode_eos_penalty': 0.0, 'iter_decode_max_iter': 10, 'iter_decode_force_max_iter': False, 'iter_decode_with_beam': 1, 'iter_decode_with_external_reranker': False, 'retain_iter_history': False, 'retain_dropout': False, 'retain_dropout_modules': None, 'decoding_format': None, 'no_seed_provided': False}, 'eval_lm': {'_name': None, 'output_word_probs': False, 'output_word_stats': False, 'context_window': 0, 'softmax_batch': 9223372036854775807}, 'interactive': {'_name': None, 'buffer_size': 0, 'input': '-'}, 'model': {'_name': 'transformer_lm', 'activation_fn': relu, 'dropout': 0.1, 'attention_dropout': 0.0, 'activation_dropout': 0.0, 'relu_dropout': 0.0, 'decoder_embed_dim': 512, 'decoder_output_dim': 512, 'decoder_input_dim': 512, 'decoder_ffn_embed_dim': 2048, 'decoder_layers': 6, 'decoder_attention_heads': 8, 'decoder_normalize_before': False, 'no_decoder_final_norm': False, 'adaptive_softmax_cutoff': None, 'adaptive_softmax_dropout': 0.0, 'adaptive_softmax_factor': 4.0, 'no_token_positional_embeddings': False, 'share_decoder_input_output_embed': True, 'character_embeddings': False, 'character_filters': '[(1, 64), (2, 128), (3, 192), (4, 256), (5, 256), (6, 256), (7, 256)]', 'character_embedding_dim': 4, 'char_embedder_highway_layers': 2, 'adaptive_input': False, 'adaptive_input_factor': 4.0, 'adaptive_input_cutoff': None, 'tie_adaptive_weights': False, 'tie_adaptive_proj': False, 'decoder_learned_pos': False, 'layernorm_embedding': False, 'no_scale_embedding': False, 'checkpoint_activations': False, 'offload_activations': False, 'decoder_layerdrop': 0.0, 'decoder_layers_to_keep': None, 'quant_noise_pq': 0.0, 'quant_noise_pq_block_size': 8, 'quant_noise_scalar': 0.0, 'min_params_to_wrap': 100000000, 'base_layers': 0, 'base_sublayers': 1, 'base_shuffle': 1, 'scale_fc': False, 'scale_attn': False, 'scale_heads': False, 'scale_resids': False, 'add_bos_token': False, 'tokens_per_sample': 512, 'max_target_positions': None, 'tpu': False}, 'task': {'_name': 'language_modeling', 'data': '/job/fairseq/data-bin/wikitext-103', 'sample_break_mode': none, 'tokens_per_sample': 512, 'output_dictionary_size': -1, 'self_target': False, 'future_target': False, 'past_target': False, 'add_bos_token': False, 'max_target_positions': None, 'shorten_method': none, 'shorten_data_split_list': '', 'pad_to_fixed_length': False, 'pad_to_fixed_bsz': False, 'seed': 1, 'batch_size': None, 'batch_size_valid': None, 'dataset_impl': None, 'data_buffer_size': 10, 'tpu': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma'}, 'criterion': {'_name': 'cross_entropy', 'sentence_avg': False}, 'optimizer': {'_name': 'adam', 'adam_betas': '(0.9, 0.98)', 'adam_eps': 1e-08, 'weight_decay': 0.01, 'use_old_adam': False, 'fp16_adam_stats': False, 'tpu': False, 'lr': [0.0005]}, 'lr_scheduler': {'_name': 'inverse_sqrt', 'warmup_updates': 4000, 'warmup_init_lr': 1e-07, 'lr': [0.0005]}, 'scoring': {'_name': 'bleu', 'pad': 1, 'eos': 2, 'unk': 3}, 'bpe': None, 'tokenizer': None, 'ema': {'_name': None, 'store_ema': False, 'ema_decay': 0.9999, 'ema_start_update': 0, 'ema_seed_model': None, 'ema_update_freq': 1, 'ema_fp32': False}} | |
2022-04-18 23:34:12 | INFO | fairseq.tasks.language_modeling | dictionary: 267744 types | |
2022-04-18 23:34:15 | INFO | fairseq_cli.train | TransformerLanguageModel( | |
(decoder): TransformerDecoder( | |
(dropout_module): FairseqDropout() | |
(embed_tokens): Embedding(267744, 512, padding_idx=1) | |
(embed_positions): SinusoidalPositionalEmbedding() | |
(layers): ModuleList( | |
(0): TransformerDecoderLayerBase( | |
(dropout_module): FairseqDropout() | |
(self_attn): MultiheadAttention( | |
(dropout_module): FairseqDropout() | |
(k_proj): Linear(in_features=512, out_features=512, bias=True) | |
(v_proj): Linear(in_features=512, out_features=512, bias=True) | |
(q_proj): Linear(in_features=512, out_features=512, bias=True) | |
(out_proj): Linear(in_features=512, out_features=512, bias=True) | |
) | |
(activation_dropout_module): FairseqDropout() | |
(self_attn_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
(fc1): Linear(in_features=512, out_features=2048, bias=True) | |
(fc2): Linear(in_features=2048, out_features=512, bias=True) | |
(final_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
) | |
(1): TransformerDecoderLayerBase( | |
(dropout_module): FairseqDropout() | |
(self_attn): MultiheadAttention( | |
(dropout_module): FairseqDropout() | |
(k_proj): Linear(in_features=512, out_features=512, bias=True) | |
(v_proj): Linear(in_features=512, out_features=512, bias=True) | |
(q_proj): Linear(in_features=512, out_features=512, bias=True) | |
(out_proj): Linear(in_features=512, out_features=512, bias=True) | |
) | |
(activation_dropout_module): FairseqDropout() | |
(self_attn_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
(fc1): Linear(in_features=512, out_features=2048, bias=True) | |
(fc2): Linear(in_features=2048, out_features=512, bias=True) | |
(final_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
) | |
(2): TransformerDecoderLayerBase( | |
(dropout_module): FairseqDropout() | |
(self_attn): MultiheadAttention( | |
(dropout_module): FairseqDropout() | |
(k_proj): Linear(in_features=512, out_features=512, bias=True) | |
(v_proj): Linear(in_features=512, out_features=512, bias=True) | |
(q_proj): Linear(in_features=512, out_features=512, bias=True) | |
(out_proj): Linear(in_features=512, out_features=512, bias=True) | |
) | |
(activation_dropout_module): FairseqDropout() | |
(self_attn_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
(fc1): Linear(in_features=512, out_features=2048, bias=True) | |
(fc2): Linear(in_features=2048, out_features=512, bias=True) | |
(final_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
) | |
(3): TransformerDecoderLayerBase( | |
(dropout_module): FairseqDropout() | |
(self_attn): MultiheadAttention( | |
(dropout_module): FairseqDropout() | |
(k_proj): Linear(in_features=512, out_features=512, bias=True) | |
(v_proj): Linear(in_features=512, out_features=512, bias=True) | |
(q_proj): Linear(in_features=512, out_features=512, bias=True) | |
(out_proj): Linear(in_features=512, out_features=512, bias=True) | |
) | |
(activation_dropout_module): FairseqDropout() | |
(self_attn_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
(fc1): Linear(in_features=512, out_features=2048, bias=True) | |
(fc2): Linear(in_features=2048, out_features=512, bias=True) | |
(final_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
) | |
(4): TransformerDecoderLayerBase( | |
(dropout_module): FairseqDropout() | |
(self_attn): MultiheadAttention( | |
(dropout_module): FairseqDropout() | |
(k_proj): Linear(in_features=512, out_features=512, bias=True) | |
(v_proj): Linear(in_features=512, out_features=512, bias=True) | |
(q_proj): Linear(in_features=512, out_features=512, bias=True) | |
(out_proj): Linear(in_features=512, out_features=512, bias=True) | |
) | |
(activation_dropout_module): FairseqDropout() | |
(self_attn_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
(fc1): Linear(in_features=512, out_features=2048, bias=True) | |
(fc2): Linear(in_features=2048, out_features=512, bias=True) | |
(final_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
) | |
(5): TransformerDecoderLayerBase( | |
(dropout_module): FairseqDropout() | |
(self_attn): MultiheadAttention( | |
(dropout_module): FairseqDropout() | |
(k_proj): Linear(in_features=512, out_features=512, bias=True) | |
(v_proj): Linear(in_features=512, out_features=512, bias=True) | |
(q_proj): Linear(in_features=512, out_features=512, bias=True) | |
(out_proj): Linear(in_features=512, out_features=512, bias=True) | |
) | |
(activation_dropout_module): FairseqDropout() | |
(self_attn_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
(fc1): Linear(in_features=512, out_features=2048, bias=True) | |
(fc2): Linear(in_features=2048, out_features=512, bias=True) | |
(final_layer_norm): FusedLayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True) | |
) | |
) | |
(output_projection): Linear(in_features=512, out_features=267744, bias=False) | |
) | |
) | |
2022-04-18 23:34:15 | INFO | fairseq_cli.train | task: LanguageModelingTask | |
2022-04-18 23:34:15 | INFO | fairseq_cli.train | model: TransformerLanguageModel | |
2022-04-18 23:34:15 | INFO | fairseq_cli.train | criterion: CrossEntropyCriterion | |
2022-04-18 23:34:15 | INFO | fairseq_cli.train | num. shared model params: 155,999,232 (num. trained: 155,999,232) | |
2022-04-18 23:34:15 | INFO | fairseq_cli.train | num. expert model params: 0 (num. trained: 0) | |
2022-04-18 23:34:15 | INFO | fairseq.data.data_utils | loaded 3,760 examples from: /job/fairseq/data-bin/wikitext-103/valid | |
2022-04-18 23:34:15 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:2 to store for rank: 0 | |
2022-04-18 23:34:15 | INFO | torch.distributed.distributed_c10d | Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 16 nodes. | |
2022-04-18 23:34:15 | INFO | fairseq.trainer | detected shared parameter: decoder.embed_tokens.weight <- decoder.output_projection.weight | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 00/02 : 0 3 2 1 5 6 7 4 8 11 10 9 13 14 15 12 | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 01/02 : 0 3 2 1 5 6 7 4 8 11 10 9 13 14 15 12 | |
ip-10-0-0-115:73:174 [0] NCCL INFO Trees [0] 3/8/-1->0->-1 [1] 3/-1/-1->0->8 | |
ip-10-0-0-115:74:179 [1] NCCL INFO Trees [0] 5/-1/-1->1->2 [1] 5/-1/-1->1->2 | |
ip-10-0-0-115:75:180 [2] NCCL INFO Trees [0] 1/-1/-1->2->3 [1] 1/-1/-1->2->3 | |
ip-10-0-0-115:77:181 [4] NCCL INFO Trees [0] -1/-1/-1->4->7 [1] -1/-1/-1->4->7 | |
ip-10-0-0-115:76:177 [3] NCCL INFO Trees [0] 2/-1/-1->3->0 [1] 2/-1/-1->3->0 | |
ip-10-0-0-115:78:178 [5] NCCL INFO Trees [0] 6/-1/-1->5->1 [1] 6/-1/-1->5->1 | |
ip-10-0-0-115:80:176 [7] NCCL INFO Trees [0] 4/-1/-1->7->6 [1] 4/-1/-1->7->6 | |
ip-10-0-0-115:79:175 [6] NCCL INFO Trees [0] 7/-1/-1->6->5 [1] 7/-1/-1->6->5 | |
ip-10-0-0-115:74:179 [1] NCCL INFO Channel 00 : 1[170] -> 5[1b0] via P2P/IPC | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 00 : 0[160] -> 3[190] via P2P/IPC | |
ip-10-0-0-115:78:178 [5] NCCL INFO Channel 00 : 5[1b0] -> 6[1c0] via P2P/IPC | |
ip-10-0-0-115:74:179 [1] NCCL INFO Channel 01 : 1[170] -> 5[1b0] via P2P/IPC | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 01 : 0[160] -> 3[190] via P2P/IPC | |
ip-10-0-0-115:78:178 [5] NCCL INFO Channel 01 : 5[1b0] -> 6[1c0] via P2P/IPC | |
ip-10-0-0-115:79:175 [6] NCCL INFO Channel 00 : 6[1c0] -> 7[1d0] via P2P/IPC | |
ip-10-0-0-115:75:180 [2] NCCL INFO Channel 00 : 2[180] -> 1[170] via P2P/IPC | |
ip-10-0-0-115:79:175 [6] NCCL INFO Channel 01 : 6[1c0] -> 7[1d0] via P2P/IPC | |
ip-10-0-0-115:75:180 [2] NCCL INFO Channel 01 : 2[180] -> 1[170] via P2P/IPC | |
ip-10-0-0-115:76:177 [3] NCCL INFO Channel 00 : 3[190] -> 2[180] via P2P/IPC | |
ip-10-0-0-115:77:181 [4] NCCL INFO Channel 00 : 4[1a0] -> 8[160] [send] via NET/Socket/0 | |
ip-10-0-0-115:76:177 [3] NCCL INFO Channel 01 : 3[190] -> 2[180] via P2P/IPC | |
ip-10-0-0-115:80:176 [7] NCCL INFO Channel 00 : 7[1d0] -> 4[1a0] via P2P/IPC | |
ip-10-0-0-115:79:175 [6] NCCL INFO Connected all rings | |
ip-10-0-0-115:76:177 [3] NCCL INFO Connected all rings | |
ip-10-0-0-115:80:176 [7] NCCL INFO Channel 01 : 7[1d0] -> 4[1a0] via P2P/IPC | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 00 : 12[1a0] -> 0[160] [receive] via NET/Socket/0 | |
ip-10-0-0-115:73:174 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread | |
ip-10-0-0-115:78:178 [5] NCCL INFO Connected all rings | |
ip-10-0-0-115:78:178 [5] NCCL INFO Channel 00 : 5[1b0] -> 1[170] via P2P/IPC | |
ip-10-0-0-115:79:175 [6] NCCL INFO Channel 00 : 6[1c0] -> 5[1b0] via P2P/IPC | |
ip-10-0-0-115:77:181 [4] NCCL INFO Channel 01 : 4[1a0] -> 8[160] [send] via NET/Socket/0 | |
ip-10-0-0-115:78:178 [5] NCCL INFO Channel 01 : 5[1b0] -> 1[170] via P2P/IPC | |
ip-10-0-0-115:79:175 [6] NCCL INFO Channel 01 : 6[1c0] -> 5[1b0] via P2P/IPC | |
ip-10-0-0-115:74:179 [1] NCCL INFO Connected all rings | |
ip-10-0-0-115:74:179 [1] NCCL INFO Channel 00 : 1[170] -> 2[180] via P2P/IPC | |
ip-10-0-0-115:75:180 [2] NCCL INFO Connected all rings | |
ip-10-0-0-115:74:179 [1] NCCL INFO Channel 01 : 1[170] -> 2[180] via P2P/IPC | |
ip-10-0-0-115:75:180 [2] NCCL INFO Channel 00 : 2[180] -> 3[190] via P2P/IPC | |
ip-10-0-0-115:75:180 [2] NCCL INFO Channel 01 : 2[180] -> 3[190] via P2P/IPC | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 01 : 12[1a0] -> 0[160] [receive] via NET/Socket/0 | |
ip-10-0-0-115:73:174 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread | |
ip-10-0-0-115:76:177 [3] NCCL INFO Channel 00 : 3[190] -> 0[160] via P2P/IPC | |
ip-10-0-0-115:75:180 [2] NCCL INFO Connected all trees | |
ip-10-0-0-115:75:180 [2] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:75:180 [2] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:76:177 [3] NCCL INFO Channel 01 : 3[190] -> 0[160] via P2P/IPC | |
ip-10-0-0-115:75:180 [2] NCCL INFO Channel 00 : 2[180] -> 4[1a0] via P2P/indirect/0[160] | |
ip-10-0-0-115:74:179 [1] NCCL INFO Connected all trees | |
ip-10-0-0-115:74:179 [1] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:74:179 [1] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:74:179 [1] NCCL INFO Channel 01 : 1[170] -> 4[1a0] via P2P/indirect/0[160] | |
ip-10-0-0-115:73:174 [0] NCCL INFO Connected all rings | |
ip-10-0-0-115:78:178 [5] NCCL INFO Connected all trees | |
ip-10-0-0-115:78:178 [5] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:78:178 [5] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:77:181 [4] NCCL INFO Connected all rings | |
ip-10-0-0-115:80:176 [7] NCCL INFO Connected all rings | |
ip-10-0-0-115:77:181 [4] NCCL INFO Channel 00 : 4[1a0] -> 7[1d0] via P2P/IPC | |
ip-10-0-0-115:77:181 [4] NCCL INFO Channel 01 : 4[1a0] -> 7[1d0] via P2P/IPC | |
ip-10-0-0-115:77:181 [4] NCCL INFO Connected all trees | |
ip-10-0-0-115:77:181 [4] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:77:181 [4] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:80:176 [7] NCCL INFO Channel 00 : 7[1d0] -> 6[1c0] via P2P/IPC | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 00 : 8[160] -> 0[160] [receive] via NET/Socket/0 | |
ip-10-0-0-115:73:174 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread | |
ip-10-0-0-115:80:176 [7] NCCL INFO Channel 01 : 7[1d0] -> 6[1c0] via P2P/IPC | |
ip-10-0-0-115:80:176 [7] NCCL INFO Connected all trees | |
ip-10-0-0-115:80:176 [7] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:80:176 [7] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:79:175 [6] NCCL INFO Connected all trees | |
ip-10-0-0-115:79:175 [6] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:79:175 [6] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 01 : 8[160] -> 0[160] [receive] via NET/Socket/0 | |
ip-10-0-0-115:73:174 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 00 : 0[160] -> 8[160] [send] via NET/Socket/0 | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 01 : 0[160] -> 8[160] [send] via NET/Socket/0 | |
ip-10-0-0-115:73:174 [0] NCCL INFO Connected all trees | |
ip-10-0-0-115:73:174 [0] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:73:174 [0] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:76:177 [3] NCCL INFO Connected all trees | |
ip-10-0-0-115:76:177 [3] NCCL INFO threadThresholds 8/8/64 | 128/8/64 | 8/8/512 | |
ip-10-0-0-115:76:177 [3] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 01 : 0[160] -> 5[1b0] via P2P/indirect/1[170] | |
ip-10-0-0-115:76:177 [3] NCCL INFO Channel 01 : 3[190] -> 4[1a0] via P2P/indirect/0[160] | |
ip-10-0-0-115:76:177 [3] NCCL INFO Channel 00 : 3[190] -> 5[1b0] via P2P/indirect/1[170] | |
ip-10-0-0-115:76:177 [3] NCCL INFO Channel 01 : 3[190] -> 6[1c0] via P2P/indirect/7[1d0] | |
ip-10-0-0-115:75:180 [2] NCCL INFO Channel 01 : 2[180] -> 5[1b0] via P2P/indirect/1[170] | |
ip-10-0-0-115:75:180 [2] NCCL INFO Channel 01 : 2[180] -> 7[1d0] via P2P/indirect/6[1c0] | |
ip-10-0-0-115:77:181 [4] NCCL INFO Channel 01 : 4[1a0] -> 1[170] via P2P/indirect/5[1b0] | |
ip-10-0-0-115:74:179 [1] NCCL INFO Channel 01 : 1[170] -> 6[1c0] via P2P/indirect/5[1b0] | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 00 : 0[160] -> 6[1c0] via P2P/indirect/4[1a0] | |
ip-10-0-0-115:78:178 [5] NCCL INFO Channel 01 : 5[1b0] -> 0[160] via P2P/indirect/4[1a0] | |
ip-10-0-0-115:74:179 [1] NCCL INFO Channel 00 : 1[170] -> 7[1d0] via P2P/indirect/3[190] | |
ip-10-0-0-115:79:175 [6] NCCL INFO Channel 00 : 6[1c0] -> 0[160] via P2P/indirect/4[1a0] | |
ip-10-0-0-115:73:174 [0] NCCL INFO Channel 01 : 0[160] -> 7[1d0] via P2P/indirect/4[1a0] | |
ip-10-0-0-115:80:176 [7] NCCL INFO Channel 01 : 7[1d0] -> 0[160] via P2P/indirect/4[1a0] | |
ip-10-0-0-115:80:176 [7] NCCL INFO Channel 00 : 7[1d0] -> 1[170] via P2P/indirect/5[1b0] | |
ip-10-0-0-115:79:175 [6] NCCL INFO Channel 01 : 6[1c0] -> 1[170] via P2P/indirect/5[1b0] | |
ip-10-0-0-115:80:176 [7] NCCL INFO Channel 01 : 7[1d0] -> 2[180] via P2P/indirect/3[190] | |
ip-10-0-0-115:78:178 [5] NCCL INFO Channel 01 : 5[1b0] -> 2[180] via P2P/indirect/1[170] | |
ip-10-0-0-115:79:175 [6] NCCL INFO Channel 01 : 6[1c0] -> 3[190] via P2P/indirect/2[180] | |
ip-10-0-0-115:78:178 [5] NCCL INFO Channel 00 : 5[1b0] -> 3[190] via P2P/indirect/1[170] | |
ip-10-0-0-115:77:181 [4] NCCL INFO Channel 00 : 4[1a0] -> 2[180] via P2P/indirect/6[1c0] | |
ip-10-0-0-115:77:181 [4] NCCL INFO Channel 01 : 4[1a0] -> 3[190] via P2P/indirect/0[160] | |
ip-10-0-0-115:76:177 [3] NCCL INFO comm 0x7ffa8c002fb0 rank 3 nranks 16 cudaDev 3 busId 190 - Init COMPLETE | |
ip-10-0-0-115:77:181 [4] NCCL INFO comm 0x7effa8002fb0 rank 4 nranks 16 cudaDev 4 busId 1a0 - Init COMPLETE | |
ip-10-0-0-115:78:178 [5] NCCL INFO comm 0x7fb064002fb0 rank 5 nranks 16 cudaDev 5 busId 1b0 - Init COMPLETE | |
ip-10-0-0-115:80:176 [7] NCCL INFO comm 0x7fa534002fb0 rank 7 nranks 16 cudaDev 7 busId 1d0 - Init COMPLETE | |
ip-10-0-0-115:79:175 [6] NCCL INFO comm 0x7ff990002fb0 rank 6 nranks 16 cudaDev 6 busId 1c0 - Init COMPLETE | |
ip-10-0-0-115:73:174 [0] NCCL INFO comm 0x7f5078002fb0 rank 0 nranks 16 cudaDev 0 busId 160 - Init COMPLETE | |
ip-10-0-0-115:75:180 [2] NCCL INFO comm 0x7f5304002fb0 rank 2 nranks 16 cudaDev 2 busId 180 - Init COMPLETE | |
ip-10-0-0-115:74:179 [1] NCCL INFO comm 0x7f5534002fb0 rank 1 nranks 16 cudaDev 1 busId 170 - Init COMPLETE | |
ip-10-0-0-115:73:73 [0] NCCL INFO Launch mode Parallel | |
2022-04-18 23:34:15 | INFO | fairseq.utils | ***********************CUDA enviroments for all 16 workers*********************** | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 0: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 1: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 2: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 3: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 4: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 5: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 6: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 7: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 8: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 9: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 10: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 11: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 12: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 13: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 14: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | rank 15: capabilities = 7.0 ; total memory = 31.749 GB ; name = Tesla V100-SXM2-32GB | |
2022-04-18 23:34:15 | INFO | fairseq.utils | ***********************CUDA enviroments for all 16 workers*********************** | |
2022-04-18 23:34:15 | INFO | fairseq_cli.train | training on 16 devices (GPUs/TPUs) | |
2022-04-18 23:34:15 | INFO | fairseq_cli.train | max tokens per device = 2048 and max sentences per device = None | |
2022-04-18 23:34:15 | INFO | fairseq.trainer | Preparing to load checkpoint /job/fairseq/checkpoints/transformer_wikitext-103_ubuntu/checkpoint_last.pt | |
2022-04-18 23:34:15 | INFO | fairseq.trainer | No existing checkpoint found /job/fairseq/checkpoints/transformer_wikitext-103_ubuntu/checkpoint_last.pt | |
2022-04-18 23:34:15 | INFO | fairseq.trainer | loading train data for epoch 1 | |
2022-04-18 23:34:16 | INFO | fairseq.data.data_utils | loaded 1,801,350 examples from: /job/fairseq/data-bin/wikitext-103/train | |
2022-04-18 23:34:17 | INFO | fairseq.trainer | NOTE: your device may support faster training with --fp16 or --amp | |
2022-04-18 23:34:17 | INFO | fairseq.optim.adam | using FusedAdam | |
2022-04-18 23:34:17 | INFO | fairseq.data.iterators | grouped total_num_itrs = 3151 | |
2022-04-18 23:34:17 | INFO | fairseq.trainer | begin training epoch 1 | |
2022-04-18 23:34:17 | INFO | fairseq_cli.train | Start iterating over samples | |
2022-04-18 23:34:18 | INFO | root | Reducer buckets have been rebuilt in this iteration. | |
2022-04-18 23:34:48 | INFO | train_inner | epoch 001: 100 / 3151 loss=16.618, ppl=100595, wps=108288, ups=3.3, wpb=32768, bsz=64, num_updates=100, lr=1.25975e-05, gnorm=3.652, train_wall=31, gb_free=20.6, wall=33 | |
2022-04-18 23:35:18 | INFO | train_inner | epoch 001: 200 / 3151 loss=14.226, ppl=19162.3, wps=108465, ups=3.31, wpb=32768, bsz=64, num_updates=200, lr=2.5095e-05, gnorm=1.593, train_wall=30, gb_free=20.6, wall=63 | |
2022-04-18 23:35:49 | INFO | train_inner | epoch 001: 300 / 3151 loss=12.192, ppl=4678.22, wps=106993, ups=3.27, wpb=32764.3, bsz=64, num_updates=300, lr=3.75925e-05, gnorm=1.079, train_wall=30, gb_free=20.6, wall=94 | |
2022-04-18 23:36:19 | INFO | train_inner | epoch 001: 400 / 3151 loss=10.734, ppl=1702.93, wps=107755, ups=3.29, wpb=32768, bsz=64, num_updates=400, lr=5.009e-05, gnorm=0.671, train_wall=30, gb_free=20.6, wall=124 | |
2022-04-18 23:36:51 | INFO | train_inner | epoch 001: 500 / 3151 loss=10.124, ppl=1116, wps=105255, ups=3.21, wpb=32768, bsz=64, num_updates=500, lr=6.25875e-05, gnorm=0.558, train_wall=31, gb_free=20.6, wall=155 | |
2022-04-18 23:37:22 | INFO | train_inner | epoch 001: 600 / 3151 loss=9.818, ppl=902.66, wps=104790, ups=3.2, wpb=32768, bsz=64, num_updates=600, lr=7.5085e-05, gnorm=0.643, train_wall=31, gb_free=20.6, wall=186 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment