Skip to content

Instantly share code, notes, and snippets.

@taylanbil
Created September 27, 2019 21:53
Show Gist options
  • Save taylanbil/64aaa74c59745fce84aa217057f421d8 to your computer and use it in GitHub Desktop.
Save taylanbil/64aaa74c59745fce84aa217057f421d8 to your computer and use it in GitHub Desktop.
[fairseq][transformer] Fresh run w/ 3 large shapes.
Epoch 1 begin 19:52:45
training/ 19:55:52, device xla:5, step 1, Rate=2.19, GlobalRate=2.19
training/ 19:55:52, device xla:4, step 1, Rate=2.19, GlobalRate=2.19
training/ 19:56:05, device xla:2, step 1, Rate=1.97, GlobalRate=1.97
training/ 19:56:05, device xla:1, step 1, Rate=1.97, GlobalRate=1.97
training/ 19:59:14, device xla:8, step 1, Rate=1.60, GlobalRate=1.60
training/ 19:59:14, device xla:3, step 1, Rate=1.60, GlobalRate=1.60
training/ 19:59:16, device xla:6, step 1, Rate=1.60, GlobalRate=1.60
training/ 20:12:01, device xla:7, step 1, Rate=0.94, GlobalRate=0.94
training/ 20:17:46, device xla:5, step 2, Rate=1.11, GlobalRate=0.54
training/ 20:18:17, device xla:7, step 2, Rate=1.20, GlobalRate=1.05
training/ 20:18:25, device xla:6, step 2, Rate=0.91, GlobalRate=0.70
training/ 20:18:25, device xla:3, step 2, Rate=0.91, GlobalRate=0.70
training/ 20:18:26, device xla:4, step 2, Rate=1.10, GlobalRate=0.52
training/ 20:18:26, device xla:2, step 2, Rate=1.02, GlobalRate=0.52
training/ 20:32:17, device xla:1, step 2, Rate=1.07, GlobalRate=0.56
training/ 20:32:20, device xla:8, step 2, Rate=0.95, GlobalRate=0.67
training/ 20:32:35, device xla:8, step 3, Rate=20.98, GlobalRate=0.88
training/ 20:32:35, device xla:1, step 3, Rate=34.96, GlobalRate=0.99
training/ 20:32:35, device xla:6, step 3, Rate=0.72, GlobalRate=0.66
training/ 20:34:39, device xla:2, step 3, Rate=0.56, GlobalRate=0.42
training/ 20:34:40, device xla:3, step 3, Rate=0.52, GlobalRate=0.52
training/ 20:34:40, device xla:7, step 3, Rate=0.63, GlobalRate=0.73
training/ 20:34:59, device xla:4, step 3, Rate=0.60, GlobalRate=0.42
training/ 20:34:59, device xla:5, step 3, Rate=0.59, GlobalRate=0.42
training/ 20:35:15, device xla:4, step 4, Rate=9.79, GlobalRate=0.52
training/ 20:35:15, device xla:3, step 4, Rate=8.88, GlobalRate=0.72
training/ 20:35:15, device xla:2, step 4, Rate=4.54, GlobalRate=0.52
training/ 20:35:15, device xla:5, step 4, Rate=38.37, GlobalRate=0.83
training/ 20:35:15, device xla:1, step 4, Rate=14.94, GlobalRate=1.03
training/ 20:37:37, device xla:8, step 4, Rate=8.90, GlobalRate=0.88
training/ 20:37:37, device xla:6, step 4, Rate=0.80, GlobalRate=0.68
training/ 20:40:22, device xla:7, step 4, Rate=1.15, GlobalRate=0.83
training/ 20:40:34, device xla:5, step 5, Rate=17.28, GlobalRate=1.10
training/ 20:40:34, device xla:2, step 5, Rate=2.30, GlobalRate=0.55
training/ 20:40:34, device xla:7, step 5, Rate=14.01, GlobalRate=0.91
training/ 20:40:34, device xla:8, step 5, Rate=5.29, GlobalRate=1.01
training/ 20:40:34, device xla:3, step 5, Rate=4.03, GlobalRate=0.73
training/ 20:40:34, device xla:1, step 5, Rate=6.94, GlobalRate=1.10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment