Skip to content

Instantly share code, notes, and snippets.

@ozancaglayan
Created November 13, 2015 14:29
Show Gist options
  • Save ozancaglayan/4c9f39aec6016649d9c6 to your computer and use it in GitHub Desktop.
Save ozancaglayan/4c9f39aec6016649d9c6 to your computer and use it in GitHub Desktop.
repeat-bug-logs
--- c384-ce12288-pe1536-h256-bs256-lr0.03.log 2015-10-29 22:28:40.539747323 +0100
+++ c384-ce12288-pe1536-h256-bs256-lr0.03-repeatbug.log 2015-11-13 15:41:20.820063590 +0100
@@ -12,6 +12,7 @@
6: Tesla K40c with 15 CPUs x 192 threads running at 0.74 Ghz, 11519 MBytes of memory, use -arch=sm_35, utilization 0%
7: Tesla K40c with 15 CPUs x 192 threads running at 0.74 Ghz, 11519 MBytes of memory, use -arch=sm_35, utilization 0%
- using 8 devices in parallel: 0 1 2 3 4 5 6 7
+ - repeating these 1 machine(s) 32 times
#### GPU allocate local data for 8 GPU
#### GPU 0: use local data_in from MachSplit
#### GPU 1: allocate local data_in=0x2162a0000
@@ -21,39 +22,40 @@
#### GPU 5: allocate local data_in=0x228da0000
#### GPU 6: allocate local data_in=0x22d860000
#### GPU 7: allocate local data_in=0x232320000
+ - repeating these 1 machine(s) 32 times
#### CUDA set data_in for 8 GPU
### - mach 0 is on GPU 1 with input 0x2162a0000
-### - mach 1 is on GPU 2 with input 0x21ad60000
-### - mach 2 is on GPU 3 with input 0x21f820000
-### - mach 3 is on GPU 4 with input 0x2242e0000
-### - mach 4 is on GPU 5 with input 0x228da0000
-### - mach 5 is on GPU 6 with input 0x22d860000
-### - mach 6 is on GPU 7 with input 0x232320000
-### - mach 7 is on GPU 0 with input 0x208900000
+### - mach 1 is on GPU 1 with input 0x2162a0000
+### - mach 2 is on GPU 1 with input 0x2162a0000
+### - mach 3 is on GPU 1 with input 0x2162a0000
+### - mach 4 is on GPU 1 with input 0x2162a0000
+### - mach 5 is on GPU 1 with input 0x2162a0000
+### - mach 6 is on GPU 1 with input 0x2162a0000
+### - mach 7 is on GPU 1 with input 0x2162a0000
### - mach 8 is on GPU 1 with input 0x2162a0000
-### - mach 9 is on GPU 2 with input 0x21ad60000
-### - mach 10 is on GPU 3 with input 0x21f820000
-### - mach 11 is on GPU 4 with input 0x2242e0000
-### - mach 12 is on GPU 5 with input 0x228da0000
-### - mach 13 is on GPU 6 with input 0x22d860000
-### - mach 14 is on GPU 7 with input 0x232320000
-### - mach 15 is on GPU 0 with input 0x208900000
+### - mach 9 is on GPU 1 with input 0x2162a0000
+### - mach 10 is on GPU 1 with input 0x2162a0000
+### - mach 11 is on GPU 1 with input 0x2162a0000
+### - mach 12 is on GPU 1 with input 0x2162a0000
+### - mach 13 is on GPU 1 with input 0x2162a0000
+### - mach 14 is on GPU 1 with input 0x2162a0000
+### - mach 15 is on GPU 1 with input 0x2162a0000
### - mach 16 is on GPU 1 with input 0x2162a0000
-### - mach 17 is on GPU 2 with input 0x21ad60000
-### - mach 18 is on GPU 3 with input 0x21f820000
-### - mach 19 is on GPU 4 with input 0x2242e0000
-### - mach 20 is on GPU 5 with input 0x228da0000
-### - mach 21 is on GPU 6 with input 0x22d860000
-### - mach 22 is on GPU 7 with input 0x232320000
-### - mach 23 is on GPU 0 with input 0x208900000
+### - mach 17 is on GPU 1 with input 0x2162a0000
+### - mach 18 is on GPU 1 with input 0x2162a0000
+### - mach 19 is on GPU 1 with input 0x2162a0000
+### - mach 20 is on GPU 1 with input 0x2162a0000
+### - mach 21 is on GPU 1 with input 0x2162a0000
+### - mach 22 is on GPU 1 with input 0x2162a0000
+### - mach 23 is on GPU 1 with input 0x2162a0000
### - mach 24 is on GPU 1 with input 0x2162a0000
-### - mach 25 is on GPU 2 with input 0x21ad60000
-### - mach 26 is on GPU 3 with input 0x21f820000
-### - mach 27 is on GPU 4 with input 0x2242e0000
-### - mach 28 is on GPU 5 with input 0x228da0000
-### - mach 29 is on GPU 6 with input 0x22d860000
-### - mach 30 is on GPU 7 with input 0x232320000
-### - mach 31 is on GPU 0 with input 0x208900000
+### - mach 25 is on GPU 1 with input 0x2162a0000
+### - mach 26 is on GPU 1 with input 0x2162a0000
+### - mach 27 is on GPU 1 with input 0x2162a0000
+### - mach 28 is on GPU 1 with input 0x2162a0000
+### - mach 29 is on GPU 1 with input 0x2162a0000
+### - mach 30 is on GPU 1 with input 0x2162a0000
+### - mach 31 is on GPU 1 with input 0x2162a0000
WARNING: MachSplit::GetDataOut() has no output data for the whole machine
WARNING: MachSplit::GetGradOut() has no output gradient for the whole machine
- Sequential machine [3] 32- .. -640096, bs=256, passes=0/0
@@ -98,128 +100,128 @@
- MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 2
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 2
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 3
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 3
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 4
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 4
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 5
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 5
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 6
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 6
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 7
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 7
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 0
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 0
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
- MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 2
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 2
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 3
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 3
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 4
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 4
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 5
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 5
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 6
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 6
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 7
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 7
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 0
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 0
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
- MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 2
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 2
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 3
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 3
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 4
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 4
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 5
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 5
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 6
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 6
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 7
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 7
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 0
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 0
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
- MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 2
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 2
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 3
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 3
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 4
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 4
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 5
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 5
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 6
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 6
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 7
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 7
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- Sequential machine [2] 1536- .. -20003, bs=256, passes=0/0
- - MachTanh p-[1536]-256, bs=256, passes=0/0, on GPU 0
- - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 0
+ - MachTanh s-[1536]-256, bs=256, passes=0/0, on GPU 1
+ - MachSoftmaxStable 256-20003, bs=256, passes=0/0, on GPU 1
- total number of parameters: 5534243 (21 MBytes)
- total number of parameters: 177095776 (675 MBytes)
- total number of parameters: 203652832 (776 MBytes)
@@ -268,98 +270,44 @@
- loading all data into memory ... done (0m0s)
- shuffling data 10 times ... done (0m0s)
- this machine can predict up to 32 phrases, each with an output layer of dimension 20003
-Starting training on host nv12.clusterparole.univ-lemans.fr pid 2064
+Starting training on host nv12.clusterparole.univ-lemans.fr pid 30856
- training on train.df
- validation on dev.df
- stopping training at 100 epochs
- learning rate: 3.00e-02, multiplied by 5.00e-01 if the error increases on the development data
lower bound: 1.000000e-04, stopping after 2 iterations without improvement
- scaling learning rate by sqrt of batch size
- - saving best machine on validation data into file: machs/c384-ce12288-pe1536-h256-bs256-lr0.03.best.mach
-Starting epoch 1 at Thu Oct 29 22:09:39 2015
- - initial unscaled lrate=3.0000e-02, wdecay=5.0000e-04
- - all data is already loaded into memory
- - shuffling data 10 times ... done (0m0s)
- - training time: real=48.34s, cpu=48.27s forw: real=11.19s, cpu=11.17s grad: real=11.27s, cpu=11.25s backw: real=18.43s, cpu=18.40s
- - CSTM Train: log_sum: -16772503.00 including EOS, px=380.07, 167264 phrases, 2823491 target words
- - CSTM InfoPost: epoch finished, 2823491 target words in 167264 phrases (166314 (99.43%) short source, 165034 (98.67%) short target phrases)
- CSTM InfoPost: Average px: 380.07
- - starting validation ...
- - all data is already loaded into memory
- - shuffling data 10 times ... done (0m0s)
- - CSTM TestDev: 2882 phrases, 48387 target words, avr length src=15.1 tgt=16.8
- - CSTM TestDev: log_sum=-263318.34, px=230.89
- - saving current best machine into file 'machs/c384-ce12288-pe1536-h256-bs256-lr0.03.best.mach'
-Starting epoch 2 at Thu Oct 29 22:10:31 2015
- - initial unscaled lrate=3.0000e-02, wdecay=5.0000e-04
- - all data is already loaded into memory
- - shuffling data 10 times ... done (0m0s)
- - training time: real=48.35s, cpu=48.24s forw: real=11.18s, cpu=11.16s grad: real=11.27s, cpu=11.23s backw: real=18.43s, cpu=18.39s
- - CSTM Train: log_sum: -14680059.00 including EOS, px=181.14, 167264 phrases, 2823491 target words
- - CSTM InfoPost: epoch finished, 2823491 target words in 167264 phrases (166314 (99.43%) short source, 165034 (98.67%) short target phrases)
- CSTM InfoPost: Average px: 181.14
- - starting validation ...
- - all data is already loaded into memory
- - shuffling data 10 times ... done (0m0s)
- - CSTM TestDev: 2882 phrases, 48387 target words, avr length src=15.1 tgt=16.8
- - CSTM TestDev: log_sum=-253583.22, px=188.81
- - saving current best machine into file 'machs/c384-ce12288-pe1536-h256-bs256-lr0.03.best.mach'
-Starting epoch 3 at Thu Oct 29 22:11:24 2015
- - initial unscaled lrate=3.0000e-02, wdecay=5.0000e-04
- - all data is already loaded into memory
- - shuffling data 10 times ... done (0m0s)
- - training time: real=48.27s, cpu=48.18s forw: real=11.18s, cpu=11.16s grad: real=11.25s, cpu=11.23s backw: real=18.41s, cpu=18.38s
- - CSTM Train: log_sum: -13691704.00 including EOS, px=127.64, 167264 phrases, 2823491 target words
- - CSTM InfoPost: epoch finished, 2823491 target words in 167264 phrases (166314 (99.43%) short source, 165034 (98.67%) short target phrases)
- CSTM InfoPost: Average px: 127.64
- - starting validation ...
- - all data is already loaded into memory
- - shuffling data 10 times ... done (0m0s)
- - CSTM TestDev: 2882 phrases, 48387 target words, avr length src=15.1 tgt=16.8
- - CSTM TestDev: log_sum=-250169.62, px=175.95
- - saving current best machine into file 'machs/c384-ce12288-pe1536-h256-bs256-lr0.03.best.mach'
-Starting epoch 4 at Thu Oct 29 22:12:17 2015
+ - saving machine into file: machs/repeattest.mach
+ - saving best machine on validation data into file: machs/repeattest.best.mach
+Starting epoch 1 at Thu Oct 29 21:05:03 2015
- initial unscaled lrate=3.0000e-02, wdecay=5.0000e-04
- all data is already loaded into memory
- shuffling data 10 times ... done (0m1s)
- - training time: real=48.26s, cpu=48.12s forw: real=11.17s, cpu=11.14s grad: real=11.22s, cpu=11.18s backw: real=18.38s, cpu=18.33s
- - CSTM Train: log_sum: -12816231.00 including EOS, px=93.61, 167264 phrases, 2823491 target words
+ - training time: real=196.60s, cpu=196.41s forw: real=60.82s, cpu=60.77s grad: real=11.09s, cpu=11.06s backw: real=117.32s, cpu=117.22s
+ - CSTM Train: log_sum: -199213824.00 including EOS, px=4385492956373475817360956850176.00, 167264 phrases, 2823491 target words
- CSTM InfoPost: epoch finished, 2823491 target words in 167264 phrases (166314 (99.43%) short source, 165034 (98.67%) short target phrases)
- CSTM InfoPost: Average px: 93.61
+ CSTM InfoPost: Average px: 4385492956373475817360956850176.00
- starting validation ...
- all data is already loaded into memory
- shuffling data 10 times ... done (0m0s)
- CSTM TestDev: 2882 phrases, 48387 target words, avr length src=15.1 tgt=16.8
- - CSTM TestDev: log_sum=-249446.55, px=173.34
- - saving current best machine into file 'machs/c384-ce12288-pe1536-h256-bs256-lr0.03.best.mach'
-Starting epoch 5 at Thu Oct 29 22:13:09 2015
+ - CSTM TestDev: log_sum=-2958226.75, px=355942101753314446734786560.00
+ - saving current best machine into file 'machs/repeattest.best.mach'
+Starting epoch 2 at Thu Oct 29 21:08:22 2015
- initial unscaled lrate=3.0000e-02, wdecay=5.0000e-04
- all data is already loaded into memory
- - shuffling data 10 times ... done (0m1s)
- - training time: real=48.27s, cpu=48.18s forw: real=11.18s, cpu=11.16s grad: real=11.26s, cpu=11.23s backw: real=18.41s, cpu=18.38s
- - CSTM Train: log_sum: -11938231.00 including EOS, px=68.59, 167264 phrases, 2823491 target words
+ - shuffling data 10 times ... done (0m0s)
+ - training time: real=196.76s, cpu=196.49s forw: real=60.89s, cpu=60.81s grad: real=11.08s, cpu=11.05s backw: real=117.47s, cpu=117.32s
+ - CSTM Train: log_sum: -186254928.00 including EOS, px=44540062015643704503146905600.00, 167264 phrases, 2823491 target words
- CSTM InfoPost: epoch finished, 2823491 target words in 167264 phrases (166314 (99.43%) short source, 165034 (98.67%) short target phrases)
- CSTM InfoPost: Average px: 68.59
+ CSTM InfoPost: Average px: 44540062015643704503146905600.00
- starting validation ...
- all data is already loaded into memory
- shuffling data 10 times ... done (0m0s)
- CSTM TestDev: 2882 phrases, 48387 target words, avr length src=15.1 tgt=16.8
- - CSTM TestDev: log_sum=-251808.27, px=182.01
+ - CSTM TestDev: log_sum=-2974781.25, px=501143766161623663905865728.00
- multiplying learning rate by 5.000000e-01, new value is 1.500000e-02, 1 iterations without improvement
-Starting epoch 6 at Thu Oct 29 22:13:58 2015
+Starting epoch 3 at Thu Oct 29 21:11:40 2015
- initial unscaled lrate=1.5000e-02, wdecay=5.0000e-04
- all data is already loaded into memory
- shuffling data 10 times ... done (0m0s)
- - training time: real=48.25s, cpu=48.17s forw: real=11.18s, cpu=11.17s grad: real=11.26s, cpu=11.23s backw: real=18.39s, cpu=18.36s
- - CSTM Train: log_sum: -10596790.00 including EOS, px=42.65, 167264 phrases, 2823491 target words
- - CSTM InfoPost: epoch finished, 2823491 target words in 167264 phrases (166314 (99.43%) short source, 165034 (98.67%) short target phrases)
- CSTM InfoPost: Average px: 42.65
- - starting validation ...
- - all data is already loaded into memory
- - shuffling data 10 times ... done (0m0s)
- - CSTM TestDev: 2882 phrases, 48387 target words, avr length src=15.1 tgt=16.8
- - CSTM TestDev: log_sum=-254228.47, px=191.34
- - multiplying learning rate by 5.000000e-01, new value is 7.500000e-03, 2 iterations without improvement
- - no improvements after 2 iterations
-Training stopped, lowest validation error was 173.337 at epoch 4
-
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment