Created
December 30, 2016 03:25
-
-
Save rachtsingh/e97b4be011f4b86c47956848725e8095 to your computer and use it in GitHub Desktop.
Baseline log
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading data from 'data/small-train.t7'... | |
* vocabulary size: source = 50004; target = 50004 | |
* additional features: source = 0; target = 0 | |
* maximum sequence length: source = 50; target = 51 | |
* number of training sentences: 100000 | |
* maximum batch size: 64 | |
Building model... | |
* using input feeding | |
Initializing parameters... | |
* number of parameters: 84814004 | |
Preparing memory optimization... | |
* sharing 69% of output/gradInput tensors memory between clones | |
Start training... | |
Epoch 1 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 732 ; Perplexity 615210.38 | |
Epoch 1 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 1069 ; Perplexity 105114.37 | |
Epoch 1 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 1227 ; Perplexity 38544.59 | |
Epoch 1 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 1371 ; Perplexity 18725.93 | |
Epoch 1 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 1455 ; Perplexity 11406.14 | |
Epoch 1 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 1514 ; Perplexity 8083.89 | |
Epoch 1 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 1543 ; Perplexity 6289.71 | |
Epoch 1 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 1562 ; Perplexity 5138.64 | |
Epoch 1 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 1579 ; Perplexity 4310.08 | |
Epoch 1 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 1615 ; Perplexity 3626.10 | |
Epoch 1 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 1632 ; Perplexity 3157.90 | |
Epoch 1 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 1653 ; Perplexity 2777.50 | |
Epoch 1 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 1659 ; Perplexity 2503.01 | |
Epoch 1 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 1670 ; Perplexity 2271.96 | |
Epoch 1 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 1680 ; Perplexity 2080.22 | |
Epoch 1 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 1693 ; Perplexity 1912.76 | |
Epoch 1 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 1696 ; Perplexity 1781.79 | |
Epoch 1 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 1703 ; Perplexity 1661.91 | |
Epoch 1 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 1700 ; Perplexity 1567.93 | |
Epoch 1 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 1712 ; Perplexity 1471.73 | |
Epoch 1 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 1715 ; Perplexity 1389.92 | |
Epoch 1 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 1723 ; Perplexity 1314.89 | |
Epoch 1 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 1726 ; Perplexity 1250.81 | |
Epoch 1 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 1729 ; Perplexity 1194.76 | |
Epoch 1 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 1731 ; Perplexity 1144.58 | |
Epoch 1 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 1740 ; Perplexity 1094.53 | |
Epoch 1 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 1743 ; Perplexity 1048.14 | |
Epoch 1 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 1745 ; Perplexity 1008.05 | |
Epoch 1 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 1751 ; Perplexity 967.57 | |
Epoch 1 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 1755 ; Perplexity 932.41 | |
Epoch 1 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 1759 ; Perplexity 898.71 | |
Validation perplexity: 275.10064334583 | |
Saving checkpoint to 'models/baseline_epoch1_275.10.t7'... | |
Epoch 2 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 1788 ; Perplexity 270.24 | |
Epoch 2 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 1855 ; Perplexity 274.90 | |
Epoch 2 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 1824 ; Perplexity 275.08 | |
Epoch 2 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 1805 ; Perplexity 271.69 | |
Epoch 2 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 1818 ; Perplexity 267.60 | |
Epoch 2 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 1822 ; Perplexity 264.24 | |
Epoch 2 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 1805 ; Perplexity 260.15 | |
Epoch 2 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 1797 ; Perplexity 255.19 | |
Epoch 2 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 1801 ; Perplexity 250.83 | |
Epoch 2 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 1803 ; Perplexity 246.35 | |
Epoch 2 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 1820 ; Perplexity 243.61 | |
Epoch 2 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 1820 ; Perplexity 240.35 | |
Epoch 2 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 1826 ; Perplexity 237.36 | |
Epoch 2 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 1821 ; Perplexity 233.45 | |
Epoch 2 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 1821 ; Perplexity 230.06 | |
Epoch 2 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 1826 ; Perplexity 226.20 | |
Epoch 2 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 1825 ; Perplexity 222.75 | |
Epoch 2 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 1836 ; Perplexity 220.01 | |
Epoch 2 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 1838 ; Perplexity 216.68 | |
Epoch 2 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 1834 ; Perplexity 213.42 | |
Epoch 2 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 1835 ; Perplexity 210.49 | |
Epoch 2 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 1835 ; Perplexity 207.81 | |
Epoch 2 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 1833 ; Perplexity 204.49 | |
Epoch 2 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 1835 ; Perplexity 201.47 | |
Epoch 2 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 1834 ; Perplexity 198.52 | |
Epoch 2 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 1833 ; Perplexity 195.71 | |
Epoch 2 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 1838 ; Perplexity 192.95 | |
Epoch 2 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 1841 ; Perplexity 190.10 | |
Epoch 2 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 1839 ; Perplexity 187.43 | |
Epoch 2 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 1840 ; Perplexity 184.93 | |
Epoch 2 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 1837 ; Perplexity 182.65 | |
Validation perplexity: 94.961664704637 | |
Saving checkpoint to 'models/baseline_epoch2_94.96.t7'... | |
Epoch 3 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 1821 ; Perplexity 104.79 | |
Epoch 3 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 1850 ; Perplexity 105.00 | |
Epoch 3 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 1832 ; Perplexity 103.46 | |
Epoch 3 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 1847 ; Perplexity 102.35 | |
Epoch 3 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 1833 ; Perplexity 101.39 | |
Epoch 3 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 1842 ; Perplexity 101.58 | |
Epoch 3 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 1841 ; Perplexity 100.87 | |
Epoch 3 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 1835 ; Perplexity 100.89 | |
Epoch 3 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 1834 ; Perplexity 99.90 | |
Epoch 3 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 1836 ; Perplexity 99.40 | |
Epoch 3 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 1833 ; Perplexity 98.06 | |
Epoch 3 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 1834 ; Perplexity 97.24 | |
Epoch 3 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 1831 ; Perplexity 96.49 | |
Epoch 3 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 1830 ; Perplexity 95.83 | |
Epoch 3 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 1821 ; Perplexity 94.96 | |
Epoch 3 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 1826 ; Perplexity 94.36 | |
Epoch 3 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 1824 ; Perplexity 93.93 | |
Epoch 3 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 1823 ; Perplexity 93.00 | |
Epoch 3 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 1822 ; Perplexity 92.36 | |
Epoch 3 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 1825 ; Perplexity 91.76 | |
Epoch 3 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 1826 ; Perplexity 91.31 | |
Epoch 3 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 1826 ; Perplexity 90.59 | |
Epoch 3 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 1827 ; Perplexity 89.91 | |
Epoch 3 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 1827 ; Perplexity 89.37 | |
Epoch 3 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 1828 ; Perplexity 88.79 | |
Epoch 3 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 1835 ; Perplexity 88.23 | |
Epoch 3 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 1833 ; Perplexity 87.68 | |
Epoch 3 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 1831 ; Perplexity 87.09 | |
Epoch 3 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 1834 ; Perplexity 86.62 | |
Epoch 3 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 1837 ; Perplexity 86.11 | |
Epoch 3 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 1837 ; Perplexity 85.62 | |
Validation perplexity: 55.471133808158 | |
Saving checkpoint to 'models/baseline_epoch3_55.47.t7'... | |
Epoch 4 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 1928 ; Perplexity 60.31 | |
Epoch 4 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 1864 ; Perplexity 59.71 | |
Epoch 4 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 1869 ; Perplexity 60.24 | |
Epoch 4 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 1863 ; Perplexity 59.44 | |
Epoch 4 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 1863 ; Perplexity 59.33 | |
Epoch 4 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 1858 ; Perplexity 58.78 | |
Epoch 4 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 1784 ; Perplexity 58.58 | |
Epoch 4 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 1564 ; Perplexity 58.83 | |
Epoch 4 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 1415 ; Perplexity 58.76 | |
Epoch 4 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 1334 ; Perplexity 58.35 | |
Epoch 4 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 1264 ; Perplexity 58.25 | |
Epoch 4 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 1212 ; Perplexity 58.03 | |
Epoch 4 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 1169 ; Perplexity 57.93 | |
Epoch 4 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 1133 ; Perplexity 57.86 | |
Epoch 4 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 1108 ; Perplexity 57.66 | |
Epoch 4 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 1084 ; Perplexity 57.55 | |
Epoch 4 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 1064 ; Perplexity 57.44 | |
Epoch 4 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 1047 ; Perplexity 57.39 | |
Epoch 4 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 1034 ; Perplexity 57.12 | |
Epoch 4 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 1021 ; Perplexity 57.00 | |
Epoch 4 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 1012 ; Perplexity 56.93 | |
Epoch 4 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 1002 ; Perplexity 56.60 | |
Epoch 4 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 993 ; Perplexity 56.51 | |
Epoch 4 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 985 ; Perplexity 56.18 | |
Epoch 4 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 980 ; Perplexity 56.12 | |
Epoch 4 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 973 ; Perplexity 55.96 | |
Epoch 4 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 967 ; Perplexity 55.83 | |
Epoch 4 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 962 ; Perplexity 55.71 | |
Epoch 4 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 956 ; Perplexity 55.48 | |
Epoch 4 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 953 ; Perplexity 55.32 | |
Epoch 4 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 947 ; Perplexity 55.15 | |
Validation perplexity: 43.754144572934 | |
Saving checkpoint to 'models/baseline_epoch4_43.75.t7'... | |
Epoch 5 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 874 ; Perplexity 42.76 | |
Epoch 5 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 850 ; Perplexity 41.09 | |
Epoch 5 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 824 ; Perplexity 41.11 | |
Epoch 5 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 40.90 | |
Epoch 5 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 41.27 | |
Epoch 5 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 41.87 | |
Epoch 5 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 41.56 | |
Epoch 5 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 41.56 | |
Epoch 5 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 829 ; Perplexity 41.35 | |
Epoch 5 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 832 ; Perplexity 41.30 | |
Epoch 5 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 828 ; Perplexity 41.23 | |
Epoch 5 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 830 ; Perplexity 41.34 | |
Epoch 5 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 41.51 | |
Epoch 5 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 832 ; Perplexity 41.60 | |
Epoch 5 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 41.63 | |
Epoch 5 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 832 ; Perplexity 41.76 | |
Epoch 5 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 834 ; Perplexity 41.86 | |
Epoch 5 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 834 ; Perplexity 41.73 | |
Epoch 5 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 835 ; Perplexity 41.66 | |
Epoch 5 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 835 ; Perplexity 41.62 | |
Epoch 5 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 41.62 | |
Epoch 5 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 41.52 | |
Epoch 5 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 41.48 | |
Epoch 5 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 835 ; Perplexity 41.38 | |
Epoch 5 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 41.30 | |
Epoch 5 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 41.30 | |
Epoch 5 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 41.20 | |
Epoch 5 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 41.12 | |
Epoch 5 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 842 ; Perplexity 41.14 | |
Epoch 5 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 843 ; Perplexity 41.12 | |
Epoch 5 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 41.07 | |
Validation perplexity: 35.248375765707 | |
Saving checkpoint to 'models/baseline_epoch5_35.25.t7'... | |
Epoch 6 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 811 ; Perplexity 32.54 | |
Epoch 6 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 807 ; Perplexity 32.51 | |
Epoch 6 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 818 ; Perplexity 32.59 | |
Epoch 6 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 801 ; Perplexity 32.15 | |
Epoch 6 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 812 ; Perplexity 32.23 | |
Epoch 6 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 818 ; Perplexity 32.35 | |
Epoch 6 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 830 ; Perplexity 32.59 | |
Epoch 6 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 32.73 | |
Epoch 6 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 32.69 | |
Epoch 6 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 829 ; Perplexity 32.74 | |
Epoch 6 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 829 ; Perplexity 32.93 | |
Epoch 6 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 829 ; Perplexity 33.09 | |
Epoch 6 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 830 ; Perplexity 32.98 | |
Epoch 6 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 829 ; Perplexity 32.94 | |
Epoch 6 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 32.84 | |
Epoch 6 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 32.74 | |
Epoch 6 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 32.69 | |
Epoch 6 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 32.63 | |
Epoch 6 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 832 ; Perplexity 32.73 | |
Epoch 6 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 832 ; Perplexity 32.67 | |
Epoch 6 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 832 ; Perplexity 32.71 | |
Epoch 6 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 831 ; Perplexity 32.67 | |
Epoch 6 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 834 ; Perplexity 32.79 | |
Epoch 6 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 834 ; Perplexity 32.82 | |
Epoch 6 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 833 ; Perplexity 32.81 | |
Epoch 6 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 832 ; Perplexity 32.84 | |
Epoch 6 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 832 ; Perplexity 32.84 | |
Epoch 6 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 833 ; Perplexity 32.88 | |
Epoch 6 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 833 ; Perplexity 32.91 | |
Epoch 6 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 834 ; Perplexity 32.91 | |
Epoch 6 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 835 ; Perplexity 32.89 | |
Validation perplexity: 31.98985843374 | |
Saving checkpoint to 'models/baseline_epoch6_31.99.t7'... | |
Epoch 7 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 846 ; Perplexity 27.00 | |
Epoch 7 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 864 ; Perplexity 26.83 | |
Epoch 7 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 847 ; Perplexity 26.66 | |
Epoch 7 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 26.58 | |
Epoch 7 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 846 ; Perplexity 26.90 | |
Epoch 7 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 841 ; Perplexity 26.73 | |
Epoch 7 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 844 ; Perplexity 26.69 | |
Epoch 7 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 842 ; Perplexity 26.64 | |
Epoch 7 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 843 ; Perplexity 26.70 | |
Epoch 7 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 843 ; Perplexity 26.71 | |
Epoch 7 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 843 ; Perplexity 26.69 | |
Epoch 7 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 26.82 | |
Epoch 7 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 27.01 | |
Epoch 7 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 834 ; Perplexity 26.99 | |
Epoch 7 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 27.11 | |
Epoch 7 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 834 ; Perplexity 27.15 | |
Epoch 7 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 27.09 | |
Epoch 7 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 27.18 | |
Epoch 7 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 27.17 | |
Epoch 7 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 27.20 | |
Epoch 7 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 27.30 | |
Epoch 7 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 27.31 | |
Epoch 7 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 27.35 | |
Epoch 7 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 27.39 | |
Epoch 7 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 27.40 | |
Epoch 7 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 841 ; Perplexity 27.43 | |
Epoch 7 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 27.42 | |
Epoch 7 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 841 ; Perplexity 27.45 | |
Epoch 7 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 27.48 | |
Epoch 7 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 27.50 | |
Epoch 7 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 27.50 | |
Validation perplexity: 28.895515910504 | |
Saving checkpoint to 'models/baseline_epoch7_28.90.t7'... | |
Epoch 8 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 823 ; Perplexity 22.72 | |
Epoch 8 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 22.44 | |
Epoch 8 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 863 ; Perplexity 22.41 | |
Epoch 8 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 855 ; Perplexity 22.78 | |
Epoch 8 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 849 ; Perplexity 22.36 | |
Epoch 8 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 853 ; Perplexity 22.67 | |
Epoch 8 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 851 ; Perplexity 22.65 | |
Epoch 8 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 849 ; Perplexity 22.61 | |
Epoch 8 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 847 ; Perplexity 22.70 | |
Epoch 8 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 846 ; Perplexity 22.68 | |
Epoch 8 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 843 ; Perplexity 22.76 | |
Epoch 8 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 841 ; Perplexity 22.86 | |
Epoch 8 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 844 ; Perplexity 23.01 | |
Epoch 8 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 844 ; Perplexity 23.06 | |
Epoch 8 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 844 ; Perplexity 23.13 | |
Epoch 8 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 845 ; Perplexity 23.19 | |
Epoch 8 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 843 ; Perplexity 23.17 | |
Epoch 8 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 23.15 | |
Epoch 8 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 23.27 | |
Epoch 8 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 23.28 | |
Epoch 8 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 23.37 | |
Epoch 8 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 23.48 | |
Epoch 8 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 23.49 | |
Epoch 8 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 835 ; Perplexity 23.47 | |
Epoch 8 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 23.48 | |
Epoch 8 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 835 ; Perplexity 23.48 | |
Epoch 8 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 23.55 | |
Epoch 8 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 23.55 | |
Epoch 8 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 835 ; Perplexity 23.56 | |
Epoch 8 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 23.62 | |
Epoch 8 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 23.60 | |
Validation perplexity: 27.403117648073 | |
Saving checkpoint to 'models/baseline_epoch8_27.40.t7'... | |
Epoch 9 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 856 ; Perplexity 19.06 | |
Epoch 9 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 832 ; Perplexity 19.52 | |
Epoch 9 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 822 ; Perplexity 19.22 | |
Epoch 9 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 814 ; Perplexity 19.31 | |
Epoch 9 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 19.79 | |
Epoch 9 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 850 ; Perplexity 19.65 | |
Epoch 9 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 842 ; Perplexity 19.66 | |
Epoch 9 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 836 ; Perplexity 19.62 | |
Epoch 9 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 19.76 | |
Epoch 9 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 844 ; Perplexity 19.93 | |
Epoch 9 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 841 ; Perplexity 19.95 | |
Epoch 9 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 19.93 | |
Epoch 9 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 19.93 | |
Epoch 9 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 20.04 | |
Epoch 9 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 20.09 | |
Epoch 9 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 843 ; Perplexity 20.21 | |
Epoch 9 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 20.21 | |
Epoch 9 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 841 ; Perplexity 20.33 | |
Epoch 9 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 843 ; Perplexity 20.33 | |
Epoch 9 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 842 ; Perplexity 20.38 | |
Epoch 9 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 841 ; Perplexity 20.37 | |
Epoch 9 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 20.43 | |
Epoch 9 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 20.47 | |
Epoch 9 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 20.51 | |
Epoch 9 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 20.48 | |
Epoch 9 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 20.48 | |
Epoch 9 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 20.52 | |
Epoch 9 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 20.54 | |
Epoch 9 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 837 ; Perplexity 20.56 | |
Epoch 9 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 838 ; Perplexity 20.59 | |
Epoch 9 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 839 ; Perplexity 20.64 | |
Validation perplexity: 25.825762106436 | |
Saving checkpoint to 'models/baseline_epoch9_25.83.t7'... | |
Epoch 10 ; Iteration 50/1587 ; Learning rate 0.5000 ; Source tokens/s 818 ; Perplexity 15.92 | |
Epoch 10 ; Iteration 100/1587 ; Learning rate 0.5000 ; Source tokens/s 822 ; Perplexity 15.73 | |
Epoch 10 ; Iteration 150/1587 ; Learning rate 0.5000 ; Source tokens/s 817 ; Perplexity 15.87 | |
Epoch 10 ; Iteration 200/1587 ; Learning rate 0.5000 ; Source tokens/s 823 ; Perplexity 15.53 | |
Epoch 10 ; Iteration 250/1587 ; Learning rate 0.5000 ; Source tokens/s 827 ; Perplexity 15.60 | |
Epoch 10 ; Iteration 300/1587 ; Learning rate 0.5000 ; Source tokens/s 826 ; Perplexity 15.51 | |
Epoch 10 ; Iteration 350/1587 ; Learning rate 0.5000 ; Source tokens/s 829 ; Perplexity 15.45 | |
Epoch 10 ; Iteration 400/1587 ; Learning rate 0.5000 ; Source tokens/s 820 ; Perplexity 15.37 | |
Epoch 10 ; Iteration 450/1587 ; Learning rate 0.5000 ; Source tokens/s 819 ; Perplexity 15.29 | |
Epoch 10 ; Iteration 500/1587 ; Learning rate 0.5000 ; Source tokens/s 817 ; Perplexity 15.23 | |
Epoch 10 ; Iteration 550/1587 ; Learning rate 0.5000 ; Source tokens/s 820 ; Perplexity 15.21 | |
Epoch 10 ; Iteration 600/1587 ; Learning rate 0.5000 ; Source tokens/s 822 ; Perplexity 15.28 | |
Epoch 10 ; Iteration 650/1587 ; Learning rate 0.5000 ; Source tokens/s 822 ; Perplexity 15.29 | |
Epoch 10 ; Iteration 700/1587 ; Learning rate 0.5000 ; Source tokens/s 824 ; Perplexity 15.30 | |
Epoch 10 ; Iteration 750/1587 ; Learning rate 0.5000 ; Source tokens/s 826 ; Perplexity 15.26 | |
Epoch 10 ; Iteration 800/1587 ; Learning rate 0.5000 ; Source tokens/s 827 ; Perplexity 15.31 | |
Epoch 10 ; Iteration 850/1587 ; Learning rate 0.5000 ; Source tokens/s 827 ; Perplexity 15.32 | |
Epoch 10 ; Iteration 900/1587 ; Learning rate 0.5000 ; Source tokens/s 826 ; Perplexity 15.33 | |
Epoch 10 ; Iteration 950/1587 ; Learning rate 0.5000 ; Source tokens/s 826 ; Perplexity 15.35 | |
Epoch 10 ; Iteration 1000/1587 ; Learning rate 0.5000 ; Source tokens/s 828 ; Perplexity 15.41 | |
Epoch 10 ; Iteration 1050/1587 ; Learning rate 0.5000 ; Source tokens/s 832 ; Perplexity 15.45 | |
Epoch 10 ; Iteration 1100/1587 ; Learning rate 0.5000 ; Source tokens/s 832 ; Perplexity 15.48 | |
Epoch 10 ; Iteration 1150/1587 ; Learning rate 0.5000 ; Source tokens/s 833 ; Perplexity 15.43 | |
Epoch 10 ; Iteration 1200/1587 ; Learning rate 0.5000 ; Source tokens/s 832 ; Perplexity 15.42 | |
Epoch 10 ; Iteration 1250/1587 ; Learning rate 0.5000 ; Source tokens/s 832 ; Perplexity 15.41 | |
Epoch 10 ; Iteration 1300/1587 ; Learning rate 0.5000 ; Source tokens/s 833 ; Perplexity 15.44 | |
Epoch 10 ; Iteration 1350/1587 ; Learning rate 0.5000 ; Source tokens/s 836 ; Perplexity 15.46 | |
Epoch 10 ; Iteration 1400/1587 ; Learning rate 0.5000 ; Source tokens/s 839 ; Perplexity 15.49 | |
Epoch 10 ; Iteration 1450/1587 ; Learning rate 0.5000 ; Source tokens/s 840 ; Perplexity 15.48 | |
Epoch 10 ; Iteration 1500/1587 ; Learning rate 0.5000 ; Source tokens/s 841 ; Perplexity 15.49 | |
Epoch 10 ; Iteration 1550/1587 ; Learning rate 0.5000 ; Source tokens/s 840 ; Perplexity 15.50 | |
Validation perplexity: 23.989800570611 | |
Saving checkpoint to 'models/baseline_epoch10_23.99.t7'... | |
Epoch 11 ; Iteration 50/1587 ; Learning rate 0.2500 ; Source tokens/s 780 ; Perplexity 12.87 | |
Epoch 11 ; Iteration 100/1587 ; Learning rate 0.2500 ; Source tokens/s 831 ; Perplexity 13.25 | |
Epoch 11 ; Iteration 150/1587 ; Learning rate 0.2500 ; Source tokens/s 839 ; Perplexity 13.02 | |
Epoch 11 ; Iteration 200/1587 ; Learning rate 0.2500 ; Source tokens/s 844 ; Perplexity 12.91 | |
Epoch 11 ; Iteration 250/1587 ; Learning rate 0.2500 ; Source tokens/s 847 ; Perplexity 13.01 | |
Epoch 11 ; Iteration 300/1587 ; Learning rate 0.2500 ; Source tokens/s 841 ; Perplexity 12.95 | |
Epoch 11 ; Iteration 350/1587 ; Learning rate 0.2500 ; Source tokens/s 839 ; Perplexity 12.90 | |
Epoch 11 ; Iteration 400/1587 ; Learning rate 0.2500 ; Source tokens/s 836 ; Perplexity 12.89 | |
Epoch 11 ; Iteration 450/1587 ; Learning rate 0.2500 ; Source tokens/s 833 ; Perplexity 12.87 | |
Epoch 11 ; Iteration 500/1587 ; Learning rate 0.2500 ; Source tokens/s 835 ; Perplexity 12.79 | |
Epoch 11 ; Iteration 550/1587 ; Learning rate 0.2500 ; Source tokens/s 834 ; Perplexity 12.77 | |
Epoch 11 ; Iteration 600/1587 ; Learning rate 0.2500 ; Source tokens/s 839 ; Perplexity 12.80 | |
Epoch 11 ; Iteration 650/1587 ; Learning rate 0.2500 ; Source tokens/s 843 ; Perplexity 12.86 | |
Epoch 11 ; Iteration 700/1587 ; Learning rate 0.2500 ; Source tokens/s 842 ; Perplexity 12.89 | |
Epoch 11 ; Iteration 750/1587 ; Learning rate 0.2500 ; Source tokens/s 842 ; Perplexity 12.87 | |
Epoch 11 ; Iteration 800/1587 ; Learning rate 0.2500 ; Source tokens/s 839 ; Perplexity 12.86 | |
Epoch 11 ; Iteration 850/1587 ; Learning rate 0.2500 ; Source tokens/s 841 ; Perplexity 12.89 | |
Epoch 11 ; Iteration 900/1587 ; Learning rate 0.2500 ; Source tokens/s 838 ; Perplexity 12.86 | |
Epoch 11 ; Iteration 950/1587 ; Learning rate 0.2500 ; Source tokens/s 837 ; Perplexity 12.83 | |
Epoch 11 ; Iteration 1000/1587 ; Learning rate 0.2500 ; Source tokens/s 839 ; Perplexity 12.85 | |
Epoch 11 ; Iteration 1050/1587 ; Learning rate 0.2500 ; Source tokens/s 837 ; Perplexity 12.80 | |
Epoch 11 ; Iteration 1100/1587 ; Learning rate 0.2500 ; Source tokens/s 837 ; Perplexity 12.80 | |
Epoch 11 ; Iteration 1150/1587 ; Learning rate 0.2500 ; Source tokens/s 838 ; Perplexity 12.85 | |
Epoch 11 ; Iteration 1200/1587 ; Learning rate 0.2500 ; Source tokens/s 838 ; Perplexity 12.84 | |
Epoch 11 ; Iteration 1250/1587 ; Learning rate 0.2500 ; Source tokens/s 837 ; Perplexity 12.83 | |
Epoch 11 ; Iteration 1300/1587 ; Learning rate 0.2500 ; Source tokens/s 838 ; Perplexity 12.84 | |
Epoch 11 ; Iteration 1350/1587 ; Learning rate 0.2500 ; Source tokens/s 837 ; Perplexity 12.86 | |
Epoch 11 ; Iteration 1400/1587 ; Learning rate 0.2500 ; Source tokens/s 837 ; Perplexity 12.86 | |
Epoch 11 ; Iteration 1450/1587 ; Learning rate 0.2500 ; Source tokens/s 837 ; Perplexity 12.87 | |
Epoch 11 ; Iteration 1500/1587 ; Learning rate 0.2500 ; Source tokens/s 837 ; Perplexity 12.86 | |
Epoch 11 ; Iteration 1550/1587 ; Learning rate 0.2500 ; Source tokens/s 835 ; Perplexity 12.83 | |
Validation perplexity: 22.960085550097 | |
Saving checkpoint to 'models/baseline_epoch11_22.96.t7'... | |
Epoch 12 ; Iteration 50/1587 ; Learning rate 0.1250 ; Source tokens/s 863 ; Perplexity 12.03 | |
Epoch 12 ; Iteration 100/1587 ; Learning rate 0.1250 ; Source tokens/s 835 ; Perplexity 11.52 | |
Epoch 12 ; Iteration 150/1587 ; Learning rate 0.1250 ; Source tokens/s 847 ; Perplexity 11.65 | |
Epoch 12 ; Iteration 200/1587 ; Learning rate 0.1250 ; Source tokens/s 845 ; Perplexity 11.59 | |
Epoch 12 ; Iteration 250/1587 ; Learning rate 0.1250 ; Source tokens/s 853 ; Perplexity 11.71 | |
Epoch 12 ; Iteration 300/1587 ; Learning rate 0.1250 ; Source tokens/s 852 ; Perplexity 11.88 | |
Epoch 12 ; Iteration 350/1587 ; Learning rate 0.1250 ; Source tokens/s 849 ; Perplexity 11.82 | |
Epoch 12 ; Iteration 400/1587 ; Learning rate 0.1250 ; Source tokens/s 841 ; Perplexity 11.70 | |
Epoch 12 ; Iteration 450/1587 ; Learning rate 0.1250 ; Source tokens/s 838 ; Perplexity 11.64 | |
Epoch 12 ; Iteration 500/1587 ; Learning rate 0.1250 ; Source tokens/s 839 ; Perplexity 11.69 | |
Epoch 12 ; Iteration 550/1587 ; Learning rate 0.1250 ; Source tokens/s 835 ; Perplexity 11.57 | |
Epoch 12 ; Iteration 600/1587 ; Learning rate 0.1250 ; Source tokens/s 836 ; Perplexity 11.58 | |
Epoch 12 ; Iteration 650/1587 ; Learning rate 0.1250 ; Source tokens/s 835 ; Perplexity 11.60 | |
Epoch 12 ; Iteration 700/1587 ; Learning rate 0.1250 ; Source tokens/s 835 ; Perplexity 11.57 | |
Epoch 12 ; Iteration 750/1587 ; Learning rate 0.1250 ; Source tokens/s 836 ; Perplexity 11.62 | |
Epoch 12 ; Iteration 800/1587 ; Learning rate 0.1250 ; Source tokens/s 841 ; Perplexity 11.63 | |
Epoch 12 ; Iteration 850/1587 ; Learning rate 0.1250 ; Source tokens/s 843 ; Perplexity 11.67 | |
Epoch 12 ; Iteration 900/1587 ; Learning rate 0.1250 ; Source tokens/s 841 ; Perplexity 11.65 | |
Epoch 12 ; Iteration 950/1587 ; Learning rate 0.1250 ; Source tokens/s 845 ; Perplexity 11.67 | |
Epoch 12 ; Iteration 1000/1587 ; Learning rate 0.1250 ; Source tokens/s 844 ; Perplexity 11.68 | |
Epoch 12 ; Iteration 1050/1587 ; Learning rate 0.1250 ; Source tokens/s 843 ; Perplexity 11.66 | |
Epoch 12 ; Iteration 1100/1587 ; Learning rate 0.1250 ; Source tokens/s 842 ; Perplexity 11.65 | |
Epoch 12 ; Iteration 1150/1587 ; Learning rate 0.1250 ; Source tokens/s 842 ; Perplexity 11.65 | |
Epoch 12 ; Iteration 1200/1587 ; Learning rate 0.1250 ; Source tokens/s 841 ; Perplexity 11.64 | |
Epoch 12 ; Iteration 1250/1587 ; Learning rate 0.1250 ; Source tokens/s 843 ; Perplexity 11.66 | |
Epoch 12 ; Iteration 1300/1587 ; Learning rate 0.1250 ; Source tokens/s 842 ; Perplexity 11.64 | |
Epoch 12 ; Iteration 1350/1587 ; Learning rate 0.1250 ; Source tokens/s 841 ; Perplexity 11.61 | |
Epoch 12 ; Iteration 1400/1587 ; Learning rate 0.1250 ; Source tokens/s 841 ; Perplexity 11.61 | |
Epoch 12 ; Iteration 1450/1587 ; Learning rate 0.1250 ; Source tokens/s 842 ; Perplexity 11.63 | |
Epoch 12 ; Iteration 1500/1587 ; Learning rate 0.1250 ; Source tokens/s 841 ; Perplexity 11.61 | |
Epoch 12 ; Iteration 1550/1587 ; Learning rate 0.1250 ; Source tokens/s 839 ; Perplexity 11.58 | |
Validation perplexity: 22.833532364382 | |
Saving checkpoint to 'models/baseline_epoch12_22.83.t7'... | |
Epoch 13 ; Iteration 50/1587 ; Learning rate 0.0625 ; Source tokens/s 815 ; Perplexity 10.74 | |
Epoch 13 ; Iteration 100/1587 ; Learning rate 0.0625 ; Source tokens/s 824 ; Perplexity 11.03 | |
Epoch 13 ; Iteration 150/1587 ; Learning rate 0.0625 ; Source tokens/s 828 ; Perplexity 11.10 | |
Epoch 13 ; Iteration 200/1587 ; Learning rate 0.0625 ; Source tokens/s 821 ; Perplexity 10.92 | |
Epoch 13 ; Iteration 250/1587 ; Learning rate 0.0625 ; Source tokens/s 828 ; Perplexity 10.89 | |
Epoch 13 ; Iteration 300/1587 ; Learning rate 0.0625 ; Source tokens/s 832 ; Perplexity 11.01 | |
Epoch 13 ; Iteration 350/1587 ; Learning rate 0.0625 ; Source tokens/s 830 ; Perplexity 10.93 | |
Epoch 13 ; Iteration 400/1587 ; Learning rate 0.0625 ; Source tokens/s 831 ; Perplexity 10.95 | |
Epoch 13 ; Iteration 450/1587 ; Learning rate 0.0625 ; Source tokens/s 829 ; Perplexity 10.93 | |
Epoch 13 ; Iteration 500/1587 ; Learning rate 0.0625 ; Source tokens/s 829 ; Perplexity 10.94 | |
Epoch 13 ; Iteration 550/1587 ; Learning rate 0.0625 ; Source tokens/s 828 ; Perplexity 10.90 | |
Epoch 13 ; Iteration 600/1587 ; Learning rate 0.0625 ; Source tokens/s 829 ; Perplexity 10.89 | |
Epoch 13 ; Iteration 650/1587 ; Learning rate 0.0625 ; Source tokens/s 843 ; Perplexity 10.89 | |
Epoch 13 ; Iteration 700/1587 ; Learning rate 0.0625 ; Source tokens/s 877 ; Perplexity 10.84 | |
Epoch 13 ; Iteration 750/1587 ; Learning rate 0.0625 ; Source tokens/s 906 ; Perplexity 10.85 | |
Epoch 13 ; Iteration 800/1587 ; Learning rate 0.0625 ; Source tokens/s 939 ; Perplexity 10.89 | |
Epoch 13 ; Iteration 850/1587 ; Learning rate 0.0625 ; Source tokens/s 930 ; Perplexity 10.88 | |
Epoch 13 ; Iteration 900/1587 ; Learning rate 0.0625 ; Source tokens/s 921 ; Perplexity 10.85 | |
Epoch 13 ; Iteration 950/1587 ; Learning rate 0.0625 ; Source tokens/s 916 ; Perplexity 10.82 | |
Epoch 13 ; Iteration 1000/1587 ; Learning rate 0.0625 ; Source tokens/s 910 ; Perplexity 10.85 | |
Epoch 13 ; Iteration 1050/1587 ; Learning rate 0.0625 ; Source tokens/s 904 ; Perplexity 10.82 | |
Epoch 13 ; Iteration 1100/1587 ; Learning rate 0.0625 ; Source tokens/s 899 ; Perplexity 10.81 | |
Epoch 13 ; Iteration 1150/1587 ; Learning rate 0.0625 ; Source tokens/s 897 ; Perplexity 10.83 | |
Epoch 13 ; Iteration 1200/1587 ; Learning rate 0.0625 ; Source tokens/s 896 ; Perplexity 10.89 | |
Epoch 13 ; Iteration 1250/1587 ; Learning rate 0.0625 ; Source tokens/s 894 ; Perplexity 10.93 | |
Epoch 13 ; Iteration 1300/1587 ; Learning rate 0.0625 ; Source tokens/s 891 ; Perplexity 10.91 | |
Epoch 13 ; Iteration 1350/1587 ; Learning rate 0.0625 ; Source tokens/s 890 ; Perplexity 10.95 | |
Epoch 13 ; Iteration 1400/1587 ; Learning rate 0.0625 ; Source tokens/s 888 ; Perplexity 10.96 | |
Epoch 13 ; Iteration 1450/1587 ; Learning rate 0.0625 ; Source tokens/s 886 ; Perplexity 10.95 | |
Epoch 13 ; Iteration 1500/1587 ; Learning rate 0.0625 ; Source tokens/s 884 ; Perplexity 10.95 | |
Epoch 13 ; Iteration 1550/1587 ; Learning rate 0.0625 ; Source tokens/s 883 ; Perplexity 10.94 | |
Validation perplexity: 22.913811798902 | |
Saving checkpoint to 'models/baseline_epoch13_22.91.t7'... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment