Last active
December 30, 2016 07:38
-
-
Save rachtsingh/d856ccdf71f8885b7ea535d820c9d7ef to your computer and use it in GitHub Desktop.
seq2seq with batch normalization enabled
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading data from 'data/small-train.t7'... | |
* vocabulary size: source = 50004; target = 50004 | |
* additional features: source = 0; target = 0 | |
* maximum sequence length: source = 50; target = 51 | |
* number of training sentences: 100000 | |
* maximum batch size: 64 | |
Building model... | |
* using input feeding | |
Initializing parameters... | |
* number of parameters: 84834004 | |
Preparing memory optimization... | |
* sharing 79% of output/gradInput tensors memory between clones | |
Start training... | |
Epoch 1 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 568 ; Perplexity 89976.49 | |
Epoch 1 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 696 ; Perplexity 38050.13 | |
Epoch 1 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 767 ; Perplexity 24768.86 | |
Epoch 1 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 808 ; Perplexity 18641.86 | |
Epoch 1 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 825 ; Perplexity 14317.86 | |
Epoch 1 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 833 ; Perplexity 12003.24 | |
Epoch 1 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 849 ; Perplexity 9978.76 | |
Epoch 1 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 861 ; Perplexity 8682.25 | |
Epoch 1 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 872 ; Perplexity 7726.82 | |
Epoch 1 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 881 ; Perplexity 6839.93 | |
Epoch 1 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 893 ; Perplexity 6090.45 | |
Epoch 1 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 895 ; Perplexity 5571.32 | |
Epoch 1 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 898 ; Perplexity 5135.74 | |
Epoch 1 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 899 ; Perplexity 4752.13 | |
Epoch 1 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 902 ; Perplexity 4420.40 | |
Epoch 1 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 905 ; Perplexity 4134.08 | |
Epoch 1 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 910 ; Perplexity 3876.58 | |
Epoch 1 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 911 ; Perplexity 3688.81 | |
Epoch 1 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 912 ; Perplexity 3499.98 | |
Epoch 1 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 915 ; Perplexity 3343.33 | |
Epoch 1 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 916 ; Perplexity 3201.50 | |
Epoch 1 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 917 ; Perplexity 3074.55 | |
Epoch 1 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 918 ; Perplexity 2962.87 | |
Epoch 1 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 918 ; Perplexity 2862.08 | |
Epoch 1 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 920 ; Perplexity 2769.04 | |
Epoch 1 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 921 ; Perplexity 2689.65 | |
Epoch 1 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 923 ; Perplexity 2615.62 | |
Epoch 1 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 924 ; Perplexity 2549.87 | |
Epoch 1 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 926 ; Perplexity 2486.84 | |
Epoch 1 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 925 ; Perplexity 2430.59 | |
Epoch 1 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 925 ; Perplexity 2380.55 | |
Validation perplexity: 950.45464907671 | |
Saving checkpoint to 'models/master2_c_epoch1_950.45.t7'... | |
Epoch 2 ; Iteration 50/1587 ; Learning rate 1.0000 ; Source tokens/s 968 ; Perplexity 1200.15 | |
Epoch 2 ; Iteration 100/1587 ; Learning rate 1.0000 ; Source tokens/s 967 ; Perplexity 1212.35 | |
Epoch 2 ; Iteration 150/1587 ; Learning rate 1.0000 ; Source tokens/s 948 ; Perplexity 1195.37 | |
Epoch 2 ; Iteration 200/1587 ; Learning rate 1.0000 ; Source tokens/s 961 ; Perplexity 1194.98 | |
Epoch 2 ; Iteration 250/1587 ; Learning rate 1.0000 ; Source tokens/s 958 ; Perplexity 1173.84 | |
Epoch 2 ; Iteration 300/1587 ; Learning rate 1.0000 ; Source tokens/s 951 ; Perplexity 1159.89 | |
Epoch 2 ; Iteration 350/1587 ; Learning rate 1.0000 ; Source tokens/s 953 ; Perplexity 1146.81 | |
Epoch 2 ; Iteration 400/1587 ; Learning rate 1.0000 ; Source tokens/s 955 ; Perplexity 1137.07 | |
Epoch 2 ; Iteration 450/1587 ; Learning rate 1.0000 ; Source tokens/s 949 ; Perplexity 1127.32 | |
Epoch 2 ; Iteration 500/1587 ; Learning rate 1.0000 ; Source tokens/s 947 ; Perplexity 1120.15 | |
Epoch 2 ; Iteration 550/1587 ; Learning rate 1.0000 ; Source tokens/s 946 ; Perplexity 1112.84 | |
Epoch 2 ; Iteration 600/1587 ; Learning rate 1.0000 ; Source tokens/s 950 ; Perplexity 1110.48 | |
Epoch 2 ; Iteration 650/1587 ; Learning rate 1.0000 ; Source tokens/s 948 ; Perplexity 1107.41 | |
Epoch 2 ; Iteration 700/1587 ; Learning rate 1.0000 ; Source tokens/s 949 ; Perplexity 1103.62 | |
Epoch 2 ; Iteration 750/1587 ; Learning rate 1.0000 ; Source tokens/s 948 ; Perplexity 1099.63 | |
Epoch 2 ; Iteration 800/1587 ; Learning rate 1.0000 ; Source tokens/s 947 ; Perplexity 1097.62 | |
Epoch 2 ; Iteration 850/1587 ; Learning rate 1.0000 ; Source tokens/s 945 ; Perplexity 1096.00 | |
Epoch 2 ; Iteration 900/1587 ; Learning rate 1.0000 ; Source tokens/s 948 ; Perplexity 1094.80 | |
Epoch 2 ; Iteration 950/1587 ; Learning rate 1.0000 ; Source tokens/s 945 ; Perplexity 1094.88 | |
Epoch 2 ; Iteration 1000/1587 ; Learning rate 1.0000 ; Source tokens/s 945 ; Perplexity 1094.81 | |
Epoch 2 ; Iteration 1050/1587 ; Learning rate 1.0000 ; Source tokens/s 944 ; Perplexity 1094.91 | |
Epoch 2 ; Iteration 1100/1587 ; Learning rate 1.0000 ; Source tokens/s 941 ; Perplexity 1094.17 | |
Epoch 2 ; Iteration 1150/1587 ; Learning rate 1.0000 ; Source tokens/s 942 ; Perplexity 1093.79 | |
Epoch 2 ; Iteration 1200/1587 ; Learning rate 1.0000 ; Source tokens/s 942 ; Perplexity 1093.59 | |
Epoch 2 ; Iteration 1250/1587 ; Learning rate 1.0000 ; Source tokens/s 942 ; Perplexity 1094.23 | |
Epoch 2 ; Iteration 1300/1587 ; Learning rate 1.0000 ; Source tokens/s 943 ; Perplexity 1094.88 | |
Epoch 2 ; Iteration 1350/1587 ; Learning rate 1.0000 ; Source tokens/s 944 ; Perplexity 1094.58 | |
Epoch 2 ; Iteration 1400/1587 ; Learning rate 1.0000 ; Source tokens/s 943 ; Perplexity 1093.94 | |
Epoch 2 ; Iteration 1450/1587 ; Learning rate 1.0000 ; Source tokens/s 944 ; Perplexity 1094.14 | |
Epoch 2 ; Iteration 1500/1587 ; Learning rate 1.0000 ; Source tokens/s 945 ; Perplexity 1094.49 | |
Epoch 2 ; Iteration 1550/1587 ; Learning rate 1.0000 ; Source tokens/s 945 ; Perplexity 1094.91 | |
Validation perplexity: 970.56549385217 | |
Saving checkpoint to 'models/master2_c_epoch2_970.57.t7'... | |
Epoch 3 ; Iteration 50/1587 ; Learning rate 0.5000 ; Source tokens/s 935 ; Perplexity 1071.35 | |
Epoch 3 ; Iteration 100/1587 ; Learning rate 0.5000 ; Source tokens/s 941 ; Perplexity 1090.61 | |
Epoch 3 ; Iteration 150/1587 ; Learning rate 0.5000 ; Source tokens/s 933 ; Perplexity 1084.42 | |
Epoch 3 ; Iteration 200/1587 ; Learning rate 0.5000 ; Source tokens/s 933 ; Perplexity 1091.91 | |
Epoch 3 ; Iteration 250/1587 ; Learning rate 0.5000 ; Source tokens/s 933 ; Perplexity 1089.04 | |
Epoch 3 ; Iteration 300/1587 ; Learning rate 0.5000 ; Source tokens/s 931 ; Perplexity 1089.90 | |
Epoch 3 ; Iteration 350/1587 ; Learning rate 0.5000 ; Source tokens/s 932 ; Perplexity 1090.33 | |
Epoch 3 ; Iteration 400/1587 ; Learning rate 0.5000 ; Source tokens/s 936 ; Perplexity 1090.96 | |
Epoch 3 ; Iteration 450/1587 ; Learning rate 0.5000 ; Source tokens/s 937 ; Perplexity 1090.25 | |
Epoch 3 ; Iteration 500/1587 ; Learning rate 0.5000 ; Source tokens/s 944 ; Perplexity 1088.60 | |
Epoch 3 ; Iteration 550/1587 ; Learning rate 0.5000 ; Source tokens/s 950 ; Perplexity 1089.68 | |
Epoch 3 ; Iteration 600/1587 ; Learning rate 0.5000 ; Source tokens/s 950 ; Perplexity 1089.66 | |
Epoch 3 ; Iteration 650/1587 ; Learning rate 0.5000 ; Source tokens/s 951 ; Perplexity 1089.48 | |
Epoch 3 ; Iteration 700/1587 ; Learning rate 0.5000 ; Source tokens/s 950 ; Perplexity 1088.60 | |
Epoch 3 ; Iteration 750/1587 ; Learning rate 0.5000 ; Source tokens/s 948 ; Perplexity 1088.10 | |
Epoch 3 ; Iteration 800/1587 ; Learning rate 0.5000 ; Source tokens/s 947 ; Perplexity 1086.87 | |
Epoch 3 ; Iteration 850/1587 ; Learning rate 0.5000 ; Source tokens/s 946 ; Perplexity 1086.54 | |
Epoch 3 ; Iteration 900/1587 ; Learning rate 0.5000 ; Source tokens/s 944 ; Perplexity 1086.92 | |
Epoch 3 ; Iteration 950/1587 ; Learning rate 0.5000 ; Source tokens/s 945 ; Perplexity 1086.27 | |
Epoch 3 ; Iteration 1000/1587 ; Learning rate 0.5000 ; Source tokens/s 946 ; Perplexity 1086.65 | |
Epoch 3 ; Iteration 1050/1587 ; Learning rate 0.5000 ; Source tokens/s 947 ; Perplexity 1087.17 | |
Epoch 3 ; Iteration 1100/1587 ; Learning rate 0.5000 ; Source tokens/s 947 ; Perplexity 1087.47 | |
Epoch 3 ; Iteration 1150/1587 ; Learning rate 0.5000 ; Source tokens/s 947 ; Perplexity 1087.76 | |
Epoch 3 ; Iteration 1200/1587 ; Learning rate 0.5000 ; Source tokens/s 947 ; Perplexity 1088.23 | |
Epoch 3 ; Iteration 1250/1587 ; Learning rate 0.5000 ; Source tokens/s 947 ; Perplexity 1088.20 | |
Epoch 3 ; Iteration 1300/1587 ; Learning rate 0.5000 ; Source tokens/s 946 ; Perplexity 1087.65 | |
Epoch 3 ; Iteration 1350/1587 ; Learning rate 0.5000 ; Source tokens/s 945 ; Perplexity 1088.28 | |
Epoch 3 ; Iteration 1400/1587 ; Learning rate 0.5000 ; Source tokens/s 945 ; Perplexity 1088.59 | |
Epoch 3 ; Iteration 1450/1587 ; Learning rate 0.5000 ; Source tokens/s 946 ; Perplexity 1089.03 | |
Epoch 3 ; Iteration 1500/1587 ; Learning rate 0.5000 ; Source tokens/s 946 ; Perplexity 1089.19 | |
Epoch 3 ; Iteration 1550/1587 ; Learning rate 0.5000 ; Source tokens/s 945 ; Perplexity 1089.26 | |
Validation perplexity: 962.52278471218 | |
Saving checkpoint to 'models/master2_c_epoch3_962.52.t7'... | |
Epoch 4 ; Iteration 50/1587 ; Learning rate 0.2500 ; Source tokens/s 981 ; Perplexity 1086.34 | |
Epoch 4 ; Iteration 100/1587 ; Learning rate 0.2500 ; Source tokens/s 977 ; Perplexity 1089.36 | |
Epoch 4 ; Iteration 150/1587 ; Learning rate 0.2500 ; Source tokens/s 965 ; Perplexity 1094.90 | |
Epoch 4 ; Iteration 200/1587 ; Learning rate 0.2500 ; Source tokens/s 964 ; Perplexity 1095.74 | |
Epoch 4 ; Iteration 250/1587 ; Learning rate 0.2500 ; Source tokens/s 956 ; Perplexity 1091.06 | |
Epoch 4 ; Iteration 300/1587 ; Learning rate 0.2500 ; Source tokens/s 954 ; Perplexity 1093.44 | |
Epoch 4 ; Iteration 350/1587 ; Learning rate 0.2500 ; Source tokens/s 949 ; Perplexity 1091.02 | |
Epoch 4 ; Iteration 400/1587 ; Learning rate 0.2500 ; Source tokens/s 947 ; Perplexity 1090.95 | |
Epoch 4 ; Iteration 450/1587 ; Learning rate 0.2500 ; Source tokens/s 948 ; Perplexity 1091.66 | |
Epoch 4 ; Iteration 500/1587 ; Learning rate 0.2500 ; Source tokens/s 946 ; Perplexity 1091.39 | |
Epoch 4 ; Iteration 550/1587 ; Learning rate 0.2500 ; Source tokens/s 948 ; Perplexity 1091.22 | |
Epoch 4 ; Iteration 600/1587 ; Learning rate 0.2500 ; Source tokens/s 950 ; Perplexity 1089.93 | |
Epoch 4 ; Iteration 650/1587 ; Learning rate 0.2500 ; Source tokens/s 950 ; Perplexity 1089.40 | |
Epoch 4 ; Iteration 700/1587 ; Learning rate 0.2500 ; Source tokens/s 949 ; Perplexity 1090.39 | |
Epoch 4 ; Iteration 750/1587 ; Learning rate 0.2500 ; Source tokens/s 947 ; Perplexity 1090.05 | |
Epoch 4 ; Iteration 800/1587 ; Learning rate 0.2500 ; Source tokens/s 948 ; Perplexity 1089.72 | |
Epoch 4 ; Iteration 850/1587 ; Learning rate 0.2500 ; Source tokens/s 946 ; Perplexity 1089.31 | |
Epoch 4 ; Iteration 900/1587 ; Learning rate 0.2500 ; Source tokens/s 948 ; Perplexity 1089.51 | |
Epoch 4 ; Iteration 950/1587 ; Learning rate 0.2500 ; Source tokens/s 947 ; Perplexity 1088.93 | |
Epoch 4 ; Iteration 1000/1587 ; Learning rate 0.2500 ; Source tokens/s 946 ; Perplexity 1089.00 | |
Epoch 4 ; Iteration 1050/1587 ; Learning rate 0.2500 ; Source tokens/s 946 ; Perplexity 1089.05 | |
Epoch 4 ; Iteration 1100/1587 ; Learning rate 0.2500 ; Source tokens/s 945 ; Perplexity 1088.49 | |
Epoch 4 ; Iteration 1150/1587 ; Learning rate 0.2500 ; Source tokens/s 945 ; Perplexity 1089.00 | |
Epoch 4 ; Iteration 1200/1587 ; Learning rate 0.2500 ; Source tokens/s 945 ; Perplexity 1089.06 | |
Epoch 4 ; Iteration 1250/1587 ; Learning rate 0.2500 ; Source tokens/s 944 ; Perplexity 1089.66 | |
Epoch 4 ; Iteration 1300/1587 ; Learning rate 0.2500 ; Source tokens/s 944 ; Perplexity 1089.17 | |
Epoch 4 ; Iteration 1350/1587 ; Learning rate 0.2500 ; Source tokens/s 945 ; Perplexity 1089.43 | |
Epoch 4 ; Iteration 1400/1587 ; Learning rate 0.2500 ; Source tokens/s 945 ; Perplexity 1089.70 | |
Epoch 4 ; Iteration 1450/1587 ; Learning rate 0.2500 ; Source tokens/s 946 ; Perplexity 1089.70 | |
Epoch 4 ; Iteration 1500/1587 ; Learning rate 0.2500 ; Source tokens/s 946 ; Perplexity 1089.65 | |
Epoch 4 ; Iteration 1550/1587 ; Learning rate 0.2500 ; Source tokens/s 945 ; Perplexity 1089.34 | |
Validation perplexity: 968.62977799903 | |
Saving checkpoint to 'models/master2_c_epoch4_968.63.t7'... | |
Epoch 5 ; Iteration 50/1587 ; Learning rate 0.1250 ; Source tokens/s 946 ; Perplexity 1056.67 | |
Epoch 5 ; Iteration 100/1587 ; Learning rate 0.1250 ; Source tokens/s 960 ; Perplexity 1073.13 | |
Epoch 5 ; Iteration 150/1587 ; Learning rate 0.1250 ; Source tokens/s 956 ; Perplexity 1073.70 | |
Epoch 5 ; Iteration 200/1587 ; Learning rate 0.1250 ; Source tokens/s 954 ; Perplexity 1073.78 | |
Epoch 5 ; Iteration 250/1587 ; Learning rate 0.1250 ; Source tokens/s 951 ; Perplexity 1082.66 | |
Epoch 5 ; Iteration 300/1587 ; Learning rate 0.1250 ; Source tokens/s 955 ; Perplexity 1085.42 | |
Epoch 5 ; Iteration 350/1587 ; Learning rate 0.1250 ; Source tokens/s 950 ; Perplexity 1083.44 | |
Epoch 5 ; Iteration 400/1587 ; Learning rate 0.1250 ; Source tokens/s 956 ; Perplexity 1082.75 | |
Epoch 5 ; Iteration 450/1587 ; Learning rate 0.1250 ; Source tokens/s 957 ; Perplexity 1086.22 | |
Epoch 5 ; Iteration 500/1587 ; Learning rate 0.1250 ; Source tokens/s 955 ; Perplexity 1085.63 | |
Epoch 5 ; Iteration 550/1587 ; Learning rate 0.1250 ; Source tokens/s 953 ; Perplexity 1084.17 | |
Epoch 5 ; Iteration 600/1587 ; Learning rate 0.1250 ; Source tokens/s 953 ; Perplexity 1085.38 | |
Epoch 5 ; Iteration 650/1587 ; Learning rate 0.1250 ; Source tokens/s 949 ; Perplexity 1084.69 | |
Epoch 5 ; Iteration 700/1587 ; Learning rate 0.1250 ; Source tokens/s 947 ; Perplexity 1085.55 | |
Epoch 5 ; Iteration 750/1587 ; Learning rate 0.1250 ; Source tokens/s 945 ; Perplexity 1086.10 | |
Epoch 5 ; Iteration 800/1587 ; Learning rate 0.1250 ; Source tokens/s 947 ; Perplexity 1087.42 | |
Epoch 5 ; Iteration 850/1587 ; Learning rate 0.1250 ; Source tokens/s 950 ; Perplexity 1088.29 | |
Epoch 5 ; Iteration 900/1587 ; Learning rate 0.1250 ; Source tokens/s 948 ; Perplexity 1089.14 | |
Epoch 5 ; Iteration 950/1587 ; Learning rate 0.1250 ; Source tokens/s 947 ; Perplexity 1088.31 | |
Epoch 5 ; Iteration 1000/1587 ; Learning rate 0.1250 ; Source tokens/s 946 ; Perplexity 1087.61 | |
Epoch 5 ; Iteration 1050/1587 ; Learning rate 0.1250 ; Source tokens/s 946 ; Perplexity 1087.26 | |
Epoch 5 ; Iteration 1100/1587 ; Learning rate 0.1250 ; Source tokens/s 947 ; Perplexity 1087.29 | |
Epoch 5 ; Iteration 1150/1587 ; Learning rate 0.1250 ; Source tokens/s 946 ; Perplexity 1086.35 | |
Epoch 5 ; Iteration 1200/1587 ; Learning rate 0.1250 ; Source tokens/s 946 ; Perplexity 1087.10 | |
Epoch 5 ; Iteration 1250/1587 ; Learning rate 0.1250 ; Source tokens/s 946 ; Perplexity 1086.96 | |
Epoch 5 ; Iteration 1300/1587 ; Learning rate 0.1250 ; Source tokens/s 946 ; Perplexity 1087.26 | |
Epoch 5 ; Iteration 1350/1587 ; Learning rate 0.1250 ; Source tokens/s 945 ; Perplexity 1087.63 | |
Epoch 5 ; Iteration 1400/1587 ; Learning rate 0.1250 ; Source tokens/s 944 ; Perplexity 1088.35 | |
Epoch 5 ; Iteration 1450/1587 ; Learning rate 0.1250 ; Source tokens/s 946 ; Perplexity 1088.85 | |
Epoch 5 ; Iteration 1500/1587 ; Learning rate 0.1250 ; Source tokens/s 946 ; Perplexity 1088.42 | |
Epoch 5 ; Iteration 1550/1587 ; Learning rate 0.1250 ; Source tokens/s 945 ; Perplexity 1088.10 | |
Validation perplexity: 965.09351712684 | |
Saving checkpoint to 'models/master2_c_epoch5_965.09.t7'... | |
Epoch 6 ; Iteration 50/1587 ; Learning rate 0.0625 ; Source tokens/s 971 ; Perplexity 1072.91 | |
Epoch 6 ; Iteration 100/1587 ; Learning rate 0.0625 ; Source tokens/s 971 ; Perplexity 1087.93 | |
Epoch 6 ; Iteration 150/1587 ; Learning rate 0.0625 ; Source tokens/s 974 ; Perplexity 1089.82 | |
Epoch 6 ; Iteration 200/1587 ; Learning rate 0.0625 ; Source tokens/s 964 ; Perplexity 1082.23 | |
Epoch 6 ; Iteration 250/1587 ; Learning rate 0.0625 ; Source tokens/s 966 ; Perplexity 1083.00 | |
Epoch 6 ; Iteration 300/1587 ; Learning rate 0.0625 ; Source tokens/s 973 ; Perplexity 1088.59 | |
Epoch 6 ; Iteration 350/1587 ; Learning rate 0.0625 ; Source tokens/s 966 ; Perplexity 1088.21 | |
Epoch 6 ; Iteration 400/1587 ; Learning rate 0.0625 ; Source tokens/s 962 ; Perplexity 1089.78 | |
Epoch 6 ; Iteration 450/1587 ; Learning rate 0.0625 ; Source tokens/s 962 ; Perplexity 1091.51 | |
Epoch 6 ; Iteration 500/1587 ; Learning rate 0.0625 ; Source tokens/s 962 ; Perplexity 1090.06 | |
Epoch 6 ; Iteration 550/1587 ; Learning rate 0.0625 ; Source tokens/s 959 ; Perplexity 1092.91 | |
Epoch 6 ; Iteration 600/1587 ; Learning rate 0.0625 ; Source tokens/s 958 ; Perplexity 1094.88 | |
Epoch 6 ; Iteration 650/1587 ; Learning rate 0.0625 ; Source tokens/s 958 ; Perplexity 1095.66 | |
Epoch 6 ; Iteration 700/1587 ; Learning rate 0.0625 ; Source tokens/s 957 ; Perplexity 1095.19 | |
Epoch 6 ; Iteration 750/1587 ; Learning rate 0.0625 ; Source tokens/s 958 ; Perplexity 1093.79 | |
Epoch 6 ; Iteration 800/1587 ; Learning rate 0.0625 ; Source tokens/s 956 ; Perplexity 1093.30 | |
Epoch 6 ; Iteration 850/1587 ; Learning rate 0.0625 ; Source tokens/s 950 ; Perplexity 1091.97 | |
Epoch 6 ; Iteration 900/1587 ; Learning rate 0.0625 ; Source tokens/s 949 ; Perplexity 1090.74 | |
Epoch 6 ; Iteration 950/1587 ; Learning rate 0.0625 ; Source tokens/s 947 ; Perplexity 1091.07 | |
Epoch 6 ; Iteration 1000/1587 ; Learning rate 0.0625 ; Source tokens/s 947 ; Perplexity 1091.13 | |
Epoch 6 ; Iteration 1050/1587 ; Learning rate 0.0625 ; Source tokens/s 946 ; Perplexity 1090.87 | |
Epoch 6 ; Iteration 1100/1587 ; Learning rate 0.0625 ; Source tokens/s 947 ; Perplexity 1090.72 | |
Epoch 6 ; Iteration 1150/1587 ; Learning rate 0.0625 ; Source tokens/s 946 ; Perplexity 1091.13 | |
Epoch 6 ; Iteration 1200/1587 ; Learning rate 0.0625 ; Source tokens/s 946 ; Perplexity 1091.02 | |
Epoch 6 ; Iteration 1250/1587 ; Learning rate 0.0625 ; Source tokens/s 947 ; Perplexity 1090.97 | |
Epoch 6 ; Iteration 1300/1587 ; Learning rate 0.0625 ; Source tokens/s 945 ; Perplexity 1090.11 | |
Epoch 6 ; Iteration 1350/1587 ; Learning rate 0.0625 ; Source tokens/s 945 ; Perplexity 1090.08 | |
Epoch 6 ; Iteration 1400/1587 ; Learning rate 0.0625 ; Source tokens/s 945 ; Perplexity 1089.14 | |
Epoch 6 ; Iteration 1450/1587 ; Learning rate 0.0625 ; Source tokens/s 945 ; Perplexity 1089.28 | |
Epoch 6 ; Iteration 1500/1587 ; Learning rate 0.0625 ; Source tokens/s 946 ; Perplexity 1089.73 | |
Epoch 6 ; Iteration 1550/1587 ; Learning rate 0.0625 ; Source tokens/s 945 ; Perplexity 1088.52 | |
Validation perplexity: 966.980620381 | |
Saving checkpoint to 'models/master2_c_epoch6_966.98.t7'... | |
Epoch 7 ; Iteration 50/1587 ; Learning rate 0.0312 ; Source tokens/s 950 ; Perplexity 1084.67 | |
Epoch 7 ; Iteration 100/1587 ; Learning rate 0.0312 ; Source tokens/s 951 ; Perplexity 1083.14 | |
Epoch 7 ; Iteration 150/1587 ; Learning rate 0.0312 ; Source tokens/s 936 ; Perplexity 1079.40 | |
Epoch 7 ; Iteration 200/1587 ; Learning rate 0.0312 ; Source tokens/s 927 ; Perplexity 1077.76 | |
Epoch 7 ; Iteration 250/1587 ; Learning rate 0.0312 ; Source tokens/s 926 ; Perplexity 1078.60 | |
Epoch 7 ; Iteration 300/1587 ; Learning rate 0.0312 ; Source tokens/s 930 ; Perplexity 1078.65 | |
Epoch 7 ; Iteration 350/1587 ; Learning rate 0.0312 ; Source tokens/s 934 ; Perplexity 1079.10 | |
Epoch 7 ; Iteration 400/1587 ; Learning rate 0.0312 ; Source tokens/s 928 ; Perplexity 1077.53 | |
Epoch 7 ; Iteration 450/1587 ; Learning rate 0.0312 ; Source tokens/s 931 ; Perplexity 1077.56 | |
Epoch 7 ; Iteration 500/1587 ; Learning rate 0.0312 ; Source tokens/s 934 ; Perplexity 1077.31 | |
Epoch 7 ; Iteration 550/1587 ; Learning rate 0.0312 ; Source tokens/s 933 ; Perplexity 1078.69 | |
Epoch 7 ; Iteration 600/1587 ; Learning rate 0.0312 ; Source tokens/s 937 ; Perplexity 1079.58 | |
Epoch 7 ; Iteration 650/1587 ; Learning rate 0.0312 ; Source tokens/s 938 ; Perplexity 1080.93 | |
Epoch 7 ; Iteration 700/1587 ; Learning rate 0.0312 ; Source tokens/s 941 ; Perplexity 1081.42 | |
Epoch 7 ; Iteration 750/1587 ; Learning rate 0.0312 ; Source tokens/s 940 ; Perplexity 1082.54 | |
Epoch 7 ; Iteration 800/1587 ; Learning rate 0.0312 ; Source tokens/s 941 ; Perplexity 1082.83 | |
Epoch 7 ; Iteration 850/1587 ; Learning rate 0.0312 ; Source tokens/s 940 ; Perplexity 1083.91 | |
Epoch 7 ; Iteration 900/1587 ; Learning rate 0.0312 ; Source tokens/s 942 ; Perplexity 1084.72 | |
Epoch 7 ; Iteration 950/1587 ; Learning rate 0.0312 ; Source tokens/s 940 ; Perplexity 1084.26 | |
Epoch 7 ; Iteration 1000/1587 ; Learning rate 0.0312 ; Source tokens/s 939 ; Perplexity 1084.69 | |
Epoch 7 ; Iteration 1050/1587 ; Learning rate 0.0312 ; Source tokens/s 940 ; Perplexity 1084.43 | |
Epoch 7 ; Iteration 1100/1587 ; Learning rate 0.0312 ; Source tokens/s 940 ; Perplexity 1085.22 | |
Epoch 7 ; Iteration 1150/1587 ; Learning rate 0.0312 ; Source tokens/s 940 ; Perplexity 1086.08 | |
Epoch 7 ; Iteration 1200/1587 ; Learning rate 0.0312 ; Source tokens/s 941 ; Perplexity 1087.03 | |
Epoch 7 ; Iteration 1250/1587 ; Learning rate 0.0312 ; Source tokens/s 941 ; Perplexity 1087.32 | |
Epoch 7 ; Iteration 1300/1587 ; Learning rate 0.0312 ; Source tokens/s 941 ; Perplexity 1087.39 | |
Epoch 7 ; Iteration 1350/1587 ; Learning rate 0.0312 ; Source tokens/s 942 ; Perplexity 1087.68 | |
Epoch 7 ; Iteration 1400/1587 ; Learning rate 0.0312 ; Source tokens/s 943 ; Perplexity 1088.23 | |
Epoch 7 ; Iteration 1450/1587 ; Learning rate 0.0312 ; Source tokens/s 943 ; Perplexity 1088.46 | |
Epoch 7 ; Iteration 1500/1587 ; Learning rate 0.0312 ; Source tokens/s 942 ; Perplexity 1087.71 | |
Epoch 7 ; Iteration 1550/1587 ; Learning rate 0.0312 ; Source tokens/s 944 ; Perplexity 1088.10 | |
Validation perplexity: 967.23047079327 | |
Saving checkpoint to 'models/master2_c_epoch7_967.23.t7'... | |
Epoch 8 ; Iteration 50/1587 ; Learning rate 0.0156 ; Source tokens/s 962 ; Perplexity 1076.28 | |
Epoch 8 ; Iteration 100/1587 ; Learning rate 0.0156 ; Source tokens/s 973 ; Perplexity 1075.29 | |
Epoch 8 ; Iteration 150/1587 ; Learning rate 0.0156 ; Source tokens/s 963 ; Perplexity 1083.62 | |
Epoch 8 ; Iteration 200/1587 ; Learning rate 0.0156 ; Source tokens/s 958 ; Perplexity 1083.32 | |
Epoch 8 ; Iteration 250/1587 ; Learning rate 0.0156 ; Source tokens/s 961 ; Perplexity 1087.49 | |
Epoch 8 ; Iteration 300/1587 ; Learning rate 0.0156 ; Source tokens/s 963 ; Perplexity 1090.44 | |
Epoch 8 ; Iteration 350/1587 ; Learning rate 0.0156 ; Source tokens/s 953 ; Perplexity 1086.70 | |
Epoch 8 ; Iteration 400/1587 ; Learning rate 0.0156 ; Source tokens/s 954 ; Perplexity 1087.24 | |
Epoch 8 ; Iteration 450/1587 ; Learning rate 0.0156 ; Source tokens/s 949 ; Perplexity 1085.94 | |
Epoch 8 ; Iteration 500/1587 ; Learning rate 0.0156 ; Source tokens/s 951 ; Perplexity 1085.05 | |
Epoch 8 ; Iteration 550/1587 ; Learning rate 0.0156 ; Source tokens/s 950 ; Perplexity 1085.28 | |
Epoch 8 ; Iteration 600/1587 ; Learning rate 0.0156 ; Source tokens/s 949 ; Perplexity 1082.99 | |
Epoch 8 ; Iteration 650/1587 ; Learning rate 0.0156 ; Source tokens/s 943 ; Perplexity 1082.97 | |
Epoch 8 ; Iteration 700/1587 ; Learning rate 0.0156 ; Source tokens/s 941 ; Perplexity 1082.53 | |
Epoch 8 ; Iteration 750/1587 ; Learning rate 0.0156 ; Source tokens/s 940 ; Perplexity 1081.47 | |
Epoch 8 ; Iteration 800/1587 ; Learning rate 0.0156 ; Source tokens/s 938 ; Perplexity 1081.44 | |
Epoch 8 ; Iteration 850/1587 ; Learning rate 0.0156 ; Source tokens/s 937 ; Perplexity 1081.63 | |
Epoch 8 ; Iteration 900/1587 ; Learning rate 0.0156 ; Source tokens/s 937 ; Perplexity 1082.12 | |
Epoch 8 ; Iteration 950/1587 ; Learning rate 0.0156 ; Source tokens/s 939 ; Perplexity 1083.17 | |
Epoch 8 ; Iteration 1000/1587 ; Learning rate 0.0156 ; Source tokens/s 939 ; Perplexity 1083.11 | |
Epoch 8 ; Iteration 1050/1587 ; Learning rate 0.0156 ; Source tokens/s 939 ; Perplexity 1083.51 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment