Created
July 23, 2020 20:25
-
-
Save taylanbil/e09bc472f717a1833aa676981e03f88b to your computer and use it in GitHub Desktop.
validation loss schedule comparison, mbart, full data
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
taylanbil@dlrm-gpu-8:~/kkissmart-fairseq/mbart$ paste <( grep valid fulldata-gpu.txt | grep loss | cut -d '|' -f3,6,7) tpu.loss.schedule.txt | |
fairseq_cli.train fairseq_cli.train | |
valid | loss 5.001 | ppl 32.02 valid | loss 5.003 | ppl 32.08 | |
valid_EN | loss 8.41 | ppl 340.13 valid_EN | loss 8.411 | ppl 340.26 | |
valid_IMG | loss 4.621 | ppl 24.61 valid_IMG | loss 4.624 | ppl 24.65 | |
valid | loss 4.767 | ppl 27.22 valid | loss 4.771 | ppl 27.31 | |
valid_EN | loss 7.853 | ppl 231.15 valid_EN | loss 7.853 | ppl 231.17 | |
valid_IMG | loss 4.423 | ppl 21.45 valid_IMG | loss 4.427 | ppl 21.51 | |
valid | loss 4.609 | ppl 24.41 valid | loss 4.61 | ppl 24.42 | |
valid_EN | loss 7.347 | ppl 162.79 valid_EN | loss 7.354 | ppl 163.54 | |
valid_IMG | loss 4.304 | ppl 19.75 valid_IMG | loss 4.304 | ppl 19.76 | |
valid | loss 4.51 | ppl 22.79 valid | loss 4.511 | ppl 22.8 | |
valid_EN | loss 6.933 | ppl 122.22 valid_EN | loss 6.937 | ppl 122.54 | |
valid_IMG | loss 4.24 | ppl 18.9 valid_IMG | loss 4.24 | ppl 18.9 | |
valid | loss 4.447 | ppl 21.81 valid | loss 4.443 | ppl 21.75 | |
valid_EN | loss 6.584 | ppl 95.92 valid_EN | loss 6.59 | ppl 96.34 | |
valid_IMG | loss 4.209 | ppl 18.49 valid_IMG | loss 4.204 | ppl 18.43 | |
valid | loss 4.385 | ppl 20.89 valid | loss 4.386 | ppl 20.91 | |
valid_EN | loss 6.289 | ppl 78.17 valid_EN | loss 6.291 | ppl 78.32 | |
valid_IMG | loss 4.173 | ppl 18.03 valid_IMG | loss 4.174 | ppl 18.05 | |
valid | loss 4.339 | ppl 20.24 valid | loss 4.338 | ppl 20.22 | |
valid_EN | loss 6.063 | ppl 66.88 valid_EN | loss 6.075 | ppl 67.44 | |
valid_IMG | loss 4.147 | ppl 17.72 valid_IMG | loss 4.144 | ppl 17.68 | |
valid | loss 4.301 | ppl 19.71 valid | loss 4.295 | ppl 19.63 | |
valid_EN | loss 5.856 | ppl 57.93 valid_EN | loss 5.864 | ppl 58.23 | |
valid_IMG | loss 4.128 | ppl 17.48 valid_IMG | loss 4.12 | ppl 17.38 | |
valid | loss 4.24 | ppl 18.89 valid | loss 4.248 | ppl 19 | |
valid_EN | loss 5.686 | ppl 51.49 valid_EN | loss 5.682 | ppl 51.33 | |
valid_IMG | loss 4.078 | ppl 16.89 valid_IMG | loss 4.089 | ppl 17.01 | |
valid | loss 4.096 | ppl 17.1 valid | loss 4.096 | ppl 17.1 | |
valid_EN | loss 5.51 | ppl 45.58 valid_EN | loss 5.517 | ppl 45.8 | |
valid_IMG | loss 3.937 | ppl 15.32 valid_IMG | loss 3.938 | ppl 15.32 | |
valid | loss 3.967 | ppl 15.64 valid | loss 3.972 | ppl 15.69 | |
valid_EN | loss 5.367 | ppl 41.26 valid_EN | loss 5.369 | ppl 41.33 | |
valid_IMG | loss 3.811 | ppl 14.04 valid_IMG | loss 3.816 | ppl 14.08 | |
valid | loss 3.874 | ppl 14.67 valid | loss 3.888 | ppl 14.81 | |
valid_EN | loss 5.22 | ppl 37.27 valid_EN | loss 5.219 | ppl 37.25 | |
valid_IMG | loss 3.724 | ppl 13.21 valid_IMG | loss 3.74 | ppl 13.36 | |
valid | loss 3.811 | ppl 14.04 valid | loss 3.814 | ppl 14.07 | |
valid_EN | loss 5.109 | ppl 34.5 valid_EN | loss 5.11 | ppl 34.53 | |
valid_IMG | loss 3.667 | ppl 12.7 valid_IMG | loss 3.67 | ppl 12.73 | |
valid | loss 3.764 | ppl 13.58 valid | loss 3.783 | ppl 13.76 | |
valid_EN | loss 4.999 | ppl 31.98 valid_EN | loss 5.007 | ppl 32.16 | |
valid_IMG | loss 3.626 | ppl 12.34 valid_IMG | loss 3.646 | ppl 12.52 | |
valid | loss 3.715 | ppl 13.13 valid | loss 3.729 | ppl 13.26 | |
valid_EN | loss 4.91 | ppl 30.07 valid_EN | loss 4.905 | ppl 29.95 | |
valid_IMG | loss 3.581 | ppl 11.97 valid_IMG | loss 3.598 | ppl 12.11 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
on commit 1f8ccaa as before.
TPU command
python ../tpu_fairseq/train.py image-text-data-bin --encoder-normalize-before --decoder-normalize-before --arch mbart_base --layernorm-embedding --task multilingual_denoising --criterion cross_entropy --dataset-impl mmap --optimizer adam --adam-eps 1e-06 --adam-betas '(0.9, 0.98)' --lr-scheduler polynomial_decay --lr 1e-04 --min-lr -1 --warmup-updates 0 --total-num-update 500000 --dropout 0.0 --attention-dropout 0.0 --weight-decay 0.0 --max-tokens 4104 --seed 2 --log-format simple --log-interval 100 --add-lang-token --no-whole-word-mask-langs IMG --mask 0.35 --permute-sentences 1.0 --mask-length span-poisson --replace-length 1 --rotate 0.0 --max-source-positions 1026 --max-target-positions 1026 --tokens-per-sample 1026 --sample-break-mode complete --save-interval-updates 500 --skip-invalid-size-inputs-valid-test --langs EN,IMG --no-bos --no-input-eos --multilang-sampling-alpha 0.5 --tpu --max-sentences 4 --no-save --num-buckets 1 --distributed-world-size 8
GPU command
python /home/taylanbil/kkissmart-fairseq/tpu_fairseq/train.py ../image-text-data-bin --encoder-normalize-before --decoder-normalize-before --arch mbart_base --layernorm-embedding --task multilingual_denoising --criterion cross_entropy --dataset-impl mmap --optimizer adam --adam-eps 1e-06 --adam-betas '(0.9, 0.98)' --lr-scheduler polynomial_decay --lr 1e-04 --min-lr -1 --warmup-updates 0 --total-num-update 500000 --dropout 0.0 --attention-dropout 0.0 --weight-decay 0.0 --max-tokens 4104 --seed 2 --log-format simple --log-interval 100 --add-lang-token --no-whole-word-mask-langs IMG --mask 0.35 --permute-sentences 1.0 --mask-length span-poisson --replace-length 1 --rotate 0.0 --max-source-positions 1026 --max-target-positions 1026 --tokens-per-sample 1026 --sample-break-mode complete --save-interval-updates 500 --skip-invalid-size-inputs-valid-test --langs EN,IMG --no-bos --no-input-eos --multilang-sampling-alpha 0.5 --max-sentences 4 --no-save --fp16 --num-buckets 1