Skip to content

Instantly share code, notes, and snippets.

@taylanbil
Created July 23, 2020 20:25
Show Gist options
  • Save taylanbil/e09bc472f717a1833aa676981e03f88b to your computer and use it in GitHub Desktop.
Save taylanbil/e09bc472f717a1833aa676981e03f88b to your computer and use it in GitHub Desktop.
validation loss schedule comparison, mbart, full data
taylanbil@dlrm-gpu-8:~/kkissmart-fairseq/mbart$ paste <( grep valid fulldata-gpu.txt | grep loss | cut -d '|' -f3,6,7) tpu.loss.schedule.txt
fairseq_cli.train fairseq_cli.train
valid | loss 5.001 | ppl 32.02 valid | loss 5.003 | ppl 32.08
valid_EN | loss 8.41 | ppl 340.13 valid_EN | loss 8.411 | ppl 340.26
valid_IMG | loss 4.621 | ppl 24.61 valid_IMG | loss 4.624 | ppl 24.65
valid | loss 4.767 | ppl 27.22 valid | loss 4.771 | ppl 27.31
valid_EN | loss 7.853 | ppl 231.15 valid_EN | loss 7.853 | ppl 231.17
valid_IMG | loss 4.423 | ppl 21.45 valid_IMG | loss 4.427 | ppl 21.51
valid | loss 4.609 | ppl 24.41 valid | loss 4.61 | ppl 24.42
valid_EN | loss 7.347 | ppl 162.79 valid_EN | loss 7.354 | ppl 163.54
valid_IMG | loss 4.304 | ppl 19.75 valid_IMG | loss 4.304 | ppl 19.76
valid | loss 4.51 | ppl 22.79 valid | loss 4.511 | ppl 22.8
valid_EN | loss 6.933 | ppl 122.22 valid_EN | loss 6.937 | ppl 122.54
valid_IMG | loss 4.24 | ppl 18.9 valid_IMG | loss 4.24 | ppl 18.9
valid | loss 4.447 | ppl 21.81 valid | loss 4.443 | ppl 21.75
valid_EN | loss 6.584 | ppl 95.92 valid_EN | loss 6.59 | ppl 96.34
valid_IMG | loss 4.209 | ppl 18.49 valid_IMG | loss 4.204 | ppl 18.43
valid | loss 4.385 | ppl 20.89 valid | loss 4.386 | ppl 20.91
valid_EN | loss 6.289 | ppl 78.17 valid_EN | loss 6.291 | ppl 78.32
valid_IMG | loss 4.173 | ppl 18.03 valid_IMG | loss 4.174 | ppl 18.05
valid | loss 4.339 | ppl 20.24 valid | loss 4.338 | ppl 20.22
valid_EN | loss 6.063 | ppl 66.88 valid_EN | loss 6.075 | ppl 67.44
valid_IMG | loss 4.147 | ppl 17.72 valid_IMG | loss 4.144 | ppl 17.68
valid | loss 4.301 | ppl 19.71 valid | loss 4.295 | ppl 19.63
valid_EN | loss 5.856 | ppl 57.93 valid_EN | loss 5.864 | ppl 58.23
valid_IMG | loss 4.128 | ppl 17.48 valid_IMG | loss 4.12 | ppl 17.38
valid | loss 4.24 | ppl 18.89 valid | loss 4.248 | ppl 19
valid_EN | loss 5.686 | ppl 51.49 valid_EN | loss 5.682 | ppl 51.33
valid_IMG | loss 4.078 | ppl 16.89 valid_IMG | loss 4.089 | ppl 17.01
valid | loss 4.096 | ppl 17.1 valid | loss 4.096 | ppl 17.1
valid_EN | loss 5.51 | ppl 45.58 valid_EN | loss 5.517 | ppl 45.8
valid_IMG | loss 3.937 | ppl 15.32 valid_IMG | loss 3.938 | ppl 15.32
valid | loss 3.967 | ppl 15.64 valid | loss 3.972 | ppl 15.69
valid_EN | loss 5.367 | ppl 41.26 valid_EN | loss 5.369 | ppl 41.33
valid_IMG | loss 3.811 | ppl 14.04 valid_IMG | loss 3.816 | ppl 14.08
valid | loss 3.874 | ppl 14.67 valid | loss 3.888 | ppl 14.81
valid_EN | loss 5.22 | ppl 37.27 valid_EN | loss 5.219 | ppl 37.25
valid_IMG | loss 3.724 | ppl 13.21 valid_IMG | loss 3.74 | ppl 13.36
valid | loss 3.811 | ppl 14.04 valid | loss 3.814 | ppl 14.07
valid_EN | loss 5.109 | ppl 34.5 valid_EN | loss 5.11 | ppl 34.53
valid_IMG | loss 3.667 | ppl 12.7 valid_IMG | loss 3.67 | ppl 12.73
valid | loss 3.764 | ppl 13.58 valid | loss 3.783 | ppl 13.76
valid_EN | loss 4.999 | ppl 31.98 valid_EN | loss 5.007 | ppl 32.16
valid_IMG | loss 3.626 | ppl 12.34 valid_IMG | loss 3.646 | ppl 12.52
valid | loss 3.715 | ppl 13.13 valid | loss 3.729 | ppl 13.26
valid_EN | loss 4.91 | ppl 30.07 valid_EN | loss 4.905 | ppl 29.95
valid_IMG | loss 3.581 | ppl 11.97 valid_IMG | loss 3.598 | ppl 12.11
@taylanbil
Copy link
Author

on commit 1f8ccaa as before.

TPU command

python ../tpu_fairseq/train.py image-text-data-bin --encoder-normalize-before --decoder-normalize-before --arch mbart_base --layernorm-embedding --task multilingual_denoising --criterion cross_entropy --dataset-impl mmap --optimizer adam --adam-eps 1e-06 --adam-betas '(0.9, 0.98)' --lr-scheduler polynomial_decay --lr 1e-04 --min-lr -1 --warmup-updates 0 --total-num-update 500000 --dropout 0.0 --attention-dropout 0.0 --weight-decay 0.0 --max-tokens 4104 --seed 2 --log-format simple --log-interval 100 --add-lang-token --no-whole-word-mask-langs IMG --mask 0.35 --permute-sentences 1.0 --mask-length span-poisson --replace-length 1 --rotate 0.0 --max-source-positions 1026 --max-target-positions 1026 --tokens-per-sample 1026 --sample-break-mode complete --save-interval-updates 500 --skip-invalid-size-inputs-valid-test --langs EN,IMG --no-bos --no-input-eos --multilang-sampling-alpha 0.5 --tpu --max-sentences 4 --no-save --num-buckets 1 --distributed-world-size 8

GPU command

python /home/taylanbil/kkissmart-fairseq/tpu_fairseq/train.py ../image-text-data-bin --encoder-normalize-before --decoder-normalize-before --arch mbart_base --layernorm-embedding --task multilingual_denoising --criterion cross_entropy --dataset-impl mmap --optimizer adam --adam-eps 1e-06 --adam-betas '(0.9, 0.98)' --lr-scheduler polynomial_decay --lr 1e-04 --min-lr -1 --warmup-updates 0 --total-num-update 500000 --dropout 0.0 --attention-dropout 0.0 --weight-decay 0.0 --max-tokens 4104 --seed 2 --log-format simple --log-interval 100 --add-lang-token --no-whole-word-mask-langs IMG --mask 0.35 --permute-sentences 1.0 --mask-length span-poisson --replace-length 1 --rotate 0.0 --max-source-positions 1026 --max-target-positions 1026 --tokens-per-sample 1026 --sample-break-mode complete --save-interval-updates 500 --skip-invalid-size-inputs-valid-test --langs EN,IMG --no-bos --no-input-eos --multilang-sampling-alpha 0.5 --max-sentences 4 --no-save --fp16 --num-buckets 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment