Skip to content

Instantly share code, notes, and snippets.

@sshleifer
Last active September 2, 2020 15:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sshleifer/9dd84d0220ba2ee81a170b41f1241c12 to your computer and use it in GitHub Desktop.
Save sshleifer/9dd84d0220ba2ee81a170b41f1241c12 to your computer and use it in GitHub Desktop.
Experiment results distilling mbart-large-en-ro and finetuning mbart-large-cc-25
for file in ls */*bleu.json
do
   echo "$file:"
   cat "$file" | sed -n '/^\s*$/!{p;q}' 
   echo  "------"
done

enro test bleu (distil-mbart unless otherwise specified, before post processing).

12_3_no_teacher 22.65
12_3/ 23.0
12_3_bru: 23.56

12_4_no_teacher 20.72
12_4_bru_ld 24.416
12_4_teacher 25.21  (menro_12_4_bru)



12_6_no_teacher_halfgen 23.79
12_6_no_teacher_leps 24.95
12_6_v2/ 25.4267  **UPLOADED** as `sshleifer/distill-mbart-12-6`

~12_9_no_teacher/ 17.56 
12_9 26.03 (menro_12_9_bru_v2) (menro_12_9_bru was close)

teacher: 26.457

finetune from cc25: 25.8 - 26.1


CC25 FT

mbart-large-enro: 26.46 before fix, val

after fix: 26.80 batch_parity: 26.42 (delta of 0.4 vs pradhy could be the force_bos_token fix OR the batch parity fix) 21.23 with decoder_start_token_id = eos_token_id

(all numbers below are before fix.):

  • cc25_dynb_cont: 25.4 15h
  • pradhy: 26.01
  • bru_pl85_long_master/25.98
  • bru_baseline_pl81: 25.99
  • bru_pl85_long/test_bleu.json 26.27
  • bru_pl85_layerdrop/test_bleu.json 26.34it

fairseq released evals

Test:

  • Generate test with beam=5: BLEU = 26.83 57.1/33.2/20.7/13.2 (BP = 1.000 ratio = 1.007 hyp_len = 49294 ref_len = 48945)

Valid:

  • Generate valid with beam=5: BLEU = 28.18 58.7/34.7/22.0/14.3 (BP = 0.995 ratio = 0.995 hyp_len = 51041 ref_len = 51300)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment