for file in ls */*bleu.json
do
echo "$file:"
cat "$file" | sed -n '/^\s*$/!{p;q}'
echo "------"
done
12_3_no_teacher 22.65
12_3/ 23.0
12_3_bru: 23.56
12_4_no_teacher 20.72
12_4_bru_ld 24.416
12_4_teacher 25.21 (menro_12_4_bru)
12_6_no_teacher_halfgen 23.79
12_6_no_teacher_leps 24.95
12_6_v2/ 25.4267 **UPLOADED** as `sshleifer/distill-mbart-12-6`
~12_9_no_teacher/ 17.56
12_9 26.03 (menro_12_9_bru_v2) (menro_12_9_bru was close)
teacher: 26.457
finetune from cc25: 25.8 - 26.1
mbart-large-enro: 26.46 before fix, val
after fix: 26.80 batch_parity: 26.42 (delta of 0.4 vs pradhy could be the force_bos_token fix OR the batch parity fix) 21.23 with decoder_start_token_id = eos_token_id
(all numbers below are before fix.):
- cc25_dynb_cont: 25.4 15h
- pradhy: 26.01
- bru_pl85_long_master/25.98
- bru_baseline_pl81: 25.99
- bru_pl85_long/test_bleu.json 26.27
- bru_pl85_layerdrop/test_bleu.json 26.34it
Test:
- Generate test with beam=5: BLEU = 26.83 57.1/33.2/20.7/13.2 (BP = 1.000 ratio = 1.007 hyp_len = 49294 ref_len = 48945)
Valid:
- Generate valid with beam=5: BLEU = 28.18 58.7/34.7/22.0/14.3 (BP = 0.995 ratio = 0.995 hyp_len = 51041 ref_len = 51300)