Skip to content

Instantly share code, notes, and snippets.

@emjotde
Last active June 8, 2019 12:30
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save emjotde/9c5260870c25304b9c8b111ddcf81b74 to your computer and use it in GitHub Desktop.
Save emjotde/9c5260870c25304b9c8b111ddcf81b74 to your computer and use it in GitHub Desktop.
WMT 2018 hyperparameters
#!/bin/bash -v
WORKSPACE=19000
SEED=0
HDFS=/hdfs/$PHILLY_VC/marcinjd
MARIAN=$HDFS/bins/marian
DATA_DIR=$HDFS/WMT.paracrawl
LOG_DIR=$PHILLY_LOG_DIR
MODEL_DIR=$PHILLY_MODEL_DIR
GPUS=$PHILLY_GPU_COUNT
# train model
$MARIAN/marian \
--model $MODEL_DIR/model.npz --type transformer \
--train-sets $DATA_DIR/data/all.paracrawl.8M.bpe.en $DATA_DIR/data/all.paracrawl.8M.bpe.de \
--max-length 100 \
--vocabs $DATA_DIR/model/vocab.ende.yml $DATA_DIR/model/vocab.ende.yml \
--mini-batch-fit -w $WORKSPACE --mini-batch 1000 --maxi-batch 1000 \
--valid-freq 5000 --save-freq 5000 --disp-freq 500 \
--valid-metrics ce-mean-words perplexity translation \
--valid-sets $DATA_DIR/data/valid.bpe.en $DATA_DIR/data/valid.bpe.de \
--valid-script-path $DATA_DIR/scripts/validate.sh \
--valid-translation-output $MODEL_DIR/valid.bpe.en.output --quiet-translation \
--beam-size 6 --normalize=0.6 \
--valid-mini-batch 16 \
--overwrite --keep-best \
--early-stopping 5 --cost-type=ce-mean-words \
--log $LOG_DIR/logrank.0.log --valid-log $LOG_DIR/valid.log \
--enc-depth 6 --dec-depth 6 \
--transformer-preprocess n --transformer-postprocess da \
--tied-embeddings-all --dim-emb 1024 --transformer-dim-ffn 4096 \
--transformer-dropout 0.1 --transformer-dropout-attention 0.1 --transformer-dropout-ffn 0.1 --label-smoothing 0.1 \
--learn-rate 0.0001 --lr-warmup 8000 --lr-decay-inv-sqrt 8000 --lr-report \
--optimizer-params 0.9 0.98 1e-09 --clip-norm 5 \
--devices $(seq 0 $(($GPUS - 1))) --sync-sgd --seed $SEED \
--exponential-smoothing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment