Skip to content

Instantly share code, notes, and snippets.

@bartgras
Created October 25, 2017 10:06
Show Gist options
  • Save bartgras/e03c31e5343900b8136cbf951b7b0386 to your computer and use it in GitHub Desktop.
Save bartgras/e03c31e5343900b8136cbf951b7b0386 to your computer and use it in GitHub Desktop.
How to run BIG model
#First I've tried this, but it failed with OOM
mkdir -p ./data/v1_bigtest
mkdir -p ./train/v1_bigtest
t2t-trainer \
--t2t_usr_dir=./data_generators \
--generate_data \
--tmp_dir=/e/challenger_nmt/t2t_temp_dir \
--data_dir=./data/v1_bigtest \
--problems=challenger_enzh_v1 \
--model=transformer \
--hparams_set=transformer_big_single_gpu \
--output_dir=./train/v1_bigtest \
# Then, because I've already generated data, so there was no need to do
# it again, I've removed "--generate_data" and changed batch size
mkdir -p ./data/v1_bigtest
mkdir -p ./train/v1_bigtest
t2t-trainer \
--t2t_usr_dir=./data_generators \
--generate_data \ <---------------------------- removed this
--tmp_dir=/e/challenger_nmt/t2t_temp_dir \
--data_dir=./data/v1_bigtest \
--problems=challenger_enzh_v1 \
--model=transformer \
--hparams_set=transformer_big_single_gpu \
--output_dir=./train/v1_bigtest \
--hparams='batch_size=2048' <---------------------- added this
# Notice my GPU was able to run 2048.
# Training will be faster if you use 2048 instead
# of 1024, can you please try these settings?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment