Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save juliensimon/da64fc6d6a2fe39bd8c5af12389a227e to your computer and use it in GitHub Desktop.
Save juliensimon/da64fc6d6a2fe39bd8c5af12389a227e to your computer and use it in GitHub Desktop.
Trainium vs V100
LANGUAGE PRETRAINING
python run_clm.py \
--model_name_or_path gpt2 \
--dataset_name wikitext \
--dataset_config_name wikitext-103-raw-v1 \
--num_train_epochs 10 \
--per_device_train_batch_size 8 \
--per_device_eval_batch_size 8 \
--do_train \
--do_eval \
--output_dir /tmp/test-clm \
--torch_compile True --torch_compile_mode default --fp16 True --save_strategy no
torchrun --nproc_per_node 32 run_clm.py --model_name_or_path gpt2 --dataset_name wikitext --dataset_config_name wikitext-103-raw-v1 --num_train_epochs 10 --per_device_train_batch_size 1 --per_device_eval_batch_size 8 --do_train --do_eval --output_dir /tmp/test-clm --overwrite_output_dir --save_strategy no
TOKEN CLASSIFICATION
python run_ner.py --model_name_or_path bert-large-uncased --dataset_name conll2003 --output_dir /tmp/test-ner --do_train --do_eval --num_train_epochs 10 --torch_compile True --torch_compile_mode default --fp16 True --overwrite_output_dir --max_seq_length 512 --save_strategy no
torchrun --nproc_per_node 32 run_ner.py --model_name_or_path bert-large-uncased --dataset_name conll2003 --output_dir /tmp/test-ner --do_train --do_eval --overwrite_output_dir --max_seq_length 512 --per_device_train_batch_size 1 --per_device_eval_batch_size 8 --save_strategy no
IMAGE CLASSIFICATION
python run_image_classification.py --dataset_name food101 --output_dir ./food101_outputs/ --do_train --do_eval --learning_rate 2e-5 --num_train_epochs 10 --per_device_train_batch_size 192 --per_device_eval_batch_size 64 --torch_compile True --torch_compile_mode default --fp16 True --remove_unused_columns False --overwrite_output_dir --save_strategy no
torchrun --nproc_per_node 32 run_image_classification.py --dataset_name food101 --output_dir ./food101_outputs/ --do_train --do_eval --learning_rate 2e-5 --num_train_epochs 10 --per_device_train_batch_size 16 --per_device_eval_batch_size 32 --remove_unused_columns False --overwrite_output_dir --save_strategy no
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment