DOUDOU0314/comparison.txt

## comparison.txt
Comparison of deepspeed and gluon-nlp in SQuAD1.1

Dataset:SQuAD1.1
GPUs:p3.8x with four teslaV100
BatchSize: 3
Epoch: 2
Max_seq_len: 384
Model: google_en_uncased_bert_wwm_large
Results:
  deepspeed : 93.11/87.03   time cost: 0.71hours
  log: https://gluon-nlp-log.s3.amazonaws.com/squad_training_log/2021.1.11/deepspeed/deepspeed_finetune.log

  gluon_nlp : 93.07/87.52   time cost: 1.43hours
  log: https://gluon-nlp-log.s3.amazonaws.com/squad_training_log/2021.1.11/gluon-nlp/finetune_squad1.1.log

Command：

1.deepspeed

  source ~/env/nlp/bin/activate
  git clone --recursive https://github.com/sxjscience/DeepSpeedExamples.git
  cd DeepSpeedExamples/BingBertSquad
  nlp_data prepare_squad --version 1.1 --save-path squad
  cd ckpt
  wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin -O bert-large-uncased-whole-word-masking-pytorch_model.bin
  wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-config.json -O bert-large-uncased-whole-word-masking-config.json
  cd ..
  bash run_squad_hf_deepspeed.sh 4 ckpt/bert-large-uncased-whole-word-masking-pytorch_model.bin squad squad_out ckpt/bert-large-uncased-whole-word-masking-config.json

2.gluonon_nlp

  source ~/env/nlp/bin/activate
  cd gluon-nlp/scripts/question_answering/commands/

  # If there isn't a "run_squad2_uncased_bert_wwm_large.sh", you should modify "generate_commands.py" , define a "uncased_bert_wwm_large_cfg()" with :
    # cfg = uncased_bert_base_cfg()
    # cfg.model_name = 'google_en_uncased_bert_wwm_large'
    # cfg.batch_size = 3
    # cfg.epochs = 2
    # cfg.max_seq_length = 384
  # modify main function and add "uncased_bert_wwm_large_cfg" to list
  # after saving, generate a command:
  python generate_commands.py
  # run
  bash run_squad2_uncased_bert_wwm_large.sh 1 1.1 float16
	Comparison of deepspeed and gluon-nlp in SQuAD1.1

	Dataset:SQuAD1.1
	GPUs:p3.8x with four teslaV100
	BatchSize: 3
	Epoch: 2
	Max_seq_len: 384
	Model: google_en_uncased_bert_wwm_large
	Results:
	deepspeed : 93.11/87.03 time cost: 0.71hours
	log: https://gluon-nlp-log.s3.amazonaws.com/squad_training_log/2021.1.11/deepspeed/deepspeed_finetune.log

	gluon_nlp : 93.07/87.52 time cost: 1.43hours
	log: https://gluon-nlp-log.s3.amazonaws.com/squad_training_log/2021.1.11/gluon-nlp/finetune_squad1.1.log

	Command：

	1.deepspeed

	source ~/env/nlp/bin/activate
	git clone --recursive https://github.com/sxjscience/DeepSpeedExamples.git
	cd DeepSpeedExamples/BingBertSquad
	nlp_data prepare_squad --version 1.1 --save-path squad
	cd ckpt
	wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin -O bert-large-uncased-whole-word-masking-pytorch_model.bin
	wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-config.json -O bert-large-uncased-whole-word-masking-config.json
	cd ..
	bash run_squad_hf_deepspeed.sh 4 ckpt/bert-large-uncased-whole-word-masking-pytorch_model.bin squad squad_out ckpt/bert-large-uncased-whole-word-masking-config.json

	2.gluonon_nlp

	source ~/env/nlp/bin/activate
	cd gluon-nlp/scripts/question_answering/commands/

	# If there isn't a "run_squad2_uncased_bert_wwm_large.sh", you should modify "generate_commands.py" , define a "uncased_bert_wwm_large_cfg()" with :
	# cfg = uncased_bert_base_cfg()
	# cfg.model_name = 'google_en_uncased_bert_wwm_large'
	# cfg.batch_size = 3
	# cfg.epochs = 2
	# cfg.max_seq_length = 384
	# modify main function and add "uncased_bert_wwm_large_cfg" to list
	# after saving, generate a command:
	python generate_commands.py
	# run
	bash run_squad2_uncased_bert_wwm_large.sh 1 1.1 float16