Skip to content

Instantly share code, notes, and snippets.

@jondurbin
Last active October 21, 2023 12:52
Show Gist options
  • Save jondurbin/c17dddabdc8fee69881f03d816dce482 to your computer and use it in GitHub Desktop.
Save jondurbin/c17dddabdc8fee69881f03d816dce482 to your computer and use it in GitHub Desktop.
airoboros-m-7b-3.1.2.md

Trained on 10x a6000 GPUs on runpod.io.

I actually ran many fine-tunes, including multiple full-finetunes, fp16 loras, and qloras, and the below qlora actually did best in my testing.

dataset: https://hf.co/datasets/jondurbin/airoboros-3.1 (plus a few unpublished de-censoring instructions)

training script: https://github.com/jondurbin/qlora specifically commit 8cd269bf9bd7753c92164934269019e12f23314f

export BASE_DIR=/workspace
export WANDB_API_KEY=[redacted]
export WANDB_PROJECT=airoboros-m-7b-3.1.2

accelerate launch qlora/train.py \
  --model_name_or_path $BASE_DIR/mistral-7b \
  --output_dir $BASE_DIR/$WANDB_PROJECT \
  --working_dir $BASE_DIR/$WANDB_PROJECT-checkpoints \
  --num_train_epochs 3 \
  --logging_steps 1 \
  --save_strategy steps \
  --save_steps 25 \
  --save_total_limit 3 \
  --data_seed 42 \
  --evaluation_strategy steps \
  --eval_dataset_size 0.01 \
  --eval_steps 25 \
  --max_new_tokens 4096 \
  --dataloader_num_workers 3 \
  --logging_strategy steps \
  --remove_unused_columns False \
  --do_train \
  --lora_r 16 \
  --lora_alpha 32 \
  --lora_modules all \
  --bf16 \
  --bits 4 \
  --lr_scheduler_type cosine \
  --dataset $BASE_DIR/conversations.json \
  --dataset_format airoboros_chat \
  --model_max_len 4096 \
  --per_device_train_batch_size 2 \
  --learning_rate 0.00022 \
  --warmup_ratio 0.015 \
  --adam_beta2 0.999 \
  --max_grad_norm 0.3 \
  --lora_dropout 0.03 \
  --weight_decay 0.0 \
  --seed 42 \
  --report_to wandb \
  --gradient_checkpointing True \
  --expand_conversations \
  --skip_excess_length False \
  --ddp_find_unused_parameters False \
  --trust_remote_code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment