jondurbin/airoboros-33b-gpt4-2.0-tuning.md

## airoboros-33b-gpt4-2.0-tuning.md

      
    Raw
  

              airoboros-33b-gpt4-2.0-tuning.md
            
          
    Overview

This was a qlora fine-tune of llama-30b-hf using dataset https://huggingface.co/datasets/jondurbin/airoboros-gpt4-2.0
QLoRA fork

I used my fork of qlora: https://github.com/jondurbin/qlora which has support for airoboros dataset format, updated prompt format, etc.
Base model

I used https://hf.co/decapoda-research/llama-30b-hf
I then replaced tokenizer_config.json and special_tokens_map.json in the base model with the versions found in my qlora fork: https://github.com/jondurbin/qlora
Hardware

This was done on a single 40GB A100.
Fine-tune

export WANDB_API_KEY=[redacted]
export WANDB_PROJECT=airoboros-33b-gpt4-2.0

python qlora.py \
    --model_name_or_path /data/llama-30b-hf \
    --output_dir /data/$WANDB_PROJECT-checkpoints \
    --num_train_epochs 3 \
    --logging_steps 1 \
    --save_strategy steps \
    --data_seed 11422 \
    --save_steps 75 \
    --save_total_limit 3 \
    --evaluation_strategy "no" \
    --eval_dataset_size 2 \
    --max_new_tokens 1800 \
    --dataloader_num_workers 3 \
    --logging_strategy steps \
    --remove_unused_columns False \
    --do_train \
    --lora_r 64 \
    --lora_alpha 16 \
    --lora_modules all \
    --double_quant \
    --quant_type nf4 \
    --bf16 \
    --bits 4 \
    --warmup_ratio 0.03 \
    --lr_scheduler_type constant \
    --gradient_checkpointing \
    --dataset /data/instructions.jsonl \
    --dataset_format airoboros \
    --model_max_len 2048 \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 16 \
    --learning_rate 0.0001 \
    --adam_beta2 0.999 \
    --max_grad_norm 0.3 \
    --lora_dropout 0.05 \
    --weight_decay 0.0 \
    --seed 11422 \
    --report_to wandb