Skip to content

Instantly share code, notes, and snippets.

@bigsnarfdude
Created April 17, 2024 01:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bigsnarfdude/dee070e536f4fa9f934f9a9ed016ccb0 to your computer and use it in GitHub Desktop.
Save bigsnarfdude/dee070e536f4fa9f934f9a9ed016ccb0 to your computer and use it in GitHub Desktop.
ft.py
import time
out_dir = 'out-owt'
eval_interval = 50
eval_iters = 100
wandb_log = True # feel free to turn on
wandb_project = 'owt'
wandb_run_name = 'ft-' + str(time.time())
dataset = 'openwebtext'
init_from = 'gpt2' # this is the largest GPT-2 model
# only save checkpoints if the validation loss improves
always_save_checkpoint = False
# the number of examples per iter:
# 1 batch_size * 32 grad_accum * 1024 tokens = 32,768 tokens/iter
# shakespeare has 301,966 tokens, so 1 epoch ~= 9.2 iters
batch_size = 1
gradient_accumulation_steps = 32
max_iters = 10000
# finetune at constant LR
learning_rate = 3e-5
decay_lr = False
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment