Skip to content

Instantly share code, notes, and snippets.

View xdevfaheem's full-sized avatar
👤
Chillin'

Mohammed Faheem xdevfaheem

👤
Chillin'
View GitHub Profile
@Laeeth
Laeeth / long_gpt.py
Created April 13, 2023 04:17 — forked from NaxAlpha/long_gpt.py
Training script for LongGPT; Fine-tunes GPT-2 (335M) on The Pile Dataset with a context size of 8k tokens. (requires > 16GB RAM)
import time
from contextlib import suppress
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torch.backends.cuda as cuda
from torch.utils.data import DataLoader, IterableDataset