Skip to content

Instantly share code, notes, and snippets.

View matthewdouglas's full-sized avatar

Matthew Douglas matthewdouglas

  • Indianapolis, IN
View GitHub Profile
@matthewdouglas
matthewdouglas / convert.py
Created June 8, 2024 17:07
Convert llm.c GPT-2 checkpoint to HF safetensors
from pathlib import Path
import numpy as np
from safetensors import serialize_file
from transformers import GPT2Config, AutoTokenizer
LLMC_HEADER_DTYPE = np.dtype([
("magic", "<i4"), # Little-endian, int32 magic number: 20240326
("version", "<i4"), # Little-endian, int32. fp32 = 3, bf16 = 5.
("max_seq_len", "<i4"),
("vocab_size", "<i4"),