Skip to content

Instantly share code, notes, and snippets.

@vuiseng9
Last active September 11, 2023 05:27
Show Gist options
  • Save vuiseng9/be1f893498163b38a0bf84407c686f8a to your computer and use it in GitHub Desktop.
Save vuiseng9/be1f893498163b38a0bf84407c686f8a to your computer and use it in GitHub Desktop.

Setup

pip install transformers torch
git clone https://huggingface.co/EleutherAI/gpt-j-6b # depends on git-lfs 

Run following as python script

from transformers import AutoTokenizer, pipeline

model_dir=<path to cloned model>
tokenizer = AutoTokenizer.from_pretrained(model_dir)
generator_pipe = pipeline('text-generation', model=model_dir, tokenizer=tokenizer)
output = generator_pipe("I love the Avengers", max_length=30, num_return_sequences=1)
print(output)

For large model, the pytorch checkpoint will be split into multiple files during cloning, Running a pipeline using it will decompress them into one single pytorch_model.bin

Loading weight (python)

import torch

sd=torch.load("/path/to/pytorch.bin")

# Number of parameters by layer
for k, v in sd.items():
    print(f'{v.numel():10} | {k}')
@vuiseng9
Copy link
Author

Setup

pip install transformers
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Snippet to retrieve SmoothQuant-ed Weights

import torch
from collections import OrderedDict
from glob import glob

model_dir = "/data1/vchua/hf-model/opt-13b-smoothquant"

sd = dict()

# large models would have sharded binary
ptbins = sorted(glob(f"{model_dir}/pytorch_model*.bin"))
for ptbin in ptbins:
    sd.update(torch.load(ptbin, map_location=torch.device('cpu')))

for k, tensor in sd.items():
    if "weight" in k and "layer_norm" not in k:
        print(k) 
        # tensor is weight tensor of interest

SmoothQuant Models

https://huggingface.co/mit-han-lab
├── opt-13b-smoothquant
├── opt-30b-smoothquant
├── opt-66b-smoothquant
└── opt-6.7b-smoothquant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment