Skip to content

Instantly share code, notes, and snippets.

@HDCharles
Created June 4, 2024 20:21
Show Gist options
  • Save HDCharles/8a9131bcf6a4a7355ac52a760cd5b63c to your computer and use it in GitHub Desktop.
Save HDCharles/8a9131bcf6a4a7355ac52a760cd5b63c to your computer and use it in GitHub Desktop.
doing lm_eval's work
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from lm_eval.models.huggingface import HFLM
from lm_eval.evaluator import evaluate
from lm_eval.tasks import get_task_dict
path_to_hf_checkpoint = "/home/cdhernandez/local/gpt-fast/checkpoints/meta-llama/Meta-Llama-3-8B"
task_list = ["wikitext"]
device = "cuda"
precision = torch.bfloat16
tokenizer = AutoTokenizer.from_pretrained(path_to_hf_checkpoint)
model = AutoModelForCausalLM.from_pretrained(path_to_hf_checkpoint).to(device="cuda", dtype=precision)
from torchao.quantization.quant_api import change_linear_weights_to_int4_woqtensors
# your API Here
change_linear_weights_to_int4_woqtensors(model)
with torch.no_grad():
result = evaluate(
HFLM(pretrained=model, tokenizer=tokenizer),
get_task_dict(task_list),
limit = 10
)
for task, res in result["results"].items():
print(f"{task}: {res}")
@andrewor14
Copy link

  • Remove path to checkpoint: we can just download hte model directly from the path from hf
  • Explicitly list out public APIs that people can try (it seems like there's only 4)
  • Put this in docs/ folder in ao titled how to run an eval or create a new folder called scripts/ and call this eval.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment