Skip to content

Instantly share code, notes, and snippets.

View lewtun's full-sized avatar
🤫
LLM whispering

lewtun

🤫
LLM whispering
View GitHub Profile
@lewtun
lewtun / dbrx-instruct-ifeval.json
Created April 24, 2024 21:48
DBRX-Instruct IFEval scores from LightEval
{
"config_general": {
"lighteval_sha": "?",
"num_fewshot_seeds": 1,
"override_batch_size": 4,
"max_samples": null,
"job_id": "",
"start_time": 1163608.425196265,
"end_time": 1173616.769654949,
"total_evaluation_time_secondes": "10008.34445868386",
@lewtun
lewtun / view_details.py
Last active February 28, 2024 11:28
View LightEval predictions
"""
First install: pip install datasets pandas rich transformers
Usage:
# Loglikelihood evals
python view_details.py --filepath path/to/parquet/details
# Generative evals
python view_details.py --filepath path/to/parquet/details --is_generative
@lewtun
lewtun / sft_trainer.py
Last active April 27, 2024 21:28
Fine-tuning Mistral 7B with TRL & DeepSpeed ZeRO-3
# This is a modified version of TRL's `SFTTrainer` example (https://github.com/huggingface/trl/blob/main/examples/scripts/sft_trainer.py),
# adapted to run with DeepSpeed ZeRO-3 and Mistral-7B-V1.0. The settings below were run on 1 node of 8 x A100 (80GB) GPUs.
#
# Usage:
# - Install the latest transformers & accelerate versions: `pip install -U transformers accelerate`
# - Install deepspeed: `pip install deepspeed==0.9.5`
# - Install TRL from main: pip install git+https://github.com/huggingface/trl.git
# - Clone the repo: git clone github.com/huggingface/trl.git
# - Copy this Gist into trl/examples/scripts
# - Run from root of trl repo with: accelerate launch --config_file=examples/accelerate_configs/deepspeed_zero3.yaml --gradient_accumulation_steps 8 examples/scripts/sft_trainer.py
@lewtun
lewtun / sentiment_tuning.py
Created August 1, 2023 16:56
TRL Sentiment Tuning with DeepSpeed ZeRO-3
# coding=utf-8
# Copyright 2023 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@lewtun
lewtun / dialogue_template.py
Last active October 6, 2023 12:57
Dialogue template
# coding=utf-8
# Copyright 2023 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
@lewtun
lewtun / m4_inference.py
Created July 4, 2023 15:28
M4 inference
import torch
from m4.training.packing import image_attention_mask_for_packed_input_ids, incremental_to_binary_attention_mask
from m4.training.utils import build_image_transform
from io import BytesIO
from PIL import Image
import requests
from transformers import AutoTokenizer, AutoModelForCausalLM
MAX_SEQ_LEN=2048
@lewtun
lewtun / hf-endpoints-inference.ipynb
Created June 10, 2023 07:53
Demo of synchronous and token streaming text generation
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@lewtun
lewtun / dataset-sharding.ipynb
Created February 12, 2023 15:22
Sharding dataset subsets
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@lewtun
lewtun / metrics.jsonl
Created December 2, 2022 09:47
metric.jsonl
{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"type":"accuracy","value":0.8967889908256881,"name":"Accuracy"}}
{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"type":"precision","value":0.8898678414096917,"name":"Precision"}}
{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"type":"recall","value":0.9099099099099099,"name":"Recall"}}
{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"type":"auc","value":0.9672186789593331,"name":"AUC"}}
{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"type":"f1","value":0.8997772828507795,"name":"F1"}}
{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"ty