Skip to content

Instantly share code, notes, and snippets.

View TimDettmers's full-sized avatar

Tim Dettmers TimDettmers

View GitHub Profile
@TimDettmers
TimDettmers / gist:385014b37f998c7857b15a3ea60b4cae
Last active August 14, 2023 23:07
Chip2 data evaluated with Rouge L and BERTScore for 4, 8, 16 bit LoRA and full finetuning
Config: max_steps: 17500, lora_r: 64 , lr: 0.0002, bf16: False, lora_modules: all , bits: 4 , full_finetune: False, lora_dropout: 0.0 , warmup_steps: 100 , compress_statistics: True, dataset: NaN , gradient_accumulation_steps: NaN , learning_rate: NaN , quant_type: fp4 , adam_beta2: 0.999, update_freq: 6
eval_bert_f1 mean (SE): 64.8716 (21.8331). 95% CI (22.079, 107.664). Sample size: 4
eval_bert_f1 mean (SE): 64.8716 (21.8331). 95% CI (22.079, 107.664). Sample size: 4
eval_rougeL mean (SE): 33.1083 (19.1162). 95% CI (-4.359, 70.576). Sample size: 4
================================================================================
Config: max_steps: 17500, lora_r: 64 , lr: 0.0002, bf16: False, lora_modules: all , bits: 4 , full_finetune: False, lora_dropout: 0.0 , warmup_steps: 100 , compress_statistics: False, dataset: NaN , gradient_accumulation_steps: NaN , learning_rate: NaN , quant_type: fp4 , adam_beta2: 0.999, update_freq: 6
eval_bert_f1 mean (SE): 67.0044 (22.3593). 95% CI (23.180, 110.829).
@TimDettmers
TimDettmers / gist:a96b0d948a97583f8ab0599fa888c35c
Created August 10, 2023 04:57
Hyperparameter grid search for LLaMA models on Alpaca dataset for QLoRA finetuning
This table contains data from multiple software versions. Some hyperparamter names are "NaN" meaning, they did not exist in that software version. The best 7B result is 40.08 MMLU.
================================================================================
Config: learning_rate: 0.005, adam_beta2: 0.999, lora_dropout: 0.0 , max_grad_norm: 1.0 , max_steps: 7320, lr_scheduler_type: <SchedulerType.COSINE: cosine>, weight_decay: 0.0 , base_model: /gscratch/zlab/llama/7B, quant_type: nf4 , gradient_accumulation_steps: 6 , per_device_train_batch_size: 2
acc mean (SE): 0.2290 (nan). 95% CI (nan, nan). Sample size: 1
================================================================================
Config: learning_rate: 0.0002, adam_beta2: 0.999, lora_dropout: 0.0 , max_grad_norm: 0.3 , max_steps: 9750, lr_scheduler_type: <SchedulerType.CONSTANT: constant>, weight_decay: 0.0 , base_model: NaN , quant_type: nf4 , gradient_accumulation_steps: 2 , per_device_train_batch_size: 8
acc mean (SE): 0.2290 (0.0
@TimDettmers
TimDettmers / inference_hf_8bit.py
Created October 11, 2022 15:32
Minimal example of 8-bit inference for LLMs via Hugging Face transformers + accelerate.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
MAX_NEW_TOKENS = 128
model_name = 'facebook/opt-6.7b'
text = """Hello, I am a prompt. Who are you?"""
tokenizer = AutoTokenizer.from_pretrained(model_name)
input_ids = tokenizer(text, return_tensors="pt").input_ids
@TimDettmers
TimDettmers / find_huffman_ratio.py
Last active September 30, 2021 15:45
Calculate Huffman compression ratio with bitsandbytes
import torch
import bitsandbytes as bnb
from heapq import heappush, heappop, heapify
a = torch.normal(0, 0.5, size=(1024, 1024),device='cuda')
def get_compression(x:torch.Tensor)->float:
"""Yields the compression rate of Huffman Coding"""
assert x.device.type == 'cuda'