Skip to content

Instantly share code, notes, and snippets.

@sekstini
sekstini / weightcomp.ipynb
Created July 19, 2023 21:58
LLaMA-2 7b weight comparison between original (bf16) and huggingface (fp16)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
# %%
import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
# %%
import bitsandbytes as bnb # type: ignore
def dump_4bit_weight(m: bnb.nn.modules.Linear4bit, path: str):
assert isinstance(m, bnb.nn.modules.Linear4bit), "Only Linear4bit is supported"
obj = {
# %%
import torch
# %%
# == Load 4bit weights and original weights ==
lin_4bit_dump = torch.load("lin_4bit_dump.pt", map_location="cuda")
lin_orig_weight = torch.load("lin_orig_weight.pt", map_location="cuda")
tmp = torch.load("input_and_outputs.pt", map_location="cuda")
x, output_4bit, output_orig = tmp["input"], tmp["output_4bit"], tmp["output_orig"]
@sekstini
sekstini / convert_llama_weights_to_hf.py
Created September 27, 2023 10:44
Convert Mistral Llama 7B to Huggingface format
# Copyright 2022 EleutherAI and The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
@sekstini
sekstini / Residual_FSQ_Example.ipynb
Last active April 23, 2024 07:41
Residual FSQ MNIST Example
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@sekstini
sekstini / MiniFormer_CompileTest.ipynb
Created November 1, 2023 20:46
Testing MiniFormer memory consumption with and without torch.compile
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@sekstini
sekstini / table.csv
Created November 7, 2023 03:18
gpt-4 vs gpt-4-turbo on PyLLMs benchmark
We can make this file beautiful and searchable if this error is corrected: Unclosed quoted field in line 3.
Model,Output,Tokens,Cost ($),Latency (s),Speed (tokens/sec),Evaluation
OpenAIProvider('gpt-4-1106-preview'),"If you see the word 'push' in reverse through a glass door, it means you are looking at the text from the back side. The text is likely intended for people on the other side of the door, instructing them to push when they approach the door from that side.
Since you are seeing it in reverse, you are on the opposite side of the door from where the instruction applies. Therefore, you should do the opposite of what the reversed text says, which means you should pull the door to open it.",105,0.00257,2.75,38.18,3
OpenAIProvider('gpt-4-1106-preview'),"To solve the quadratic equation x^2 - 5x + 6 = 0, we can use factoring, the quadratic formula, or completing the square. The simplest method here is factoring because the equation is already in a form that suggests it might factor nicely.
We are looking for two numbers that multiply to give the constant term (6) and add up to give the coefficient of the x ter
@sekstini
sekstini / KVCacheFun.ipynb
Last active November 28, 2023 10:12
Answering what happens when you give the KV-Cache of an instruct model to its base model. (update: better version in the comments)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@sekstini
sekstini / KVCacheFunv2.ipynb
Last active November 28, 2023 06:49
KV Cache Fun, but using the proper instruct template, and comparing outputs
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.