Skip to content

Instantly share code, notes, and snippets.

View Birch-san's full-sized avatar

Birch-san

View GitHub Profile
@Birch-san
Birch-san / arb.py
Created July 27, 2023 23:07
Computing aspect ratio buckets
import numpy as np
import math
from numpy.typing import NDArray
# we are trying to make buckets of varying aspect ratios,
# all with about the same area (equivalent to a 512x512 square)
square_side = 512
buckets = 8
widest_aspect: float = math.atan2(1, 2) # 1/2 = 0.5 aspect ratio
@Birch-san
Birch-san / flash_attn_processor.py
Created July 21, 2023 17:41
diffusers flash_attn AttnProcessors for qkvpacked self-attn and regular cross-attn
import torch
from typing import Optional
from flash_attn import flash_attn_func, flash_attn_qkvpacked_func
from diffusers.models.attention import Attention
class FlashAttnProcessor:
r"""
Processor for implementing memory efficient attention using flash_attn.
"""
@Birch-san
Birch-san / flash_attn_processor.py
Last active December 19, 2023 22:07
FlashAttnProcessor
import torch
from typing import Optional
from flash_attn import flash_attn_func
from diffusers.models.attention import Attention
class FlashAttnProcessor:
r"""
Processor for implementing memory efficient attention using flash_attn.
"""
@Birch-san
Birch-san / bnb-correctness-test.md
Last active July 10, 2023 17:43
Correctness-testing bitsandbytes `0.40.0`

correctness-testing 0.40.0

Here we've ramped up the bnb_4bit_compute_dtype to float32, in the hopes of making the model stay on-topic.
Since we were concerned by the responses measured with bnb_4bit_compute_dtype=bfloat16

llama 7b

`I was under the effect of a counterspell, so none of the superpower-wielding monsters could see me anyway. My eyes had begun to change as a result of my battle with Melvin. The transformation was complete. I was in the true look of my chosen form. As you can see, a true-blue beauty. There was only one of me, though, so I would have to make sure that this was the end. I went to catch the culprit. He was in the same clothes he was wearing when he committed the first murder. I did not recognize the man from that time, nor did he from me, but his face was twisted with an evil grin. He had the same shaved head. However, his hair seemed to change color. It was dark brown when I met him, but it turned to

@Birch-san
Birch-san / bnb-perf-test.md
Last active July 10, 2023 17:05
Perf-testing bitsandbytes `0.39.1` vs `0.40.0`

perf-testing bitsandbytes 0.39.1 vs 0.40.0

4090 on CUDA 12.1

seed=64

Evaluated using evaluate.py,

python -m evaluate --model_name_or_path huggyllama/llama-7b --tokenizer_model_name_or_path huggyllama/llama-7b --bf16 --overrun_countermeasures False --prompt_style bare
[14.614647 14.526281 14.438574 14.351521 14.265114 14.179349
14.094221 14.009725 13.925854 13.842604 13.759968 13.677942
13.596522 13.515701 13.435474 13.355838 13.276786 13.198313
13.120416 13.043088 12.966325 12.890123 12.814478 12.739382
12.664833 12.590827 12.517358 12.444422 12.372013 12.300129
12.2287655 12.157917 12.087579 12.017748 11.94842 11.879589
11.8112545 11.743409 11.67605 11.609174 11.542775 11.4768505
11.411397 11.346409 11.281884 11.217818 11.154207 11.091047
11.028336 10.966067 10.90424 10.842849 10.781891 10.721362
10.661261 10.601581 10.542321 10.483477 10.425045 10.367022
@Birch-san
Birch-san / falcon-40b-spqr.md
Created June 9, 2023 22:43
Run Falcon-40B with 3.35-bit quantization via SpQR

Instructions are a work-in-progress (I haven't managed it yet, just writing what I do as I go along).

@Birch-san
Birch-san / code-assist.md
Last active March 4, 2024 19:32
Local VSCode AI code assistance via starcoder + 4-bit quantization in ~11GB VRAM

Install HF Code Autocomplete VSCode plugin.

We are not going to set an API token. We are going to specify an API endpoint.
We will try to deploy that API ourselves, to use our own GPU to provide the code assistance.

We will use bigcode/starcoder, a 15.5B param model.
We will use NF4 4-bit quantization to fit this into 10787MiB VRAM.
It would require 23767MiB VRAM unquantized. (still fits on a 4090, which has 24564MiB)!

Setup API

@Birch-san
Birch-san / data_collator.py
Created June 3, 2023 21:45
DataCollatorForCriticLM
class ExtractedCriticSample(TypedDict):
prompt: str
continuation: str
rating: int
@dataclass
class DataCollatorForCriticLM(object):
tokenizer: transformers.PreTrainedTokenizer
prompt_max_len: int
continuation_max_len: int
@Birch-san
Birch-san / pahse1_train_sample_1.json
Created June 2, 2023 10:20
Excerpt from OpenAI's PRM800K process supervision dataset
{
"labeler": "e90a38f3-3135-4465-87af-3e6322e3d772",
"timestamp": "2022-07-17T16:56:51.323252",
"generation": null,
"is_quality_control_question": false,
"is_initial_screening_question": false,
"question":
{
"problem": "How many positive two-digit integers leave a remainder of 2 when divided by 8?",
"ground_truth_answer": "12"