Skip to content

Instantly share code, notes, and snippets.

View Birch-san's full-sized avatar

Birch-san

View GitHub Profile
@Birch-san
Birch-san / .gitconfig
Created April 2, 2024 15:10
Using fine-grained access token to access your organisation's private GitHub repositories
[url "https://oauth2:github_pat_REDACTED@github.com/"]
insteadOf = https://github.com/
[url "https://oauth2:github_pat_REDACTED@github.com/MYCOOLORG/"]
insteadOf = git@github.com:MYCOOLORG/
@Birch-san
Birch-san / img-folder-chunking.md
Last active February 3, 2024 18:30
Chunking a folder of pngs into .tar files

Uploading a folder of many files to HF, by chunking it into .tars

So you generated 50000 images for computing FID or whatever, and now you want to upload those samples to HF.
You try, but one of the filetransfers fails, and you lose all your progress.
I mean it'd be nice if HF could just… fix this… like, put retries into huggingface-cli upload instead of just discarding tens of gigabytes of progress… but we live in the world in which we live.

So let's make it easier. instead of 50k small files, let's upload 50 big files. Collate 'em into .tars.

I'm not sure this makes a valid WDS, but it's close; I think you would need to rename the files to 000000.img.png if you wanted that.

@Birch-san
Birch-san / installing-python-proxy.md
Last active January 29, 2024 10:36
Installing Python when behind a corporate proxy

Behind a corporate proxy? Can't add PPAs to your apt listings?

A typical HTTP proxy URL may look like:
http://proxy.mycoolproxy.com:8080

Let's configure all our tools to use this proxy.

apt

sudo nano /etc/apt/apt.conf.d/00proxy.conf
@Birch-san
Birch-san / install-bpftrace-on-wsl2.md
Last active January 29, 2024 10:02
Installing bpftrace on WSL2
wsl --update --web-download
wsl --install -d Ubuntu-22.04 --web-download
wsl --setdefault Ubuntu-22.04
sudo apt-get install -y bpftrace bpftrace-dbgsym linux-headers-generic libc6-dev
@Birch-san
Birch-san / 8bit_adam_memory_usage.md
Last active October 3, 2023 18:20
Unexplained memory usage of 8-bit AdamW (paged vs unpaged)

Some weird memory usage (VRAM) is reported (by torch and by NVML) when using 8-bit AdamW, paged or unpaged.

Here we train llama 2 on 4096-token sequences, using either --optim adamw_8bit or --optim paged_adamw_8bit.
We do a full finetune using qlora.py --full-finetune, with our qlora.py fork, stepwise branch, commit 9a1045d.
We print the memory usage using HF transformers trainer's on_step_end callback. This is after optimizer.step(); model.zero_grad().

One would expect the memory usage at the end of step 1 to be the same as the end of step 2.
Yet for unpaged optimizer: memory usage leaps by 13.2GiB. End of step 1=70.4GiB, end of step 2=81.6GiB.
This appears to be a leap in PyTorch reserved memory only (32.6GiB -> 43.9GiB).

@Birch-san
Birch-san / t5-small-weight-inits.py
Created October 1, 2023 15:04
google/t5-v1_1-small t5-small weight initializations
import torch
from transformers import T5ForConditionalGeneration
model: T5ForConditionalGeneration = T5ForConditionalGeneration.from_pretrained('google/t5-v1_1-small')
_inference_mode_context = torch._C._InferenceMode(True)
_inference_mode_context.__enter__()
model.shared.weight.std()
tensor(11.6375)
@Birch-san
Birch-san / local-copilot.md
Last active March 12, 2024 15:14
Running GitHub Copilot against local Code Llama model

Running GitHub Copilot VSCode extension against local Code Llama model

image

image

Tested on NVIDIA RTX 4090, but these instructions also cover AMD and Mac in case you wanna try those.
This guide assumes you are running Linux (I ran this on Ubuntu).

Before you get excited:

@Birch-san
Birch-san / mask-test.ipynb
Created September 2, 2023 15:32
Tester for neighbourhood_mask, perimeter_mask
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Birch-san
Birch-san / mask_test.py
Created September 2, 2023 15:31
Tester for neighbourhood_mask, perimeter_mask
from typing import Optional, NamedTuple
from torch import BoolTensor, arange, meshgrid, clamp
import torch
class Dimensions(NamedTuple):
height: int
width: int
def make_neighbourhood_mask(size: Dimensions, size_orig: Dimensions, device='cpu') -> BoolTensor:
h, w = size
@Birch-san
Birch-san / llama_flash.py
Last active January 22, 2024 06:05
Loading llama with Flash Attention
from transformers import (
AutoConfig,
AutoTokenizer,
BitsAndBytesConfig,
GenerationConfig,
AutoModelForCausalLM,
LlamaTokenizerFast,
PreTrainedModel,
TextIteratorStreamer,
StoppingCriteria,