Skip to content

Instantly share code, notes, and snippets.

View fastdaima's full-sized avatar
😁
Focusing

fastdaima

😁
Focusing
View GitHub Profile
@fastdaima
fastdaima / static_kv_cache.py
Created February 29, 2024 13:27 — forked from ArthurZucker/static_kv_cache.py
simple static kv cache script
from transformers import AutoModelForCausalLM, AutoTokenizer, StaticCache
import torch
from typing import Optional
device = "cuda"
# Copied from the gpt-fast repo
def multinomial_sample_one_no_sync(probs_sort): # Does multinomial sampling without a cuda synchronization
q = torch.empty_like(probs_sort).exponential_(1)
return torch.argmax(probs_sort / q, dim=-1, keepdim=True).to(dtype=torch.int)

Ubuntu 20.04 for Deep Learning

In the name of God

This gist contains steps to setup Ubuntu 20.04 for deep learning.


Install Ubuntu 20.04:

Ubuntu 22.04 for Deep Learning

In the name of God

This gist contains steps to setup Ubuntu 22.04 for deep learning.


Install Ubuntu 22.04

@fastdaima
fastdaima / pull-all.sh
Created September 19, 2022 18:38 — forked from jph00/pull-all.sh
Update in parallel all repos listed in ~/git/repos, and print status of any that are dirty
#!/usr/bin/env bash
for f in $(<~/git/repos); do
cd ~/git/$f
git pull > /dev/null &
cd - > /dev/null
done
wait < <(jobs -p)
for f in $(<~/git/repos); do