Skip to content

Instantly share code, notes, and snippets.

View KeremTurgutlu's full-sized avatar
:octocat:
Having Fun

Kerem Turgutlu KeremTurgutlu

:octocat:
Having Fun
View GitHub Profile
@KeremTurgutlu
KeremTurgutlu / tinygemm_vs_bitblas.py
Last active July 11, 2024 20:40
HQQ Tinygemm vs BitBlas Benchmark
import torch
import numpy as np
from hqq.core.quantize import HQQLinear, BaseQuantizeConfig, Quantizer, HQQBackend
from hqq.backends.torchao import HQQLinearTorchWeightOnlynt4, patch_hqq_to_aoint4
# from unpack_int4.ops import unpack_int4_packed
import torchao
import bitblas
# unpack_cuda_compiled = torch.compile(torchao.ops.unpack_int4_to_int, mode="default", fullgraph=True)
from bitblas.cache import global_operator_cache, get_database_path
@KeremTurgutlu
KeremTurgutlu / test_triton_mm.ipynb
Last active May 24, 2024 13:51
test_triton_mm.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@KeremTurgutlu
KeremTurgutlu / exp.ipynb
Last active January 10, 2024 14:29
QLORA Memory Experiments
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@KeremTurgutlu
KeremTurgutlu / gpt_eval_templates.py
Created October 28, 2023 05:03
GPT-Eva Templates
gpt_eval_template_coherence = """
You will be given title: [TITLE] and description: [DESC] written from a set of information of a real estate listing in Turkish.
Your task is to rate the title and description on one metric.
Please make sure you read and understand these instructions carefully. Please keep this
document open while reviewing, and refer to it as needed.
Evaluation Criteria:
Coherence (1-5) - the collective quality of all sentences. We align this dimension with
@KeremTurgutlu
KeremTurgutlu / multipack_sampler_flash_attn.py
Last active October 7, 2023 04:44
Multipack Sampler x Flash Attention
"""
Testing flash attn with multipacking which essentially packs sequences using https://github.com/imoneoi/multipack_sampler,
and passes a single sequence of `1 x (bs x seqlen)` to the model to avoid padding.
An alternative is to use block diagonal attention as attention bias, but the following uses flash attention 2 which
is much faster.
Multipacking can be used to speed up both pretraining and finetuning.
"""
@KeremTurgutlu
KeremTurgutlu / ddp_batch_all_gather.py
Last active September 20, 2023 00:57
Debugging: Distributed InfoNCE Loss
# CLIP contrastive loss is calculated all the negative batch samples from all the GPUs
# How to implement that?
# For more info: https://github.com/openai/CLIP/issues/29
import os
import sys
import tempfile
import torch
import torch.distributed as dist
import torch.nn as nn
@KeremTurgutlu
KeremTurgutlu / nn_interpolate.py
Last active May 22, 2023 18:19
Nearest Neighbor Interpolation in Numpy
from collections import Counter
def nn_interpolate(A, new_size):
"""
Nearest Neighbor Interpolation, Step by Step
"""
# get sizes
old_size = A.shape
# calculate row and column ratios
We can make this file beautiful and searchable if this error is corrected: No tabs found in this TSV file in line 0.
TsvHttpData-1.0
https://files.pushshift.io/reddit/comments/RC_2005-12.zst
@KeremTurgutlu
KeremTurgutlu / ema_swa.py
Last active July 26, 2022 03:10
EMA and SWA callbacks for different model averaging techniques
from fastai.vision.all import *
__all__ = ["EMA", "SWA"]
class EMA(Callback):
"https://fastai.github.io/timmdocs/training_modelEMA"
order,run_valid=5,False
def __init__(self, decay=0.9999):
super().__init__()
self.decay = decay
from fastai.vision.all import *
from torch.cuda.amp import autocast, GradScaler
from torch.cuda.amp.grad_scaler import _refresh_per_optimizer_state
from sam import SAM
class FastaiSched:
def __init__(self, optimizer, max_lr):
self.optimizer = optimizer
self.lr_sched = combine_scheds([0.1,0.9], [SchedLin(1e-8,max_lr), SchedCos(max_lr,1e-8)])
self.update(0)