Skip to content

Instantly share code, notes, and snippets.

View aflah02's full-sized avatar
🎯
Focusing

Aflah aflah02

🎯
Focusing
View GitHub Profile
@NaxAlpha
NaxAlpha / pythia_1b4_8k.py
Last active April 4, 2024 17:19
Fine-tune Pythia 1.4b model for 8k context window, this script requires at least 40 GB of memory, 15-20 hours of fine-tune is sufficient.
import copy
import torch
import torch.nn.functional as F
import torch.backends.cuda as cuda
from torch.utils.data import DataLoader, IterableDataset
import wandb
from tqdm import tqdm
import bitsandbytes as bnb
@veekaybee
veekaybee / chatgpt.md
Last active September 29, 2024 03:46
Everything I understand about chatgpt

ChatGPT Resources

Context

ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowhere. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?

I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.

Model Architecture

@p208p2002
p208p2002 / BART Fine-tuning.ipynb
Last active December 1, 2022 07:38
#blog #bart-model #nlp
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@yoavg
yoavg / stochastic-critique.md
Last active November 9, 2023 04:32
A criticism of Stochastic Parrots

A criticism of "On the Dangers of Stochastic Parrots: Can Languae Models be Too Big"

Yoav Goldberg, Jan 23, 2021.

The FAccT paper "On the Dangers of Stochastic Parrots: Can Languae Models be Too Big" by Bender, Gebru, McMillan-Major and Shmitchell has been the center of a controversary recently. The final version is now out, and, owing a lot to this controversary, would undoubtly become very widely read. I read an earlier draft of the paper, and I think that the new and updated final version is much improved in many ways: kudos for the authors for this upgrade. I also agree with and endorse most of the content. This is important stuff, you should read it.

However, I do find some aspects of the paper (and the resulting discourse around it and around technology) to be problematic. These weren't clear to me when initially reading the first draft several months ago, but they became very clear to me now. These points are for the most part

@conquistadorjd
conquistadorjd / 0-image-processing
Last active January 2, 2022 11:33
image pricessing using PIL, pillow
image processing
@hibariya
hibariya / Makefile
Last active November 11, 2022 10:35
Calling c function from x86 assembly code (cdecl)
OBJECTS = func.o main.o
CC = gcc
CFLAGS = -std=c11 -m32 -Wall -Wextra -Werror -c
AS = nasm
ASFLAGS = -f elf
all: $(OBJECTS)
gcc -m32 $(OBJECTS) -o main
run: all
@killercup
killercup / pandoc.css
Created July 3, 2013 11:31
Add this to your Pandoc HTML documents using `--css pandoc.css` to make them look more awesome. (Tested with Markdown and LaTeX.)
/*
* I add this to html files generated with pandoc.
*/
html {
font-size: 100%;
overflow-y: scroll;
-webkit-text-size-adjust: 100%;
-ms-text-size-adjust: 100%;
}