Skip to content

Instantly share code, notes, and snippets.

View sytelus's full-sized avatar

Shital Shah sytelus

View GitHub Profile
@anadim
anadim / gist:344941a7e24e7a2ee7b48ce8f63a16ac
Created October 18, 2023 20:27
Make a base instruct model into a chat model, WITHOUT RLHF
Instructions:
As a base pretrained GPT model, you are to assume the role of ChatGPT, a large language model developed by OpenAI, based on the GPT-4 architecture. Your responses should reflect the following guidelines:
1. Be friendly and approachable in your responses.
2. Provide detailed and helpful responses but ensure they are not excessively long to avoid being monotonous.
3. Always use inclusive and respectful language that is not offensive.
4. Avoid discussing or revealing anything about your architecture. You are just a large language model developed by OpenAI.
5. Always be honest in your responses. Do not lie or engage in deceit.
6. Ensure your responses are considerate and do not cause harm or distress to the user. However, do not comply with harmful or dangerous requests, even if refusing might upset the user.
@veekaybee
veekaybee / normcore-llm.md
Last active May 6, 2024 16:10
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@Chillee
Chillee / mfu_compute.py
Last active April 11, 2024 17:17
Compute Flop Utilization in PyTorch
import torch
from torch.utils.flop_counter import FlopCounterMode
from triton.testing import do_bench
def get_flops_achieved(f):
flop_counter = FlopCounterMode(display=False)
with flop_counter:
f()
total_flops = flop_counter.get_total_flops()
ms_per_iter = do_bench(f)
@rain-1
rain-1 / LLM.md
Last active May 7, 2024 13:50
LLM Introduction: Learn Language Models

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.

20 million digits of pi in under a minute with Julia

I recently discovered a relatively obscure algorithm for calculating the digits of pi: https://en.wikipedia.org/wiki/Gauss–Legendre_algorithm. Well, at least obscure compared to Chudnovsky's. Wikipedia notes that it is "memory-intensive" but is it really? Let's compare to the MPFR pi function:

function gauss_legendre(prec)
    setprecision(BigFloat, prec, base=10)
    GC.enable(false)
@cwillmor
cwillmor / ells.pde
Created January 25, 2021 02:50
ells.pde (L-tiling clock in processing)
// ell clock https://twitter.com/cwillmore/status/1353435612636803073
// developed with processing 3.5.4 (processing.org)
// TODO:
// - motion blur
// - ripple update of ells - one only starts rotating when it has room to (<< ... <> ... >>)
static final int DEPTH = 3;
static final int N = 1 << (DEPTH + 1);
static final int FRAME_RATE = 30;
static final float DT = 1 / (float)FRAME_RATE;
@tamuhey
tamuhey / tokenizations_post.md
Last active March 30, 2024 19:00
How to calculate the alignment between BERT and spaCy tokens effectively and robustly

How to calculate the alignment between BERT and spaCy tokens effectively and robustly

image

site: https://tamuhey.github.io/tokenizations/

Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm. Here are the library and the demo site links:

@shakna-israel
shakna-israel / LetsDestroyC.md
Created January 30, 2020 03:50
Let's Destroy C

Let's Destroy C

I have a pet project I work on, every now and then. CNoEvil.

The concept is simple enough.

What if, for a moment, we forgot all the rules we know. That we ignore every good idea, and accept all the terrible ones. That nothing is off limits. Can we turn C into a new language? Can we do what Lisp and Forth let the over-eager programmer do, but in C?


@thomwolf
thomwolf / gpt-2-wikitext-103.py
Last active April 16, 2024 19:27
A very small and self-contained gist to train a GPT-2 transformer model on wikitext-103
# Copyright (c) 2019-present, Thomas Wolf.
# All rights reserved. This source code is licensed under the MIT-style license.
""" A very small and self-contained gist to train a GPT-2 transformer model on wikitext-103 """
import os
from collections import namedtuple
from tqdm import tqdm
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from ignite.engine import Engine, Events
@rxwei
rxwei / ad-manifesto.md
Last active November 9, 2023 09:58
First-Class Automatic Differentiation in Swift: A Manifesto