Skip to content

Instantly share code, notes, and snippets.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
{"seed": 374894, "temp": 0.7, "top_p": 0.0, "top_k": 40, "repetition_penalty": 1.1764705882352942, "max_seq_len": 512, "max_gen_len": 511}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Loading
Loaded in 8.72 seconds
============== sample 1 =================
I believe the meaning of life is to grow, learn and give.
@shawwn
shawwn / llama_sizes.txt
Created March 5, 2023 18:07
The size of each file distributed with LLaMA, for reference. See https://github.com/shawwn/llama-dl
./tokenizer_checklist.chk 50
./tokenizer.model 499723
./7B/checklist.chk 100
./7B/consolidated.00.pth 13476939516
./7B/params.json 101
./13B/checklist.chk 154
./13B/consolidated.00.pth 13016334699
./13B/consolidated.01.pth 13016334699
./13B/params.json 101
./30B/checklist.chk 262
@shawwn
shawwn / adamsp.py
Created February 9, 2023 18:44
AdamSP optimizer
def lerp(a, b, t):
return (b - a) * t + a
@optimizer
def adamsp(step_size=1e-1, b1=0.5):
"""Construct optimizer triple for AdamSP.
Args:
step_size: positive scalar, or a callable representing a step size schedule
that maps the iteration index to a positive scalar (default 1e-1).
@shawwn
shawwn / adam.py
Last active February 15, 2023 19:48
Reformulating Adam optimizer to gain an intuition about what it's doing.
def lerp(a, b, t):
return (b - a) * t + a
def bias(i, x, beta):
return 1 - jnp.asarray(beta, x.dtype) ** (i + 1)
@optimizer
def adam(step_size, b1=0.9, b2=0.999, eps=1e-8) -> OptimizerResult:
"""Construct optimizer triple for Adam.
@shawwn
shawwn / hn_ignorance.js
Created January 31, 2023 23:47 — forked from sillysaurus/hn_ignorance.js
HN Ignorance Is Bliss
// ==UserScript==
// @name HN Ignorance Is Bliss
// @description Hide your comment scores and karma counters. See https://news.ycombinator.com/item?id=14456203
// @author sillysaurus3
// @version 1.0
// @match *://news.ycombinator.com/*
// @grant none
// @downloadURL https://gist.githubusercontent.com/sillysaurus/4d917e925548e4c7ec6f6bb96c94ef5c/raw
// @updateURL https://gist.githubusercontent.com/sillysaurus/4d917e925548e4c7ec6f6bb96c94ef5c/raw
// ==/UserScript==
@shawwn
shawwn / hn_ignorance.js
Last active February 1, 2023 00:05 — forked from sillysaurus/hn_ignorance.js
HN Ignorance Is Bliss
// ==UserScript==
// @name HN Ignorance Is Bliss
// @description Hide your comment scores and karma counters. Installation instructions at https://news.ycombinator.com/item?id=14456203
// @author sillysaurus3
// @version 1.0
// @match *://news.ycombinator.com/*
// @grant none
// @downloadURL https://gist.githubusercontent.com/sillysaurus/4d917e925548e4c7ec6f6bb96c94ef5c/raw
// @updateURL https://gist.githubusercontent.com/sillysaurus/4d917e925548e4c7ec6f6bb96c94ef5c/raw
// ==/UserScript==
#!/bin/bash
# wget https://gist.githubusercontent.com/shawwn/88f64f7294c5a2e5e009d277a429ff2e/raw/tpu_setup.sh
# bash tpu_setup.sh
set -x
pip3 install --upgrade pip
# upgrade to nightly jax.
pip3 install --force-reinstall --pre -U -f https://storage.googleapis.com/jax-releases/libtpu_releases.html 'jax[tpu]' 'jaxlib'
pip3 install rich
#pragma once
#include "maybe.hpp"
#include <concepts>
#include <numbers>
#include <limits>
#include <thread>
#include <sstream>
#include <string>
@shawwn
shawwn / demangle.cpp
Created May 29, 2022 22:30
Code for printing out C++ stack traces
/* Copyright 2017 The TensorFlow Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
Process: clangd [61497]
Path: /Applications/CLion.app/Contents/bin/clang/mac/clangd
Identifier: clangd
Version: 0
Code Type: ARM-64 (Native)
Parent Process: clion [21849]
Responsible: clion [21849]
User ID: 501
Date/Time: 2022-05-10 11:06:21.935 +0500