Skip to content

Instantly share code, notes, and snippets.

View marclove's full-sized avatar

Marc Love marclove

View GitHub Profile
@ChrisHayduk
ChrisHayduk / merge_qlora_with_quantized_model.py
Last active April 10, 2024 00:36
Merging QLoRA weights with quantized model
"""
The code below combines approaches published by both @eugene-yh and @jinyongyoo on Github.
Thanks for the contributions guys!
"""
import torch
import peft
@the-crypt-keeper
the-crypt-keeper / test.py
Last active September 27, 2023 01:55
llama2 chat prompt format reverse engineering
#
# this is adapted from https://github.com/facebookresearch/llama/blob/main/llama/generation.py#L213
# the tokenizer is replaced with ord() to make it easier to see whats actually happening
from typing_extensions import TypedDict, Literal
from typing import List, Optional
Role = Literal["system", "user", "assistant"]
class Message(TypedDict):
@jaretburkett
jaretburkett / Humans v1 - Token Counts
Created June 27, 2023 02:28
Humans v1 - Token Counts
This file has been truncated, but you can view the full file.
smiling mouth revealing white straight teeth - 24426
anxious expression with biting lower lip - 17012
shallow depth of field - 16806
early childhood age - 14067
social worker - 12566
smiling mouth revealing slightly crooked teeth - 12329
broad grin revealing straight white teeth - 11336
pediatrician - 11212
preschooler age - 10873
headshot - 10462

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

@moyix
moyix / CodeGen_GPTJ_Conversion.md
Last active January 5, 2024 12:50
How to convert the SalesForce CodeGen models to GPT-J

Using Linear Algebra to Convert a Large Code Model

Background

The SalesForce CodeGen models are a family of large language models trained on a large amount of natural language data and then fine-tuned on specialized datasets of code. Models of size 350M, 2B, 6B, and 16B parameters are provided in three flavors:

  • nl, the base model trained on The Pile, a large natural language dataset compiled by EleutherAI
  • multi, which is fine-tuned from the nl model on a dataset of code in multiple languages, scraped from GitHub, and
  • mono, which is fine-tuned from the multi model on Python code only.
@tomholford
tomholford / install_pg_gem.md
Last active April 18, 2024 20:37
Install postgresql gem `pg` on macOS

Installing pg gem on macOS

If you're trying to install the postgresql gem pg and it is failing with the following error message:

Installing pg 1.2.3 with native extensions
Gem::Ext::BuildError: ERROR: Failed to build gem native extension.

    current directory: ~/.rbenv/versions/3.0.0/lib/ruby/gems/3.0.0/gems/pg-1.2.3/ext
~/.rbenv/versions/3.0.0/bin/ruby -I ~/.rbenv/versions/3.0.0/lib/ruby/3.0.0 -r ./siteconf20210125-97201-pycpo.rb extconf.rb
@raorao
raorao / pr-comment-emojis.md
Last active March 27, 2024 23:21
PR Comment Emojis

Any top-level comment on pull request ought be tagged with one of four emojis:

  • for a non-blocking comment that asks for clarification. The pull request author must answer the question before the pull request is merged, but does not have to wait for the comment author to re-review before merging.

  • 🎨 for a non-blocking comment that proposes a refactor or cleanup. The pull request author does not have to address the comment for the pull request to merge.

  • ⚠️ for a blocking comment that must be addressed before the pull request can merge. The comment's author should leave a Request Changes review, and is responsible for re-reviewing once the pull request author has addressed the issue.

  • 😻 for a comment that compliments the author for their work.

@laser
laser / 0a.hs
Last active April 13, 2023 14:06
An Introduction to ADTs and Structural Pattern Matching in TypeScript
data Failable t e = Success t | Failure e
@augustj
augustj / gist:4f2e0db660aec78150fa
Created November 25, 2014 23:42
A script to build, sign and upload to TestFlight
#!/bin/bash
#
# testflightapp.com tokens
# This one is per user (and is August's)
API_TOKEN="XXX"
TEAM_TOKEN="XXX"
PRODUCT_NAME="myAppName"