Skip to content

Instantly share code, notes, and snippets.

@jameshfisher
jameshfisher / trie.go
Created May 15, 2010 19:05
A trie structure, storing []byte elements, implemented in Go
package trie
import (
"container/vector"
"sort"
)
// A 'set' structure, that can hold []byte objects.
// For any one []byte instance, it is either in the set or not.
type Trie struct {
@spicycode
spicycode / tmux.conf
Created September 20, 2011 16:43
The best and greatest tmux.conf ever
# 0 is too far from ` ;)
set -g base-index 1
# Automatically set window title
set-window-option -g automatic-rename on
set-option -g set-titles on
#set -g default-terminal screen-256color
set -g status-keys vi
set -g history-limit 10000
@erikreagan
erikreagan / mac-apps.md
Created August 4, 2012 19:18
Mac developer must-haves

Mac web developer apps

This gist's comment stream is a collection of webdev apps for OS X. Feel free to add links to apps you like, just make sure you add some context to what it does — either from the creator's website or your own thoughts.

— Erik

@naholyr
naholyr / _service.md
Created December 13, 2012 09:39
Sample /etc/init.d script

Sample service script for debianoids

Look at LSB init scripts for more information.

Usage

Copy to /etc/init.d:

# replace "$YOUR_SERVICE_NAME" with your service's name (whenever it's not enough obvious)
@rokirx
rokirx / fp_resources.md
Last active December 27, 2015 09:19
Functional Programming Learning Resources
@bjorgvino
bjorgvino / yosemite ntfs read+write.txt
Last active December 2, 2021 03:58
osxfuse + ntfs-3g + Yosemite = NTFS R/W
Remove osxfuse if installed via homebrew:
> brew uninstall osxfuse
Install osxfuse binary and choose to install the MacFUSE compatibility layer:
http://sourceforge.net/projects/osxfuse/files/latest/download?source=files
Reboot (optional but recommended by osxfuse)
Install ntfs-3g via homebrew:
> brew update && brew install ntfs-3g
@JeffBelback
JeffBelback / docker-destroy-all.sh
Last active May 25, 2024 20:19
Destroy all Docker Containers and Images
#!/bin/bash
# Stop all containers
containers=`docker ps -a -q`
if [ -n "$containers" ] ; then
docker stop $containers
fi
# Delete all containers
containers=`docker ps -a -q`
if [ -n "$containers" ]; then
docker rm -f -v $containers

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much