Skip to content

Instantly share code, notes, and snippets.

View malcolmgreaves's full-sized avatar

Malcolm Greaves malcolmgreaves

View GitHub Profile

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

Recovering deleted files in Ubuntu with ext4 filesystem

Recently, I deleted some files by mistake in a Ubuntu machine with an ext4 fs. These notes document the steps I took to get them back.

Important

  • this procedure assumes that the partition that contained the deleted files is different from the root partition, as that was the scenario with which I had to deal (deleted files were in my home dir). The procedure needs that the partition that contained the files is unmounted, so if the deleted files were in the root partition, the process would be a bit different (e.g. storing the fs journal in a USB stick, using a live CD/USB to boot and issue the commands, etc.)
  • if something is not clear, you need more information, etc. check the sources below

With that out the way, let's begin.

@tsiege
tsiege / The Technical Interview Cheat Sheet.md
Last active April 16, 2024 23:05
This is my technical interview cheat sheet. Feel free to fork it or do whatever you want with it. PLEASE let me know if there are any errors or if anything crucial is missing. I will add more links soon.

ANNOUNCEMENT

I have moved this over to the Tech Interview Cheat Sheet Repo and has been expanded and even has code challenges you can run and practice against!






\

@andreyvit
andreyvit / tmux.md
Created June 13, 2012 03:41
tmux cheatsheet

tmux cheat sheet

(C-x means ctrl+x, M-x means alt+x)

Prefix key

The default prefix is C-b. If you (or your muscle memory) prefer C-a, you need to add this to ~/.tmux.conf:

remap prefix to Control + a

@santisbon
santisbon / Search my gists.md
Last active April 14, 2024 17:24
How to #search gists

Enter this in the search box along with your search terms:

Get all gists from the user santisbon.
user:santisbon

Find all gists with a .yml extension.
extension:yml

Find all gists with HTML files.
language:html

@rob-murray
rob-murray / add_intellij_launcer
Last active April 10, 2024 15:42
Add Intellij launcher shortcut and icon for ubuntu
// create file:
sudo vim /usr/share/applications/intellij.desktop
// add the following
[Desktop Entry]
Version=13.0
Type=Application
Terminal=false
Icon[en_US]=/home/rob/.intellij-13/bin/idea.png
Name[en_US]=IntelliJ
@yossorion
yossorion / what-i-wish-id-known-about-equity-before-joining-a-unicorn.md
Last active April 7, 2024 22:55
What I Wish I'd Known About Equity Before Joining A Unicorn

What I Wish I'd Known About Equity Before Joining A Unicorn

Disclaimer: This piece is written anonymously. The names of a few particular companies are mentioned, but as common examples only.

This is a short write-up on things that I wish I'd known and considered before joining a private company (aka startup, aka unicorn in some cases). I'm not trying to make the case that you should never join a private company, but the power imbalance between founder and employee is extreme, and that potential candidates would

@LeCoupa
LeCoupa / bash-cheatsheet.sh
Last active March 31, 2024 11:57
Bash CheatSheet for UNIX Systems --> UPDATED VERSION --> https://github.com/LeCoupa/awesome-cheatsheets
#!/bin/bash
#####################################################
# Name: Bash CheatSheet for Mac OSX
#
# A little overlook of the Bash basics
#
# Usage:
#
# Author: J. Le Coupanec
# Date: 2014/11/04
@Chandler
Chandler / slack_history.py
Last active March 26, 2024 14:35
Download Slack Channel/PrivateChannel/DirectMessage History
print("UPDATE AUG 2023: this script is beyond old and broken")
print("You may find interesting and more up to date resources in the comments of the gist")
exit()
from slacker import Slacker
import json
import argparse
import os
# This script finds all channels, private channels and direct messages
@gruber
gruber / Liberal Regex Pattern for All URLs
Last active March 20, 2024 20:28
Liberal, Accurate Regex Pattern for Matching All URLs
The regex patterns in this gist are intended to match any URLs,
including "mailto:foo@example.com", "x-whatever://foo", etc. For a
pattern that attempts only to match web URLs (http, https), see:
https://gist.github.com/gruber/8891611
# Single-line version of pattern:
(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))