Skip to content

Instantly share code, notes, and snippets.

View allanj's full-sized avatar
🎯
Focusing

Allan Jie allanj

🎯
Focusing
View GitHub Profile
@allanj
allanj / demo_sft_with_accelerate.py
Last active January 11, 2024 13:05
demo_sft_script
from torch.utils.data import DataLoader
from transformers import AutoTokenizer, PreTrainedTokenizerFast, set_seed, AutoModelForCausalLM, AutoConfig
from tqdm import tqdm
import argparse
import torch
import torch.nn as nn
import logging
from typing import Dict, Tuple
from accelerate import Accelerator, DistributedDataParallelKwargs
from accelerate.logging import get_logger
@CharlyWargnier
CharlyWargnier / inspirational_quotes_app.py
Last active March 24, 2024 19:17
A Streamlit app for navigating through inspirational quotes with "Next" and "Previous" buttons.
import streamlit as st
if 'count' not in st.session_state:
st.session_state.count = 0
if 'quotes' not in st.session_state:
st.session_state.quotes = [
"Life is what happens when you're busy making other plans. — John Lennon",
"Get busy living or get busy dying. — Stephen King",
"You only live once, but if you do it right, once is enough. — Mae West",

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

@allanj
allanj / command.md
Created August 15, 2022 02:36
Useful Command in Linux

Kill process contain certain string

For example, kill command contains python3 -u experiment_main.py

kill $(ps aux | grep '[p]ython3 -u experiment_main.py' | awk '{print $2}')

Hadoop List files by date

hdfs dfs -ls / | sort -k6,7
@Mason-McGough
Mason-McGough / pointer_network.py
Last active October 2, 2023 09:55
Pointer network attention architecture in PyTorch
class PointerNetwork(nn.Module):
"""
From "Pointer Networks" by Vinyals et al. (2017)
Adapted from pointer-networks-pytorch by ast0414:
https://github.com/ast0414/pointer-networks-pytorch
Args:
n_hidden: The number of features to expect in the inputs.
"""
new_adj = torch.triu(adj, diagonal=1)
# print(new_adj[-1, :10, :10])
new_adj1 = torch.bmm(new_adj, new_adj)
# print(new_adj1[-1, :10, :10])
new_adj_or = torch.clamp((new_adj + new_adj1), max=1)
# print('new_adj_or', new_adj_or[-1, :10, :10])
loop = 1
while not torch.equal(torch.bmm(new_adj_or, new_adj_or), new_adj1):
new_adj1 = torch.bmm(new_adj_or, new_adj_or)
new_adj_or = torch.clamp((new_adj_or + new_adj1), max=1)
@ines
ines / Install
Last active September 21, 2023 17:14
Streamlit + spaCy
pip install streamlit
pip install spacy
python -m spacy download en_core_web_sm
python -m spacy download en_core_web_md
python -m spacy download de_core_news_sm
@allanj
allanj / iob1toiob2_funct.py
Last active March 29, 2021 14:37
Convert the tags from IOB1 to IOB2 tagging scheme
"""
IOB1: O I I B I
IOB2: O B I B I
"""
from typing import List
def iob2(tags: List[str]):
"""
Check that tags have a valid IOB format.
@allanj
allanj / word2vec_bin2txt.py
Created March 12, 2019 05:51
Convert the word2vec bin file to txt
#
# @author: Allan
#
def convert(input, output):
from gensim.models.keyedvectors import KeyedVectors
embedding = KeyedVectors.load_word2vec_format(input, binary=True)
f= open(output, 'w', encoding='utf-8')
@amit-chahar
amit-chahar / download-script.sh
Last active February 20, 2023 12:57
Scirpt to download files from Google drive using curl (Detailed explanation can be read here: https://stackoverflow.com/a/49444877/4043524)
#!/bin/bash
fileid="FILEIDENTIFIER"
filename="FILENAME"
curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=${fileid}" > /dev/null
curl -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=${fileid}" -o ${filename}