Malcolm Greaves malcolmgreaves

## rl-for-llms.md

      
              1 file
            
          
              22 forks
            
          
              11 comments
            
          
              527 stars
            
          
                yoavg
                / rl-for-llms.md
            
            
              Last active
              April 11, 2024 21:25
            
          
    Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.
Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback".
I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

  
## deploy_ide_via_notebook_api.sh
export CONTAINER_URI="gcr.io/deeplearning-platform-release/experimental.theia.1-7"
export INSTANCE_NAME=...
export PROJECT_NAME=...
export IMAGE_PROJECT="deeplearning-platform-release"
export IMAGE_FAMILY="theia-container-experimental"
export MACHINE_TYPE=... #"n1-standard-4"
export ZONE=.... #"us-central1-a"
gcloud notebooks instances create "${INSTANCE_NAME}" \
        --project="${PROJECT_NAME}" \
        --location="${ZONE}" \

## start_ide.sh
export CONTAINER_URI="gcr.io/deeplearning-platform-release/experimental.theia.1-7"
export INSTANCE_NAME=...
export PROJECT_NAME=...
export IMAGE_PROJECT="deeplearning-platform-release"
export IMAGE_FAMILY="theia-container-experimental"
export MACHINE_TYPE=... #"n1-standard-4"
export ZONE=... #"us-central1-a"
gcloud compute instances create "${INSTANCE_NAME}" \
        --project="${PROJECT_NAME}" \
        --zone="${ZONE}" \

## run_ner.py
from __future__ import absolute_import, division, print_function

import argparse
import glob
import logging
import os
import random

import numpy as np
import torch

## gist:7f876c6ad4e4adcd36caea98b159b6f6
import torch
from torch_geometric.data import InMemoryDataset


class MyOwnDataset(InMemoryDataset):
    def __init__(self, root, transform=None, pre_transform=None):
        super(MyOwnDataset, self).__init__(root, transform, pre_transform)
        self.data, self.slices = torch.load(self.processed_paths[0])

    @property

## sshfs-gcp-instance-osx.md

      
              1 file
            
          
              2 forks
            
          
              6 comments
            
          
              28 stars
            
          
                mollymerp
                / sshfs-gcp-instance-osx.md
            
            
              Last active
              January 10, 2024 14:52
            
              
                How to mount a GCP compute instance filesystem locally using `sshfs` on MacOS
              
          
    How to mount a GCP compute instance filesystem locally using sshfs

This guide assumes that:

you already have an instance set up on GCP that you want to mount locally
the GCP CLI (gcloud) is installed on your local machine
you have authenticated locally to your google account gcloud auth login


make sure your gcloud config is correct for the instance you're trying to access:


## mode-presto-linter.js
// ==UserScript==
// @name         PrestoDB Linter v0.1.3
// @namespace    http://tampermonkey.net/
// @version      0.1
// @description  try to take over the world!
// @author       Steven Hao
// @match        https://modeanalytics.com/editor/*
// @grant        none
// ==/UserScript==

## pad_packed_demo.py
import torch
from torch import LongTensor
from torch.nn import Embedding, LSTM
from torch.autograd import Variable
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence

## We want to run LSTM on a batch of 3 character sequences ['long_str', 'tiny', 'medium']
#
#     Step 1: Construct Vocabulary
#     Step 2: Load indexed data (list of instances, where each instance is list of character indices)

## download.sh
#!/bin/bash

# Bash function to download a file with wget, showing a progress bar and enables
# re-downloading if interrupted. Also can automatically determine filename from
# supplied URL or override from command line.
# First argument is URL.
# Second optional argument is filename.
download () {
    local URL="$1"
    local FI="$2"

## SelfAttention.py
class SelfAttention(nn.Module):
    def __init__(self, attention_size, batch_first=False, non_linearity="tanh"):
        super(SelfAttention, self).__init__()

        self.batch_first = batch_first
        self.attention_weights = Parameter(torch.FloatTensor(attention_size))
        self.softmax = nn.Softmax(dim=-1)

        if non_linearity == "relu":
            self.non_linearity = nn.ReLU()
	export CONTAINER_URI="gcr.io/deeplearning-platform-release/experimental.theia.1-7"
	export INSTANCE_NAME=...
	export PROJECT_NAME=...
	export IMAGE_PROJECT="deeplearning-platform-release"
	export IMAGE_FAMILY="theia-container-experimental"
	export MACHINE_TYPE=... #"n1-standard-4"
	export ZONE=.... #"us-central1-a"
	gcloud notebooks instances create "${INSTANCE_NAME}" \
	--project="${PROJECT_NAME}" \
	--location="${ZONE}" \
	from __future__ import absolute_import, division, print_function

	import argparse
	import glob
	import logging
	import os
	import random

	import numpy as np
	import torch
	import torch
	from torch_geometric.data import InMemoryDataset


	class MyOwnDataset(InMemoryDataset):
	def __init__(self, root, transform=None, pre_transform=None):
	super(MyOwnDataset, self).__init__(root, transform, pre_transform)
	self.data, self.slices = torch.load(self.processed_paths[0])

	@property
	// ==UserScript==
	// @name PrestoDB Linter v0.1.3
	// @namespace http://tampermonkey.net/
	// @version 0.1
	// @description try to take over the world!
	// @author Steven Hao
	// @match https://modeanalytics.com/editor/*
	// @grant none
	// ==/UserScript==
	import torch
	from torch import LongTensor
	from torch.nn import Embedding, LSTM
	from torch.autograd import Variable
	from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence

	## We want to run LSTM on a batch of 3 character sequences ['long_str', 'tiny', 'medium']
	#
	# Step 1: Construct Vocabulary
	# Step 2: Load indexed data (list of instances, where each instance is list of character indices)
	#!/bin/bash

	# Bash function to download a file with wget, showing a progress bar and enables
	# re-downloading if interrupted. Also can automatically determine filename from
	# supplied URL or override from command line.
	# First argument is URL.
	# Second optional argument is filename.
	download () {
	local URL="$1"
	local FI="$2"
	class SelfAttention(nn.Module):
	def __init__(self, attention_size, batch_first=False, non_linearity="tanh"):
	super(SelfAttention, self).__init__()

	self.batch_first = batch_first
	self.attention_weights = Parameter(torch.FloatTensor(attention_size))
	self.softmax = nn.Softmax(dim=-1)

	if non_linearity == "relu":
	self.non_linearity = nn.ReLU()