OmaymaS

## RLHF.md

      
              1 file
            
          
              8 forks
            
          
                39 comments
              
            
              120 stars
            
          
                JoaoLages
                / RLHF.md
            
            
              Last active
              December 11, 2024 04:17
            
              
                Reinforcement Learning from Human Feedback (RLHF) - a simplified explanation 
              
          
    Maybe you've heard about this technique but you haven't completely understood it, especially the PPO part. This explanation might help.
We will focus on text-to-text language models 📝, such as GPT-3, BLOOM, and T5. Models like BERT, which are encoder-only, are not addressed.
Reinforcement Learning from Human Feedback (RLHF) has been successfully applied in ChatGPT, hence its major increase in popularity. 📈
RLHF is especially useful in two scenarios 🌟:

You can’t create a good loss function

Example: how do you calculate a metric to measure if the model’s output was funny?


You want to train with production data, but you can’t easily label your production data


## R_datasets.md

      
              1 file
            
          
              2 forks
            
          
                0 comments
              
            
              22 stars
            
          
                zross
                / R_datasets.md
            
            
              Last active
              March 30, 2020 05:50
            
              
                Easy sample data available in R packages (and related)
              
          
Eurostat, World Bank and others: https://ikashnitsky.github.io/2017/data-acquisition-two/
Star Wars data: in dplyr, http://dplyr.tidyverse.org/reference/starwars.html
Baby names data: babynames package, https://cran.r-project.org/web/packages/babynames/index.html
Movies data: https://cran.r-project.org/web/packages/ggplot2movies/index.html ggplot2movies package
Game of Thrones screen time: https://github.com/Preetish/GoT_screen_time
Open Bike Data: https://github.com/ropensci/bikedata
Tons of data through 538: https://cran.r-project.org/web/packages/fivethirtyeight/vignettes/fivethirtyeight.html
Public health data England: fingertipsR, https://cran.r-project.org/web/packages/fingertipsR/
Financial data via Quandl: https://www.quandl.com/tools/r
Cyclones: https://github.com/ropensci/rrricanesdat


## gist:91a829ea21550a7a7d9469220a7c2f73
[core]
	excludesfile = ~/.gitignore_global
	pager = diff-so-fancy | less --tabs=4 -RFX

[difftool "sourcetree"]
	cmd = opendiff \"$LOCAL\" \"$REMOTE\"
	path =

[alias]
    aa = add --all

## online_google_auth.r
## GUIDE TO AUTH2 Authentication in R Shiny (or other online apps)
##
## Mark Edmondson 2015-02-16 - @HoloMarkeD | http://markedmondson.me
##
## v 0.1
##
##
## Go to the Google API console and activate the APIs you need. https://code.google.com/apis/console/?pli=1
## Get your client ID, and client secret for use below, and put in the URL of your app in the redirect URIs
##  e.g. I put in https://mark.shinyapps.io/ga-effect/ for the GA Effect app,

## introrx.md

      
              7 files
            
          
              2514 forks
            
          
                474 comments
              
            
              21991 stars
            
          
                staltz
                / introrx.md
            
            
              Last active
              December 20, 2024 15:49
            
              
                The introduction to Reactive Programming you've been missing
              
          
    The introduction to Reactive Programming you've been missing

(by @andrestaltz)

This tutorial as a series of videos

If you prefer to watch video tutorials with live-coding, then check out this series I recorded with the same contents as in this article: Egghead.io - Introduction to Reactive Programming.


## knitr_defaults.R
# My preferred defaults (may be changed in individual chunks)
opts_chunk$set(tidy=FALSE, warning=FALSE, message=FALSE, cache=TRUE,
               comment=NA, verbose=TRUE, fig.width=6, fig.height=4)

# Name the cache path and fig.path based on filename...
opts_chunk$set(fig.path = paste("figure/",
                                gsub(".Rmd", "", knitr:::knit_concord$get('infile')),
                                "-", sep=""),
               cache.path = paste(gsub(".Rmd", "", knitr:::knit_concord$get('infile') ),
                                "/", sep=""))
	[core]
	excludesfile = ~/.gitignore_global
	pager = diff-so-fancy \| less --tabs=4 -RFX

	[difftool "sourcetree"]
	cmd = opendiff \"$LOCAL\" \"$REMOTE\"
	path =

	[alias]
	aa = add --all
	## GUIDE TO AUTH2 Authentication in R Shiny (or other online apps)
	##
	## Mark Edmondson 2015-02-16 - @HoloMarkeD \| http://markedmondson.me
	##
	## v 0.1
	##
	##
	## Go to the Google API console and activate the APIs you need. https://code.google.com/apis/console/?pli=1
	## Get your client ID, and client secret for use below, and put in the URL of your app in the redirect URIs
	## e.g. I put in https://mark.shinyapps.io/ga-effect/ for the GA Effect app,
	# My preferred defaults (may be changed in individual chunks)
	opts_chunk$set(tidy=FALSE, warning=FALSE, message=FALSE, cache=TRUE,
	comment=NA, verbose=TRUE, fig.width=6, fig.height=4)

	# Name the cache path and fig.path based on filename...
	opts_chunk$set(fig.path = paste("figure/",
	gsub(".Rmd", "", knitr:::knit_concord$get('infile')),
	"-", sep=""),
	cache.path = paste(gsub(".Rmd", "", knitr:::knit_concord$get('infile') ),
	"/", sep=""))