Skip to content

Instantly share code, notes, and snippets.


Hadley Wickham hadley

View GitHub Profile
hadley /
Last active Oct 13, 2021
A lighthearted and overly flattering comparison of R to other programming languages: what other comparisons have I missed?

R is like:

  • Clojure, because most objects are immutable
  • Scala, because it combined functional and OO techniques
  • Node.js, because the interpreter is single threaded
  • PHP, because it favours pragmatism over purity
  • Lisp, because it's homoiconic
  • Perl, because OO is (mostly) implemented using the language itself
hadley / shiny-oauth.r
Last active Oct 6, 2021
Sketch of shiny + oauth
View shiny-oauth.r
# OAuth setup --------------------------------------------------------
# Most OAuth applications require that you redirect to a fixed and known
# set of URLs. Many only allow you to redirect to a single URL: if this
# is the case for, you'll need to create an app for testing with a localhost
# url, and an app for your deployed app.
View lazy.frame.R
# lazy data frame
# like idata.frame, but exploits lazy evaluation + macro code generation
# instead of active bindings
lazy.frame <- function(df, enclos=parent.frame(), ...) UseMethod("lazy.frame")
lazy.frame.lazy.frame <- function(df, ...) df
View dplyr-summarise.R
# What's the most natural way to express this code in base R?
library(dplyr, warn.conflicts = FALSE)
mtcars %>%
group_by(cyl) %>%
summarise(mean = mean(disp), n = n())
#> # A tibble: 3 x 3
#> cyl mean n
#> <dbl> <dbl> <int>
#> 1 4 105. 11
#> 2 6 183. 7
View tidy-cocktails.R
# Code for quick exploration of
# Video at
cocktails <- readr::read_csv("boston_cocktails.csv")
# Are name and row_id equivalent? -----------------------------------------
View git-path.r
git_prompt <- function() {
git <- find_git()
if (is.null(git)) stop("Git not installed", call. = FALSE)
if (!in_git_repo(git)) stop("Not in git repo", call. = FALSE)
update <- function(...) update_prompt(git)
hadley /
Created Feb 13, 2015
Advise for teaching an R workshop

I think the two most important messages that people can get from a short course are:

a) the material is important and worthwhile to learn (even if it's challenging), and b) it's possible to learn it!

For those reasons, I usually start by diving as quickly as possible into visualisation. I think it's a bad idea to start by explicitly teaching programming concepts (like data structures), because the pay off isn't obvious. If you start with visualisation, the pay off is really obvious and people are more motivated to push past any initial teething problems. In stat405, I used to start with some very basic templates that got people up and running with scatterplots and histograms - they wouldn't necessary understand the code, but they'd know which bits could be varied for different effects.

Apart from visualisation, I think the two most important topics to cover are tidy data (i.e. + tidyr) and data manipulation (dplyr). These are both important for when people go off and apply

hadley /
Created May 31, 2018
Walk through of R's condition handler C code

Registering handlers

The key C function that powers both tryCatch() and withCallingHandlers() is do_addCondHands(). It creates handler object with mkHandlerEntry() then stores in the handler stack for the current frame. (More precisely it writes to R_HandlerStack, a global variable that is an alias to c->handlerstack)

The five R arguments to do_addCondHands() are classes, handlers, parentenv, target, and calling. These are combined with a result object (a list of length 4, returned by the exiting handler to doTryCatch()) to create the handler objects which have five components:

  • The class, accessed with ENTRY_CLASS(e). A string given a class name; the handler will match all conditions that contain this component in their class vector.
View clustergram-had.r
ks.default <- function(rows) seq(2, max(3, rows %/% 4))
many_kmeans <- function(x, ks = ks.default(nrow(x)), ...) {
ldply(seq_along(ks), function(i) {
cl <- kmeans(x, centers = ks[i], ...)
data.frame(obs = seq_len(nrow(x)), i = i, k = ks[i], cluster = cl$cluster)
all_hclust <- function(x, ks = ks.default(nrow(x)), point.dist = "euclidean", cluster.dist = "ward") {
hadley /
Created Sep 27, 2013
My first stab at a basic R programming curriculum. I think teaching just these topics without overall motivating examples would be extremely boring, but if you're a self-taught R user, this might be useful to help spot your gaps.


  • I've tried to break up in to separate pieces, but it's not always possible: e.g. knowledge of data structures and subsetting are tidy intertwined.

  • Level of Bloom's taxonomy listed in square brackets, e.g. Few categories currently assess components higher in the taxonomy.

Programming R curriculum

Data structures