Skip to content

Instantly share code, notes, and snippets.

@johnjosephhorton
johnjosephhorton / parseHTMLwithR.R
Created November 11, 2010 16:30
How to parse HTML tables using R
# from: http://learnr.wordpress.com/2010/01/21/ggplot2-crayola-crayon-colours/
library(XML)
library(ggplot2)
theurl <- "http://en.wikipedia.org/wiki/List_of_Crayola_crayon_colors"
html <- htmlParse(theurl)
sched <- readHTMLTable(html, stringsAsFactors = FALSE)
crayola <- readHTMLTable(html, stringsAsFactors = FALSE)[[2]]
crayola <- crayola[, c("Hex Code", "Issued", "Retired")]
names(crayola) <- c("colour", "issued", "retired")
@johnjosephhorton
johnjosephhorton / toJSONdf.R
Created November 12, 2010 16:28
R function for turning a data frame into JSON suitable for use in Protovis
entity.write <- function(x){
out = as.character(x)
out = paste("\"", x, "\"", sep="")
out
}
toJSONdf <- function(x){
str = "["
for(row in 1:(dim(x)[1])){
first_entry = TRUE
@johnjosephhorton
johnjosephhorton / randomization_in_HIT.html
Created November 14, 2010 23:35
Some code for doing simple A/B-type randomization for locally-hosted HITs on Amazon Mechanical Turk (MTurk)
<script>
function rand(){
if (Math.random() > .5){
document.getElementById('boss').style.visibility="hidden";
} else {
document.getElementById('requester').style.visibility="hidden";
}
}
</script>
@johnjosephhorton
johnjosephhorton / mturk_distro.R
Created November 17, 2010 16:04
Code for plotting distribution of HITs/worker on MTurk
library(ggplot2)
library(reldist)
data <- read.csv("User_3144_workers.csv")
u = dim(data)[1]
png("mturk_distro.png")
qplot(Number.of.HITs.approved.or.rejected, data = data) + scale_x_log2() +
xlab("Number of HITs Approved or Rejected \n (John Horton's Account)") +
ylab("Count of MTurk Workers") +
@johnjosephhorton
johnjosephhorton / jsonp.html
Created November 19, 2010 17:07
JSONP Cats Example
<script src="http://code.jquery.com/jquery-1.4.4.js"></script> <script>
$.getJSON("http://api.flickr.com/services/feeds/photos_public.gne?jsoncallback=?",
{
tags: "cat",
tagmode: "any",
format: "json"
},
function(data) {
$.each(data.items, function(i,item){
$("<img/>").attr("src", item.media.m).appendTo("#images");
@johnjosephhorton
johnjosephhorton / parsing_MTurk_datetime_in_R.R
Created December 1, 2010 21:53
This is the correct strptime to parse the datetime string that MTurk generates
accept_time <- sapply(results$AcceptTime,
function(t){as.numeric(as.POSIXct(as.character(t), "%a %b %d %H:%M:%S GMT %Y", tz="UTC"))}
)
@johnjosephhorton
johnjosephhorton / internal_HIT.html
Created December 3, 2010 04:26
HTML and JS for an internal HIT posted on MTurk
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script> <script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jqueryui/1.8.6/jquery-ui.min.js"></script>
<p><style type="text/css">
/*
* jQuery UI CSS Framework 1.8.6
*
* Copyright 2010, AUTHORS.txt (http://jqueryui.com/about)
* Dual licensed under the MIT or GPL Version 2 licenses.
* http://jquery.org/license
*
* http://docs.jquery.com/UI/Theming/API
@johnjosephhorton
johnjosephhorton / get_rank.R
Created May 21, 2011 21:23
functional programming way to extract the within-subject order that some time-stamped event occurred
# accept time would be seconds from the unix epoch
results$rank <- mapply(
function(time,worker_id){
which(time==sort(subset(results, WorkerId==worker_id)$accept_time))
}, results$accept_time, results$WorkerId)
@johnjosephhorton
johnjosephhorton / cases.R
Created May 21, 2011 21:27
functional programming way to implement a case method in R
# keys and values map 1:1
# e.g.,
keys <- c(1,2,3)
values <- c('a','b','c')
list_of_keys <- c(1,2,3,1,1,2)
letters <- sapply(list_of_keys, function(x){values[which(x==keys)]})
@johnjosephhorton
johnjosephhorton / unix2POSIXct.R
Created July 20, 2011 16:40
Convert unix epoch time stamp into R datetime class
unix2POSIXct <- function (time) structure(time, class = c("POSIXt", "POSIXct"))