Skip to content

Instantly share code, notes, and snippets.

View mrdwab's full-sized avatar

Ananda Mahto mrdwab

View GitHub Profile
@mrdwab
mrdwab / print.cintanotes.R
Created February 26, 2013 09:48
Reads the sqlite database created by CintaNotes.
print.cintanotes <- function(x, ...) {
for (i in 1:nrow(x)) {
cat("#", x[i, 2], "\n\n")
cat(paste("Posted on *", x[i, 4], "*\n\n", sep = ""))
cat(x[i, 3], "\n\n")
if (x[i, 5] != "") {
cat("Links: ", x[i, 5], "\n\n")
}
if (!is.na(x[i, 6])) {
cat("Tags: ", x[i, 6], "\n\n")
@mrdwab
mrdwab / data.table.message.R
Last active December 14, 2015 11:38
Error message from using data.table on a very wide dataset
rm(dtt) ## Make sure `dtt` isn't there to begin with
set.seed(1)
LLLL <- apply(expand.grid(LETTERS, LETTERS[10:15], LETTERS[1:20], LETTERS[1:5], stringsAsFactors=FALSE), 1, paste0, collapse="")
size <- 1000
dateSamples <- 1500
startDate <- as.Date("1980-01-01")
@mrdwab
mrdwab / myAgg.R
Created March 9, 2013 11:25
Version of `aggregate` where the function name is appended to the aggregated variable's name.
myAgg <- function (formula, data, FUN, ..., subset, na.action = na.omit)
{
if (missing(formula) || !inherits(formula, "formula"))
stop("'formula' missing or incorrect")
if (length(formula) != 3L)
stop("'formula' must have both left and right hand sides")
m <- match.call(expand.dots = FALSE)
if (is.matrix(eval(m$data, parent.frame())))
m$data <- as.data.frame(data)
m$... <- m$FUN <- NULL
@mrdwab
mrdwab / read.so.R
Last active December 14, 2015 18:19
Read pasted output at Stack Overflow into R
read.so <- function(sep = "", header = TRUE, out = "mydf") {
OS <- ifelse(Sys.info()["sysname"] == "Darwin", "Mac", "others")
temp <- switch(
OS,
Mac = {
suppressWarnings(
read.table(text = gsub("^#", "", pipe("pbpaste")), header = header,
stringsAsFactors = FALSE, sep = sep))
},
others = {
@mrdwab
mrdwab / list.depth.R
Created March 13, 2013 10:22
Recursively determine how deep a list is nested in R
list.depth <- function(this, thisdepth = 0) {
# http://stackoverflow.com/a/13433689/1270695
if(!is.list(this)) {
return(thisdepth)
} else {
return(max(unlist(lapply(this, list.depth, thisdepth = thisdepth+1))))
}
}
@mrdwab
mrdwab / parse_so_rep_page.R
Created March 25, 2013 08:27
Copy the data at stackoverflow.com/reputation and read it into R to get a nice data.frame from which you can do fun stuff.
parse_so_rep_page <- function(rep_file) {
# Authors: Paul Hiemstra, Ananda Mahto
all_data <- readLines(rep_file)
all_data <- all_data[-1]
## Deal with bonuses
all_data <- gsub("-- bonuses\\s+(.*)", " 99 NA \\1", all_data)
date_entries <- grep("^-", all_data)
actions_per_day <- c(date_entries[1], diff(date_entries)) - 1
@mrdwab
mrdwab / SOPageLoaderHelper.R
Created March 30, 2013 04:55
Functions to get question and answer sets from Stack Overflow, and to extract the code blocks from those questions.
#'Download a Stack Overflow question page as a character vector
#'
#'Helper function for other Stack Overflow related functions. Downloads the
#'page as a character vector and extracts the portion that is needed for other
#'functions.
#'
#'@param qid The numeric question ID.
#'@return A character vector
#'
#'@author Ananda Mahto
@mrdwab
mrdwab / readClip.R
Last active December 16, 2015 13:39
Rough concept of `readSOfwf`
readClip <- function(){
OS <- Sys.info()["sysname"]
cliptext <- switch(OS,
Darwin = {
con <- pipe("pbpaste")
text <- readLines(con)
close(con)
text
},
uReshape <- function(data, id.vars, var.stubs, sep) {
vGrep <- Vectorize(grep, "pattern", SIMPLIFY = FALSE)
temp <- names(data)[names(data) %in%
unlist(vGrep(var.stubs, names(data),
value = TRUE))]
if (sep == "NoSep") {
x <- NoSep(temp, ...)
} else {
x <- do.call(rbind.data.frame,
@mrdwab
mrdwab / uReshape.R
Last active March 3, 2020 06:10
`reshape()` for "unbalanced" datasets.
uReshape <- function(data, id.vars, var.stubs, sep) {
# vectorized version of grep
vGrep <- Vectorize(grep, "pattern", SIMPLIFY = FALSE)
# Isolate the columns starting with the var.stubs
temp <- names(data)[names(data) %in% unlist(vGrep(var.stubs, names(data), value = TRUE))]
# Split the vector and reasemble into a data.frame
x <- do.call(rbind.data.frame, strsplit(temp, split = sep))
names(x) <- c("VAR", paste(".time", 1:(length(x)-1), sep = "_"))