Skip to content

Instantly share code, notes, and snippets.

View jknowles's full-sized avatar

Jared Knowles jknowles

View GitHub Profile
@thomasp85
thomasp85 / trim_model.R
Created October 24, 2017 07:26
Trim all unnecessary data from model objects
library(future)
trim_model <- function(model, predictor = predict, ..., ignore_warnings = TRUE) {
# Cache the correct output
true_pred <- predictor(model, ...)
# Treat prediction warnings as errors?
if (!ignore_warnings) {
old_ops <- options(warn = 2)
on.exit(options(old_ops))
}
@alexhanna
alexhanna / social-science-programming.md
Last active March 14, 2024 11:05
Notes on social science programming principles
  1. Code and Data for the Social Sciences: A Practitioner’s Guide, Gentzkow and Shapiro.
  2. Good enough practices in scientific computing, Wilson et al.
  3. Best Practices for Scientific Computing, Wilson et al.
  4. Principled Data Processing, Patrick Ball.
  5. The Plain Person’s Guide to Plain Text Social Science, Healy.
  6. Avoiding technical debt in social science research, Toor.
@kdkorthauer
kdkorthauer / RstudioServerSetup.sh
Created October 7, 2016 15:04
Bash script to set up R, install a few R packages, and get Rstudio Server running on ubuntu.
sudo sh -c 'echo "deb http://cran.rstudio.com/bin/linux/ubuntu trusty/" >> /etc/apt/sources.list'
gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
gpg -a --export E084DAB9 | sudo apt-key add -
sudo apt-get update
sudo apt-get -y install r-base libapparmor1 libcurl4-gnutls-dev libxml2-dev libssl-dev gdebi-core
sudo apt-get install libcairo2-dev
sudo apt-get install libxt-dev
sudo apt-get install git-core
sudo /bin/dd if=/dev/zero of=/var/swap.1 bs=1M count=1024
@lecy
lecy / datausa_census_api.md
Last active August 29, 2022 14:54
Building Census Dataset in R Using datausa.io API

Using the dataUSA.io API for Census Data in R

This gist contains some notes on constructing a query for census and economic data from the DataUSA.io site. This is a quick-start guide to their API; for in-depth documentation check out their API documentation.

A great way to learn how to structure a query is to visit a specific datausa.io page and click on the "Options" button on top of any graph, then select "API" to see the query syntax that created the graph.

Analytics

Example Use

@carlbfrederick
carlbfrederick / survProbs.coxme.R
Last active September 2, 2015 14:58
I wrote these functions to calculate survival probabilities at various levels of the random effect estimate from a coxme.object. The code is heavily adapted from survfit.coxph(). It should work, but all errors and inelegant hacks I have introduced are certainly my own doing. Comments/improvements welcome, enjoy!
#Internal Functions
MYagsurv <- function(y, x, wt, risk, survtype=3, vartype=3) {
nvar <- ncol(as.matrix(x))
status <- y[, ncol(y)]
dtime <- y[, ncol(y) - 1]
death <- (status == 1)
time <- sort(unique(dtime))
nevent <- as.vector(rowsum(wt * death, dtime))
ncens <- as.vector(rowsum(wt * (!death), dtime))
wrisk <- c(wt * risk) #Had to add c() to remove the dimnames so that the multiplication later would work
@ryanwitt
ryanwitt / gist:2911560
Created June 11, 2012 17:46
Confusion matrix for a logistic glm model in R. Helpful for comparing glm to randomForests.
confusion.glm <- function(data, model) {
prediction <- ifelse(predict(model, data, type='response') > 0.5, TRUE, FALSE)
confusion <- table(prediction, as.logical(model$y))
confusion <- cbind(confusion, c(1 - confusion[1,1]/(confusion[1,1]+confusion[2,1]), 1 - confusion[2,2]/(confusion[2,2]+confusion[1,2])))
confusion <- as.data.frame(confusion)
names(confusion) <- c('FALSE', 'TRUE', 'class.error')
confusion
}