Skip to content

Instantly share code, notes, and snippets.

@inkhorn
inkhorn / casino.geo.r
Last active October 30, 2023 12:44
not in my backyard casino analysis
library(ff)
library(ggthemes)
ffload(file="casino", overwrite=TRUE)
casino.orig$Outside.of.Toronto = as.ff(ifelse(casino.orig[,"City"] == "Toronto",0,1))
casino.in.toronto = glm(casino.orig[,"Q6"] == "City of Toronto" ~ Outside.of.Toronto, data=casino.orig, family=binomial(logit))
casino.outside.toronto = glm(casino.orig[,"Q6"] == "Adjacent Municipality" ~ Outside.of.Toronto, data=casino.orig, family=binomial(logit))
summary(casino.in.toronto)
@inkhorn
inkhorn / enron processing.py
Last active February 5, 2017 18:12
Script to read, filter, and output all enron emails into many files in one directory
docs = []
from os import listdir, chdir
import re
# Here's my attempt at coming up with regular expressions to filter out
# parts of the enron emails that I deem as useless.
email_pat = re.compile(".+@.+")
to_pat = re.compile("To:.+\n")
@inkhorn
inkhorn / ltep.r
Created December 13, 2013 04:24
LTEP Survey Analsyis
ltep = read.csv("ltep-survey-results-all.csv")
library(likert)
library(ggthemes)
# Here I flip the scoring
ltep[,13:19] = sapply(ltep[,13:19], function (x) 8 - x)
deal.w.esources = likert(ltep[,13:19])
summary(deal.w.esources)
plot(deal.w.esources, text.size=6, text.color="black") + theme(axis.text.x=element_text(colour="black", face="bold", size=14), axis.text.y=element_text(colour="black", face="bold", size=14), axis.title.x=element_text(colour="black", face="bold", size=14), plot.title=element_text(size=18, face="bold")) + ggtitle("What guidelines should Ontario use\n for its future mix of energy sources?")
@inkhorn
inkhorn / enron corpus processing v2.py
Created November 5, 2013 03:21
Enron Corpus Processing, version 2
docs = []
from os import listdir, chdir
import re
# Here's the section where I try to filter useless stuff out.
# Notice near the end all of the regex patterns where I've called
# "re.DOTALL". This is pretty key here. What it means is that the
# .+ I have referenced within the regex pattern should be able to
# pick up alphanumeric characters, in addition to newline characters
@inkhorn
inkhorn / enron corpus processing.r
Last active December 27, 2015 03:18
Enron Corpus Processing
library(stringr)
library(plyr)
library(tm)
library(tm.plugin.mail)
library(SnowballC)
library(topicmodels)
# At this point, the python script should have been run,
# creating about 126 thousand txt files. I was very much afraid
# to import that many txt files into the tm package in R (my computer only
@inkhorn
inkhorn / daycares.R
Created October 17, 2013 02:18
Daycare Analysis
library(ff)
library(ffbase)
library(RgoogleMaps)
library(plyr)
addTrans <- function(color,trans)
{
# This function adds transparancy to a color.
# Define transparancy with an integer between 0 and 255
# 0 being fully transparant and 255 being fully visable
@inkhorn
inkhorn / ebike.r
Created September 13, 2013 00:31
E-bike Survey Analysis
library(rpart)
library(plyr)
library(rpart.plot)
ebike = read.csv("E-Bike_Survey_Responses.csv")
# This next part is strictly to change any blank responses into NAs
ebike[,2:10][ebike[,2:10] == ''] = NA
# In this section we use mapvalues from the plyr package to get rid of blanks, but also
@inkhorn
inkhorn / estimate_age.R
Last active December 20, 2015 09:19
Estimate Age from First Name in R
library(stringr)
library(plyr)
# We're assuming you've downloaded the SSA files into your R project directory.
file_listing = list.files()[3:135]
for (f in file_listing) {
year = str_extract(f, "[0-9]{4}")
if (year == "1880") { # Initializing the very long dataframe
name_data = read.csv(f, header=FALSE)
@inkhorn
inkhorn / neither.casino.glm.r
Created May 17, 2013 19:00
neither casino glm
Call:
glm(formula = casino$Q6 == "Neither" ~ GoBigorGoHome + TechnicalDetails +
Soc.Env.Issues, family = binomial(logit), data = casino)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.4090 -0.7344 -0.3934 0.8966 2.7194
Coefficients:
Estimate Std. Error z value Pr(>|z|)
@inkhorn
inkhorn / adj.mun.cacsino.glm.r
Created May 17, 2013 18:59
adjacent municipality casino glm
Call:
glm(formula = casino$Q6 == "Adjacent Municipality" ~ GoBigorGoHome +
TechnicalDetails + Soc.Env.Issues, family = binomial(logit),
data = casino)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.0633 -0.7248 -0.5722 -0.3264 2.7136
Coefficients: