Skip to content

Instantly share code, notes, and snippets.

Avatar
🏠
Working from home

Christopher Peters statwonk

🏠
Working from home
View GitHub Profile
View instructions and how-to
=================================================================
SETTING UP SSHD AS A SERVICE FOR RUNNING HADOOP DAEMONS ON WINDOWS 7
=================================================================
Steps:
1. Download 'setup.exe' from Cygwin website
2. Right-click on 'setup.exe'
3. Leave settings as they are, click through until you come to the plugin selection window
3.1 - Make sure that the installation directory is 'C:\cygwin'
@statwonk
statwonk / gist:5975418
Created Jul 11, 2013
Modularization of monthly statistics scripts.
View gist:5975418
module MonthlyStats
class MonthTime
def initialize
@begin_time = '2013-06-01'
@end_time = '2013-06-30'
end
def count_code_challenges
puts CodeChallenges.count
end
View Extract player data
install.packages("RCurl"); install.pacakges("rjson")
library(RCurl); library(rjson)
df <- fromJSON(getURL("https://raw.github.com/BurntSushi/nflgame/master/nflgame/players.json"))
dataframeFromJSON <- function(l) {
l1 <- lapply(l, function(x) {
x[sapply(x, is.null)] <- NA
unlist(x)
})
@statwonk
statwonk / gist:9448932
Last active Aug 29, 2015
Reading the FBI data in.
View gist:9448932
df <- read.csv("data.csv",
header = T,
stringsAsFactors = F)
df$date_put_on_list <- as.POSIXct(df$date_put_on_list, format = "%m/%d/%Y", tz = "EST")
df$follow_up_date <- as.POSIXct(df$follow_up_date, format = "%m/%d/%Y", tz = "EST")
df$days_to_capture <- ifelse(is.na(df$follow_up_date) | df$follow_up_date == "",
difftime(Sys.time(), df$date_put_on_list, units = "days"),
difftime(df$follow_up_date, df$date_put_on_list, units = "days"))
df$censor <- ifelse(!is.na(df$follow_up_date), 1, 0)
View gist:9449792
difftime(Sys.time(), min(df$date_put_on_list[df$censor == 0]), units = "days")
Time difference of 15892.45 days
df$days_to_capture <- ifelse(df$id == 313, 15892.45, df$days_to_capture)
mean(df$days_to_capture[df$days_to_capture > 0 & (df$censor == 1 | df$id == 313)], na.rm = T)
# 435.1818 days
View gist:9450725
library(muhaz)
fit2 <- muhaz(df$days_to_capture,
df$censor,
bw.method = "knn")
plot(fit2, xlim = c(0, 365.25),
ylab = "Hazard Rate",
xlab = "Days from being added to the list")
View gist:e7e7eed21ded77b4aedb
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel grid splines stats graphics grDevices
[8] utils datasets methods base
@statwonk
statwonk / Catting hosts
Last active Aug 29, 2015
re: connecting workers to master
View Catting hosts
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
View redis_key_sizes.sh
#!/usr/bin/env bash
# This script prints out all of your Redis keys and their size in a human readable format
# Copyright 2013 Brent O'Connor
# License: http://www.apache.org/licenses/LICENSE-2.0
human_size() {
awk -v sum="$1" ' BEGIN {hum[1024^3]="Gb"; hum[1024^2]="Mb"; hum[1024]="Kb"; for (x=1024^3; x>=1024; x/=1024) { if (sum>=x) { printf "%.2f %s\n",sum/x,hum[x]; break; } } if (sum<1024) print "1kb"; } '
}
View Dockerfile
FROM ubuntu:14.04
MAINTAINER Winston Chang "winston@rstudio.com"
# =====================================================================
# R
# =====================================================================
# Need this to add R repo
RUN apt-get update && apt-get install -y software-properties-common
You can’t perform that action at this time.