Skip to content

Instantly share code, notes, and snippets.

View devmacrile's full-sized avatar

Devin Riley devmacrile

View GitHub Profile
@devmacrile
devmacrile / definitions.py
Created October 11, 2017 20:03
Curious about how python handled simultaneous local definition compared to a Lisp example
#(let ((a 1))
# (define (f x)
# (define b (+ a x))
# (define a 5)
# (+ a b))
# (f 10))
a = 1
def f(x):
@devmacrile
devmacrile / twenty_sided.py
Created March 18, 2017 18:11
Twenty sided di sum/difference simulation
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
di1 = np.random.random_integers(1, 20, 100000)
di2 = np.random.random_integers(1, 20, 100000)
values = (di1 + di2) - (np.absolute(di1 - di2))
sample_mean = np.nanmean(values)

Keybase proof

I hereby claim:

  • I am devmacrile on github.
  • I am devmacrile (https://keybase.io/devmacrile) on keybase.
  • I have a public key whose fingerprint is 08D8 CCCC 5D01 1F96 285C 2606 7709 364D 9CCA 6F14

To claim this, I am signing this object:

@devmacrile
devmacrile / writeTDE.r
Created February 26, 2016 03:43
Write R data.frame to a Tableau data extract file (.tde)
# Write R data.frame to a Tableau data extract file (.tde) by building and executing
# a python script which utilizes the Tableau data extract API (a hack, yes).
#
# This, naturally, has a hard dependency on the TDE API, so is only available for
# Windows and Linux systems (unfortunately)
#
# Devin Riley
# October, 2014
@devmacrile
devmacrile / map1.py
Created February 5, 2015 16:29
Modifiable map-reduce code for running TF-IDF via Hadoop Streaming jobs.
#!/usr/bin/python
import sys
import re
import nltk
from nltk.corpus import stopwords
stop_words = stopwords.words('english')
#input comes from standard input
for line in sys.stdin:
#separate incident id from text
@devmacrile
devmacrile / wrong-cv-example.r
Last active August 29, 2015 14:14
Sloppy implementation of a simulation exemplifying a common error in performing cross-validation (from The Elements of Statistical Learning, 7.10.2)
# Example of a common cross-validation mistake
# Described in The Elements of Statistical Learning, 7.10.2
# http://statweb.stanford.edu/~tibs/ElemStatLearn/
#
# Consider a scenario with
# N = 50 samples in two equal-sized classes, and p = 5000 quantitative
# predictors (standard Gaussian) that are independent of the class labels.
# The true (test) error rate of any classifier is 50%. We carried out the above
# recipe, choosing in step (1) the 100 predictors having highest correlation
# with the class labels, and then using a 1-nearest neighbor classifier, based
@devmacrile
devmacrile / sentiment140.r
Last active August 29, 2015 14:14
Simple R wrapper function for the Sentiment140 API
# Wrapper function for the Sentiment140 API
# An API for a maximum entropy model trained on ~1.5M tweets
# The server will timeout if the job takes > 60 seconds,
# so if the tweet count is relatively high, the function
# will split the data into chunks of 2500 (fairly arbitrary choice)
# http://help.sentiment140.com/api
Sentiment140 <- function(sentences){
# Load required packages
library(plyr)