Skip to content

Instantly share code, notes, and snippets.

@datalove
datalove / kaggle.aquire.scores.20140611
Last active August 29, 2015 14:02
Kaggle Acquire Competition - Distribution of Scores 11 June 2014
library(XML)
library(ggplot2)
url <- "http://www.kaggle.com/c/acquire-valued-shoppers-challenge/leaderboard"
tree <- htmlTreeParse(url)
tbl <- readHTMLTable(pagetree, stringsAsFactors = FALSE)[[1]]
colnames(tbl) <- gsub("[^a-zA-Z0-9#]","", colnames(tbl))
tbl$Score <- as.numeric(tbl$Score)
@datalove
datalove / odds.p.from.ci
Last active August 29, 2015 14:03
Calculate P-value for an odds ratio using CIs
# calculation as per:
# http://www.bmj.com/content/343/bmj.d2304
get_odds_p <- function(est, cil, ciu) {
se <- (log(ciu) - log(cil))/(2*1.96)
z <- abs(log(est)/se)
p <- exp(-0.717*z - 0.416*z^2)
p
}
@datalove
datalove / evaluate_a_string.r
Last active August 29, 2015 14:05
How to get R to evaluate arbitrary code in a string
rm(list=ls())
rcode <- "x <- 1+1; y <- 2+2"
# running `eval` evaluates `rcode` into the parent environment
eval(parse(text = rcode))
print(x)
print(y)
# Use autodetected proxy settings
setInternet2(TRUE)
# Get the SpotfireSPK package from the Spotfire Stats Server
install.packages("SpotfireSPK", repos = "http://MySpotfireServer:8080/SplusServer/update/TERR")
# Get a package you want to deploy from CRAN
install.packages("nortest", repos = "http://cran.us.r-project.org")
# Create a Debian Control File
@datalove
datalove / terr_RinR.r
Created August 14, 2014 01:06
How to use TERR's RinR package inside TERR/Spotfire
# Download: http://tap.tibco.com/ -> "Samples" tab -> RinR -> "Try Now" button
# Docs: http://docs.tibco.com/pub/enterprise-runtime-for-R/2.5.0/doc/html/RinR/RinR-package.html
########################################
# Setup
########################################
library(RinR)
library(Sdatasets)
@datalove
datalove / devtols_install_github_behind_proxy.R
Last active July 30, 2021 08:48
How get devtools::install_github() working behind a proxy that messes with the SSL certs
library(httr)
library(devtools)
# make httr set CURL to ignore SSL verification problems
# (needed if the SSL proxy replaces certs with its own)
set_config(config(ssl.verifypeer = 0L))
# set proxy details
set_config(use_proxy("10.10.10.10",8080))
@datalove
datalove / TERR_Expression_Function_Handling_Input_Columns.r
Last active August 29, 2015 14:08
TERR Script to be used in a Spotfire 'Expression Function'0
#######################################################
#
# Expression Functions in Spotfire can take arbitrarily
# many columns as input. Columns will be passed to TERR
# in order as 'input1', 'input2', etc.
#
# This shows how to capture an arbitrary number of
# columns and to put them into a data frame.
#
#######################################################
@datalove
datalove / TERR_Expression_Function_Mahalanobis_Distance.r
Last active August 29, 2015 14:08
Finds the Mahalanobis Distance for a set of columns
###################################################################
# Takes an arbitrarily long list of input columns and returns a
# boolean indicating whether or not each row is an outlier.
###################################################################
# create vector of inputs
inputs <- grep("^input[0-9]+$",ls(), value = TRUE)
# capture columns as a matrix
x <- sapply(inputs, function(y) {eval(parse(text = y))})
@datalove
datalove / TERR_Expression_Function_Mahalanobis_Outlier.r
Last active August 29, 2015 14:08
Find multivariate outliers using Mahalanobis Distances
########################################################
# Takes an arbitrarily long list of input columns and
# returns a boolean indicating whether or not each row
# is an outlier.
#
# The function uses the critical value for Mahalanobis
# Distance calculated from an upper tailed ChiSq
# distribution with p=0.001.
########################################################