Skip to content

Instantly share code, notes, and snippets.

View MarkEdmondson1234's full-sized avatar
🦑
Tappity tap tap

Mark Edmondson MarkEdmondson1234

🦑
Tappity tap tap
View GitHub Profile
# Determine number of clusters
## run kmeans for varying number of clusters 1 to 15
wss <- (nrow(comp)-1)*sum(apply(comp,2,var))
for (i in 2:15) wss[i] <- sum(kmeans(comp,
centers=i)$withinss)
plot(1:15, wss, type="b", xlab="Number of Clusters",
ylab="Within groups sum of squares")
# From scree plot elbow occurs at k = 4-6
kResults <- data.frame(k_data, cluster = k$cluster)
## Transform data for columns of cluster, rows of Sku with value of mean total for each
rl <- as.data.frame(lapply(1:4, function(x){ r3 <- kResults[kResults$cluster == x,
setdiff(names(kResults), 'cluster')]
r4 <- colSums(r3) / nrow(r3)
r4
}))
names(rl) <- paste("cluster",1:4)
## want: 30049 x 187
## userId, product1_view, product2_view, ...., productN_view, productBought
pv <- reshape2::recast(product_views,
dimension1 ~ productSku + variable,
fun.aggregate=sum)
library(dplyr)
## if a user buys more than once, the row will be duplicated
pt <- product_trans %>% select(productSku, dimension1)
library(randomForest)
## warning - can take a long time (30mins)
rf <- randomForest(x = predictors, y = response)
## once model done, we run it using test data and compare results to reality
predictor_test <- test[,which(!names(test) %in% c("dimension1","boughtSku"))]
response_test <- as.factor(test[,"boughtSku"])
## check result on test set
prediction <- predict(rf, predictor_test)
@MarkEdmondson1234
MarkEdmondson1234 / animate.R
Created February 8, 2016 22:34 — forked from thomasp85/animate.R
Animating graph over time
library(ggraph)
library(gganimate)
library(igraph)
# Data from http://konect.uni-koblenz.de/networks/sociopatterns-infectious
infect <- read.table('out.sociopatterns-infectious', skip = 2, sep = ' ', stringsAsFactors = FALSE)
infect$V3 <- NULL
names(infect) <- c('from', 'to', 'time')
infect$timebins <- as.numeric(cut(infect$time, breaks = 100))
# We want that nice fading effect so we need to add extra data for the trailing
@MarkEdmondson1234
MarkEdmondson1234 / r-installation-debain-wheezy
Last active February 10, 2016 14:43
Installation of R, Rstudio, OpenCPU on GCE Debain Wheezy image, details on how to in the blogpost http://markedmondson.me/run-r-rstudio-and-opencpu-on-google-compute-engine-free-vm-image
### r-installation-debain-wheezy
#### setup commands to setup R, RStudio and OpenCPU on a Google Compute Engine Wheezy instance
#### Mark Edmondson 29 June 2014
##
## No original work, all taken from these sources:
## https://github.com/jeroenooms/opencpu-deb/blob/master/build-on-debian.md
## https://support.rstudio.com/hc/communities/public/questions/200651456-RStudio-server-not-installable-on-Debian-Wheezy-just-released-this-week-
## http://cran.r-project.org/bin/linux/debian/README.html
# need to be sudo for all below
@MarkEdmondson1234
MarkEdmondson1234 / costdata.gs
Created February 15, 2016 14:15 — forked from chipoglesby/costdata.gs
Cost Data Upload via Google Analytic's Management API with Google Sheets
function uploadData() {
var accountId = "xxxxxxxx";
var webPropertyId = "UA-xxxxxxxx-x";
var customDataSourceId = "xxxxxxxx";
var ss = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var maxRows = ss.getLastRow();
var maxColumns = ss.getLastColumn();
var data = [];
for (var i = 1; i < maxRows;i++) {
data.push(ss.getRange([i], 1,1, maxColumns).getValues());
library(idbr) # devtools::install_github('walkerke/idbr')
library(ggplot2)
library(animation)
library(dplyr)
library(ggthemes)
idb_api_key("Your Census API key goes here")
male <- idb1('JA', 2010:2050, sex = 'male') %>%
mutate(POP = POP * -1,
@MarkEdmondson1234
MarkEdmondson1234 / downloadSearchAnalytics.R
Last active March 2, 2016 13:49
demo on how to download and archive search analytics data using searchConsoleR
## A script to download and archive Google search analytics
##
## Demo of searchConsoleR R package.
##
## Version 1 - 10th August 2015
##
## Mark Edmondson (http://markedmondson.me)
## load the required libraries
## (Download them with install.packages("googleAuthR") and install.packages("searchConsoleR" if necessary
@MarkEdmondson1234
MarkEdmondson1234 / RMessages.sh
Created March 21, 2016 16:04
Write R messages in StOut to a file
Rscript -e "setwd('/srv/shiny-server/xxxxx/'); zz<-file('rscript.log', open='wt');sink(zz, type = 'm'); rmarkdown::render('getData.Rmd')"