Skip to content

Instantly share code, notes, and snippets.

View ateucher's full-sized avatar

Andy Teucher ateucher

View GitHub Profile
@ateucher
ateucher / cb-friendly_palettes.r
Last active December 22, 2015 22:59
Colour-blind friendly palettes
## Colour-blind friendly palettes
# The palette with grey:
cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
# The palette with black:
cbbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
@ateucher
ateucher / get_colClasses.r
Last active December 23, 2015 18:49
Handy way to get colClasses for a large text file dataset
intital <- read.csv("large_dataset.csv", nrows=100)
classes <- sapply(initial, class)
data <- read.csv("large_dataset.csv", colClasses=classes)
@ateucher
ateucher / ggplot_guides.R
Last active May 30, 2020 19:41
Manipulating ggplot2 legend with `override.aes` so points only for points, lines only for lines etc.
require(ggplot2)
set.seed(42)
df <- data.frame(x=1:50, y=rnorm(50, 10, 2), int=rbinom(50,1,0.3))
ggplot(df, aes(x=x, y=y)) +
geom_ribbon(aes(ymin=0, ymax=y, fill='#1E90FF'), alpha=0.3) +
geom_abline(intercept=10, slope=-0.1
, aes(colour='Linear')) +
@ateucher
ateucher / lsa_fix.R
Created January 31, 2014 21:49
Fix to lsa function
x <- 1:26
y <- LETTERS
z <- data.frame(x,y)
unrowname <- function(x) {
rownames(x) <- NULL
return(x)
}
lsa <- function ()
@ateucher
ateucher / foldText.R
Created May 8, 2014 22:43
Insert linebreaks in a string (for wrapping labels)
foldText <- function(x, n) {
x <- gsub(paste0('([^\n]{1,',n,'})(\\s|$)'), '\\1\n', x)
## Remove line-breaks (one or more) at first/last positions
x <- gsub('^(\n)+|(\n)+$', '', x)
x
}
## Examples
txt <- c("Hello my name is Andy",
"Oh when the Saints go marching in",
@ateucher
ateucher / aic_table.R
Last active August 29, 2015 14:04
Generating AIC table with model and variable names
require(survival)
require(kimisc) # has the nlist function to create a named list
require(AICcmodavg) # has the aictab function
require(dplyr)
require(ggplot2)
require(reshape2)
dat <- read.csv("data/obs_matched_pr.csv")
dat$strat_var <- factor(dat$strat_var)
@ateucher
ateucher / cars.R
Last active August 29, 2015 14:05
Compare fuel economy of cars
library(fueleconomy)
library(dplyr)
library(tidyr)
library(ggplot2)
comp_vehicles <- vehicles %>%
filter(cyl == 4,
grepl("Utility", class),
make %in% c("Honda", "Toyota","Subaru",
"Hyundai", "Kia", "Mazda",
@ateucher
ateucher / geo_json.R
Last active August 29, 2015 14:08
Testing to_geo_json and to_geojson
devtools::install_github("ropensci/togeojson@reworkapi")
library('togeojson')
library('maps')
data(us.cities)
to_geo_json(us.cities[1:2,], lat='lat', lon='long')
to_geojson(us.cities[1:2,], lat='lat', lon='long')
@ateucher
ateucher / foo.geojson
Last active August 29, 2015 14:08
Mixing geometry types in geojson
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ateucher
ateucher / R_introduction.md
Last active February 26, 2016 00:24
Motivational Pitch for R

I'm going to start off by describing a pretty common data analysis scenario, and then talk about how using R can help:

  • You have a lot of individual spreadsheet files containing your data, and you need it all together, so you copy and paste each one into a master file.
  • Next you do a bunch of data cleaning in the master spreadsheet - fixing date formats, unit conversions, transformations, etc.
  • You then import the data into your favourite statistics program, run your analysis, and
  • copy the outputs back into a spreadsheet or other graphing program to plot your results.
  • You give the results to a colleague to review and she comes back with some concerns that something doesn't look quite right with the results. She also suggests that a different modelling technique would be more appropriate.
  • You comb through the original data and realize that in some of the files one column was misaligned, and so in copying and pasting these into the master dataset this error was compounded over many rows.
  • In ad