Skip to content

Instantly share code, notes, and snippets.

View tomhopper's full-sized avatar

Tom Hopper tomhopper

  • Michigan, United States
View GitHub Profile
@tomhopper
tomhopper / xmrplot.R
Last active August 29, 2015 13:56
Plot XmR charts (individuals and moving range) in R using the qcc library.
library(qcc)
#' The data, from sample published by Donald Wheeler
my.xmr.raw <- c(5045,4350,4350,3975,4290,4430,4485,4285,3980,3925,3645,3760,3300,3685,3463,5200)
#' Create the individuals chart and qcc object
my.xmr.x <- qcc(my.xmr.raw, type = "xbar.one", plot = TRUE)
#' Create the moving range chart and qcc object. qcc takes a two-column matrix
#' that is used to calculate the moving range.
my.xmr.raw.r <- matrix(cbind(my.xmr.raw[1:length(my.xmr.raw)-1], my.xmr.raw[2:length(my.xmr.raw)]), ncol=2)
my.xmr.mr <- qcc(my.xmr.raw.r, type="R", plot = TRUE)
@tomhopper
tomhopper / annotate_outside_plot.R
Created February 14, 2014 13:01
Demonstration of integration of plotting individuals chart using qcc and ggplot2, with annotation outside of the plot.
library(qcc)
library(ggplot2)
library(grid)
#' The data, from sample data provided by Donald Wheeler
my.xmr.raw <- c(5045,4350,4350,3975,4290,4430,4485,4285,3980,3925,3645,3760,3300,3685,3463,5200)
#' Create the individuals qcc object
my.xmr.x <- qcc(my.xmr.raw, type = "xbar.one", plot = FALSE)
@tomhopper
tomhopper / ggplot2_axis_ranges.R
Last active October 2, 2020 10:36
Get the actual x- and y-axis ranges from a ggplot object. Works on ggplot2 >= 0.8.9 (tested on 0.9.3.1). Code from http://stackoverflow.com/questions/7705345/how-can-i-extract-plot-axes-ranges-for-a-ggplot2-object
library(ggplot2)
#' create a data frame with test data.
my.df <- data.frame(index = 1:10, value = rnorm(10))
#' create the ggplot object
my.ggp <- ggplot(data = my.df, aes(x = index, y = value)) + geom_point() + geom_line()
#' get the x- and y-axis ranges actually used in the graph
# This worked in early versions of ggplot2 (probably <2.2)
@tomhopper
tomhopper / ordered_dotplot_errorbars.R
Last active January 4, 2022 03:05
An ordered dot plot with error bars. Demonstrates creating summarized data (e.g. means), ordering by second variable, error bars on dot plots. Based on the bar chart example at http://martinsbioblogg.wordpress.com/2014/03/19/using-r-barplot-with-ggplot2/
#' Example Cleveland-style dot plot.
#' Two variables plotted on an facet plot.
#' Ordering of the data in the first facet.
#' Error bars plotted with each dot.
library(ggplot2)
library(reshape2)
library(plyr)
#' Create some data for the demonstration
@tomhopper
tomhopper / ggplot_density_plot.r
Created April 7, 2014 17:49
Plotting multiple probability density functions in ggplot2 using different colors
ggplot(NULL, aes(x=x, colour = distribution)) +
stat_function(fun=dnorm, data = data.frame(x = c(-6,6), distribution = factor(1)), size = 1) +
stat_function(fun=dt, args = list( df = 20), data = data.frame(x = c(-6,6), distribution = factor(2)), linetype = "dashed", size = 1) +
scale_colour_manual(values = c("blue","red"), labels = c("Normal","T-Distribution")) +
theme(text = element_text(size = 12),
legend.position = c(0.85, 0.75)) +
xlim(-4, 4) +
xlab(NULL) +
ylab(NULL)
@tomhopper
tomhopper / PRESS.R
Last active November 6, 2022 00:46
Functions that return the PRESS statistic (predictive residual sum of squares) and predictive r-squared for a linear model (class lm) in R
#' @title PRESS
#' @author Thomas Hopper
#' @description Returns the PRESS statistic (predictive residual sum of squares).
#' Useful for evaluating predictive power of regression models.
#' @param linear.model A linear regression model (class 'lm'). Required.
#'
PRESS <- function(linear.model) {
#' calculate the predictive residuals
pr <- residuals(linear.model)/(1-lm.influence(linear.model)$hat)
#' calculate the PRESS
@tomhopper
tomhopper / dt_merge_nodups.R
Last active February 10, 2017 02:58
Merge two data.tables and eliminate duplicated rows
library(data.table)
# See \link{http://stackoverflow.com/questions/11792527/filtering-out-duplicated-non-unique-rows-in-data-table}
# for a discussion of how to eliminate duplicate rows.
# The problem is that the \code{unique()} function will use a key, if it exists. We need to
# eliminate the key.
# Create one column of data
temp1 <- data.table(sample(letters,size = 15, replace = FALSE))
temp2 <- data.table(sample(letters,size = 15, replace = FALSE))
@tomhopper
tomhopper / plot_aligned_series.R
Last active June 25, 2023 17:36
Align multiple ggplot2 graphs with a common x axis and different y axes, each with different y-axis labels.
#' When plotting multiple data series that share a common x axis but different y axes,
#' we can just plot each graph separately. This suffers from the drawback that the shared axis will typically
#' not align across graphs due to different plot margins.
#' One easy solution is to reshape2::melt() the data and use ggplot2's facet_grid() mapping. However, there is
#' no way to label individual y axes.
#' facet_grid() and facet_wrap() were designed to plot small multiples, where both x- and y-axis ranges are
#' shared acros all plots in the facetting. While the facet_ calls allow us to use different scales with
#' the \code{scales = "free"} argument, they should not be used this way.
#' A more robust approach is to the grid package grid.draw(), rbind() and ggplotGrob() to create a grid of
#' individual plots where the plot axes are properly aligned within the grid.
@tomhopper
tomhopper / facet_labelling.R
Last active August 29, 2015 14:06
Custom labels for ggplot2 facets.
#' Data frame column names are rarely human-readable, concise and clear, but are usually meaningful. Rather
#' than trying to modify the data, we can provide custom labels for facets.
library(data.table)
library(lubridate)
library(reshape2)
library(ggplot2)
#' Download raw data from "Weather Data" at \link{http://datamonitoring.marec.gvsu.edu/DataDownload.aspx},
#' rename the file to "Marec_weather.csv" and save it to /data/ in the current working directory.
@tomhopper
tomhopper / rnorm.r
Last active August 18, 2023 03:25
Functions to create normally distributed data between two values minimum and maximum. One function pegs the minimum and maximum; the other uses a 99.7% tolerance interval.
#' @title Returns a normally distributed vector within the 99.7% tolerance interval defined by minimum and maximum
#' @param n (required) The number of random numbers to generate
#' @param minimum (optional) The lower 99.9% tolerance limit
#' @param maximum (optional) The upper 99.9% tolerance limit
#' @return numeric vector with n elements randomly distributed so that approximately 99.7% of values will fall between minimum and maximum
#' @examples
#' rnorm.within(10)
#' rnorm.within(10, 10, 20)
#' summary(rnorm.within(10000, 10, 20))
rnorm.within <- function(n, minimum=0, maximum=1)