Skip to content

Instantly share code, notes, and snippets.

View tomhopper's full-sized avatar

Tom Hopper tomhopper

  • Michigan, United States
View GitHub Profile
@tomhopper
tomhopper / find_and_delete.sh
Last active November 10, 2022 02:55
Use the Mac OS X terminal (UNIX command line) to find and delete all files matching a pattern
find . -name '.filename' -print -exec rm -r {} \;
# . = in current directory
# -name = file name to find
# -print = print the result's full file name to standard output
# -exec = execute the following command
# {} = fill in with the result of standard output
# \; = semicolon to terminate the -exec command, and the escape
# character so that the terminal doesn't treat the semicolon as a
# return character (used for stringing together multiple commands).
@tomhopper
tomhopper / ggplot2_xkcd_Humor_Sans.R
Created March 18, 2015 15:58
Use the font Humor Sans instead of font xkcd with theme_xkcd()
# The xkcd font used by the package xkcd (which provides a theme for ggplot2)
# is missing many characters and some characters don't seem to display correctly.
# An alternate xkcd-style font is Humor Sans, available free from
# \url{http://antiyawn.com/uploads/humorsans.html}
# The code below forces the use of Humor Sans instead of xkcd.
# The xkcd and ggplot2 packages are available from CRAN.
library(ggplot2)
library(xkcd)
# Create 2 replicates of 5 "words" generated from random characters,
# each "word" 5 - 15 characters long, with word length following a
# poisson distribution.
rep(replicate(5, paste(sample(letters, round(rpois(5000, lambda = 3)+5, 0), replace = FALSE), collapse = "")), 2)
# Sample output:
# [1] "rfexnwyjst" "vwtadhjnly" "ztfgvldo" "tmerol" "mcqhosap" "rfexnwyjst" "vwtadhjnly" "ztfgvldo" "tmerol"
#[10] "mcqhosap"
@tomhopper
tomhopper / .Rprofile
Last active June 13, 2019 19:08
Rprofile file
## For original file showing use of .env to add functions invisibly, see
## \link{http://gettinggeneticsdone.blogspot.com/2013/06/customize-rprofile.html}
## Load packages
#library(BiocInstaller)
## Don't show those silly significanct stars
#options(show.signif.stars=FALSE)
## Do you want to automatically convert strings to factor variables in a data.frame?
@tomhopper
tomhopper / sort_factors.R
Created June 16, 2015 19:29
Several methods of sorting a factor in a data frame by a numeric variable so that it plots in ascending (or descending) order using ggplot2
#' @title Sorting data frames factor levels for ggplot2
#' @description Sorting a factor variable by a numeric variable.
#' In one case, each factor level is matched to one numeric value.
#' In the other case, each factor level is repeated across a second
#' grouping factor variable, and we want to sort only the
library(dplyr)
library(tidyr)
library(ggplot2)
# Sort a factor by variable by a numeric variable
@tomhopper
tomhopper / align_common_baseline.R
Last active November 5, 2016 17:47
Examples of aligning against a common baseline, using Cleveland-style dot plots
# Response to a post at Storytelling with Data:
# \url{http://www.storytellingwithdata.com/blog/orytellingwithdata.com/2015/07/align-against-common-baseline.html}
# Demonstrates
# * Cleveland-style dot plots (improvement over pie and bar charts)
# * Sorting categorical data by a numerical variable with more than one grouping variable
# * Highlighting differences between groups graphically
library(ggplot2)
library(scales)
@tomhopper
tomhopper / dplyr_filter_ungroup.R
Created January 29, 2016 20:31 — forked from jhofman/dplyr_filter_ungroup.R
careful when filtering with many groups in dplyr
library(dplyr)
# create a dummy dataframe with 100,000 groups and 1,000,000 rows
# and partition by group_id
df <- data.frame(group_id=sample(1:1e5, 1e6, replace=T),
val=sample(1:100, 1e6, replace=T)) %>%
group_by(group_id)
# filter rows with a value of 1 naively
system.time(df %>% filter(val == 1))
@tomhopper
tomhopper / median_hourly_earnings.R
Created July 2, 2016 15:35
makeover: convert from two groups of side-by-side vertical bar charts to a more readable dot plot
# from Conrad Hacket
# Median hourly earnings
# \url{https://twitter.com/conradhackett/status/748884076493475840}
# makeover: convert from two groups of side-by-side vertical bar charts to a more readable dot plot
# Demonstrates:
# Use of in ggplot2
# Creating dot plots
# Combining color and shape in a single legend
# Sorting a dataframe so that categorical data in one column is ordered by a second numerical column
# Note: resulting graph displays best at about 450 pixels x 150 pixels
@tomhopper
tomhopper / nlsLM_s-curve.R
Last active August 29, 2023 16:46
Example of using nlsLM to fit an s-curve to data
# Based on a post at \url{http://www.walkingrandomly.com/?p=5254}
library(dplyr)
library(ggplot2)
library(minpack.lm)
# The data to fit
my_df <- data_frame(x = c(0,15,45,75,105,135,165,195,225,255,285,315),
y = c(0,0,0,4.5,19.7,39.5,59.2,77.1,93.6,98.7,100,100))
# EDA to see the trend
@tomhopper
tomhopper / addNewData.R
Created October 9, 2016 00:12 — forked from dfalster/addNewData.R
The function addNewData.R modifies a data frame with a lookup table. This is useful where you want to supplement data loaded from file with other data, e.g. to add details, change treatment names, or similar. The function readNewData is also included. This function runs some checks on the new table to ensure it has correct variable names and val…
##' Modifies 'data' by adding new values supplied in newDataFileName
##'
##' newDataFileName is expected to have columns
##' c(lookupVariable,lookupValue,newVariable,newValue,source)
##'
##' Within the column 'newVariable', replace values that
##' match 'lookupValue' within column 'lookupVariable' with the value
##' newValue'. If 'lookupVariable' is NA, then replace *all* elements
##' of 'newVariable' with the value 'newValue'.
##'