Skip to content

Instantly share code, notes, and snippets.

View mmparker's full-sized avatar

Matt Parker mmparker

View GitHub Profile
library(dplyr)
df_list <- list(
mutate(mtcars[1:5, ], jurisdiction = 'New Guernsey'),
mutate(iris[1:5, ], mean = mean(Sepal.Length)),
mutate(beaver1[1:5, ], jurisdiction = 'New Guernsey', mean = mean(temp))
)
library(dplyr)
library(rvest)
# Get the list of files to work on
files_to_process <- list.files('r:/shared documents/',
pattern = 'html',
full.names = TRUE)
@mmparker
mmparker / test_script.py
Created July 7, 2017 21:51
Trying to understand logging in Python with structlog...
# Two main things I can't figure out:
# 1. How do I get the name of each function to prefix its log entry?
# e.g., INFO:testmodule:outsideFun:{msg} instead of just INFO:testmodule:{msg}
# 2. How do I get bound values to persist through subsequent functions?
# e.g., user:matt.parker and outsideArg:foo should appear in both
# outsideFun and insideFun log messages
import logging
@mmparker
mmparker / nested_modules.R
Last active February 10, 2017 17:39
Trying to nest a plot-generating module inside a second module that creates several plots.
options(stringsAsFactors = FALSE,
scipen = 9999)
library(shiny)
library(ggplot2)
library(dplyr)
@mmparker
mmparker / if_else_bug.R
Last active November 2, 2016 16:43
When working with a sqlite tbl, if_else() won't accept named arguments - but *will* accept unnamed.
library(dplyr)
# Set up a temp sqlite database
db <- src_sqlite(tempfile(), create = TRUE)
iris2 <- copy_to(db, iris)
# if_else with named true and false on a data.frame -> no problem
iris %>% mutate(Sepal.Size = if_else(Sepal.Length > 5,
true = "big",
@mmparker
mmparker / multi_event_dates.R
Created October 4, 2016 22:06
An attempt to put date labels on an integer-sequenced x-axis, when there are multiple sequential points per date.
# Always
options(stringsAsFactors = FALSE)
library(tidyverse)
# Setting the ggplot2 theme
@mmparker
mmparker / sleep_time.R
Last active June 26, 2016 03:43
A workflow for creating a plot that shows sleep times and wake times as a bar for each day. Reference plot: https://pbs.twimg.com/media/Cl1Bk-JUgAAZrEX.jpg
# Setup
options(stringsAsFactors = FALSE)
library(lubridate)
library(ggplot2)
@mmparker
mmparker / apply_over_xdf.R
Created February 8, 2016 18:08
Snippet for lagging within groups (or applying any other transformFunc) using RRE
# This is an example of how to apply the lagging function from this
# StackOverflow answer: http://stackoverflow.com/a/30874772/143319
# to grouped data. In short:
# 1. Use rxSplit() to put each group in its own XDF file
# 2. Use lapply() to iterate over the list of XDF files
# Pick a sample dataset
xdfPath <- file.path(rxGetOption("sampleDataDir"), "DJIAdaily.xdf")
@mmparker
mmparker / statistical_spanish.md
Last active August 29, 2015 14:28
Spanish Vocabulary for Statistical Computing

Spanish Vocabulary for Statistical Computing

This is a little cheatsheet for anyone interested in writing about or teaching statistics in Spanish. Please feel free to expand and correct.

Statistics (La Estadistica)

Values: valores

@mmparker
mmparker / dynamic_query_date.r
Created March 10, 2015 22:31
Dynamic dates in SQL queries in R
# Pick some dynamic date - here's the start of the current month
start_date <- format(Sys.Date(), "%Y-%m-01")
# Then use paste0() to construct a query that includes it
# Here's an example to get all the positive QFTs this month
# (I have probably misremembered the field names).
# I'm pretty sure you have to put a # on each side of the
# date in order for Access to recognize it as a date.
test_query <- sqlQuery(tbdb,
paste0("SELECT person_id, test_date