Skip to content

Instantly share code, notes, and snippets.

View jknowles's full-sized avatar

Jared Knowles jknowles

View GitHub Profile
@jknowles
jknowles / barplots.R
Created February 20, 2020 19:33
How to order bar charts in ggplot2
# Load ggplot2
library(ggplot2)
# Load example data
data(mtcars)
# Create a character vector of car names
mtcars$name <- row.names(mtcars)
# Plot car names by mpg
ggplot(mtcars, aes(x = name, y = mpg)) +
@jknowles
jknowles / sdp_reg_exercise_1_example_2019.R
Created June 17, 2019 15:50
Exploring student composition effects on test score growth of school average scores. Example for SDP 2019 Regression course.
###################################################################################################
## Title: Regression Module Assignment 1
## Exploring Student Composition Effects on Test Score Growth
## Author: Jared E. Knowles, Civilytics Consulting
## Date: 6/12/2019
## Last Updated: 6/17/2019
###################################################################################################
# ----------------------------------------------------------------------------
# Load the data
@jknowles
jknowles / cpe_functions.R
Last active December 4, 2018 22:46
Functions to Support CPE and Census Data Alignment
################################################################################
# Functions to find the data
################################################################################
# Finders
# Simple functions that take the ID code from CPE (e.g. 49-00039) and look up
# the respective data for it in the structure provided in teh competition
find_police_shape <- function(dept_id, kaggle_kernel = FALSE) {
if(kaggle_kernel == TRUE) {
prefix = "../input/cpe-data/"
###############################################################################
## SDP Fall Workshop Predictive Analytics
## Advanced / Additional Code Snippets for Working with PA Data and Models
## Author: Jared E. Knowles
## Date: 09/14/2018
## You do not need to use all or even any of this code. The code does not need to
## be run together. This is just a survey of some additional techniques/tricks you
## can do in R to make explaining predictive models and complex data easier.
## As always - your needs and approaches may different.
################################################################################
// Additional tools for machine learning and predictive analytics in stata
/*
Author: Jared Knowles
Date: 09/12/2018
Purpose: Survey of some additional code helpful in conducting and explaining
or demonstrating predictive analytics to stakeholders.
You do not need to run all of this code - this is a survey of commands that
tackle different techniques. Pick and choose what might be most useful to you.
*/
@jknowles
jknowles / datausa_census_api.rmd
Created May 30, 2018 22:46 — forked from lecy/datausa_census_api.md
Building Census Dataset in R Using datausa.io API
# Using the dataUSA.io API for Census Data in R
This gist contains some notes on constructing a query for census and economic data from the [DataUSA.io](http://datausa.io/) site. This is a quick-start guide to their API; for in-depth documentation check out their [API documentation](https://github.com/DataUSA/datausa-api/wiki/Overview).
A great way to learn how to structure a query is to visit a specific datausa.io page and click on the "Options" button on top of any graph, then select "API" to see the query syntax that created the graph.
![Analytics](https://ga-beacon.appspot.com/UA-27835807-2/gist-id?pixel)
## Example Use
@jknowles
jknowles / robust_predict.lm.R
Created March 6, 2018 20:40
Robust Prediction Intervals for LM
predict.robust <- function(model, data, robust_vcov = NULL, level = 0.95,
interval = "prediction"){
# adapted from
# https://stackoverflow.com/questions/38109501/how-does-predict-lm-compute-confidence-interval-and-prediction-interval
# model is an lm object from r
# data is the dataset to predict from
# robust_vcov must be a robust vcov matrix created by V <- sandwich::vcovHC(model, ...)
# level = the % of the confidence interval, default is 95%
# interval = either "prediction" or "confidence" - prediction includes uncertainty about the model itself
if(is.null(robust_vcov)){
@jknowles
jknowles / helper_funcs.R
Last active September 10, 2017 04:30
R Helper functions for the Philadelphia SDP Cohort 8 Predictive Analytics Workshop
# Calculate the AUC of a GLM model easily
# Jared Knowles
# model = a fitted glm in R
# newdata = an optional data.frame of new fitted values
auc.glm <- function(model, newdata = NULL){
if(missing(newdata)){
resp <- model$y
# if(class(resp) == "numeric"){
# resp <- factor(resp)
# }
@jknowles
jknowles / 0_reuse_code.js
Created April 27, 2014 02:40
Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
@jknowles
jknowles / s_dplyr.R
Last active August 29, 2015 13:59 — forked from skranz/s_dplyr
# Helper functions that allow string arguments for dplyr's data modification functions like arrange, select etc.
# Author: Sebastian Kranz
# Examples are below
#' Modified version of dplyr's filter that uses string arguments
#' @export
s_filter = function(.data, ...) {
eval.string.dplyr(.data,"filter", ...)
}