Skip to content

Instantly share code, notes, and snippets.

View matt-dray's full-sized avatar
®️

Matt Dray matt-dray

®️
View GitHub Profile
library(dplyr)
library(Rcrawler)
library(purrr)
library(httr)
x <- c(
"https://rostrum.blog",
"https://therostrumblog.wordpress.com/"
)
@matt-dray
matt-dray / embed_iframe.R
Created September 17, 2018 13:50
Embed slides into blogdown
knitr::include_url("https://matt-dray.github.io/earl18-presentation/")
@matt-dray
matt-dray / join-data-two-databases.R
Last active September 19, 2018 13:09
Use odbc and DBI to connect to two SQL databases, query them and join the outputs
# Connecting R to SQL (with DBI and odbc)
# Original: 18 July 2018
# Matt Dray
# Goal: join two datasets from different databases
# Basic approach:
# 1. Connect to database A, perform query, disconnect
# 2. As above for database B
# 3. Join the two dataframes returned from steps 1 and 2
@matt-dray
matt-dray / get_gias.R
Last active September 19, 2018 13:16
Download today's latest Get Information About Schools data
# This code sources the CSV for GIAS from online
# The URL is in the form [standard path][creation date][.csv], so we can just
# change the creation date to today's date using Sys.Date()
gias <- data.table::fread(
paste0(
"http://ea-edubase-api-prod.azurewebsites.net/edubase/edubasealldata",
stringr::str_replace_all(Sys.Date(), "-", ""),
".csv"
)
@matt-dray
matt-dray / extract_tweets_function.R
Last active September 19, 2018 13:25
Function to automate generation of RDS, simple CSV and plot using rtweet
# Function to automate generation of RDS, simple CSV and plot using rtweet
# Matt Dray
# March 2018
# Purpose: create an RDS, simplified CSV and plot of tweets containing search term
# from rtweet::get_tweets function and save to to folder with unique descriptive
# name related to search term. Assumes you have an 'output' folder in your home
# directory to store these files. Assumes you've already sorted out a twitter
# token as per http://rtweet.info/articles/auth.html
@matt-dray
matt-dray / preferred_rmd_yaml.yaml
Last active September 19, 2018 13:26
R Markdown YAML header for current date, nice theme/highlighting/table of contents
---
title: "Title"
subtitle: "Subtitle"
author: "Name"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
html_document:
theme: cerulean
highlight: tango
number_sections: yes
@matt-dray
matt-dray / googlesheets_test.R
Last active September 19, 2018 13:27
Basic functions from the googlesheets package for R
# Basic googlesheets functions
# Jan 2018
# https://github.com/jennybc/googlesheets
# Browser sign-in required on first function execution
# 1. Load package ----
#install.packages(googlesheets)
library(googlesheets)
@matt-dray
matt-dray / sparklyr-test.R
Created September 21, 2018 13:56
Testing the sparklyr package for Coffee & Coding
# Testing sparklyr for Coffee & Coding
# Matt Dray
# 29 Nov 2017
# Following https://spark.rstudio.com/
# R 3.4.2 & RStudio 1.1.383
# What? -------------------------------------------------------------------
@matt-dray
matt-dray / dbi-odbc-join.R
Created September 21, 2018 14:00
Example of getting data from two different SQL databases and joining them in R
# Coffee & Coding: connecting R to SQL (with DBI and odbc)
# DfE-scpecific example
# 18 July 2018
# Session: Cathy
# This script: Matt
# Basic approach:
# 1. Connect to database A, perform query, disconnect
# 2. As above for database B
# 3. Join the two dataframes returned from steps 1 and 2
@matt-dray
matt-dray / gac-xml-df-csv.R
Last active October 3, 2018 13:22
Read XML files to dataframes in a list and then save each as a CSV
# FIREBREAK Q2 2018
# Government Art Collection
# Matt Dray
# 2 October 2018
# Purpose: Wrangle XML files output from GAC database to CSV
# Call packages -----------------------------------------------------------