Skip to content

Instantly share code, notes, and snippets.

View jmclawson's full-sized avatar

James Clawson jmclawson

View GitHub Profile
---
title: "Development Indicators by Continent"
author: "Gapminder Analytics Group"
format: dashboard
---
# Charts
```{r}
#| label: setup
@jmclawson
jmclawson / linear-models-slides.qmd
Last active November 16, 2023 19:34
Understanding and using linear models in R
---
title: "Understanding and using linear models in R"
subtitle: "A 10-slide guide to the dark side"
format:
revealjs:
slide-number: true
embed-resources: true
warning: false
message: false
theme: dark
@jmclawson
jmclawson / corpus_micusp.R
Last active November 4, 2023 16:36
Functions for retrieving metadata and texts from the Michigan Corpus of Upper-Level Student Papers (https://elicorpora.info/main)
# helper function get_if_needed for downloading online documents exactly once: https://gist.github.com/jmclawson/65899e2de6bfee692b08141a98422240
source("https://gist.githubusercontent.com/jmclawson/65899e2de6bfee692b08141a98422240/raw/7c5590377332e427691f2331b69abd58be2141ec/get_if_needed.R")
get_micusp_metadata <- function(micusp_dir = "micusp"){
get_if_needed("https://elicorpora.info/browse?mode=download&start=1&sort=dept&direction=desc",
filename = "micusp_metadata.csv",
destdir = micusp_dir)
readr::read_csv("micusp/micusp_metadata.csv", show_col_types = FALSE) |>
janitor::clean_names()
@jmclawson
jmclawson / move_state.R
Last active September 19, 2023 13:20
Shift or rotate states when mapping (good for simpler US maps)
# This process is adapted from https://sesync-ci.github.io/blog/transform-Alaska-Hawaii.html
# Here, it's offered as a function to simplify trial-and-error.
move_state <- function(
df, # spatial dataframe
choice, # value in "state" column
rotation, # eg -39 * pi/180
right, # amount to move eastward; use negative for left/west
up# amount to move northward; use negative for down/south
){
@jmclawson
jmclawson / collapse_rows.R
Last active September 4, 2023 16:19
something like "collapse rows" for gt
collapse_rows <- function(df_g, col, lookleft = TRUE){
col_num <- grep(deparse(substitute(col)), colnames(df_g$`_data`))
collapse_style <- css(visibility = "hidden",
border_top = "0px")
test_rows <- function(x) ifelse(is.na(x == lag(x)), FALSE, x == lag(x))
if(col_num > 1 & lookleft) {
col_left <- as.name(colnames(df_g$`_data`)[col_num - 1])
---
title: "Special cases of data importing in R"
---
The typical methods of importing data make it straightforward to import a single CSV file. But data will often be prepared in some other format. Some common scenarios include a folder of smaller CSV files or data prepared for use with SAS.
## Importing many CSV files
It is common to read in multiple CSV files and combine their data frames. Beyond a certain number, the process should be automated.
@jmclawson
jmclawson / sf_example.rmd
Last active July 1, 2023 04:56
using sf to map from external shapefiles and from packages
---
title: "Mapping with R"
output:
html_document:
df_print: paged
toc: true
toc_float: true
date: "2023-06-30"
---
@jmclawson
jmclawson / haven_example.rmd
Created June 12, 2023 22:07
from SAS to R
---
title: "Importing SAS data into R"
---
## Get the data
```{r}
# Just download it once
if(!file.exists("medical.sas7bdat")) {
download.file("http://www.principlesofeconometrics.com/sas/medical.sas7bdat",
@jmclawson
jmclawson / topic_model.R
Created May 4, 2023 13:15
Functions for building a topic model and exploring it. Visualizations include document-level distributions (static and interactive), word distributions per topic, and topic word clouds.
library(wordcloud)
library(topicmodels)
library(plotly)
# Moves a table of texts through the necessary
# steps of preparation before building a topic
# model. The function applies these steps:
# 1. identifies text divisions by the `doc_id`
# column
# 2. divides each of the texts into same-sized
@jmclawson
jmclawson / unnest_without_caps.R
Last active November 4, 2023 16:31
Applies tidytext's unnest_tokens() function but also filters out any word that appears in the text only with a capital letter. In English texts, this should be a quick way to remove all proper nouns.
unnest_without_caps <- function(
df,
column = "text") {
full <- df |>
tidytext::unnest_tokens(word, {{column}}, to_lower = FALSE)
big <- full |>
dplyr::filter(str_detect(word, "^[A-Z]")) |>
dplyr::pull(word)