Skip to content

Instantly share code, notes, and snippets.

View jmclawson's full-sized avatar

James Clawson jmclawson

View GitHub Profile
@jmclawson
jmclawson / convertSPSS.R
Created July 12, 2016 18:44
Convert SPSS data files to CSV and Stata using R
library(foreign)
mytest.alldata <- list()
mytest.csvfile <- c()
mytest.dtafile <- c()
mytest.filelist <- list.files(path="data",pattern="*sav",recursive=FALSE,full.names=T)
for (number in 1:length(mytest.filelist)) {
mytest.alldata[[number]] <- read.spss(mytest.filelist[number],to.data.frame = TRUE)
mytest.csvfile[number] <- gsub("data/|.sav","",mytest.filelist[number])
mytest.csvfile[number] <- paste(mytest.csvfile[number],".csv",sep="")
mytest.dtafile[number] <- gsub("csv","dta",mytest.csvfile[number])
@jmclawson
jmclawson / nounsplitter.R
Last active July 12, 2016 18:53
Strip out all but common nouns in a given text file, using part-of-speech tagging
library(NLP)
library(openNLP)
# credit due to http://stackoverflow.com/questions/30995232/how-to-use-opennlp-to-get-pos-tags-in-r
## SET "data.dir" to a directory containing text files. It's not designed for nested directories.
data.dir <- "data/texts/"
## SET "saveas.file" to the destination directory. Resulting files have an "n" prepended to the file name, but it may be useful to save to a separate directory.
saveas.file <- "data/texts-n/"
@jmclawson
jmclawson / wiki_corpus.R
Last active August 6, 2019 16:15
building a corpus of titles from Wikipedia
base_beg <- "https://en.wikipedia.org/wiki/Category:"
base_end <- "th-century_novels"
get_cat_pages <- function(){
categories <<- data.frame(century=c(),
nation=c(),
url=c(),
stringsAsFactors = FALSE)
for (century in centuries){
cat_url <- paste0(base_beg,century,base_end)
remotes::install_github("kjhealy/covdata")
library(covdata)
library(dplyr)
library(ggplot2)
library(ggrepel)
library(tidyr)
covus_wide <- covus %>%
select(date, state, measure, count) %>%
pivot_wider(id_cols = c(date, state),
library(stylo)
library(ggplot2)
library(dendextend)
# Load with this line: devtools::source_gist("79cd732fb9e6a4f1dd14f1caaac4ee2d")
# Use df <- stylo() to save frequency results
# Then use stylo2gg(df) to visualize principal components
# Use stylo2gg(df, viz="hc") to show hierarchical clusters without rerunning stylo
stylo2gg <- function(df,
library(tidyverse)
library(tidytext)
library(reshape2)
library(wordVectors)
##### Modeling a Corpus #####
# This process for preparing and modeling the corpus is adapted from Women Writers Project's template_word2vec.Rmd
# These adaptations should allow for for preservation of modeling settings to aid in replicability.
# After training the model, recall its setting parameters by exploring the object's attributes.
@jmclawson
jmclawson / mkcomprangezero.tex
Last active June 14, 2021 23:53
Provides an alternative to \mkcomprange for printing a leading zero in a compressed range of numbers less than 10.
\newrobustcmd*{\mkcomprangezero}{%
\begingroup
\@ifstar
{\blx@range@aux\blx@comprange@ii}
{\blx@range@aux\blx@comprange@i}}
\def\blx@comprange@i[#1][#2]#3{%
\let\blx@tempa\@empty
\protected\def\blx@range@out@value{\appto\blx@tempa}%
\def\blx@range@out@item@process{#2}%
\let\blx@range@out@delim\blx@range@out@value
@jmclawson
jmclawson / prepare_bib.R
Last active June 16, 2021 13:51
Pre-process a bib file for clean use in documentation
# To prepare it for use in documentation, import a .bib file, strip Bibdesk's extra fields and additions, and enclose each entry with code compatible with Latex's {listings} package.
library(dplyr)
library(stringr)
library(readr)
# 0. Set relative file path for the bibfile
# setwd()
# 1. read the bib file as a vector of lines
@jmclawson
jmclawson / bibtex_documentation.sty
Last active June 16, 2021 16:03
Introduce .bib-file data directly into Latex documentation using \bibcitem{citekey}
\usepackage{listings}
\usepackage{xcolor}
\let\oldaddbibresource\addbibresource
\renewcommand{\addbibresource}[1]{%
\oldaddbibresource{#1}%
\expandafter\newcommand\csname thebibfile\endcsname{#1}%
}
% \makeatletter
@jmclawson
jmclawson / recreationthursday_2021-07-15.R
Last active July 16, 2021 21:15
Code for #RecreationThursday for July 15
library(tidyverse)
# make a pinwheel: first set up directions. The blades are drawn in different orders for clockwise and counterclockwise
clockwise_t <- c(2, 1, 3, 4)
clockwise_f <- c(4, 3, 1, 2)
direction <- list(clockwise_t, clockwise_f)
# create a 4-color pinwheel with 4 blades facing the same direction
get_pinwheel <-
function(