Skip to content

Instantly share code, notes, and snippets.

Avatar

Shawn Graham shawngraham

View GitHub Profile
View grab-identifiers-dpla.R
install.packages("devtools")
devtools::install_github("ropensci/rdpla")
library('rdpla')
mykey = "PUT YOUR KEY FROM DP.LA HERE"
# do a query; here we want ids which we can feed to wget
itemlist = items(key=mykey, q="science", date_before=1900, page_size=100, fields=c("id"))
# this will write the ids to a list; you'll need to open it in a spreadsheet, remove the first row if it's not an id
write.csv(itemlist $data, "itemlist.csv", row.names=FALSE)
# save the csv to txt (utf 8), then you can pass to wget as in Exercise 4 at
# https://github.com/hist3907b-winter2015/module2-findingdata/blob/master/m2-exercises.md
@shawngraham
shawngraham / getting a History Machine
Last active Aug 29, 2015
setting up a history research machine. Follow the instructions.
View getting a History Machine
# By Ben Marwick, from: https://gist.github.com/benmarwick/11204658 with modifications by S. Graham
Short instructions to setup a Lubuntu Virtual Machine with
R & RStudio:
1. Download these:
http://lubuntu.net/ (Intel x86 desktop cd)
https://www.virtualbox.org/wiki/Downloads (Oracle VM VirtualBox)
2. Install Oracle VM VirtualBox, open it (if using windows,
View CND-topic-model.r
setwd("desktop/beals-new")
# give yourself as much memory as you've got
options(java.parameters = "-Xmx5120m")
library(rJava)
## from http://cran.r-project.org/web/packages/mallet/mallet.pdf
library(mallet)
#CND xml file transformed in browser into csv table. copy & paste into excel, saved as csv. Cut the column headers and paste them in the line below:
library(RCurl)
@shawngraham
shawngraham / archaeology-geolocatedtweets.geojson
Created Feb 17, 2015
twarc scrape of "archaeology", geolocated tweets
View archaeology-geolocatedtweets.geojson
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@shawngraham
shawngraham / gist:7efd64c08a94c39a593f
Last active Aug 29, 2015
CND-topic-model-with-guidance.rmd
View gist:7efd64c08a94c39a593f
---
title: "Topic Modeling the Colonial Newspaper Database"
author: "Shawn Graham"
date: "February 17, 2015"
output: html_document
---
In [Module 3](https://github.com/hist3907b-winter2015/module3-wranglingdata), we used TEI to mark up primary documents. Melodee Beals has been using TEI to markup newspaper articles, creating the [Colonial Newspapers Database](https://github.com/mhbeals/Colonial-Newspaper-Database) (which she shared on github). We then used Github Pages and an XLST stylesheet to convert that database into a table of comma-separated values <https://raw.githubusercontent.com/shawngraham/exercise/gh-pages/CND.csv>. We are now going to topic model the text of those newspaper articles, to see what patterns of discourse may lie within.
# Getting Started
View geolooting.geojson
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@shawngraham
shawngraham / id.txt
Created Feb 19, 2015
list of 'looting' tweets by id - use TWARC hydrate command to get the original tweets again (thus complying with twitter tos)
View id.txt
id
568407121496137000
568407114378395000
568407104077193000
568407096242253000
568407089673957000
568407069016981000
568407057214234000
568406964599791000
568406941086527000
@shawngraham
shawngraham / geolooting-russian.geojson
Created Feb 19, 2015
geolocated tweets with russian 'мародерство' ('looting')
View geolooting-russian.geojson
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@shawngraham
shawngraham / antiquities.geojson
Created Feb 19, 2015
antiquities via twarc, geotagged
View antiquities.geojson
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@shawngraham
shawngraham / geolootedtweets.json
Created Feb 19, 2015
'looted' search on twitter, geolocated tweets
View geolootedtweets.json
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.