View grab-identifiers-dpla.R
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
install.packages("devtools") | |
devtools::install_github("ropensci/rdpla") | |
library('rdpla') | |
mykey = "PUT YOUR KEY FROM DP.LA HERE" | |
# do a query; here we want ids which we can feed to wget | |
itemlist = items(key=mykey, q="science", date_before=1900, page_size=100, fields=c("id")) | |
# this will write the ids to a list; you'll need to open it in a spreadsheet, remove the first row if it's not an id | |
write.csv(itemlist $data, "itemlist.csv", row.names=FALSE) | |
# save the csv to txt (utf 8), then you can pass to wget as in Exercise 4 at | |
# https://github.com/hist3907b-winter2015/module2-findingdata/blob/master/m2-exercises.md |
View getting a History Machine
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# By Ben Marwick, from: https://gist.github.com/benmarwick/11204658 with modifications by S. Graham | |
Short instructions to setup a Lubuntu Virtual Machine with | |
R & RStudio: | |
1. Download these: | |
http://lubuntu.net/ (Intel x86 desktop cd) | |
https://www.virtualbox.org/wiki/Downloads (Oracle VM VirtualBox) | |
2. Install Oracle VM VirtualBox, open it (if using windows, |
View CND-topic-model.r
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
setwd("desktop/beals-new") | |
# give yourself as much memory as you've got | |
options(java.parameters = "-Xmx5120m") | |
library(rJava) | |
## from http://cran.r-project.org/web/packages/mallet/mallet.pdf | |
library(mallet) | |
#CND xml file transformed in browser into csv table. copy & paste into excel, saved as csv. Cut the column headers and paste them in the line below: | |
library(RCurl) |
View archaeology-geolocatedtweets.geojson
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View gist:7efd64c08a94c39a593f
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Topic Modeling the Colonial Newspaper Database" | |
author: "Shawn Graham" | |
date: "February 17, 2015" | |
output: html_document | |
--- | |
In [Module 3](https://github.com/hist3907b-winter2015/module3-wranglingdata), we used TEI to mark up primary documents. Melodee Beals has been using TEI to markup newspaper articles, creating the [Colonial Newspapers Database](https://github.com/mhbeals/Colonial-Newspaper-Database) (which she shared on github). We then used Github Pages and an XLST stylesheet to convert that database into a table of comma-separated values <https://raw.githubusercontent.com/shawngraham/exercise/gh-pages/CND.csv>. We are now going to topic model the text of those newspaper articles, to see what patterns of discourse may lie within. | |
# Getting Started |
View geolooting.geojson
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View id.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
id | |
568407121496137000 | |
568407114378395000 | |
568407104077193000 | |
568407096242253000 | |
568407089673957000 | |
568407069016981000 | |
568407057214234000 | |
568406964599791000 | |
568406941086527000 |
View geolooting-russian.geojson
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View antiquities.geojson
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View geolootedtweets.json
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.