agoldst

## woolf-bennett.R
# the metadata.R script (for read.citations()) is part of
# this git repository:
# http://github.com/agoldst/dfr-analysis
# So change this path as needed
source("~/Developer/dfr-analysis/metadata.R")

bennett.df <- read.citations("bennett.csv")
woolf.df <- read.citations("woolf.csv")

# Now bind the two together, using columns to flag AB and VW hits

## threepercent-post.Rmd
As long promised, here are some links to the data I showed a table of during our discussion of Casanova about U.S. literary translation.
By kind permission of Chad Post, I can make available an aggregate data file of all the literature translations catalogued by Three Percent. I've decided to put the data file, together with some scripts and information about the munging, in a [github repository](http://github.com/agoldst/threepercent). The data consists of a single CSV file with one line for each title: [all_titles.csv](https://github.com/agoldst/threepercent/blob/master/all_titles.csv) ([Wikipedia on CSV format](http://en.wikipedia.org/wiki/Comma-separated_values)).

I have produced this by exporting the first "sheet" of each of the five yearly spreadsheets available at [the Three Percent Translation Database](http://www.rochester.edu/College/translation/threepercent/index.php?s=database) and then combining the files. According to Chad Post, updated data will be available soon, at which point I can reprodu

## html_clean.hs

import Text.Pandoc

{-
This script uses the Pandoc library to do two transformations
needed on the way from my mixed markdown/LaTeX syllabus sources to a
single HTML file:

1. Transform the slightly garbled html produced by tex4ht from LaTeX
source containing a biblatex bibliography by getting rid of definition

## dfr_check.R
# for this file, clone http://github.com/agoldst/dfr-analysis
source("~/Developer/dfr-analysis/metadata.R")
library(plyr)
library(stringr)

wordcounts_v <- function (f) {
    frm <- scan(f,what=list(word=character(),weight=integer()),sep=",",skip=1,quiet=T)
    result <- frm$weight
    names(result) <- frm$word
    result

## emp2.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                agoldst
                / emp2.md
            
            
              Last active
              April 21, 2020 01:59
            
              
                Markdown etc. demo: source for a blogpost on…markdown.
              
          
    Empowerment Part II

The actual "empowerment" (modest but real) comes in getting a more detailed understanding of the way the systems we already use handle text, and in learning more ways to manipulate that text, beyond the confines of any single program. The business of plain-text-slinging, a minor craft on its own, nonetheless forms a natural starting point for thinking more deeply about analyzing digitized texts, expressing yourself in "code" of various kinds, and composing in the digital medium.
Downloads

In order to do the workshop on your own, first install Pandoc and LaTeX (links above). Komodo Edit is optional; any text editor will do, though I'll occasionally refer to details in Komodo (menu items, etc.) that may be slightly different in other editors. See below for text editor suggestions.
The handout from the workshop (PDF)

  
## emp2-handout.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                agoldst
                / emp2-handout.md
            
            
              Created
              November 24, 2013 00:20
            
              
                Markdown source for a handout on markdown, HTML, and LaTeX, typesettable with pandoc + xelatex.
              
          
    % DH@RU Workshop: Empowerment Part II
% Andrew Goldstone (andrew.goldstone@rutgers.edu)
% November 20, 2013
Markdown

Text conventions

*emphasis* or _emphasis_; **strong emphasis**


## laureates.R
library("httr")

r_lits <- GET("http://api.nobelprize.org/v1/prize.json",query=list(category="literature"))

laureates <- content(r_lits,"parsed")$prizes # JSON

ids <- sapply(laureates,function (psn) {
    psn$laureates[[1]]$id
})

## jekyll-pandoc.rb
require 'jekyll'
require 'pandoc-ruby' # add pandoc-ruby to your Gemfile

# Plugin for using pandoc as Jekyll markdown processor
# http://jekyllrb.com/docs/extras/ q.v.
# install in jekyll _plugins/ folder
# or Octopress plugins/

# In _config.yml, specify
# markdown: Pandoc      # capital P

## dh2014-slides.R


opts_chunk$set(echo=F,warning=F,prompt=F,comment="",
               autodep=T,cache=T,dev="tikz",
               fig.width=4.5,fig.height=3,size ='footnotesize',
               dev.args=list(pointsize=12))
options(width=70)
options(tikzDefaultEngine="xetex")
options(tikzXelatexPackages=c(
    "\\usepackage{tikz}\n",

## mallet-inference.R
# mallet-inference.R
#
# functions for using MALLET's topic-inference functionality: given an
# existing topic model, estimate topic proportions for new documents
#
# source() this file
#
# Workflow
# --------
#
	# the metadata.R script (for read.citations()) is part of
	# this git repository:
	# http://github.com/agoldst/dfr-analysis
	# So change this path as needed
	source("~/Developer/dfr-analysis/metadata.R")

	bennett.df <- read.citations("bennett.csv")
	woolf.df <- read.citations("woolf.csv")

	# Now bind the two together, using columns to flag AB and VW hits
	As long promised, here are some links to the data I showed a table of during our discussion of Casanova about U.S. literary translation.
	By kind permission of Chad Post, I can make available an aggregate data file of all the literature translations catalogued by Three Percent. I've decided to put the data file, together with some scripts and information about the munging, in a [github repository](http://github.com/agoldst/threepercent). The data consists of a single CSV file with one line for each title: [all_titles.csv](https://github.com/agoldst/threepercent/blob/master/all_titles.csv) ([Wikipedia on CSV format](http://en.wikipedia.org/wiki/Comma-separated_values)).

	I have produced this by exporting the first "sheet" of each of the five yearly spreadsheets available at [the Three Percent Translation Database](http://www.rochester.edu/College/translation/threepercent/index.php?s=database) and then combining the files. According to Chad Post, updated data will be available soon, at which point I can reprodu

	import Text.Pandoc

	{-
	This script uses the Pandoc library to do two transformations
	needed on the way from my mixed markdown/LaTeX syllabus sources to a
	single HTML file:

	1. Transform the slightly garbled html produced by tex4ht from LaTeX
	source containing a biblatex bibliography by getting rid of definition
	# for this file, clone http://github.com/agoldst/dfr-analysis
	source("~/Developer/dfr-analysis/metadata.R")
	library(plyr)
	library(stringr)

	wordcounts_v <- function (f) {
	frm <- scan(f,what=list(word=character(),weight=integer()),sep=",",skip=1,quiet=T)
	result <- frm$weight
	names(result) <- frm$word
	result
	library("httr")

	r_lits <- GET("http://api.nobelprize.org/v1/prize.json",query=list(category="literature"))

	laureates <- content(r_lits,"parsed")$prizes # JSON

	ids <- sapply(laureates,function (psn) {
	psn$laureates[[1]]$id
	})
	require 'jekyll'
	require 'pandoc-ruby' # add pandoc-ruby to your Gemfile

	# Plugin for using pandoc as Jekyll markdown processor
	# http://jekyllrb.com/docs/extras/ q.v.
	# install in jekyll _plugins/ folder
	# or Octopress plugins/

	# In _config.yml, specify
	# markdown: Pandoc # capital P


	opts_chunk$set(echo=F,warning=F,prompt=F,comment="",
	autodep=T,cache=T,dev="tikz",
	fig.width=4.5,fig.height=3,size ='footnotesize',
	dev.args=list(pointsize=12))
	options(width=70)
	options(tikzDefaultEngine="xetex")
	options(tikzXelatexPackages=c(
	"\\usepackage{tikz}\n",
	# mallet-inference.R
	#
	# functions for using MALLET's topic-inference functionality: given an
	# existing topic model, estimate topic proportions for new documents
	#
	# source() this file
	#
	# Workflow
	# --------
	#