Skip to content

Instantly share code, notes, and snippets.

View c-forster's full-sized avatar

Chris Forster c-forster

View GitHub Profile
@c-forster
c-forster / noindent.py
Created May 15, 2015 18:51
pandoc filter to create a div class which removes indentation (also requires unprovided CSS for HTML output to work)
#!/usr/bin/python
from pandocfilters import toJSONFilter, RawInline, Para, Space, walk
def latex(s):
return RawInline('latex', s)
def html(s):
return RawInline('html', s)
def deindentParas(key, value, format, meta):
@c-forster
c-forster / drama_metadata-amended_summary.csv
Last active September 11, 2015 02:33
Tallies of Gender Detected in HathiTrust Dataset for Fiction, Poetry, and Drama
year missing undetected men women norm.missing norm.undetected norm.men norm.women
1704 1 0 0 0 1.0 0.0 0.0 0.0
1705 0 1 0 0 0.0 1.0 0.0 0.0
1706 0 0 1 0 0.0 0.0 1.0 0.0
1709 0 1 0 0 0.0 1.0 0.0 0.0
1710 0 1 0 0 0.0 1.0 0.0 0.0
1713 1 0 1 0 0.5 0.0 0.5 0.0
1714 0 1 2 0 0.0 0.3333333333333333 0.6666666666666666 0.0
1715 0 1 1 0 0.0 0.5 0.5 0.0
1716 0 2 2 0 0.0 0.5 0.5 0.0
@c-forster
c-forster / extractComparisonSetForBSPF.py
Last active September 4, 2015 18:30
Reduces (Amended) HathiTrust Fiction Metadata to the Paraments of the BSPF Data
#!/usr/bin/python
# -*- coding: utf-8 -*-
# Extract and summarize the gender breakdown for data comparable to
# that reported by Raven et al in *The English Novel 1770-1829: A
# Bibliographical Survey of Prose Fiction Published in the British Isles*
# As a practical matter, this means:
# - works published between 1770 and 1830
# - published in England or Scotland or Ireland
# - individual works (remove duplicates, and count multivol works only once)
@c-forster
c-forster / ht-comparison-set.csv
Created September 2, 2015 19:01
A Summary of Gender in a Subset of HathiTrust Data, Designed to Mimic that of the BSPF
year totalWorks male female undetected namemissing
1770 8 4 0 4 0
1771 9 4 1 3 1
1772 5 2 1 2 0
1773 2 1 0 1 0
1774 12 5 0 5 2
1775 6 4 0 1 1
1776 4 0 0 3 1
1777 11 3 2 2 4
1778 5 2 1 1 1
@c-forster
c-forster / src.Text.Pandoc.Writers.TEI.hs
Last active December 24, 2015 13:32
Trying to Write New Tests for a New Pandoc Writer
{-# LANGUAGE OverloadedStrings #-}
{-
Copyright (C) 2006-2015 John MacFarlane <jgm@berkeley.edu>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,