Skip to content

Instantly share code, notes, and snippets.

View wcaleb's full-sized avatar

W. Caleb McDaniel wcaleb

View GitHub Profile
@wcaleb
wcaleb / gist:6354288
Last active December 21, 2015 19:29
Fix for Chicago fullnote CSL to put commas after author's name. Replace lines 970-979 of chicago-fullnote-bibliography-no-ibid.csl with these lines.
<group delimiter=", ">
<group delimiter=" ">
<group suffix=", ">
<text macro="contributors-note"/>
</group>
<group delimiter=", ">
<text macro="title-note"/>
</group>
<text macro="description-note"/>
</group>
@wcaleb
wcaleb / ocrpdf.sh
Created November 6, 2013 14:41
Take a PDF, OCR it, and add OCR Text as background layer to original PDF to make it searchable
#!/bin/sh
# Take a PDF, OCR it, and add OCR Text as background layer to original PDF to make it searchable.
# Hacked together using tips from these websites:
# http://www.jlaundry.com/2012/ocr-a-scanned-pdf-with-tesseract/
# http://askubuntu.com/questions/27097/how-to-print-a-regular-file-to-pdf-from-command-line
# Dependencies: pdftk, tesseract, imagemagick, enscript, ps2pdf
# Would be nice to use hocr2pdf instead so that the text lines up with the PDF image.
# http://www.exactcode.com/site/open_source/exactimage/hocr2pdf/
@wcaleb
wcaleb / Pong.py
Created December 29, 2013 20:20
Pong
from scene import *
import random
BALL_RADIUS = 20
POINT_RADIUS = 5
GUTTER = 120
PAD_WIDTH = 100
PAD_HEIGHT = 20
HALF_PAD_WIDTH = PAD_WIDTH / 2
HALF_PAD_HEIGHT = PAD_HEIGHT / 2
@wcaleb
wcaleb / tweets318.json
Created January 14, 2014 16:30
Sample JSON for HIST 318
{"tweet1":
{
"username":"wcaleb",
"date_sent":"January 14, 2014",
"text":"I heart Cheerios.",
"hashtags": [],
"coordinates": null,
"has_image":false
},
"tweet2":
@wcaleb
wcaleb / json-exhibits.json
Created January 15, 2014 21:42
JSON validity exercises
### Sample JSON
#### Exhibit A
{"search":{
"field": null,
"hits": 1901,
"sort_order": null,
"do_facets": true,
"focus_item": null,
@wcaleb
wcaleb / txrunawayads.md
Last active August 29, 2015 13:57
TxRunawayAds

The @TxRunawayAds account tweets excerpts from advertisements related to runaway slaves in nineteenth-century Texas newspapers, along with links to the page images of the ad in the Portal to Texas History.

The tweeted excerpts come from ads identified and transcribed in the spring of 2014 by students in two digital history courses at Rice University and the University of North Texas, taught respectively by Caleb McDaniel and Andrew Torget.

The excerpts and links are composed and tweeted automatically with Python scripts written by Caleb McDaniel. Once a day, the feed posts a random ad from our data set. Occasionally, the feed also posts an ad that appeared "on this day" in history.

For more information about this project, please visit o

@wcaleb
wcaleb / chicago-wcm.csl
Created April 26, 2014 19:51
my modified Chicago CSL file
<?xml version="1.0" encoding="utf-8"?>
<style xmlns="http://purl.org/net/xbiblio/csl" class="note" version="1.0" demote-non-dropping-particle="sort-only">
<info>
<title>Chicago Manual of Style (full note, no Ibid.)</title>
<id>http://www.zotero.org/styles/chicago-fullnote-bibliography-no-ibid</id>
<link href="http://www.zotero.org/styles/chicago-fullnote-bibliography-no-ibid" rel="self"/>
<link href="http://www.chicagomanualofstyle.org/tools_citationguide.html" rel="documentation"/>
<author>
<name>Julian Onions</name>
<email>julian.onions@gmail.com</email>
@wcaleb
wcaleb / getbibs.py
Last active April 4, 2022 15:36
The Pandoc filter and shell script I use to make a bibliography file from my BibTeX note files. See http://wcm1.web.rice.edu/plain-text-citations.html
#!/usr/bin/python
# -*- coding: utf-8 -*-
# Pandoc filter that grabs the BibTeX code block from each note file
# and then uses bibtexparser to add a "short title" entry in the "note" field,
# appending finished BibTeX entry to a bibliography file.
from pandocfilters import toJSONFilter, CodeBlock
# https://github.com/sciunto/python-bibtexparser
import bibtexparser
@wcaleb
wcaleb / rename.py
Created July 8, 2014 21:08
Renaming PH2 lesson files after wget download
import os
from bs4 import BeautifulSoup
files = os.listdir('.')
for file in files:
html = open(file, 'r').read()
soup = BeautifulSoup(html)
url = soup.find(rel='canonical')['href']
open(url.split('/')[-1] + '.html', 'w').write(html)
"""
This script ingests a CSV exported from Library Thing and
returns the percentage of author last names that begin with
each letter of the alphabet.
Based on original script by Andrew Pendleton for analyzing
U.S. Census data: https://gist.github.com/apendleton/2638865
"""