Skip to content

Instantly share code, notes, and snippets.

View edsu's full-sized avatar

Ed Summers edsu

View GitHub Profile
import string
import sys
import requests
import whois
from nltk import tokenize
BOOKFILE = sys.argv[1]
OUTPUTFILE = BOOKFILE + '.possible-domains.txt'
edsu / notebook.ipynb
Created July 20, 2018 19:13 — forked from trevormunoz/rekognition-labels-analysis.ipynb
Quick test of Amazon rekognition output
Sorry, this is too big to display.

Some Friendly Advice for Data Curators

From Dorothea Salo at the Digital Humanities Winter Institute Data Curation Seminar ... shamelessly quoted hopefully not too much out of context by a student (Ed Summers).

  • Pick software last.
  • Don't chase the shiny.
  • Know where the exits are (especially in the cloud).
  • Keep your options open.
  • What problems can you make someone else's problem?
edsu /
Created October 10, 2012 14:01 — forked from acdha/horrible-beta-markup.html
Experiment adding HTML5 microdata following to a WDL item page and processing with rdflib-microdata
#!/usr/bin/env python
# you'll need to pip install microdata
import urllib
import rdflib
import microdata
items = microdata.get_items(open("horrible-beta-markup.html"))
open("horrible-beta-markup.json", "w").write(items[0].json())