Skip to content

Instantly share code, notes, and snippets.

View edsu's full-sized avatar

Ed Summers edsu

View GitHub Profile
import string
import sys
import requests
import whois
from nltk import tokenize
BOOKFILE = sys.argv[1]
OUTPUTFILE = BOOKFILE + '.possible-domains.txt'
@edsu
edsu / notebook.ipynb
Created July 20, 2018 19:13 — forked from trevormunoz/rekognition-labels-analysis.ipynb
Quick test of Amazon rekognition output
Sorry, this is too big to display.

Some Friendly Advice for Data Curators

From Dorothea Salo at the Digital Humanities Winter Institute Data Curation Seminar ... shamelessly quoted hopefully not too much out of context by a student (Ed Summers).

  • Pick software last.
  • Don't chase the shiny.
  • Know where the exits are (especially in the cloud).
  • Keep your options open.
  • What problems can you make someone else's problem?
@edsu
edsu / convert.py
Created October 10, 2012 14:01 — forked from acdha/horrible-beta-markup.html
Experiment adding HTML5 microdata following schema.org to a WDL item page and processing with rdflib-microdata
#!/usr/bin/env python
# you'll need to pip install microdata
import urllib
import rdflib
import microdata
items = microdata.get_items(open("horrible-beta-markup.html"))
open("horrible-beta-markup.json", "w").write(items[0].json())