Skip to content

Instantly share code, notes, and snippets.

Created March 14, 2015 19:06
Show Gist options
  • Save anonymous/d432388d5bb9309020a7 to your computer and use it in GitHub Desktop.
Save anonymous/d432388d5bb9309020a7 to your computer and use it in GitHub Desktop.
# flatten the hierarchy of a Gutenberg CD
cp $(find iso -name "*.txt") lt/corpus
# parse the master list of books in a Gutenberg CD
with open("master_list.csv") as f:
meta = [[((title + ": " + subtitle) if subtitle else title), (ln, fn), txt] for title,subtitle,fn,ln,txt,html,catmo,catyear,lang in csv.reader(f) if lang is '' and txt is not '']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment