Skip to content

Instantly share code, notes, and snippets.

@brianray
Created December 26, 2014 16:43
Show Gist options
  • Save brianray/2ebd28235f7b9a56a53f to your computer and use it in GitHub Desktop.
Save brianray/2ebd28235f7b9a56a53f to your computer and use it in GitHub Desktop.
Generate Confluence Wiki table from NLTK Corpus
from nltk.downloader import Downloader
dl = Downloader()
def format(x, fields, type_name, delim):
x = x.__dict__
out = delim
for col in fields.split(delim):
if col == 'type':
out += type_name
else:
out += x.get(col, "N/A").replace(delim, ";" ).encode("utf8")
out += delim
return out
headers = 'name languages license author contact type'
delim = "|"
headers = delim.join(headers.split())
out = [delim+headers+delim, ]
out += [format(x, headers, 'models', delim) for x in dl.models()]
out += [format(x, headers, 'packages', delim) for x in dl.packages()]
out += [format(x, headers, 'collections', delim) for x in dl.collections()]
out[0] = out[0].replace("|", "||")
print "\n".join(out)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment