Skip to content

Instantly share code, notes, and snippets.

@lettergram
Created December 27, 2018 05:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lettergram/5270b0a58b440c1c24ed319886ce1a68 to your computer and use it in GitHub Desktop.
Save lettergram/5270b0a58b440c1c24ed319886ce1a68 to your computer and use it in GitHub Desktop.
# Pulls all the data from the manually generated imparatives dataset
with open('data/imperatives.csv', 'r') as imperative_file:
for row in imperative_file:
tagged_comments[imperative] = "command"
# Pulls all data from the SPAADIA dataset, adds to our dataset
for doc in os.listdir('data/SPAADIA'):
with open('data/SPAADIA/' + doc, 'r') as handle:
conversations = BeautifulSoup(handle, features="xml")
for imperative in conversations.findAll("imp"):
tagged_comments[imperative.get_text()] = "command"
for declarative in conversations.findAll("decl"):
tagged_comments[declarative.get_text()] = "statement"
for question in conversations.findAll("q-yn"):
tagged_comments[question.get_text()] = "question"
for question in conversations.findAll("q-wh"):
tagged_comments[question.get_text()] = "question"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment