Skip to content

Instantly share code, notes, and snippets.

@orasik
Last active December 19, 2017 22:09
Show Gist options
  • Save orasik/27b82c61cc91584c121a588c81f20750 to your computer and use it in GitHub Desktop.
Save orasik/27b82c61cc91584c121a588c81f20750 to your computer and use it in GitHub Desktop.
NLTK
import nltk
sentence = """At eight o'clock on Thursday morning
... Arthur didn't feel very good."""
tokens = nltk.word_tokenize(sentence)
tokens
# >>> ['At', 'eight', "o'clock", 'on', 'Thursday', 'morning',
# >>> 'Arthur', 'did', "n't", 'feel', 'very', 'good', '.']
tagged = nltk.pos_tag(tokens)
tagged[0:6]
# >>> [('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'), ('on', 'IN'),
# >>> ('Thursday', 'NNP'), ('morning', 'NN')]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment