Skip to content

Instantly share code, notes, and snippets.

@amnrzv

amnrzv/nltk_tokenize.py

Last active Nov 1, 2017
Embed
What would you like to do?
A little example of NLTK's word and sentence tokenization. Output here: https://gist.github.com/amnrzv/2cbaad89e016acc0db410ec79a5ff40f
from nltk.tokenize import word_tokenize, sent_tokenize
text = "Hello, Mr. Jacobs. Nice to meet you!"
sentences = sent_tokenize(text)
words = word_tokenize(text)
print (sentences)
print (words)
['Hello, Mr. Jacobs.', 'Nice to meet you!']
['Hello', ',', 'Mr.', 'Jacobs', '.', 'Nice', 'to', 'meet', 'you', '!']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.