Skip to content

Instantly share code, notes, and snippets.

@natematias
Created September 7, 2016 04:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save natematias/6f42abf8faabc763d458fd41144b8c44 to your computer and use it in GitHub Desktop.
Save natematias/6f42abf8faabc763d458fd41144b8c44 to your computer and use it in GitHub Desktop.
Find sentences with 140 characters or less
import nltk
import sys
sent_detector = nltk.data.load('tokenizers/punkt/english.pickle')
fulltext = open(sys.argv[1], "r").read()
for sentence in sent_detector.tokenize(fulltext.strip()):
if(len(sentence) <= 140):
print(sentence)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment