-
-
Save amankharwal/0e8923929d7aeba7e99e9bd3971f6d94 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
nltk.download('punkt') | |
nested_sent_token = [nltk.sent_tokenize(lst) for lst in empty_lst] | |
# flatten list, len: 3241 | |
flat_sent_token = [item for sublist in nested_sent_token for item in sublist] | |
print("Flatten sentence token: ", len(flat_sent_token)) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment