Skip to content

Instantly share code, notes, and snippets.

@cassiasamp
Last active August 20, 2020 02:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cassiasamp/4ba71e3b647f85f41c6dc49d5d7df40d to your computer and use it in GitHub Desktop.
Save cassiasamp/4ba71e3b647f85f41c6dc49d5d7df40d to your computer and use it in GitHub Desktop.
with open('texto.txt', encoding='utf8') as file:
text = file.read()
cleaned_words = text.replace('.', ' ').replace('\n', ' ').replace('?', '').replace('(', '').replace(')', '').replace(',', ' ').split(' ')
dedup_words = list(set(cleaned_words))
sorted_words = sorted(dedup_words, key=len, reverse=True)
first_ten_words = sorted_words[:10]
print('Ten longest words in file:', first_ten_words)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment