Skip to content

Instantly share code, notes, and snippets.

@fmasanori
Last active September 6, 2022 12:05
Show Gist options
  • Save fmasanori/4673017 to your computer and use it in GitHub Desktop.
Save fmasanori/4673017 to your computer and use it in GitHub Desktop.
texto = open('alice.txt').read().lower()
from string import punctuation
for c in punctuation:
texto = texto.replace(c, ' ')
texto = texto.split()
dic = {}
for p in texto:
if p not in dic:
dic[p] = 1
else:
dic[p] += 1
print (f'{dic["alice"]} vezes')
@voyeg3r
Copy link

voyeg3r commented Jan 25, 2017

I have a lightly modified version here: https://github.com/voyeg3r/dotfiles/blob/master/bin/countwords.py
opening a file: with open(file) as f
and printing using python 3.6 new string format

@voyeg3r
Copy link

voyeg3r commented Jan 25, 2017

Fabiovilela - Aqui no linux testei pra ver a codificação do arquivo de entrada alice.txt e apareceu utf-8, se você baixar e colar por exemplo no bloco de notas ele salvará por padrão em iso-8859-1 o que pode estar causando o erro.

@voyeg3r
Copy link

voyeg3r commented Jan 26, 2017

Outra coisa. Como eu poderia gerar um dict comprehension ao invés de usar um laço nessa estrutura

@marcelo-reis
Copy link

Fabiovilela, tenta adicionar o "encoding" no open:
file = open(filename, encoding="utf8")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment