Skip to content

Instantly share code, notes, and snippets.

@tarsisazevedo
Forked from gabrielacaesar/limpeza-temer.py
Last active May 4, 2018 19:14
Show Gist options
  • Save tarsisazevedo/7236cc19647fa9908e4205b7f391a19a to your computer and use it in GitHub Desktop.
Save tarsisazevedo/7236cc19647fa9908e4205b7f391a19a to your computer and use it in GitHub Desktop.
limpeza-temer
import re
titles = ['Deputado Federal ', 'General ', 'Ex-presidente ', 'Senadora ', 'Senador ', 'do Exército', 'Tenente-Brigadeiro']
novas_linhas = []
for _, row in agenda2016_limpa_final.iterrows():
nova_linha = [row["oque"], row["onde"], row["ano"], row["mes"], row["dia"], row["hora"]]
pessoas = [re.sub('|'.join(titles),'',i.split(', ')[0]) for i in row["oque"].split('; ')]
for pessoa in pessoas:
linha_com_pessoa = nova_linha[:] # o lista[:] eh para copiar a lista
linha_com_pessoa.append(pessoa)
novas_linhas.append(linha_com_pessoa)
novo_df = pd.DataFrame(novas_linhas, columns=["oque", "onde", "ano", "mes", "dia", "hora", "pessoa"])
@tarsisazevedo
Copy link
Author

screen shot 2018-02-25 at 13 04 39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment