Skip to content

Instantly share code, notes, and snippets.

@turicas
Created March 22, 2018 15:53
Show Gist options
  • Save turicas/6b9ca83dcd531a6cd4fd87ced2a28c70 to your computer and use it in GitHub Desktop.
Save turicas/6b9ca83dcd531a6cd4fd87ced2a28c70 to your computer and use it in GitHub Desktop.
import io
import re
import requests
import rows
def extrai_tabela(url):
response = requests.get(url)
return rows.import_from_pdf(
io.BytesIO(response.content),
ends_before=re.compile(r'\* ?Variação em .*'),
)
arquivos = ['16032018194928.pdf', '18082017185431.pdf']
for arquivo in arquivos:
url = f'http://www.imea.com.br/upload/publicacoes/arquivos/{arquivo}'
print(f'Baixando {url}')
table = extrai_tabela(url)
print(rows.export_to_txt(table))
@dukejeffrie
Copy link

Na versão 0.3.1 não encontro import_from_pdf, is it perhaps something you're still working on? 🤔

@Danielydsm
Copy link

When I tried to run this code I get the error:

image

Any idea why?

@emidioandre
Copy link

@Danielydsm try access file plugin_csv.py and change value in line unicodecsv.field_size_limit to unicodecsv.field_size_limit(16777216)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment