Skip to content

Instantly share code, notes, and snippets.

@ecarreras
Created July 30, 2014 06:57
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ecarreras/07de999f3b7ab0ff1803 to your computer and use it in GitHub Desktop.
Save ecarreras/07de999f3b7ab0ff1803 to your computer and use it in GitHub Desktop.
import os
print "Descarregant PDF comers..."
os.system('curl "http://simel.simel.ree.es/sep/PubServlet2?operacion=AccInfor&fichero=117_Comercializadores.pdf" > comers.pdf')
print "Transformant a TXT..."
os.system('pdftotext comers.pdf')
print "Creant diccionari..."
c = open('comers.txt', 'r').read()
comers_in = c.split('\n')
comers_in = [x for x in comers_in if x]
comers_out = {}
for idx, field in enumerate(comers_in):
if len(field) > 67:
print field
if len(field) == 4 and all([x.isdigit() for x in field]):
comer_name = comers_in[idx + 1]
if len(comer_name) >= 69:
comer_name = comer_name[:69]
comers_out[field] = comer_name
print comers_out
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment