Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
spider receita
"""
Simple script to get the citzen name by its CPF (Brazil's SSN)
"""
from BeautifulSoup import BeautifulSoup as bs
import requests
import urllib2
with requests.session() as session:
url = 'http://www.receita.fazenda.gov.br/aplicacoes/atcta/cpf/ConsultaPublica.asp'
response = session.get(url)
element = bs(response.content)
image_url = 'http://www.receita.fazenda.gov.br/scripts/srf/intercepta/captcha.aspx?opt=image&v=123'
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(response.request.cookiejar))
urllib2.install_opener(opener)
#download image as a file
url_imagem = urllib2.urlopen(image_url)
#saves the image on disk
image_file = open('imagem.gif', 'wb')
image_file.write(url_imagem.read())
image_file.close()
#ask user to input his/her data
captcha = raw_input('Digite o captcha:')
cpf = raw_input('Digite o cpf:')
#sends the request using the user data
dados = {'txtCpf':cpf,'idLetra':captcha}
response = session.post(url, data=dados)
element = bs(response.content)
nome = element.findAll('span',{'class':'clConteudoDados'})[1].string.split(':')[1].lstrip().rstrip()
print(nome)
requests==0.6.2
BeautifulSoup==3.2.0
@gilsondev

This comment has been minimized.

Copy link

@gilsondev gilsondev commented Oct 24, 2011

Quais módulos precisam ser instalados para usar esse script?

@herberthamaral

This comment has been minimized.

Copy link
Owner Author

@herberthamaral herberthamaral commented Oct 24, 2011

@gilsondev

This comment has been minimized.

Copy link

@gilsondev gilsondev commented Oct 24, 2011

Deu erro:

Traceback (most recent call last):
File "gistfile1.py", line 15, in
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(response.request.cookiejar))
AttributeError: 'Request' object has no attribute 'cookiejar'

@herberthamaral

This comment has been minimized.

Copy link
Owner Author

@herberthamaral herberthamaral commented Oct 24, 2011

Eu testei no Python2.7. Qual é a sua versão?

@gilsondev

This comment has been minimized.

Copy link

@gilsondev gilsondev commented Oct 24, 2011

python 2.7 =/

@herberthamaral

This comment has been minimized.

Copy link
Owner Author

@herberthamaral herberthamaral commented Oct 24, 2011

Adicionei o requirements.txt e fiz algumas melhorias. Veja se funciona aí agora.

@herberthamaral

This comment has been minimized.

Copy link
Owner Author

@herberthamaral herberthamaral commented Oct 24, 2011

Ah, já descobri o problema. Meu requests tá na versão 0.6.4. O seu, provavelmente mais novo e na versão 0.7.x, tem uma modificação que remove o cookiejar do objeto Request.

Vou tentar atualizar aqui e já faço o push.

@gilsondev

This comment has been minimized.

Copy link

@gilsondev gilsondev commented Oct 24, 2011

Ok ;)

@herberthamaral

This comment has been minimized.

Copy link
Owner Author

@herberthamaral herberthamaral commented Oct 24, 2011

Parece um problema no requests 0.7.3. Eu abri uma issue pra isso: psf/requests#222

@gilsondev

This comment has been minimized.

Copy link

@gilsondev gilsondev commented Oct 24, 2011

Não estou vendo o captcha para digitar. O CPF tem que ser somente números ou no formato padrão? (000.000.000-00)

@herberthamaral

This comment has been minimized.

Copy link
Owner Author

@herberthamaral herberthamaral commented Oct 24, 2011

O captcha fica num arquivo chamado imagem.gif dentro do diretório do script e o cpf é somente numeros.

@gilsondev

This comment has been minimized.

Copy link

@gilsondev gilsondev commented Oct 24, 2011

Agora sim! Muito bom!! Meus parabéns!! =D

@djcapelli

This comment has been minimized.

Copy link

@djcapelli djcapelli commented Jul 10, 2013

Eu estou tendo o mesmo problema que o gilsondev :
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(response.request.cookiejar))
AttributeError: 'PreparedRequest' object has no attribute 'cookiejar'
Existe alguma forma de manipular o cookie sem ser pelo CokeiJar ?

@fndiaz

This comment has been minimized.

Copy link

@fndiaz fndiaz commented Feb 20, 2014

Estou tendo o problema, alguém pode me ajudar a resolver

File "cpf.py", line 14, in
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(response.request.cookiejar))
AttributeError: 'PreparedRequest' object has no attribute 'cookiejar'

@licensed

This comment has been minimized.

Copy link

@licensed licensed commented Dec 20, 2016

Estou com o mesmo problema do amigo acima
AttributeError: 'PreparedRequest' object has no attribute 'cookiejar'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.