Skip to content

Instantly share code, notes, and snippets.

@nikhilkumarsingh
Created August 10, 2018 16:59
Show Gist options
  • Save nikhilkumarsingh/c354874a36c9d85f45fd14309e0afeff to your computer and use it in GitHub Desktop.
Save nikhilkumarsingh/c354874a36c9d85f45fd14309e0afeff to your computer and use it in GitHub Desktop.
import requests
from bs4 import BeautifulSoup
headers = {
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'
}
login_data = {
'name': '<username>',
'pass': '<password>',
'form_id': 'new_login_form',
'op': 'Login'
}
with requests.Session() as s:
url = 'https://www.codechef.com/'
r = s.get(url, headers=headers)
soup = BeautifulSoup(r.content, 'html5lib')
login_data['form_build_id'] = soup.find('input', attrs={'name': 'form_build_id'})['value']
r = s.post(url, data=login_data, headers=headers)
print(r.content)
@ukulsh
Copy link

ukulsh commented Dec 6, 2018

I'm getting an error in line 18,
Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?
can you help me out??

@weisshufer
Copy link

I'm getting an error in line 18,
Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?
can you help me out??

hi,, ukulsh

try to install under pip this module

pip install html5

and you get successfuly

@luminostech1089
Copy link

luminostech1089 commented Apr 11, 2019

Hi Nikhil,
I was trying to login to a website internal to my organisation. However, even though i have passed correct credentials, it throws below error:
for POST request

Server Error

401 - Unauthorized: Access is denied due to invalid credentials.

You do not have permission to view this directory or page using the credentials that you supplied.

So tried GET request for same url, but gets same response from server. Is it something that server might have handled to block bot or am i missing something? Interesting thing is if replace my url with the codechef url , it get the appropriate response. Here is my code snippet:

headers_simple = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'}
url = r'http://private_url'
with requests.Session() as s:
r=s.get(url=url, headers=headers_simple)
print r.content

@panagiotisTB
Copy link

I'm getting an error in line 18,
Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?
can you help me out??

hi,, ukulsh

try to install under pip this module

pip install html5

and you get successfuly

Hey guys,

just use 'html.parser'. Shouldn't need an extra install.
Hope this helps

@hivo
Copy link

hivo commented Jul 18, 2019

NameError: name 'BeautifulSoup' is not defined line 18

@walidmax
Copy link

i have this error ,any help!:
Traceback (most recent call last):
File ".\login2.py", line 18, in
login_data['form_build_id'] = soup.find('input', attrs={'name': 'form_build_id'}) ['value']
TypeError: 'NoneType' object is not subscriptable

@luk0y
Copy link

luk0y commented Aug 18, 2019

Have you Checked Stackoverflow regarding this issue ?

@jankobananko12
Copy link

hey i get error in first line it says "no module named requests"

@mikephilippstock
Copy link

hey i get error in first line it says "no module named requests"

https://stackoverflow.com/questions/17309288/importerror-no-module-named-requests

@ourad
Copy link

ourad commented Dec 18, 2019

hey i get error in first line it says "no module named requests"

you just have to type on your terminal : "pip3 install requests" or "pip install requests"

@elunesarrow54
Copy link

elunesarrow54 commented Jan 8, 2020

hi, thanks for your post but i have some problem...

https://bionluk.com/login i cant enter the website..
always "[

]" is coming from this site.. how can i solve this ?

adres = "www.bionluk.com/login"
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36',
}
url = str("http://" + adres)

#Alınan adresin verilerini topla (request modulü ile)



response = requests.get(url, headers = headers)
icerik = response.content
soup = BeautifulSoup(icerik, "lxml")
print(soup.find_all("div"))

@ourad
Copy link

ourad commented Jan 10, 2020 via email

@elunesarrow54
Copy link

i tried but not work... i cant reach the HTML codes... pls some one help me.

url: www.bionluk.com/login

@vigkan
Copy link

vigkan commented Feb 29, 2020

hi i tried the same code (just inputed by username ad password) but i am not able to login to the website

b'\n\n\n\n

could you help me out

@ourad
Copy link

ourad commented Feb 29, 2020 via email

@Sameuh216
Copy link

Hi Nickhil, thank you for this video , I've tried it with my own parameters and website but the only response was :
Users\Sami\AppData\Local\Programs\Python\Python37-32\python.exe: can't find 'main' module in ''
Could you please tell me what is the matter .
best regards
Sami

@uddeshyy
Copy link

uddeshyy commented Oct 3, 2020

Hi Nikhil, This Code is Excellent and works perfect.
Can we check the status_code of the post request to know if login is successful or not ( for multiple accounts login)
I don't want content of the login page, just need to know if login is successful (maybe using .status_code)
This beautiful soup find function might not work for all websites.
Please help.

@sachin1152207
Copy link

I'm getting an error in line 18,
Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?
can you help me out??

Try this command in terminal: pip install html5ib

@luk0y
Copy link

luk0y commented Aug 15, 2021

@uddeshyy You better make use of cookies in the response . You can use status codes but Status codes are completely dependent on how the website responds. If the website giving 200 response code for both valid and invalid login then you can’t use. If Incase giving 400 for invalid and 200 for valid then there is change to use it as a key for success login.

@luk0y
Copy link

luk0y commented Aug 15, 2021

@uminostech1089 you need to send the request the same way, the browser sending to the server. Use fidler or any http network debugging tools to read how the website is sending the requests to the server

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment