-
-
Save nikhilkumarsingh/c354874a36c9d85f45fd14309e0afeff to your computer and use it in GitHub Desktop.
import requests | |
from bs4 import BeautifulSoup | |
headers = { | |
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36' | |
} | |
login_data = { | |
'name': '<username>', | |
'pass': '<password>', | |
'form_id': 'new_login_form', | |
'op': 'Login' | |
} | |
with requests.Session() as s: | |
url = 'https://www.codechef.com/' | |
r = s.get(url, headers=headers) | |
soup = BeautifulSoup(r.content, 'html5lib') | |
login_data['form_build_id'] = soup.find('input', attrs={'name': 'form_build_id'})['value'] | |
r = s.post(url, data=login_data, headers=headers) | |
print(r.content) |
I'm getting an error in line 18,
Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?
can you help me out??hi,, ukulsh
try to install under pip this module
pip install html5
and you get successfuly
Hey guys,
just use 'html.parser'. Shouldn't need an extra install.
Hope this helps
NameError: name 'BeautifulSoup' is not defined line 18
i have this error ,any help!:
Traceback (most recent call last):
File ".\login2.py", line 18, in
login_data['form_build_id'] = soup.find('input', attrs={'name': 'form_build_id'}) ['value']
TypeError: 'NoneType' object is not subscriptable
Have you Checked Stackoverflow regarding this issue ?
hey i get error in first line it says "no module named requests"
hey i get error in first line it says "no module named requests"
https://stackoverflow.com/questions/17309288/importerror-no-module-named-requests
hey i get error in first line it says "no module named requests"
you just have to type on your terminal : "pip3 install requests" or "pip install requests"
hi, thanks for your post but i have some problem...
https://bionluk.com/login i cant enter the website..
]" is coming from this site.. how can i solve this ?
always "[
adres = "www.bionluk.com/login"
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36',
}
url = str("http://" + adres)
#Alınan adresin verilerini topla (request modulü ile)
response = requests.get(url, headers = headers)
icerik = response.content
soup = BeautifulSoup(icerik, "lxml")
print(soup.find_all("div"))
i tried but not work... i cant reach the HTML codes... pls some one help me.
hi i tried the same code (just inputed by username ad password) but i am not able to login to the website
b'\n\n\n\n
could you help me out
Hi Nickhil, thank you for this video , I've tried it with my own parameters and website but the only response was :
Users\Sami\AppData\Local\Programs\Python\Python37-32\python.exe: can't find 'main' module in ''
Could you please tell me what is the matter .
best regards
Sami
Hi Nikhil, This Code is Excellent and works perfect.
Can we check the status_code of the post request to know if login is successful or not ( for multiple accounts login)
I don't want content of the login page, just need to know if login is successful (maybe using .status_code)
This beautiful soup find function might not work for all websites.
Please help.
I'm getting an error in line 18,
Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?
can you help me out??
Try this command in terminal: pip install html5ib
@uddeshyy You better make use of cookies in the response . You can use status codes but Status codes are completely dependent on how the website responds. If the website giving 200 response code for both valid and invalid login then you can’t use. If Incase giving 400 for invalid and 200 for valid then there is change to use it as a key for success login.
@uminostech1089 you need to send the request the same way, the browser sending to the server. Use fidler or any http network debugging tools to read how the website is sending the requests to the server
Hi Nikhil,
I was trying to login to a website internal to my organisation. However, even though i have passed correct credentials, it throws below error:
for POST request
Server Error
401 - Unauthorized: Access is denied due to invalid credentials.
You do not have permission to view this directory or page using the credentials that you supplied.
So tried GET request for same url, but gets same response from server. Is it something that server might have handled to block bot or am i missing something? Interesting thing is if replace my url with the codechef url , it get the appropriate response. Here is my code snippet:
headers_simple = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'}
url = r'http://private_url'
with requests.Session() as s:
r=s.get(url=url, headers=headers_simple)
print r.content