Skip to content

Instantly share code, notes, and snippets.

@lordzuko
Last active October 26, 2023 13:30
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lordzuko/9f3df89ecc0200e66f1e28c1366192ac to your computer and use it in GitHub Desktop.
Save lordzuko/9f3df89ecc0200e66f1e28c1366192ac to your computer and use it in GitHub Desktop.
Automating login using BeautifulSoup & Requests module in python over SSO Raw
{
"ssousername":"example@oracle.com",
"password":"put your password here"
}
import sys
import requests
import json
from bs4 import BeautifulSoup
def mprint(x):
sys.stdout.write(x)
print
return
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux i686; rv:7.0.1) Gecko/20100101 Firefox/7.0.1'}
mprint('[-] Initialization...')
s = requests.session()
s.headers.update(headers)
print 'done'
mprint('[-] Gathering JSESSIONID..')
# This should redirect us to the login page
# On looking at the page source we can find that
# in the submit form 6 values are submitted (at least at the time of this script)
# try to take those values out using beautiful soup
# and then do a post request. On doing post https://login.oracle.com/mysso/signon.jsp
# we will be given message we have the data which is more than necessary
# then it will take us to the form where we have to submit data here
# https://login.oracle.com/oam/server/sso/auth_cred_submit
# once done we are signed in and doing and requests.get(url) will get you the page you want.
r = s.get("company's local url- a link which requires authentication")
if r.status_code != requests.codes.ok:
print 'error'
exit(1)
print 'done'
c = r.content
soup = BeautifulSoup(c,'lxml')
svars = {}
for var in soup.findAll('input',type="hidden"):
svars[var['name']] = var['value']
s = requests.session()
r = s.post('https://login.oracle.com/mysso/signon.jsp', data=svars)
mprint('[-] Trying to submit credentials...')
inputRaw = open('credentials.json','r')
login = json.load(inputRaw)
data = {
'v': svars['v'],
'OAM_REQ': svars['OAM_REQ'],
'site2pstoretoken': svars['site2pstoretoken'],
'locale': svars['locale'],
'ssousername': login['ssousername'],
'password': login['password'],
}
r = s.post('https://login.oracle.com/oam/server/sso/auth_cred_submit', data=data)
r = s.get("company's local url- a link which requires authentication")
# dumping the html page to html file
with open('test.html','w') as f:
f.write(r.content)
@eggonzal
Copy link

I get a 200 response for line 64, but if I print the content of that response I can see

<p class="loginFailed">System error. Please re-try your action. If you continue to get this error, please contact the Administrator.</p>

I even added the request_id to the data dictionary.

Have you tried this recently?

@eggonzal
Copy link

I get a 200 response for line 64, but if I print the content of that response I can see

<p class="loginFailed">System error. Please re-try your action. If you continue to get this error, please contact the Administrator.</p>

I even added the request_id to the data dictionary.

Have you tried this recently?

Actually I found the issue.

At line 47 you create a new request.session() which causes the login to fail because it is as if you never got redirected to the login page. I commented that line and was able to login.

@lordzuko
Copy link
Author

@eggonzal, apologies for the late reply didn't saw the notification earlier. Glad it worked for you.

@essentialols
Copy link

@lordzuko I'm struggling to find the appropriate POST URL for my purpose. The login page looks like it is the same as the POST page. Is there any foolproof solution to finding the POST URL?

@eggonzal
Copy link

eggonzal commented Dec 5, 2020

@lordzuko I'm struggling to find the appropriate POST URL for my purpose. The login page looks like it is the same as the POST page. Is there any foolproof solution to finding the POST URL?

I was able to do it by parsing the html form and getting the form action attribute.

@DharveshAtish
Copy link

@lordzuko thanks for the effort in providing this piece of code. unfortunaletly on my side i'm having a hard time to make it work. i'm hitting the ADF loopback script , see below. what i'm trying achieve is to get the content of an oracle document (https://support.oracle.com/rs?type=doc&id=2796575.1 for instance), which requires in return to login to https://login.oracle.com/mysso/signon.jsp whiile submitting the login/password. i do not have a "company's local url- a link which requires authentication", instead i'm using https://support.oracle.com/rs?type=doc&id=2796575.1 for the first get request. my login/pass is valid since it works well through a web browser.

basically what i'm tried to archived is similar to this post but had no luck with it either https://stackoverflow.com/questions/57209347/how-to-login-and-web-scrape-support-oracle-com-using-python3-requests

after the post, r = s.post('https://login.oracle.com/mysso/signon.jsp', data=svars) , i'm getting the following input of type hidden, but i'm missing the site2pstoretoken ?!

form method="post" action="/oam/server/sso/auth_cred_submit" name="LoginForm" autocomplete="off"
input type="hidden" name="v" value="v1.4"
input type="hidden" name="request_id" value="669602750230009589868"
input type="hidden" name="OAM_REQ" value="VERSION_4~LwYVl9zVjDiVV1IdzIFLCjQEb5%....."
input type="hidden" name="locale" value=""

maybe i'm getting it wrong and would really appreciate your help on this.

-- below part of the loopback script which is return on r = s.post('https://login.oracle.com/oam/server/sso/auth_cred_submit', data=data)

This is the loopback script to process the url before the real page loads. It introduces
a separate round trip. During this first roundtrip, we currently do two things:
- check the url hash portion, this is for the PPR Navigation.
- do the new window detection
the above two are both controled by parameters in web.xml
Since it's very lightweight, so the network latency is the only impact.
here are the list of will-pass-in parameters (these will replace the param in this whole
pattern:
viewIdLength view Id length (characters),
loopbackIdParam loopback Id param name,
loopbackId loopback Id,
windowModeIdParam window mode param name,
clientWindowIdParam client window Id param name,
windowId window Id,
initPageLaunch initPageLaunch,
enableNewWindowDetect whether we want to enable new window detection
jsessionId session Id that needs to be appended to the redirect URL
adfHashMarker A string udentifying the start of PPR Navigation hash
enablePPRNav whether we want to enable PPR Navigation
internalParamsObj an object whose keys are the names of the internal parameters and whose values evaluate as true
noLoopbackViewId View Id used where the page should be redirected when the session cannot be established due to the
browser with disabled cookies accessing a server with URL rewriting disabled

thanks.

@DharveshAtish
Copy link

i was able to fix it finally using the ADF Faces param org.apache.myfaces.trinidad.agent.email=true which enables you to output your page in a simplified mode either for printing or for emailing which is more convenient when scraping the site. anyone having the same issue, checkout https://docs.oracle.com/middleware/1213/adf/develop-faces/adf-faces-outputmodes.htm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment