Skip to content

Instantly share code, notes, and snippets.

@kroger
Created February 13, 2012 03:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kroger/1813329 to your computer and use it in GitHub Desktop.
Save kroger/1813329 to your computer and use it in GitHub Desktop.
Download web page in Python with authentication
import urllib2, urllib
"""
secret = json.load(open(os.path.expanduser('~/.secret')))
opener = authenticate('http://mixergy.com/wp-login.php',
secret['mixergy']['username'],
mixergy['password'])
html = get_page('http://mixergy.com/ali-brown-info-business/', opener)
"""
def authenticate(login_url, username, password, login_string='log', password_string='pwd'):
"""The variables login_string and password_string are the names
used in the input form. In wordpress they are usually equal to
'log' and 'pwd' respectivelly.
"""
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
p = urllib.urlencode({login_string: username, password_string: password})
f = opener.open(login_url, p)
f.read()
f.close()
return opener
def get_page(page, opener):
return opener.open(page).read()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment