Skip to content

Instantly share code, notes, and snippets.

@cclauss
Last active December 23, 2015 14:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cclauss/6648735 to your computer and use it in GitHub Desktop.
Save cclauss/6648735 to your computer and use it in GitHub Desktop.
Create a BeautifulSoup by reading in a webpage.
#!/usr/bin/env python
# create a BeautifulSoup by reading in a webpage
# Current varsion at https://gist.github.com/cclauss/6648735
# usage:
# from soupFromURL import soupFromURL
# theSoup = soupFromURL('http://www.python.org')
import bs4
useRequests = False # "Python HTTP: When in doubt, or when
try: # not in doubt, use Requests. Beautiful,
import requests # simple, Pythonic." -- Kenny Meyers
useRequests = True
except:
from contextlib import closing
from urllib2 import urlopen
def soupFromURL(inURL):
if useRequests:
return bs4.BeautifulSoup(requests.get(inURL).text)
else:
with closing(urlopen(inURL)) as webPageSource:
#print('Successfully opened URL: ' + inURL)
return bs4.BeautifulSoup(webPageSource.read())
if __name__ == '__main__':
theURL = 'http://www.python.org'
theSoup = soupFromURL(theURL)
print(theSoup.prettify())
@cclauss
Copy link
Author

cclauss commented Sep 24, 2013

Updated to use Requests: HTTP for Humans if that module is available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment