Skip to content

Instantly share code, notes, and snippets.

@morishjs
Created July 28, 2016 11:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save morishjs/2f9e47767301d950a49fab0f0338aae3 to your computer and use it in GitHub Desktop.
Save morishjs/2f9e47767301d950a49fab0f0338aae3 to your computer and use it in GitHub Desktop.
using BeautifulSoup
import urllib2
import re
from bs4 import BeautifulSoup
parseUrl = urllib2.urlopen('http://python-data.dr-chuck.net/comments_300262.html')
soup = BeautifulSoup(parseUrl,'html.parser')
#regular expression to extract integer value
#print(soup.prettify())
lists = soup.find_all('span')
sum = 0
for e in lists:
num = str(e.string.encode('utf-8'))
sum += int(num)
print sum
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment