Skip to content

Instantly share code, notes, and snippets.

@ayubmetah
Created January 11, 2021 23:27
Show Gist options
  • Save ayubmetah/1329751bbfd2140bb94ccfb5ed0b2bfc to your computer and use it in GitHub Desktop.
Save ayubmetah/1329751bbfd2140bb94ccfb5ed0b2bfc to your computer and use it in GitHub Desktop.
Write a Python program which uses urllib to read HTML from the data files in a url and parse the data, extracting numbers and compute the sum of the numbers in the file.
import urllib.request, urllib.parse, urllib.error
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = 'http://py4e-data.dr-chuck.net/comments_1070212.html'
markup = urlopen(url, context=ctx).read()
soup = BeautifulSoup(markup, "html.parser")
sum = 0
for tag in soup('span'):
p = int(tag.text)
sum = sum + p
print(str(p) + ' : ' + str(sum))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment