Skip to content

Instantly share code, notes, and snippets.

@mjam03
Created February 22, 2021 11:10
Show Gist options
  • Save mjam03/99a7ac55068872f5cf1721eaec3b26ef to your computer and use it in GitHub Desktop.
Save mjam03/99a7ac55068872f5cf1721eaec3b26ef to your computer and use it in GitHub Desktop.
test_9
# name the urls
NRS_ROOT = 'https://www.nrscotland.gov.uk'
NRS_DEATH_STRING = 'statistics/covid19/covid-deaths-21'
# request website, parse and identify from the html only the url link elements
req = Request(NRS_ROOT+'/covid19stats', headers=hdr)
html_page = urlopen(req)
soup = BeautifulSoup(html_page, "lxml")
links = []
for link in soup.findAll('a'):
l = link.get('href')
if l != None:
links.append(l)
NRS_COD_ZIP = NRS_ROOT + [x for x in links if NRS_DEATH_STRING in x and '.zip' in x][0]
print("Scottish COD data at: {}".format(NRS_COD_ZIP))
# request the zip file and display the contents
r = requests.get(NRS_COD_ZIP)
zips = zipfile.ZipFile(io.BytesIO(r.content))
print("Files contained within the zip:")
zips.namelist()[:5]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment