Skip to content

Instantly share code, notes, and snippets.

@robincamille
Last active June 15, 2017 18:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save robincamille/e212eaed6d5792cebcdf4ce812544e8b to your computer and use it in GitHub Desktop.
Save robincamille/e212eaed6d5792cebcdf4ce812544e8b to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# From a saved Wikipedia page, plucks out the link titles from links #100-200
from bs4 import BeautifulSoup
with open('Royal_British_Columbia_Museum.html') as doc:
text = BeautifulSoup(doc,"html.parser")
links = text.find_all('a')
for l in links[100:200]:
l = l.get_text()
if len(l) < 1:
pass
elif l[0] == '[':
pass
elif l == 'edit':
pass
else:
print l
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment