Skip to content

Instantly share code, notes, and snippets.

@mozillazg
Created September 5, 2012 14:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mozillazg/3637302 to your computer and use it in GitHub Desktop.
Save mozillazg/3637302 to your computer and use it in GitHub Desktop.
cs101-unit2
def get_next_target(page):
start_link = page.find('<a href=')
if start_link == -1:
return None, 0
start_quote = page.find('"', start_link)
end_quote = page.find('"', start_quote + 1)
url = page[start_quote + 1:end_quote]
return url, end_quote
def print_all_links(page):
while True:
url, endpos = get_next_target(page)
if url:
print url
page = page[endpos:]
else:
break
print_all_links('this <a href="test1">link 1</a> is <a href="test2">link 2</a> a <a href="test3">link 3</a>')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment