Skip to content

Instantly share code, notes, and snippets.

@ericmoritz
Created June 17, 2011 20:25
Show Gist options
  • Save ericmoritz/1032245 to your computer and use it in GitHub Desktop.
Save ericmoritz/1032245 to your computer and use it in GitHub Desktop.
import time
from pyquery import PyQuery as pq
seen = set()
start = "http://www.reddit.com/r/pics/comments/i24rp/every_time_i_see_the_vancouver_kiss_submitted/c209bus"
next = start
while True:
if next in seen:
print "Recursion detected, abort!"
break
seen.add(next)
d = pq(url=next)
match = d(".usertext.border .usertext-body a:first")
if match:
print "Going deeper: %s" % (next, )
next = match.attr("href")
else:
print "No match here %s, abort" % (next, )
break
time.sleep(0.1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment