Skip to content

Instantly share code, notes, and snippets.

@Slater-Victoroff
Created May 18, 2015 19:42
Show Gist options
  • Save Slater-Victoroff/ee9cca09ac32447d242b to your computer and use it in GitHub Desktop.
Save Slater-Victoroff/ee9cca09ac32447d242b to your computer and use it in GitHub Desktop.
def get_voter_links(outfile="voter_info.txt"):
start_urls = ("http://usavoters.directory/complete.php?id=%s" % i for i in xrange(128555, 214545))
with open(outfile, 'a') as sink:
for url in start_urls:
document = etree.HTML(requests.get(url).content)
link_selector = CSSSelector('tr>td>a')
person_links = link_selector(document)[14:]
sink.write('\n'.join(link.get('href') for link in person_links if link.get('href')))
print url
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment