Skip to content

Instantly share code, notes, and snippets.

@kaja47
Created February 13, 2012 09:59
Show Gist options
  • Save kaja47/1815595 to your computer and use it in GitHub Desktop.
Save kaja47/1815595 to your computer and use it in GitHub Desktop.
czechiatwitter.com crawler
<?php
$users = array();
foreach (range(1, 897) as $page) {
$dom = new DOMDocument();
@$dom->loadHtmlFile("http://www.czechiatwitter.com/?page=$page");
$xpath = new DOMXPath($dom);
$res = $xpath->query('//strong/a');
foreach ($res as $r) { $users[] = $r->nodeValue; }
}
file_put_contents('twitter_users', join("\n", $users));
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment