Skip to content

Instantly share code, notes, and snippets.

@berlinbrown
Created March 15, 2013 08:29
Show Gist options
  • Save berlinbrown/5168320 to your computer and use it in GitHub Desktop.
Save berlinbrown/5168320 to your computer and use it in GitHub Desktop.
Web crawler fun, some popular seeds
Here are some popular seeds for basic web crawling.
www.realclearreligion.org
www.michigan.com
www.instapaper.com
itunes.apple.com
www.detroitnews.com
shop.npr.org
washington.cbslocal.com
www.marco.org
wordpress.org
www.deadline.com
www.pmc.com
www.realclearpolitics.com
www.huffingtonpost.com
npr.org
www.theverge.com
bloomberg.com
reuters.com
edition.cnn.com
www.hooktheory.com
www.wired.com
thehill.com
singularityhub.com
www.bloomberg.com
ocw.mit.edu
cnn.com
www.nytimes.com
www.newscientist.com
web.mit.edu
news.ycombinator.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment