Skip to content

Instantly share code, notes, and snippets.

@PaulSec
Created September 24, 2015 11:34
Show Gist options
  • Select an option

  • Save PaulSec/112c2bc975328692efe8 to your computer and use it in GitHub Desktop.

Select an option

Save PaulSec/112c2bc975328692efe8 to your computer and use it in GitHub Desktop.
Dump top 500 Alexa sites from the website
import requests
from bs4 import BeautifulSoup
for x in range(0, 20):
req = requests.get("http://www.alexa.com/topsites/global;%s" % x)
soup = BeautifulSoup(req.content)
for paragraph in soup.findAll('p', attrs={'class': 'desc-paragraph'}):
print paragraph.find('a').text
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment