Last active
April 15, 2017 11:37
-
-
Save mahendrakalkura/908f061d87bae7a0ce51 to your computer and use it in GitHub Desktop.
psiupuxa.com
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
''' | |
$ mkvirtualenv psiupuxa.com | |
$ workon psiupuxa.com | |
$ pip install grequests | |
$ pip install scrapy | |
$ python psiupuxa.com.py > psiupuxa.com.txt | |
$ xargs -P 64 wget < psiupuxa.com.txt | |
''' | |
from grequests import get, map | |
from scrapy.selector import Selector | |
for response in map( | |
(get(url) for url in ['http://psiupuxa.com/pages/{page}'.format(page=page + 1) for page in range(1, 8)]) | |
): | |
for url in Selector(text=response.text).xpath('//a/@href').extract(): | |
if '_desktop.' in url: | |
print url |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment