Skip to content

Instantly share code, notes, and snippets.

@mahendrakalkura
Last active April 15, 2017 11:37
Show Gist options
  • Save mahendrakalkura/908f061d87bae7a0ce51 to your computer and use it in GitHub Desktop.
Save mahendrakalkura/908f061d87bae7a0ce51 to your computer and use it in GitHub Desktop.
psiupuxa.com
# -*- coding: utf-8 -*-
'''
$ mkvirtualenv psiupuxa.com
$ workon psiupuxa.com
$ pip install grequests
$ pip install scrapy
$ python psiupuxa.com.py > psiupuxa.com.txt
$ xargs -P 64 wget < psiupuxa.com.txt
'''
from grequests import get, map
from scrapy.selector import Selector
for response in map(
(get(url) for url in ['http://psiupuxa.com/pages/{page}'.format(page=page + 1) for page in range(1, 8)])
):
for url in Selector(text=response.text).xpath('//a/@href').extract():
if '_desktop.' in url:
print url
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment