Skip to content

Instantly share code, notes, and snippets.

@rshyam1
Created May 24, 2016 01:23
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save rshyam1/d181356f7c9549c23facbe6451d3369c to your computer and use it in GitHub Desktop.
Save rshyam1/d181356f7c9549c23facbe6451d3369c to your computer and use it in GitHub Desktop.
from scrapy import Spider, Request
from scrapy.selector import Selector
from demo.items import DemoItem
class DemoSpider(Spider):
name = 'demo'
allowed_urls = ['https://newyork.craigslist.org']
start_urls = ['https://newyork.craigslist.org/search/stn/cto']
def parse(self, response):
urls = []
for i in range(26):
tn = '?s=' + str(100*i)
urls.append(response.urljoin(tn))
for url in urls:
yield Request(url, callback=self.parse_main_page)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment