Skip to content

Instantly share code, notes, and snippets.

@Parth-Vader
Created July 10, 2017 13:47
Show Gist options
  • Save Parth-Vader/236738a373cc4308d51d1173a5a35b9c to your computer and use it in GitHub Desktop.
Save Parth-Vader/236738a373cc4308d51d1173a5a35b9c to your computer and use it in GitHub Desktop.
A program to run the pypy version of scrapy in the same process.
from scrapy.crawler import CrawlerProcess
from scrapy.crawler import CrawlerRunner
from twisted.internet import reactor
from scrapy.utils.project import get_project_settings
from scrapy.utils.log import configure_logging
process = CrawlerProcess(get_project_settings())
# 'followall' is the name of one of the spiders of the project.
#process.crawl('followall')
configure_logging()
runner = CrawlerRunner()
runner.crawl('followall')
runner.crawl('followall')
d = runner.join()
d.addBoth(lambda _: reactor.stop())
reactor.run()
#process.start() # the script will block here until the crawling is finished
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment