Skip to content

Instantly share code, notes, and snippets.

@ischurov
Created May 2, 2022 19:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ischurov/7b0ad6b1a80d23c8ad039a0761dc91aa to your computer and use it in GitHub Desktop.
Save ischurov/7b0ad6b1a80d23c8ad039a0761dc91aa to your computer and use it in GitHub Desktop.
import scrapy
# Run with
# scrapy runspider scrappytest.py --set=DEPTH_LIMIT=1 -O table.csv
# set DEPTH_LIMIT accordingly
class BlogSpider(scrapy.Spider):
name = 'wikispider'
start_urls = ['https://ru.wikipedia.org/wiki/Африканский_пушистый_погоныш']
def parse(self, response):
my_title = response.css('#firstHeading::text').get()
for link in response.css("p > a"):
yield {'from': my_title,
'to': link.css("::text").get()}
yield response.follow(link, self.parse)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment