Last active
July 13, 2019 15:16
-
-
Save LittleYenMin/d8238258bf8c43417a1d84668b3df5e0 to your computer and use it in GitHub Desktop.
Scrapy第六章改寫後的parse程式
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def parse(self, response): | |
for block in response.xpath('//ul[@id="newslistul"]//li'): | |
href = block.xpath('.//a[contains(@class, "tit")]/@href').extract_first() | |
# 爬取新聞正文內容 | |
yield response.follow(url=href, callback=self.parse_content) | |
a_next = response.xpath('//a[contains(@class, "p_next")]/@href').extract_first() | |
if a_next: | |
# 爬下一頁 | |
yield response.follow(a_next, callback=self.parse) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment