Skip to content

Instantly share code, notes, and snippets.

@ssteuteville
Created June 26, 2015 03:44
Show Gist options
  • Save ssteuteville/a9528816328a6ec52097 to your computer and use it in GitHub Desktop.
Save ssteuteville/a9528816328a6ec52097 to your computer and use it in GitHub Desktop.
basic scrapyz example
from scrapyz.spiders.core import GenericSpider, Target
class RedditSpider(GenericSpider):
name = "reddit"
start_urls = ["https://www.reddit.com/"]
class Meta:
elements = ".thing"
targets = [
Target("rank", ".rank::text"),
Target("upvoted", ".upvoted::text"),
Target("dislikes", ".dislikes::text"),
Target("likes", ".likes::text"),
Target("title", "a.title::text"),
Target("domain", ".domain > a::text"),
Target("tagline", ".tagline > time::attr(datetime)"),
Target("author", ".tagline > .author::text"),
Target("subreddit", ".tagline > .subreddit::text"),
Target("comments", ".comments::text")
]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment