Skip to content

Instantly share code, notes, and snippets.

@barraponto
Created April 6, 2012 15:23
Show Gist options
  • Save barraponto/2320748 to your computer and use it in GitHub Desktop.
Save barraponto/2320748 to your computer and use it in GitHub Desktop.
from scrapy.spider import BaseSpider
from lxml.html import parse as lxmlparse
class KwnewsSpider(BaseSpider):
name = 'kwnews'
allowed_domains = ['www.independent.co.uk']
start_urls = ['http://www.independent.co.uk/']
def parse(self, response):
doc = lxmlparse(response.body)
print doc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment