Skip to content

Instantly share code, notes, and snippets.

@BrongoObenge
Last active August 29, 2015 14:07
Show Gist options
  • Save BrongoObenge/44de680f237b69af7220 to your computer and use it in GitHub Desktop.
Save BrongoObenge/44de680f237b69af7220 to your computer and use it in GitHub Desktop.
Process specs crawler
`scrapy shell 'http://tweakers.net/pricewatch/416125/msi-geforce-gtx-970-gaming-4g/specificaties/' ``` #connects in shell with the source
`a = response.xpath('//table[@class="spec-detail"]/tr')` #creates variable a. var a is to get the table "spec-detail"
`b = a.xpath("//tr")` #makes var b, var is to get table row. Specify with [1-99]
`c = b[5].xpath("td[@class='spec-index-column']")` #gets the first table rows Option name (Category name)
`testa = b[5].xpath("td[@class='spec-column first']")`
`d = testa.xpath("span[@itemprop='mpn']")[0]` #gets the category answer
c gets the key
d gets the value
OLD
####################
Get xpath in Chromium
`response.xpath('//*[@id="tab:specificaties"]/table/tr[1]/td[2]')` # tr[1-99] contains the row, td[1] has key. td[2] has value
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment