Skip to content

Instantly share code, notes, and snippets.

@yejianye
Created June 30, 2013 16:58
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save yejianye/5895954 to your computer and use it in GitHub Desktop.
lxml cssselect example
import lxml
root = lxml.html.parse(filename).getroot()
# in case the file contains unicode characters
parser = lxml.html.HTMLParser(encoding='utf-8')
root = lxml.html.parse(filename, parser=parser).getroot()
# get matched elements using css selector
els = root.cssselect('div.shop-info div.info-name h2 a')
el_text = els[0].text
el_href = els[0].attrib['href']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment