Skip to content

Instantly share code, notes, and snippets.

@ino46
Created April 7, 2011 00:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ino46/906840 to your computer and use it in GitHub Desktop.
Save ino46/906840 to your computer and use it in GitHub Desktop.
lxml.etree & XPath
#!/usr/bin/python
# -*- coding: utf-8 -*-
from lxml import etree
from StringIO import StringIO
str_html = """<p>
<em>aa</em>bb<em>cc</em>
</p>"""
root = etree.parse(StringIO(str_html), etree.HTMLParser())
em1 = root.xpath('//em[1]')
print em1[0].text #aa
print etree.tostring(em1[0], method='text') #aabb
print etree.tostring(em1[0]) #<em>aa</em>bb
print etree.tostring(em1[0], method='text', with_tail=False) #aa
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment