Skip to content

Instantly share code, notes, and snippets.

@arantius
Created October 16, 2011 01:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save arantius/1290412 to your computer and use it in GitHub Desktop.
Save arantius/1290412 to your computer and use it in GitHub Desktop.
Demonstration of why .iterdescendants() is wrong for stackoverflow question 6123351
>>> import lxml.html
>>> h = lxml.html.fromstring('<html><body><p>one <b>two</b></p><P>three <b>four</b></p></body></html>')
>>> lxml.html.tostring(h)
'<html><body><p>one <b>two</b></p><p>three <b>four</b></p></body></html>'
>>> ''.join([lxml.html.tostring(c) for c in h.body.iterdescendants()])
'<p>one <b>two</b></p><b>two</b><p>three <b>four</b></p><b>four</b>'
>>> ''.join([lxml.html.tostring(c) for c in h.body.iterchildren()])
'<p>one <b>two</b></p><p>three <b>four</b></p>'
>>>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment