Skip to content

Instantly share code, notes, and snippets.

@gelim
Created October 21, 2010 17:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gelim/638908 to your computer and use it in GitHub Desktop.
Save gelim/638908 to your computer and use it in GitHub Desktop.
lxml.etree & StringIO strangeness
import StringIO, lxml.etree
xmldata="<root>data</root>"
buffer=StringIO.StringIO(xmldata)
for event, element in lxml.etree.iterparse(buffer):
print "%s %s %s" % (event, element.tag, element.text)
buffer.close()
buffer=StringIO.StringIO()
buffer.write(xmldata)
buffer.seek(0) # YAY! thanks nosklo (#python@freenode)
for event, element in lxml.etree.iterparse(buffer):
print "%s %s %s" % (event, element.tag, element.text)
buffer.close()
@gelim
Copy link
Author

gelim commented Oct 21, 2010

By filling on the second part the StringIO via .write() makes lxml.etree.iterparse throw an execption, why ?

Traceback (most recent call last):
  File "lxml-1.py", line 10, in 
    for event, element in lxml.etree.iterparse(buffer):
  File "iterparse.pxi", line 515, in lxml.etree.iterparse.__next__ (src/lxml/lxml.etree.c:87076)
  File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64521)
lxml.etree.XMLSyntaxError: Extra content at the end of the document, line 1, column 1

@nosklo
Copy link

nosklo commented Oct 21, 2010

your second code doesn't seek back to beginning. Use .seek(0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment