Skip to content

Instantly share code, notes, and snippets.

@karlcow

karlcow/error.md

Created Feb 10, 2021
Embed
What would you like to do?
lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range

With lxml 4.5.0

❯ python
Python 3.9.1 (default, Feb  5 2021, 17:04:50) 
[Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> from io import StringIO
>>> etree.parse(StringIO('<h2>👺</h2>'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "src/lxml/etree.pyx", line 3519, in lxml.etree.parse
  File "src/lxml/parser.pxi", line 1856, in lxml.etree._parseDocument
  File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1757, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1068, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range, line 1, column 2
>>> 

with lxml 4.6.1

❯ python                         
Python 3.9.1 (default, Feb  5 2021, 17:04:50) 
[Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> from io import StringIO
>>> etree.parse(StringIO('<h2>👺</h2>'))
<lxml.etree._ElementTree object at 0x10ea66f40>
>>>

This is bug 1902364

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment