Skip to content

Instantly share code, notes, and snippets.

@karlcow
Created February 10, 2021 13:50
Show Gist options
  • Save karlcow/5c11c06fb0345ea02ad51e5f7e9a2d9f to your computer and use it in GitHub Desktop.
Save karlcow/5c11c06fb0345ea02ad51e5f7e9a2d9f to your computer and use it in GitHub Desktop.
lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range

With lxml 4.5.0

❯ python
Python 3.9.1 (default, Feb  5 2021, 17:04:50) 
[Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> from io import StringIO
>>> etree.parse(StringIO('<h2>👺</h2>'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "src/lxml/etree.pyx", line 3519, in lxml.etree.parse
  File "src/lxml/parser.pxi", line 1856, in lxml.etree._parseDocument
  File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1757, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1068, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range, line 1, column 2
>>> 

with lxml 4.6.1

❯ python                         
Python 3.9.1 (default, Feb  5 2021, 17:04:50) 
[Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> from io import StringIO
>>> etree.parse(StringIO('<h2>👺</h2>'))
<lxml.etree._ElementTree object at 0x10ea66f40>
>>>

This is bug 1902364

@reagle
Copy link

reagle commented Aug 5, 2021

Thank you for that. My xmllint and xsltproc on the commandline are the same and work correctly, I suspect the problem is that within python:

libxml used/compiled         : (2, 9, 4)
libxslt used/compiled        : (1, 1, 29)

I'm not sure how to untangle that yet given my use of homebrew and pip3. Perhaps its related to this:

╰─➤  brin libxml2                                                                             1 ↵
libxml2: stable 2.9.12 (bottled), HEAD [keg-only]
GNOME XML library
http://xmlsoft.org/
/opt/homebrew/Cellar/libxml2/2.9.12 (282 files, 11.3MB)
  Built from source on 2021-07-06 at 11:42:44
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/libxml2.rb
License: MIT
==> Dependencies
Build: python@3.9 ✔
Required: readline ✔
==> Options
--HEAD
	Install HEAD version
==> Caveats
libxml2 is keg-only, which means it was not symlinked into /opt/homebrew,
because macOS already provides this software and installing another version in
parallel can cause all kinds of trouble.

If you need to have libxml2 first in your PATH, run:
  echo 'export PATH="/opt/homebrew/opt/libxml2/bin:$PATH"' >> ~/.zshrc

For compilers to find libxml2 you may need to set:
  export LDFLAGS="-L/opt/homebrew/opt/libxml2/lib"
  export CPPFLAGS="-I/opt/homebrew/opt/libxml2/include"

For pkg-config to find libxml2 you may need to set:
  export PKG_CONFIG_PATH="/opt/homebrew/opt/libxml2/lib/pkgconfig"

@karlcow
Copy link
Author

karlcow commented Aug 5, 2021

❯ brew info libxml2
libxml2: stable 2.9.12 (bottled), HEAD [keg-only]
GNOME XML library
http://xmlsoft.org/
Not installed
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/libxml2.rb
License: MIT
==> Dependencies
Build: python@3.9 ✘
Required: readline ✔
==> Options
--HEAD
	Install HEAD version
==> Caveats
libxml2 is keg-only, which means it was not symlinked into /usr/local,
because macOS already provides this software and installing another version in
parallel can cause all kinds of trouble.

==> Analytics
install: 51,320 (30 days), 175,981 (90 days), 637,723 (365 days)
install-on-request: 30,204 (30 days), 99,449 (90 days), 373,165 (365 days)
build-error: 0 (30 days)

Ah indeed difference here.

You: Build: python@3.9 ✔
Me: Build: python@3.9 ✘

@karlcow
Copy link
Author

karlcow commented Aug 5, 2021

❯ which python3
/usr/local/bin/python3

❯ which -a python3
/usr/local/bin/python3
/Library/Frameworks/Python.framework/Versions/3.8/bin/python3
/usr/local/bin/python3
/usr/bin/python3

A bit of a mess too. :)

❯ /usr/local/bin/python3 -V
Python 3.9.1
❯ /Library/Frameworks/Python.framework/Versions/3.8/bin/python3 -V
Python 3.8.3
❯ /usr/bin/python3 -V
Python 3.8.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment