Skip to content

Instantly share code, notes, and snippets.

@whosaysni
Created July 4, 2013 05:46
Show Gist options
  • Save whosaysni/5925182 to your computer and use it in GitHub Desktop.
Save whosaysni/5925182 to your computer and use it in GitHub Desktop.
Forcing ElementTree to serialize extra namespaces
# coding: utf-8
"""
To parse XML with namespaces using ElementTree, keeping namespace prefix, you will need register_namespace.
>>> from xml.etree.ElementTree import register_namespace, fromstring, tostring
>>> base_ns = {
... 'http://example.com/article': 'article',
... 'http://example.com/report': 'report',
... 'http://example.com/book': 'book',
... }
>>> for uri, prefix in base_ns.items():
... register_namespace(prefix, uri)
>>>
Now, you may parse XML and serialize back to string with proper namespace.
>>> src = '<docs xmlns:article=\"http://example.com/article\" xmlns:report=\"http://example.com/report\" xmlns:book=\"http://example.com/book\"><article><article:section>foo</article:section></article><report><report:chapter>bar</report:chapter></report></docs>'
>>> elem = fromstring(src)
>>> tostring(elem)
'<docs xmlns:article="http://example.com/article" xmlns:report="http://example.com/report">...</docs>'
>>>
However, you may notice, there is a caveat: even source XML contains full namespace declaration, ElementTree drops part of them if no relevant node are used. On above, xmlns:book is omitted because there's no "book:" prefixed node.
To force xmlns declarations (not used in a document) are serialized, tweak namespace by yourself and call _serialize_xml directly.
>>> from xml.etree.ElementTree import _serialize_xml, _namespaces
>>> qnames, namespaces = _namespaces(elem, 'utf-8') # ... this contains the "used" namespaces.
>>> namespaces
{'http://example.com/report': 'report', 'http://example.com/article': 'article'}
>>> namespaces.update(base_ns)
>>> from StringIO import StringIO
>>> buf = StringIO()
>>> _serialize_xml(buf.write, elem, 'utf-8', qnames, namespaces)
>>> buf.getvalue()
'<docs xmlns:article="http://example.com/article" xmlns:book="http://example.com/book" xmlns:report="http://example.com/report">...</docs>'
"""
from doctest import testmod, ELLIPSIS
testmod(optionflags=ELLIPSIS)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment