Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Extracting the text node of an XML element containing a double-escaped entety.
>>> import xml.etree.ElementTree as ETree
>>> ETree.fromstring("<test>&amp;rdquo;</test>").text
'&rdquo;'
@da2x

This comment has been minimized.

Copy link
Owner Author

@da2x da2x commented Jul 13, 2017

>>> import html
>>> html.unescape("&rdquo;")
'”'

Most feed reader implementations won’t know to do that extra step at the end. Named HTML-entities are quite rare in XML, especially double escaped ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment