Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Extracting the text node of an XML element containing a double-escaped entety.
>>> import xml.etree.ElementTree as ETree
>>> ETree.fromstring("<test>&amp;rdquo;</test>").text
Copy link

da2x commented Jul 13, 2017

>>> import html
>>> html.unescape("&rdquo;")

Most feed reader implementations won’t know to do that extra step at the end. Named HTML-entities are quite rare in XML, especially double escaped ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment