da2x/double-escape.py

Created July 13, 2017 12:28

Star 0 You must be signed in to star a gist
Fork 0 You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/da2x/4afd13c2806d0175bd382f5d787bd81d.js"></script>
Save da2x/4afd13c2806d0175bd382f5d787bd81d to your computer and use it in GitHub Desktop.

Download ZIP

Extracting the text node of an XML element containing a double-escaped entety.

Raw

double-escape.py

	>>> import xml.etree.ElementTree as ETree
	>>> ETree.fromstring("<test>&rdquo;</test>").text
	'”'

Author

da2x commented Jul 13, 2017

>>> import html
>>> html.unescape("&rdquo;")
'”'

Most feed reader implementations won’t know to do that extra step at the end. Named HTML-entities are quite rare in XML, especially double escaped ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment