Skip to content

Instantly share code, notes, and snippets.

@tofi86
Last active June 28, 2018 13:27
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save tofi86/6652442 to your computer and use it in GitHub Desktop.
Save tofi86/6652442 to your computer and use it in GitHub Desktop.
In XSLT, get a string that contains every character used in an XML input document exactly one time (no duplicates)
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element>This is some sample text string with <bold>inline markup</bold> &amp; some special characters.</element>
<element>Also, named (&lt;), decimal (&#62;) and hexadecimal (&#x00A2;) entities should be respected.</element>
</root>
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:pa="http://www.pagina-online.de/" version="2.0">
<xsl:function name="pa:listAllCharacters">
<xsl:param name="textString"/>
<xsl:value-of select="
codepoints-to-string(
distinct-values(
string-to-codepoints(
string-join($textString, '')
)
)
)"/>
</xsl:function>
<xsl:template match="/">
<xsl:value-of select="pa:listAllCharacters(//text())"/>
</xsl:template>
</xsl:stylesheet>
# Escaped:
This omeapltxrngwku&amp;c.A,d(&lt;)&gt;¢b
# Not escaped:
This omeapltxrngwku&c.A,d(<)>¢b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment