Skip to content

Instantly share code, notes, and snippets.

@kennypete
Last active January 8, 2024 21:05
Show Gist options
  • Save kennypete/43812886b52d5f3adff8a2a1c7de2fb0 to your computer and use it in GitHub Desktop.
Save kennypete/43812886b52d5f3adff8a2a1c7de2fb0 to your computer and use it in GitHub Desktop.

XSLT examples using Saxon

The PowerShell script eg.ps1 runs the examples.

Common files are:

  1. eg.xml (the XML source); it is a file containing data relating to Unicode code points (and a few 2-code points) where they have named character references (sometimes aka entities).

  2. eg.xslt (the XSLT); this will transform the XML to a .csv with some non-trivial transformations. Note it may not be the most efficient XSLT ever, but it is good for a demonstration of features other than just outputting plain element content.

  3. eg.dtd (if you want to validate the XML, though it is not needed for the purposes of the example; XML Copy Editor is a good free option for XML well-formedness and validation while editing).

Using Saxon-HE

  • I used Java™ SE Runtime Environment (build 12.0.2+10) for this, which is fairly old, and it worked fine.

  • The path to java.exe needs to be in (added to) your environment or otherwise changed to the full path to your Java executable.

  • This uses Saxon-HE to transform, and it:

    • Needs to be unzipped and somewhere to source (refer eg.ps1 where it is to the full path). It does not need “installation” as such.

    • For reference, Saxon-HE is the home edition, which is free.

  • The output is eg_saxon-he.csv.

Using xslt3

  • I used Node.js v21.1.0 and it worked fine.

  • The path to node.exe needs to be in (added to) your environment.

  • This uses xslt3, which is a free version of Saxon for JavaScript/Node. NB: There is a Saxon-JS version, which is saxon-js and also available, but that will only work in tandem with the paid Saxon-EE (which will run a compiled SEF file, so is much faster).

  • Install xslt3 with npm install xslt3 -g and adjust the path in eg.ps1 as required. Running xslt3 as $env:USERPROFILE\AppData\Roaming\npm\node_modules\xslt3 is better, it seems, than running $env:USERPROFILE\AppData\Roaming\npm\xslt3 (running the .ps1 or .cmd) because the latter runs in a new window that disappears after completion.

  • The output is eg_xslt3.csv.

Observations regarding performance

The JavaScript xslt3 version executes approximately twice as fast as the Saxon-HE version. Similar, though not so pronounced, differences in stylesheet compilation time also showed. The only area the Saxon-HE version performed better was memory (it only used about 80% of what the xslt3 transformation used).

<!ELEMENT unicode (character+)>
<!ATTLIST unicode
unicode CDATA #REQUIRED>
<!ELEMENT character (unicodedata?, entity+, description)>
<!ATTLIST character
dec CDATA #REQUIRED
id CDATA #REQUIRED
mode CDATA #IMPLIED
type CDATA #IMPLIED>
<!ELEMENT unicodedata EMPTY>
<!ATTLIST unicodedata
category CDATA #IMPLIED>
<!ELEMENT entity (#PCDATA)>
<!ATTLIST entity
id CDATA #REQUIRED
optional-semi CDATA #IMPLIED
set CDATA #REQUIRED>
<!ELEMENT description (#PCDATA)>
<!ATTLIST description
unicode CDATA #IMPLIED>
echo " "
echo "----------------------------"
echo "using java saxon-he-12.4.jar"
echo "----------------------------"
java -cp "D:\Program Files (Portable)\SaxonHE12-4J\saxon-he-12.4.jar" net.sf.saxon.Transform -s:eg.xml -xsl:eg.xslt -o:eg_saxon-he.csv -t
echo " "
echo "-------------------"
echo "using node xslt3.js"
echo "-------------------"
node $env:USERPROFILE\AppData\Roaming\npm\node_modules\xslt3\xslt3.js -xsl:eg.xslt -s:eg.xml -o:eg_xslt3.csv -t
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE unicode SYSTEM "./eg.dtd">
<unicode unicode="15.1">
<character id="U00009" dec="9" mode="text" type="punctuation">
<unicodedata category="Cc"/>
<entity id="Tab" set="mmlextra"/>
<description unicode="1.1">CHARACTER TABULATION</description>
</character>
<character id="U0000A" dec="10" mode="text" type="other">
<unicodedata category="Cc"/>
<entity id="NewLine" set="mmlextra"/>
<description unicode="1.1">LINE FEED (LF)</description>
</character>
<character id="U00021" dec="33" mode="text" type="punctuation">
<unicodedata category="Po"/>
<entity id="excl" set="8879-isonum"/>
<entity id="excl" set="9573-2003-isonum"/>
<description unicode="1.1">EXCLAMATION MARK</description>
</character>
<character id="U00022" dec="34" mode="text" type="other">
<unicodedata category="Po"/>
<entity id="quot" set="predefined" optional-semi="yes"/>
<entity id="quot" set="xhtml1-special"/>
<entity id="quot" set="8879-isonum"/>
<entity id="quot" set="9573-2003-isonum"/>
<entity id="QUOT" set="html5-uppercase" optional-semi="yes"/>
<description unicode="1.1">QUOTATION MARK</description>
</character>
<character id="U00023" dec="35" mode="text" type="normal">
<unicodedata category="Po"/>
<entity id="num" set="8879-isonum"/>
<entity id="num" set="9573-2003-isonum"/>
<description unicode="1.1">NUMBER SIGN</description>
</character>
<character id="U00024" dec="36" mode="mixed" type="normal">
<unicodedata category="Sc"/>
<entity id="dollar" set="8879-isonum"/>
<entity id="dollar" set="9573-2003-isonum"/>
<description unicode="1.1">DOLLAR SIGN</description>
</character>
<character id="U00025" dec="37" mode="text" type="normal">
<unicodedata category="Po"/>
<entity id="percnt" set="8879-isonum"/>
<entity id="percnt" set="9573-2003-isonum"/>
<description unicode="1.1">PERCENT SIGN</description>
</character>
<character id="U00026" dec="38" mode="text" type="other">
<unicodedata category="Po"/>
<entity id="amp" set="predefined" optional-semi="yes"/>
<entity id="AMP" set="html5-uppercase" optional-semi="yes"/>
<entity id="amp" set="8879-isonum"/>
<entity id="amp" set="9573-2003-isonum"/>
<description unicode="1.1">AMPERSAND</description>
</character>
<character id="U00027" dec="39" mode="text" type="other">
<unicodedata category="Po"/>
<entity id="apos" set="predefined"/>
<entity id="apos" set="8879-isonum"/>
<entity id="apos" set="9573-2003-isonum"/>
<description unicode="1.1">APOSTROPHE</description>
</character>
<character id="U00028" dec="40" mode="text" type="opening">
<unicodedata category="Ps"/>
<entity id="lpar" set="8879-isonum"/>
<entity id="lpar" set="9573-2003-isonum"/>
<description unicode="1.1">LEFT PARENTHESIS</description>
</character>
<character id="U00029" dec="41" mode="text" type="closing">
<unicodedata category="Pe"/>
<entity id="rpar" set="8879-isonum"/>
<entity id="rpar" set="9573-2003-isonum"/>
<description unicode="1.1">RIGHT PARENTHESIS</description>
</character>
<character id="U0002A" dec="42" mode="math" type="other">
<unicodedata category="Po"/>
<entity id="ast" set="8879-isonum"/>
<entity id="ast" set="9573-2003-isonum"/>
<entity id="midast" set="9573-1991-isoamsb"/>
<entity id="midast" set="9573-2003-isoamsb"/>
<description unicode="1.1">ASTERISK</description>
</character>
<character id="U0002B" dec="43" mode="math" type="binaryop">
<unicodedata category="Sm"/>
<entity id="plus" set="8879-isonum"/>
<entity id="plus" set="9573-2003-isonum"/>
<description unicode="1.1">PLUS SIGN</description>
</character>
<character id="U0002C" dec="44" mode="text" type="punctuation">
<unicodedata category="Po"/>
<entity id="comma" set="8879-isonum"/>
<entity id="comma" set="9573-2003-isonum"/>
<description unicode="1.1">COMMA</description>
</character>
<character id="U0002E" dec="46" mode="text" type="punctuation">
<unicodedata category="Po"/>
<entity id="period" set="8879-isonum"/>
<entity id="period" set="9573-2003-isonum"/>
<description unicode="1.1">FULL STOP</description>
</character>
<character id="U0002F" dec="47" mode="text" type="other">
<unicodedata category="Po"/>
<entity id="sol" set="8879-isonum"/>
<entity id="sol" set="9573-2003-isonum"/>
<description unicode="1.1">SOLIDUS</description>
</character>
<character id="U0003A" dec="58" mode="text" type="punctuation">
<unicodedata category="Po"/>
<entity id="colon" set="8879-isonum"/>
<entity id="colon" set="9573-2003-isonum"/>
<description unicode="1.1">COLON</description>
</character>
<character id="U0003B" dec="59" mode="math" type="punctuation">
<unicodedata category="Po"/>
<entity id="semi" set="8879-isonum"/>
<entity id="semi" set="9573-2003-isonum"/>
<description unicode="1.1">SEMICOLON</description>
</character>
<character id="U0003C" dec="60" mode="math" type="relation">
<unicodedata category="Sm"/>
<entity id="lt" set="predefined" optional-semi="yes"/>
<entity id="lt" set="xhtml1-special"/>
<entity id="lt" set="8879-isonum"/>
<entity id="lt" set="9573-2003-isonum"/>
<entity id="LT" set="html5-uppercase" optional-semi="yes"/>
<description unicode="1.1">LESS-THAN SIGN</description>
</character>
<character id="U0003C-020D2" dec="60-8402" type="other" mode="unknown">
<unicodedata/>
<entity id="nvlt" set="9573-1991-isoamsn"/>
<entity id="nvlt" set="9573-2003-isoamsn"/>
<description unicode="combination">LESS-THAN SIGN with vertical line</description>
</character>
<character id="U0003D" dec="61" mode="text" type="relation">
<unicodedata category="Sm"/>
<entity id="equals" set="8879-isonum"/>
<entity id="equals" set="9573-2003-isonum"/>
<description unicode="1.1">EQUALS SIGN</description>
</character>
<character id="U0003D-020E5" dec="61-8421" mode="math" type="relation">
<unicodedata/>
<entity id="bne" set="STIX"/>
<entity id="bne" set="9573-1991-isotech"/>
<entity id="bne" set="9573-2003-isotech"/>
<description unicode="combination">EQUALS SIGN with reverse slash</description>
</character>
<character id="U0003E" dec="62" mode="math" type="relation">
<unicodedata category="Sm"/>
<entity id="gt" set="predefined" optional-semi="yes"/>
<entity id="gt" set="xhtml1-special"/>
<entity id="gt" set="8879-isonum"/>
<entity id="gt" set="9573-2003-isonum"/>
<entity id="GT" set="html5-uppercase" optional-semi="yes"/>
<description unicode="1.1">GREATER-THAN SIGN</description>
</character>
<character id="U0003E-020D2" dec="62-8402" type="other" mode="unknown">
<unicodedata/>
<entity id="nvgt" set="9573-1991-isoamsn"/>
<entity id="nvgt" set="9573-2003-isoamsn"/>
<description unicode="combination">GREATER-THAN SIGN with vertical line</description>
</character>
<character id="U0003F" dec="63" mode="text" type="punctuation">
<unicodedata category="Po"/>
<entity id="quest" set="8879-isonum"/>
<entity id="quest" set="9573-2003-isonum"/>
<description unicode="1.1">QUESTION MARK</description>
</character>
<character id="U00040" dec="64" mode="text" type="normal">
<unicodedata category="Po"/>
<entity id="commat" set="8879-isonum"/>
<entity id="commat" set="9573-2003-isonum"/>
<description unicode="1.1">COMMERCIAL AT</description>
</character>
<character id="U0005B" dec="91" mode="text" type="opening">
<unicodedata category="Ps"/>
<entity id="lsqb" set="8879-isonum"/>
<entity id="lsqb" set="9573-2003-isonum"/>
<entity id="lbrack" set="mmlalias"/>
<description unicode="1.1">LEFT SQUARE BRACKET</description>
</character>
<character id="U0005C" dec="92" mode="mixed" type="normal">
<unicodedata category="Po"/>
<entity id="bsol" set="8879-isonum"/>
<entity id="bsol" set="9573-2003-isonum"/>
<description unicode="1.1">REVERSE SOLIDUS</description>
</character>
<character id="U0005D" dec="93" mode="text" type="closing">
<unicodedata category="Pe"/>
<entity id="rsqb" set="8879-isonum"/>
<entity id="rsqb" set="9573-2003-isonum"/>
<entity id="rbrack" set="mmlalias"/>
<description unicode="1.1">RIGHT SQUARE BRACKET</description>
</character>
<character id="U0005E" dec="94" mode="text" type="other">
<unicodedata category="Sk"/>
<entity id="Hat" set="mmlextra"/>
<description unicode="1.1">CIRCUMFLEX ACCENT</description>
</character>
<character id="U0005F" dec="95" mode="math" type="other">
<unicodedata category="Pc"/>
<entity id="lowbar" set="8879-isonum"/>
<entity id="lowbar" set="9573-2003-isonum"/>
<entity id="UnderBar" set="mmlextra"/>
<description unicode="1.1">LOW LINE</description>
</character>
<character id="U00060" dec="96" mode="text" type="other">
<unicodedata category="Sk"/>
<entity id="grave" set="8879-isodia"/>
<entity id="grave" set="9573-2003-isodia"/>
<entity id="DiacriticalGrave" set="mmlalias"/>
<description unicode="1.1">GRAVE ACCENT</description>
</character>
<character id="U00066-0006A" dec="102-106" mode="text" type="other">
<entity id="fjlig" set="8879-isopub"/>
<entity id="fjlig" set="9573-2003-isopub"/>
<description unicode="1.1">fj ligature</description>
</character>
<character id="U0007B" dec="123" mode="math" type="opening">
<unicodedata category="Ps"/>
<entity id="lcub" set="8879-isonum"/>
<entity id="lcub" set="9573-2003-isonum"/>
<entity id="lbrace" set="mmlalias"/>
<description unicode="1.1">LEFT CURLY BRACKET</description>
</character>
<character id="U0007C" dec="124" mode="math" type="other">
<unicodedata category="Sm"/>
<entity id="verbar" set="8879-isonum"/>
<entity id="verbar" set="9573-2003-isonum"/>
<entity id="vert" set="mmlalias"/>
<entity id="VerticalLine" set="mmlextra"/>
<description unicode="1.1">VERTICAL LINE</description>
</character>
<character id="U0007D" dec="125" mode="math" type="closing">
<unicodedata category="Pe"/>
<entity id="rcub" set="8879-isonum"/>
<entity id="rcub" set="9573-2003-isonum"/>
<entity id="rbrace" set="mmlalias"/>
<description unicode="1.1">RIGHT CURLY BRACKET</description>
</character>
<character id="U000A0" dec="160" mode="math" type="other">
<unicodedata category="Zs"/>
<entity id="nbsp" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="nbsp" set="8879-isonum"/>
<entity id="nbsp" set="9573-2003-isonum"/>
<entity id="NonBreakingSpace" set="mmlalias"/>
<description unicode="1.1">NO-BREAK SPACE</description>
</character>
<character id="U000A1" dec="161" mode="text" type="punctuation">
<unicodedata category="Po"/>
<entity id="iexcl" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="iexcl" set="8879-isonum"/>
<entity id="iexcl" set="9573-2003-isonum"/>
<description unicode="1.1">INVERTED EXCLAMATION MARK</description>
</character>
<character id="U000A2" dec="162" mode="mixed" type="normal">
<unicodedata category="Sc"/>
<entity id="cent" set="xhtml1-lat1"/>
<entity id="cent" set="8879-isonum"/>
<entity id="cent" set="9573-2003-isonum"/>
<description unicode="1.1">CENT SIGN</description>
</character>
<character id="U000A3" dec="163" mode="mixed" type="normal">
<unicodedata category="Sc"/>
<entity id="pound" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="pound" set="8879-isonum"/>
<entity id="pound" set="9573-2003-isonum"/>
<description unicode="1.1">POUND SIGN</description>
</character>
<character id="U000A4" dec="164" mode="mixed" type="normal">
<unicodedata category="Sc"/>
<entity id="curren" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="curren" set="8879-isonum"/>
<entity id="curren" set="9573-2003-isonum"/>
<description unicode="1.1">CURRENCY SIGN</description>
</character>
<character id="U000A5" dec="165" mode="mixed" type="normal">
<unicodedata category="Sc"/>
<entity id="yen" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="yen" set="8879-isonum"/>
<entity id="yen" set="9573-2003-isonum"/>
<description unicode="1.1">YEN SIGN</description>
</character>
<character id="U000A6" dec="166" mode="text" type="normal">
<unicodedata category="So"/>
<entity id="brvbar" set="xhtml1-lat1"/>
<entity id="brvbar" set="8879-isonum"/>
<entity id="brvbar" set="9573-2003-isonum"/>
<description unicode="1.1">BROKEN BAR</description>
</character>
<character id="U000A7" dec="167" mode="mixed" type="normal">
<unicodedata category="So"/>
<entity id="sect" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="sect" set="8879-isonum"/>
<entity id="sect" set="9573-2003-isonum"/>
<description unicode="1.1">SECTION SIGN</description>
</character>
<character id="U000A8" dec="168" mode="text" type="other">
<unicodedata category="Sk"/>
<entity id="Dot" set="8879-isotech"/>
<entity id="Dot" set="9573-1991-isotech"/>
<entity id="Dot" set="9573-2003-isotech"/>
<entity id="die" set="8879-isodia"/>
<entity id="die" set="9573-2003-isodia"/>
<entity id="DoubleDot" set="mmlalias"/>
<entity id="uml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="uml" set="8879-isodia"/>
<entity id="uml" set="9573-2003-isodia"/>
<description unicode="1.1">DIAERESIS</description>
</character>
<character id="U000A9" dec="169" mode="mixed" type="normal">
<unicodedata category="So"/>
<entity id="copy" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="copy" set="8879-isonum"/>
<entity id="copy" set="9573-2003-isonum"/>
<entity id="COPY" set="html5-uppercase" optional-semi="yes"/>
<description unicode="1.1">COPYRIGHT SIGN</description>
</character>
<character id="U000AA" dec="170" mode="text" type="other">
<unicodedata category="Ll"/>
<entity id="ordf" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ordf" set="8879-isonum"/>
<entity id="ordf" set="9573-2003-isonum"/>
<description unicode="1.1">FEMININE ORDINAL INDICATOR</description>
</character>
<character id="U000AB" dec="171" mode="mixed" type="opening">
<unicodedata category="Pi"/>
<entity id="laquo" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="laquo" set="8879-isonum"/>
<entity id="laquo" set="9573-2003-isonum"/>
<description unicode="1.1">LEFT-POINTING DOUBLE ANGLE QUOTATION MARK</description>
</character>
<character id="U000AC" dec="172" mode="math" type="normal">
<unicodedata category="Sm"/>
<entity id="not" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="not" set="8879-isonum"/>
<entity id="not" set="9573-2003-isonum"/>
<description unicode="1.1">NOT SIGN</description>
</character>
<character id="U000AD" dec="173" mode="math" type="other">
<unicodedata category="Cf"/>
<entity id="shy" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="shy" set="8879-isonum"/>
<entity id="shy" set="9573-2003-isonum"/>
<description unicode="1.1">SOFT HYPHEN</description>
</character>
<character id="U000AE" dec="174" mode="mixed" type="normal">
<unicodedata category="So"/>
<entity id="reg" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="reg" set="8879-isonum"/>
<entity id="reg" set="9573-2003-isonum"/>
<entity id="circledR" set="mmlalias"/>
<entity id="REG" set="html5-uppercase" optional-semi="yes"/>
<description unicode="1.1">REGISTERED SIGN</description>
</character>
<character id="U000AF" dec="175" mode="text" type="other">
<unicodedata category="Sk"/>
<entity id="macr" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="macr" set="8879-isodia"/>
<entity id="macr" set="9573-2003-isodia"/>
<entity id="strns" set="9573-1991-isotech"/>
<entity id="strns" set="9573-2003-isotech"/>
<description unicode="1.1">MACRON</description>
</character>
<character id="U000B0" dec="176" mode="mixed" type="other">
<unicodedata category="So"/>
<entity id="deg" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="deg" set="8879-isonum"/>
<entity id="deg" set="9573-2003-isonum"/>
<description unicode="1.1">DEGREE SIGN</description>
</character>
<character id="U000B1" dec="177" mode="math" type="binaryop">
<unicodedata category="Sm"/>
<entity id="plusmn" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="plusmn" set="8879-isonum"/>
<entity id="plusmn" set="9573-2003-isonum"/>
<entity id="pm" set="mmlalias"/>
<entity id="PlusMinus" set="mmlalias"/>
<description unicode="1.1">PLUS-MINUS SIGN</description>
</character>
<character id="U000B2" dec="178" mode="math" type="other">
<unicodedata category="No"/>
<entity id="sup2" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="sup2" set="8879-isonum"/>
<entity id="sup2" set="9573-2003-isonum"/>
<description unicode="1.1">SUPERSCRIPT TWO</description>
</character>
<character id="U000B3" dec="179" mode="math" type="other">
<unicodedata category="No"/>
<entity id="sup3" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="sup3" set="8879-isonum"/>
<entity id="sup3" set="9573-2003-isonum"/>
<description unicode="1.1">SUPERSCRIPT THREE</description>
</character>
<character id="U000B4" dec="180" mode="text" type="other">
<unicodedata category="Sk"/>
<entity id="acute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="acute" set="8879-isodia"/>
<entity id="acute" set="9573-2003-isodia"/>
<entity id="DiacriticalAcute" set="mmlalias"/>
<description unicode="1.1">ACUTE ACCENT</description>
</character>
<character id="U000B5" dec="181" mode="math" type="normal">
<unicodedata category="Ll"/>
<entity id="micro" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="micro" set="8879-isonum"/>
<entity id="micro" set="9573-2003-isonum"/>
<description unicode="1.1">MICRO SIGN</description>
</character>
<character id="U000B6" dec="182" mode="mixed" type="normal">
<unicodedata category="So"/>
<entity id="para" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="para" set="8879-isonum"/>
<entity id="para" set="9573-2003-isonum"/>
<description unicode="1.1">PILCROW SIGN</description>
</character>
<character id="U000B7" dec="183" mode="math" type="binaryop">
<unicodedata category="Po"/>
<entity id="middot" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="middot" set="8879-isonum"/>
<entity id="middot" set="9573-2003-isonum"/>
<entity id="centerdot" set="mmlalias"/>
<entity id="CenterDot" set="mmlalias"/>
<description unicode="1.1">MIDDLE DOT</description>
</character>
<character id="U000B8" dec="184" mode="mixed" type="other">
<unicodedata category="Sk"/>
<entity id="cedil" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="cedil" set="8879-isodia"/>
<entity id="cedil" set="9573-2003-isodia"/>
<entity id="Cedilla" set="mmlalias"/>
<description unicode="1.1">CEDILLA</description>
</character>
<character id="U000B9" dec="185" mode="math" type="other">
<unicodedata category="No"/>
<entity id="sup1" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="sup1" set="8879-isonum"/>
<entity id="sup1" set="9573-2003-isonum"/>
<description unicode="1.1">SUPERSCRIPT ONE</description>
</character>
<character id="U000BA" dec="186" mode="text" type="other">
<unicodedata category="Ll"/>
<entity id="ordm" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ordm" set="8879-isonum"/>
<entity id="ordm" set="9573-2003-isonum"/>
<description unicode="1.1">MASCULINE ORDINAL INDICATOR</description>
</character>
<character id="U000BB" dec="187" mode="mixed" type="closing">
<unicodedata category="Pf"/>
<entity id="raquo" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="raquo" set="8879-isonum"/>
<entity id="raquo" set="9573-2003-isonum"/>
<description unicode="1.1">RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK</description>
</character>
<character id="U000BC" dec="188" mode="text" type="other">
<unicodedata category="No"/>
<entity id="frac14" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="frac14" set="8879-isonum"/>
<entity id="frac14" set="9573-2003-isonum"/>
<description unicode="1.1">VULGAR FRACTION ONE QUARTER</description>
</character>
<character id="U000BD" dec="189" mode="text" type="other">
<unicodedata category="No"/>
<entity id="frac12" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="frac12" set="8879-isonum"/>
<entity id="frac12" set="9573-2003-isonum"/>
<entity id="half" set="8879-isonum"/>
<entity id="half" set="9573-2003-isonum"/>
<description unicode="1.1">VULGAR FRACTION ONE HALF</description>
</character>
<character id="U000BE" dec="190" mode="text" type="other">
<unicodedata category="No"/>
<entity id="frac34" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="frac34" set="8879-isonum"/>
<entity id="frac34" set="9573-2003-isonum"/>
<description unicode="1.1">VULGAR FRACTION THREE QUARTERS</description>
</character>
<character id="U000BF" dec="191" mode="text" type="punctuation">
<unicodedata category="Po"/>
<entity id="iquest" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="iquest" set="8879-isonum"/>
<entity id="iquest" set="9573-2003-isonum"/>
<description unicode="1.1">INVERTED QUESTION MARK</description>
</character>
<character id="U000C0" dec="192" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Agrave" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Agrave" set="8879-isolat1"/>
<entity id="Agrave" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER A WITH GRAVE</description>
</character>
<character id="U000C1" dec="193" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Aacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Aacute" set="8879-isolat1"/>
<entity id="Aacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER A WITH ACUTE</description>
</character>
<character id="U000C2" dec="194" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Acirc" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Acirc" set="8879-isolat1"/>
<entity id="Acirc" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER A WITH CIRCUMFLEX</description>
</character>
<character id="U000C3" dec="195" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Atilde" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Atilde" set="8879-isolat1"/>
<entity id="Atilde" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER A WITH TILDE</description>
</character>
<character id="U000C4" dec="196" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Auml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Auml" set="8879-isolat1"/>
<entity id="Auml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER A WITH DIAERESIS</description>
</character>
<character id="U000C5" dec="197" mode="text" type="other">
<unicodedata category="Lu"/>
<entity id="Aring" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Aring" set="8879-isolat1"/>
<entity id="Aring" set="9573-2003-isolat1"/>
<entity id="angst" set="8879-isotech"/>
<entity id="angst" set="9573-1991-isotech"/>
<entity id="angst" set="9573-2003-isotech"/>
<description unicode="1.1">LATIN CAPITAL LETTER A WITH RING ABOVE</description>
</character>
<character id="U000C6" dec="198" mode="text" type="alphabetic">
<unicodedata category="Lu"/>
<entity id="AElig" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="AElig" set="8879-isolat1"/>
<entity id="AElig" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER AE</description>
</character>
<character id="U000C7" dec="199" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Ccedil" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Ccedil" set="8879-isolat1"/>
<entity id="Ccedil" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER C WITH CEDILLA</description>
</character>
<character id="U000C8" dec="200" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Egrave" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Egrave" set="8879-isolat1"/>
<entity id="Egrave" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER E WITH GRAVE</description>
</character>
<character id="U000C9" dec="201" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Eacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Eacute" set="8879-isolat1"/>
<entity id="Eacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER E WITH ACUTE</description>
</character>
<character id="U000CA" dec="202" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Ecirc" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Ecirc" set="8879-isolat1"/>
<entity id="Ecirc" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER E WITH CIRCUMFLEX</description>
</character>
<character id="U000CB" dec="203" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Euml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Euml" set="8879-isolat1"/>
<entity id="Euml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER E WITH DIAERESIS</description>
</character>
<character id="U000CC" dec="204" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Igrave" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Igrave" set="8879-isolat1"/>
<entity id="Igrave" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER I WITH GRAVE</description>
</character>
<character id="U000CD" dec="205" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Iacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Iacute" set="8879-isolat1"/>
<entity id="Iacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER I WITH ACUTE</description>
</character>
<character id="U000CE" dec="206" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Icirc" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Icirc" set="8879-isolat1"/>
<entity id="Icirc" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER I WITH CIRCUMFLEX</description>
</character>
<character id="U000CF" dec="207" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Iuml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Iuml" set="8879-isolat1"/>
<entity id="Iuml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER I WITH DIAERESIS</description>
</character>
<character id="U000D0" dec="208" mode="text" type="alphabetic">
<unicodedata category="Lu"/>
<entity id="ETH" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ETH" set="8879-isolat1"/>
<entity id="ETH" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER ETH</description>
</character>
<character id="U000D1" dec="209" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Ntilde" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Ntilde" set="8879-isolat1"/>
<entity id="Ntilde" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER N WITH TILDE</description>
</character>
<character id="U000D2" dec="210" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Ograve" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Ograve" set="8879-isolat1"/>
<entity id="Ograve" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER O WITH GRAVE</description>
</character>
<character id="U000D3" dec="211" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Oacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Oacute" set="8879-isolat1"/>
<entity id="Oacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER O WITH ACUTE</description>
</character>
<character id="U000D4" dec="212" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Ocirc" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Ocirc" set="8879-isolat1"/>
<entity id="Ocirc" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER O WITH CIRCUMFLEX</description>
</character>
<character id="U000D5" dec="213" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Otilde" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Otilde" set="8879-isolat1"/>
<entity id="Otilde" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER O WITH TILDE</description>
</character>
<character id="U000D6" dec="214" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Ouml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Ouml" set="8879-isolat1"/>
<entity id="Ouml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER O WITH DIAERESIS</description>
</character>
<character id="U000D7" dec="215" mode="mixed" type="binaryop">
<unicodedata category="Sm"/>
<entity id="times" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="times" set="8879-isonum"/>
<entity id="times" set="9573-2003-isonum"/>
<description unicode="1.1">MULTIPLICATION SIGN</description>
</character>
<character id="U000D8" dec="216" mode="text" type="alphabetic">
<unicodedata category="Lu"/>
<entity id="Oslash" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Oslash" set="8879-isolat1"/>
<entity id="Oslash" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER O WITH STROKE</description>
</character>
<character id="U000D9" dec="217" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Ugrave" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Ugrave" set="8879-isolat1"/>
<entity id="Ugrave" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER U WITH GRAVE</description>
</character>
<character id="U000DA" dec="218" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Uacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Uacute" set="8879-isolat1"/>
<entity id="Uacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER U WITH ACUTE</description>
</character>
<character id="U000DB" dec="219" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Ucirc" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Ucirc" set="8879-isolat1"/>
<entity id="Ucirc" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER U WITH CIRCUMFLEX</description>
</character>
<character id="U000DC" dec="220" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Uuml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Uuml" set="8879-isolat1"/>
<entity id="Uuml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER U WITH DIAERESIS</description>
</character>
<character id="U000DD" dec="221" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Yacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="Yacute" set="8879-isolat1"/>
<entity id="Yacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER Y WITH ACUTE</description>
</character>
<character id="U000DE" dec="222" mode="text" type="alphabetic">
<unicodedata category="Lu"/>
<entity id="THORN" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="THORN" set="8879-isolat1"/>
<entity id="THORN" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN CAPITAL LETTER THORN</description>
</character>
<character id="U000DF" dec="223" mode="text" type="alphabetic">
<unicodedata category="Ll"/>
<entity id="szlig" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="szlig" set="8879-isolat1"/>
<entity id="szlig" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER SHARP S</description>
</character>
<character id="U000E0" dec="224" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="agrave" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="agrave" set="8879-isolat1"/>
<entity id="agrave" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER A WITH GRAVE</description>
</character>
<character id="U000E1" dec="225" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="aacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="aacute" set="8879-isolat1"/>
<entity id="aacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER A WITH ACUTE</description>
</character>
<character id="U000E2" dec="226" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="acirc" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="acirc" set="8879-isolat1"/>
<entity id="acirc" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER A WITH CIRCUMFLEX</description>
</character>
<character id="U000E3" dec="227" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="atilde" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="atilde" set="8879-isolat1"/>
<entity id="atilde" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER A WITH TILDE</description>
</character>
<character id="U000E4" dec="228" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="auml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="auml" set="8879-isolat1"/>
<entity id="auml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER A WITH DIAERESIS</description>
</character>
<character id="U000E5" dec="229" mode="text" type="other">
<unicodedata category="Ll"/>
<entity id="aring" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="aring" set="8879-isolat1"/>
<entity id="aring" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER A WITH RING ABOVE</description>
</character>
<character id="U000E6" dec="230" mode="text" type="alphabetic">
<unicodedata category="Ll"/>
<entity id="aelig" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="aelig" set="8879-isolat1"/>
<entity id="aelig" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER AE</description>
</character>
<character id="U000E7" dec="231" mode="mixed" type="alphabetic">
<unicodedata category="Ll"/>
<entity id="ccedil" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ccedil" set="8879-isolat1"/>
<entity id="ccedil" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER C WITH CEDILLA</description>
</character>
<character id="U000E8" dec="232" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="egrave" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="egrave" set="8879-isolat1"/>
<entity id="egrave" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER E WITH GRAVE</description>
</character>
<character id="U000E9" dec="233" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="eacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="eacute" set="8879-isolat1"/>
<entity id="eacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER E WITH ACUTE</description>
</character>
<character id="U000EA" dec="234" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="ecirc" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ecirc" set="8879-isolat1"/>
<entity id="ecirc" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER E WITH CIRCUMFLEX</description>
</character>
<character id="U000EB" dec="235" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="euml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="euml" set="8879-isolat1"/>
<entity id="euml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER E WITH DIAERESIS</description>
</character>
<character id="U000EC" dec="236" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="igrave" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="igrave" set="8879-isolat1"/>
<entity id="igrave" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER I WITH GRAVE</description>
</character>
<character id="U000ED" dec="237" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="iacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="iacute" set="8879-isolat1"/>
<entity id="iacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER I WITH ACUTE</description>
</character>
<character id="U000EE" dec="238" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="icirc" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="icirc" set="8879-isolat1"/>
<entity id="icirc" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER I WITH CIRCUMFLEX</description>
</character>
<character id="U000EF" dec="239" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="iuml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="iuml" set="8879-isolat1"/>
<entity id="iuml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER I WITH DIAERESIS</description>
</character>
<character id="U000F0" dec="240" mode="text" type="alphabetic">
<unicodedata category="Ll"/>
<entity id="eth" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="eth" set="8879-isolat1"/>
<entity id="eth" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER ETH</description>
</character>
<character id="U000F1" dec="241" mode="mixed" type="alphabetic">
<unicodedata category="Ll"/>
<entity id="ntilde" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ntilde" set="8879-isolat1"/>
<entity id="ntilde" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER N WITH TILDE</description>
</character>
<character id="U000F2" dec="242" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="ograve" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ograve" set="8879-isolat1"/>
<entity id="ograve" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER O WITH GRAVE</description>
</character>
<character id="U000F3" dec="243" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="oacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="oacute" set="8879-isolat1"/>
<entity id="oacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER O WITH ACUTE</description>
</character>
<character id="U000F4" dec="244" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="ocirc" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ocirc" set="8879-isolat1"/>
<entity id="ocirc" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER O WITH CIRCUMFLEX</description>
</character>
<character id="U000F5" dec="245" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="otilde" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="otilde" set="8879-isolat1"/>
<entity id="otilde" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER O WITH TILDE</description>
</character>
<character id="U000F6" dec="246" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="ouml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ouml" set="8879-isolat1"/>
<entity id="ouml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER O WITH DIAERESIS</description>
</character>
<character id="U000F7" dec="247" mode="math" type="binaryop">
<unicodedata category="Sm"/>
<entity id="divide" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="divide" set="8879-isonum"/>
<entity id="divide" set="9573-2003-isonum"/>
<entity id="div" set="mmlalias"/>
<description unicode="1.1">DIVISION SIGN</description>
</character>
<character id="U000F8" dec="248" mode="text" type="alphabetic">
<unicodedata category="Ll"/>
<entity id="oslash" set="8879-isolat1"/>
<entity id="oslash" set="9573-2003-isolat1"/>
<entity id="oslash" set="xhtml1-lat1" optional-semi="yes"/>
<description unicode="1.1">LATIN SMALL LETTER O WITH STROKE</description>
</character>
<character id="U000F9" dec="249" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="ugrave" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ugrave" set="8879-isolat1"/>
<entity id="ugrave" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER U WITH GRAVE</description>
</character>
<character id="U000FA" dec="250" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="uacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="uacute" set="8879-isolat1"/>
<entity id="uacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER U WITH ACUTE</description>
</character>
<character id="U000FB" dec="251" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="ucirc" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="ucirc" set="8879-isolat1"/>
<entity id="ucirc" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER U WITH CIRCUMFLEX</description>
</character>
<character id="U000FC" dec="252" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="uuml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="uuml" set="8879-isolat1"/>
<entity id="uuml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER U WITH DIAERESIS</description>
</character>
<character id="U000FD" dec="253" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="yacute" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="yacute" set="8879-isolat1"/>
<entity id="yacute" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER Y WITH ACUTE</description>
</character>
<character id="U000FE" dec="254" mode="text" type="alphabetic">
<unicodedata category="Ll"/>
<entity id="thorn" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="thorn" set="8879-isolat1"/>
<entity id="thorn" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER THORN</description>
</character>
<character id="U000FF" dec="255" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="yuml" set="xhtml1-lat1" optional-semi="yes"/>
<entity id="yuml" set="8879-isolat1"/>
<entity id="yuml" set="9573-2003-isolat1"/>
<description unicode="1.1">LATIN SMALL LETTER Y WITH DIAERESIS</description>
</character>
<character id="U00100" dec="256" mode="mixed" type="other">
<unicodedata category="Lu"/>
<entity id="Amacr" set="8879-isolat2"/>
<entity id="Amacr" set="9573-2003-isolat2"/>
<description unicode="1.1">LATIN CAPITAL LETTER A WITH MACRON</description>
</character>
<character id="U00101" dec="257" mode="mixed" type="other">
<unicodedata category="Ll"/>
<entity id="amacr" set="8879-isolat2"/>
<entity id="amacr" set="9573-2003-isolat2"/>
<description unicode="1.1">LATIN SMALL LETTER A WITH MACRON</description>
</character>
</unicode>
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8" indent="no" omit-xml-declaration="yes"/><xsl:strip-space elements="*"/>
<xsl:variable name='newline'><xsl:text>
</xsl:text></xsl:variable>
<xsl:template match="/unicode">
<xsl:for-each select="character">
<xsl:for-each select="entity">
<xsl:if test="not(preceding-sibling::entity[@id = current()/@id])">
<xsl:variable name="entity" select="@id"/>
<xsl:variable name="id" select="../@id"/>
<xsl:variable name="desc" select="../description"/>
<xsl:variable name="dec">
<xsl:variable name="modifiedId">
<xsl:analyze-string select="../@dec" regex="^([0-9]+)([-]?)([0-9]*)$">
<xsl:matching-substring>
<xsl:value-of select="regex-group(1)"/>
<xsl:if test="regex-group(2)">#</xsl:if>
<xsl:if test="regex-group(3)"><xsl:value-of select="regex-group(3)"/></xsl:if>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:variable>
<xsl:value-of select="$modifiedId"/>
</xsl:variable>
<xsl:variable name="hex">
<xsl:variable name="strippedId">
<xsl:analyze-string select="$id" regex="^(U[0]*)([0-9A-F]+)([-]?)([0]*)([0-9A-F]+)?$">
<xsl:matching-substring>
<xsl:value-of select="regex-group(2)"/>
<xsl:if test="regex-group(4)">#</xsl:if>
<xsl:if test="regex-group(5)"><xsl:value-of select="regex-group(5)"/></xsl:if>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:variable>
<xsl:value-of select="$strippedId"/>
</xsl:variable>
<xsl:text>"</xsl:text>
<xsl:value-of select="$entity"/>
<xsl:text>","</xsl:text>
<xsl:value-of select="$dec"/>
<xsl:text>","</xsl:text>
<xsl:value-of select="$hex"/>
<xsl:text>","</xsl:text>
<xsl:value-of select="$desc"/>
<xsl:text>"</xsl:text>
<xsl:value-of select="$newline"/>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment