Skip to content

Instantly share code, notes, and snippets.

@Angles
Last active February 3, 2016 23:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Angles/d5abf02a680f60fd0d60 to your computer and use it in GitHub Desktop.
Save Angles/d5abf02a680f60fd0d60 to your computer and use it in GitHub Desktop.
Unicode dash hyphen stuff
<html>
<head>
<title>Dash Hyphen Unicode chars</title>
<style>
pre, code {
font-family: "SourceSansPro-Regular", "DejaVuSansMono", monospace;
}
a {
color: #A2BDE0;
text-decoration:none;
}
body {
margin: 2em;
font-family: Verdana, Arial, Tahoma, sans-serif;
font-size: 1.5em;
//color: white;
color: #CCCCCC;
//background-color: #c0c0c0;
//background-color: #d0d0d0;
//background-color: #000000;
background-color: #262B30;
}
</style>
</head>
<body>
<h1>Dash hyphen Unicode chars</h1>
<p></p>
<p>Note: in screen scraping text, it is often useful to use standard char 45
hyphen-minus <a href="http://www.fileformat.info/info/unicode/char/002d/index.htm">u002D</a>
in place of higher order Unicode characters, such as below, to simplify text, often found
in news articles, etc. Surounding spaces are not consistent, excepting between-words.
but fe63 and fe0d apparently include spaces. (General reference: <a href="http://www.fileformat.info/info/unicode/char/search.htm">fileformat.info</a>).</p>
<hr />
<p>dash&#x2010;zero <a href="http://www.fileformat.info/info/unicode/char/2010/index.htm">u2010</a>, hyphen</p>
<p>dash&#x2011;one <a href="http://www.fileformat.info/info/unicode/char/2011/index.htm">u2011</a>, non breaking hyphen</p>
<p>dash&#x2012;two <a href="http://www.fileformat.info/info/unicode/char/2012/index.htm">u2012</a>, figure dash</p>
<p>dash&#x2013;three <a href="http://www.fileformat.info/info/unicode/char/2013/index.htm">u2013</a>, en dash, &amp;ndash as&ndash;html</p>
<p>dash&#x2014;four <a href="http://www.fileformat.info/info/unicode/char/2014/index.htm">u2014</a>, em dash, &amp;mdash as&mdash;html "Comments: may be used in pairs to offset parenthetical text."</p>
<p>dash&#x2015;five <a href="http://www.fileformat.info/info/unicode/char/2015/index.htm">u2015</a>, horizontal bar, "Comments: quotation dash. long dash introducing quoted text."</p>
<br />
<hr />
<p>dash&#xFE58;uFE58 <a href="http://www.fileformat.info/info/unicode/char/fe58/index.htm">small em dash</a>, Block: small form varients; "Decomposition: &lt;small&gt; EM DASH (U+2014)".</p>
<p>dash&#xFE63;uFE63 <a href="http://www.fileformat.info/info/unicode/char/fe63/index.htm">small minus-hyphen</a>; Block: small form varients; "BIDI: European Number Separator [ES]".</p>
<p>dash&#xFF0D;uFF0D <a href="http://www.fileformat.info/info/unicode/char/ff0d/index.htm">FULLWIDTH hyphen-minus (U+FF0D)</a>; Block: Halfwidth and Fullwidth Forms; "BIDI: European Number Separator [ES]"; "Decomposition: &lt;wide&gt; HYPHEN-MINUS (U+002D)".</p>
<br />
<hr />
<p>Note; Of basic hyphen-minus char 45, <a href="http://www.fileformat.info/info/unicode/char/002d/index.htm">u002d</a>, <a href="http://www.fileformat.info">fileformat.info</a> says (literally): <br /></p>
See Also:<br />
hyphen <a href="http://www.fileformat.info/info/unicode/char/2010/index.htm">U+2010</a>
<br>
non-breaking hyphen <a href="http://www.fileformat.info/info/unicode/char/2011/index.htm">U+2011</a>
<br>
figure dash <a href="http://www.fileformat.info/info/unicode/char/2012/index.htm">U+2012</a>
<br>
en dash <a href="http://www.fileformat.info/info/unicode/char/2013/index.htm">U+2013</a>
<br>
hyphen bullet <a href="http://www.fileformat.info/info/unicode/char/2043/index.htm">U+2043</a>
<br>
minus sign <a href="http://www.fileformat.info/info/unicode/char/2212/index.htm">U+2212</a>
<br>
roman uncia sign <a href="http://www.fileformat.info/info/unicode/char/10191/index.htm">U+10191</a>
<br />
<hr />
<p>SOURCE: <a href="http://www.fileformat.info/info/unicode/char/search.htm">fileformat.info</a> unicode Char Search.</p>
<p>And block groups like <a href="http://www.fileformat.info/info/unicode/block/general_punctuation/list.htm">"Unicode Characters in the General Punctuation Block" - www.fileformat.info/info/unicode/block/general_punctuation/list.htm</a>
</p>
</body>
</html>

Dash hyphen Unicode chars

Note : in screen scraping text, it is often useful to use standard char 45 hyphen-minus u002D in place of higher order Unicode characters, such as those below, to simplify text, often found in news articles, etc. Surounding spaces are not consistent, excepting between-words. but fe63 and fe0d apparently include spaces. (General reference: fileformat.info).


dash‐zero u2010, hyphen

dash‑one u2011, non breaking hyphen

dash‒two u2012, figure dash

dash–three u2013, en dash, &ndash as–html

dash—four u2014, em dash, &mdash as—html "Comments: may be used in pairs to offset parenthetical text."

dash―five u2015, horizontal bar, "Comments: quotation dash. long dash introducing quoted text."


dash﹘uFE58 small em dash, Block: small form varients; "Decomposition: <small> EM DASH (U+2014) ".

dash﹣uFE63 small minus-hyphen; Block: small form varients; "BIDI: European Number Separator [ES]".

dash-uFF0D FULLWIDTH hyphen-minus (U+FF0D); Block: Halfwidth and Fullwidth Forms; "BIDI: European Number Separator [ES]"; "Decomposition: <wide> HYPHEN-MINUS (U+002D) ".


Note; Of basic hyphen-minus char 45, u002d, fileformat.info says (literally):

See Also:
hyphen U+2010
non-breaking hyphen U+2011
figure dash U+2012
en dash U+2013
hyphen bullet U+2043
minus sign U+2212
roman uncia sign U+10191


SOURCE: fileformat.info unicode Char Search.

NOTE: v.1 is HTML. This v.2 Converted from html to markdown with to-markdown online tool, has JS source on github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment