Skip to content

Instantly share code, notes, and snippets.

@Hasnep
Created October 11, 2021 12:05
Show Gist options
  • Save Hasnep/fd14a906fb48fe5751534385d6218542 to your computer and use it in GitHub Desktop.
Save Hasnep/fd14a906fb48fe5751534385d6218542 to your computer and use it in GitHub Desktop.
Unicode test file
C0 Controls and Basic Latin U+0000 – U+007F (0–127)
! 5 A a
C1 Controls and Latin–1 Supplement U+0080 – U+00FF (128–255)
¥ ¼ Ñ ñ
Latin Extended-A U+0100 – U+017F (256–383)
Ą ą IJ ij
Latin Extended-B U+0180 – U+024F (384–591)
Ə Ɛ ƕ ƺ
IPA Extensions U+0250 – U+02AF (592–687)
ɖ ɞ ɫ ɷ
Spacing Modifier Letters U+02B0 – U+02FF (688–767)
ʱ ʬ ˕ ˨
Combining Diacritical Marks U+0300 – U+036F (768–879)
o̕ o̚ ơ o͡o
Greek U+0370 – U+03FF (880–1023)
Ύ Δ δ Ϡ
Cyrillic U+0400 – U+04FF (1024–1279)
Љ Щ щ Ӄ
Cyrillic Supplement U+0500 – U+052F (1280–1327)
Ԁ Ԇ Ԉ Ԏ
Armenian U+0530 – U+058F (1328–1423)
Ա Բ ա բ
Hebrew U+0590 – U+05FF (1424–1535)
סֶ א ב ױ
Arabic U+0600 – U+06FF (1536–1791)
؟ ب حٍ ۳
Syriac U+0700 – U+074F (1792–1871)
܀ ܐ ܠ ܘ݉
Thaana U+0780 – U+07BF (1920–1983)
ހ ސ ޤ ހި
N’Ko U+07C0 – U+07FF (1984–2047)
߄ ߐ ߰ߋ ߹
Samaritan U+0800 – U+083F (2048–2111)
ࠀ ࠎ ࠏࠪ ࠽
Arabic Extended-A U+08A0 – U+08FF (2208–2303)
ࢧ ࣦ ࣲ ࣾ
Devanagari U+0900 – U+097F (2304–2431)
ठः अ ठी ३
Bengali U+0980 – U+09FF (2432–2559)
তঃ অ ৩ ৵
Gurmukhi U+0A00 – U+0A7F (2560–2687)
ਠਂ ਅ ਉ ਠੱ
Gujarati U+0A80 – U+0AFF (2688–2815)
ઠઃ અ ઠૌ ૩
Oriya U+0B00 – U+0B7F (2816–2943)
ଆ ଐ ଠୗ ୩
Tamil U+0B80 – U+0BFF (2944–3071)
பஂ அ பூ ௩
Telugu U+0C00 – U+0C7F (3072–3199)
అః ఓ అౌ ౩
Kannada U+0C80 – U+0CFF (3200–3327)
ಲಃ ಅ ಲೋ ೩
Malayalam U+0D00 – U+0D7F (3328–3455)
ഠഃ അ ഠൃ ൩
Sinhala U+0D80 – U+0DFF (3456–3583)
ෆං එ ඣ ෆූ
Thai U+0E00 – U+0E7F (3584–3711)
ก ญ กั ๓
Lao U+0E80 – U+0EFF (3712–3839)
ກ ຜ ໄ ໓
Tibetan U+0F00 – U+0FBF (3840–4031)
༣ ཁ ངཱུ ངྵ
Myanmar U+1000 – U+109F (4096–4255)
က ဂု ၄ ၍
Georgian U+10A0 – U+10FF (4256–4351)
ა ზ ჵ ჻
Hangul Jamo U+1100 – U+11FF (4352–4607)
ᄀ ᅙ ᇧ ᇸ
Ethiopic U+1200 – U+137F (4608–4991)
ሀ ቻ ፧ ፬
Cherokee U+13A0 – U+13FF (5024–5119)
Ꭰ Ꭻ Ꮞ Ᏼ
Unified Canadian Aboriginal Syllabics U+1400 – U+167F (5120–5759)
ᐁ ᑦ ᕵ ᙧ
Ogham U+1680 – U+169F (5760–5791)
ᚁ ᚈ ᚕ ᚜
Runic U+16A0 – U+16FF (5792–5887)
ᚠ ᚳ ᛦ ᛰ
Tagalog U+1700 – U+171F (5888–5919)
ᜀ ᜄ ᜌ ᜊᜒ
Hanunóo U+1720 – U+173F (5920–5951)
ᜠ ᜫ ᜪᜭ ᜯ
Buhid U+1740 – U+175F (5952–5983)
ᝁ ᝊ ᝐ ᝊᝒ
Tagbanwa U+1760 – U+177F (5984–6015)
ᝠ ᝦ ᝮ ᝪᝲ
Khmer U+1780 – U+17FF (6016–6143)
ក ឣ ខា ៤
Mongolian U+1800 – U+18AF (6144–6319)
᠀ ᠔ ᡎ ᢥ
Unified Canadian Aboriginal Syllabics Extended U+18B0 – U+18FF (6320–6399)
ᢰ ᣇ ᣠ ᣴ
Limbu U+1900 – U+194F (6400–6479)
ᤁ ᤥ ᥅ ᥉
Tai Le U+1950 – U+197F (6480–6527)
ᥐ ᥞ ᥨ ᥲ
Khmer Symbols U+19E0 – U+19FF (6624–6655)
᧠ ᧪ ᧴ ᧾
Buginese U+1A00 – U+1A1F (6656–6687)
ᨀ ᨁ ᨖ ᨔᨗ
Tai Tham U+1A20 – U+1AAF (6688–6832)
ᨠ ᨣᩯ ᪁ ᪭
Balinese U+1B00 – U+1B7F (6912–7039)
ᬧᬀ ᬊ ᬧᭀ ᭪
Sundanese U+1B80 – U+1BBF (7040–7103)
ᮗᮀ ᮋ ᮗᮦ ᮵
Lepcha U+1C00 – U+1C4F (7168–7247)
ᰁ ᰘ ᰓᰯ ᱅
Ol Chiki U+1C50 – U+1C7F (7040–7295)
᱕ ᱝ ᱰ ᱿
Vedic Extensions U+1CD0 – U+1CFF (7376–7423)
o᳐ o᳢ ᳩ ᳱ
Phonetic Extensions U+1D00 – U+1D7F (7424–7551)
ᴂ ᴥ ᴽ ᵫ
Latin Extended Additional U+1E00 – U+1EFF (7680–7935)
Ḁ Ẁ Ặ ỳ
Greek Extended U+1F00 – U+1FFF (7936–8191)
ἀ ὂ ᾑ ῼ
General Punctuation U+2000 – U+206F (8192–8303)
— “ ‰ ※
Superscripts and Subscripts U+2070 – U+209F (8304–8351)
⁴ ⁾ ₃ ₌
Currency Symbols U+20A0 – U+20CF (8352–8399)
₢ ₣ ₪ €
Combining Diacritical Marks for Symbols U+20D0 – U+20FF (8400–8447)
o⃐ o⃕ o⃚ o⃠
Letterlike Symbols U+2100 – U+214F (8448–8527)
℀ ℃ № ™
Number Forms U+2150 – U+218F (8528–8591)
⅛ Ⅳ ⅸ ↂ
Arrows U+2190 – U+21FF (8592–8703)
← ↯ ↻ ⇈
Mathematical Operators U+2200 – U+22FF (8704–8959)
∀ ∰ ⊇ ⋩
Miscellaneous Technical U+2300 – U+23FF (8960–9215)
⌂ ⌆ ⌣ ⌽
Control Pictures U+2400 – U+243F (9216–9279)
␂ ␊ ␢ ␣
Optical Character Recognition U+2440 – U+245F (9280–9311)
⑀ ⑃ ⑆ ⑊
Enclosed Alphanumerics U+2460 – U+24FF (9312–9471)
③ ⑷ ⒌ ⓦ
Box Drawing U+2500 – U+257F (9472–9599)
┍ ┝ ╤ ╳
Block Elements U+2580 – U+259F (9600–9631)
▀ ▃ ▏ ░
Geometric Shapes U+25A0 – U+25FF (9632–9727)
□ ▨ ◎ ◮
Miscellaneous Symbols U+2600 – U+26FF (9728–9983)
☂ ☺ ♀ ♪
Dingbats U+2700 – U+27BF (9984–10175)
✃ ✈ ❄ ➓
Miscellaneous Mathematical Symbols-A U+27C0 – U+27EF (10176– 10223)
⟐ ⟟ ⟥ ⟫
Supplemental Arrows-A U+27F0 – U+27FF (10224–10239)
⟰ ⟶ ⟺ ⟿
Braille Patterns U+2800 – U+28FF (10240–10495)
⠀ ⠲ ⢖ ⣿
Supplemental Arrows-B U+2900 – U+297F (10496–10623)
⤄ ⤽ ⥈ ⥻
Miscellaneous Mathematical Symbols-B U+2980 – U+29FF (10624–10751)
⦀ ⦝ ⧰ ⧻
Supplemental Mathematical Operators U+2A00 – U+2AFF (10752–11007)
⨇ ⨋ ⫚ ⫸
Miscellaneous Symbols and Arrows U+2B00 – U+2BFF (11008–11263)
⬀ ⬄ ⬉ ⬍
Glagolitic U+2C00 – U+2C5F (11264–11359)
Ⰰ Ⰹ Ⰽ ⱙ
Latin Extended-C U+2C60 – U+2C7F (11360–11391)
Ⱡ ⱥ ⱶ ⱺ
Coptic U+2C80 – U+2CFF (11392–11519)
Ⲁ ⲑ Ⲷ Ⳃ
Georgian Supplement U+2D00 – U+2D2F (11520–11567)
ⴀ ⴆ ⴝ ⴢ
Tifinagh U+2D30 – U+2D7F (11568–11647)
ⴲ ⴶ ⵟ ⵥ
Ethiopic Extended U+2D80 – U+2DDF (11648–11743)
ⶀ ⶆ ⶐ ⷖ
Cyrillic Extended-A U+2DE0 – U+2DFF (11744–11775)
оⷠ оⷩ оⷶ оⷿ
Supplemental Punctuation U+2E00 – U+2E7F (11776–11903)
⸁ ⸎ ⸨ ⸭
CJK Radicals Supplement U+2E80 – U+2EFF (11904–12031)
⺀ ⺘ ⻂ ⻱
KangXi Radicals U+2F00 – U+2FDF (12032–12255)
⼀ ⼽ ⽺ ⿔
Ideographic Description characters U+2FF0 – U+2FFF (12272–12287)
⿰ ⿳ ⿷ ⿻
CJK Symbols and Punctuation U+3000 – U+303F (12288–12351)
々 〒 〣 〰
Hiragana U+3040 – U+309F (12352–12447)
あ ぐ る ゞ
Katakana U+30A0 – U+30FF (12448–12543)
ア ヅ ヨ ヾ
Bopomofo U+3100 – U+312F (12544–12591)
ㄆ ㄓ ㄝ ㄩ
Hangul Compatibility Jamo U+3130 – U+318F (12592–12687)
ㄱ ㄸ ㅪ ㆍ
Kanbun U+3190 – U+319F (12688–12703)
㆐ ㆕ ㆚ ㆟
Bopomofo Extended U+31A0 – U+32BF (12704–12735)
ㆠ ㆧ ㆯ ㆷ
Katakana Phonetic Extensions U+31F0 – U+31FF (12784–12799)
ㇰ ㇵ ㇺ ㇿ
Enclosed CJK Letters and Months U+3200 – U+32FF (12800–13055)
㈔ ㈲ ㊧ ㋮
CJK Compatibility U+3300 – U+33FF (13056–13311)
㌃ ㍻ ㎡ ㏵
CJK Unified Ideographs Extension A U+3400 – U+4DB5 (13312–19893)
㐅 㒅 㝬 㿜
Yijing Hexagram Symbols U+4DC0 – U+4DFF (19904–19967)
䷂ ䷫ ䷴ ䷾
CJK Unified Ideographs U+4E00 – U+9FFF (19968–40959)
一 憨 田 龥
Yi Syllables U+A000 – U+A48F (40960–42127)
ꀀ ꅴ ꊩ ꒌ
Yi Radicals U+A490 – U+A4CF (42128–42191)
꒐ ꒡ ꒰ ꓆
Lisu U+A4D0 – U+A4FF (42192–42239)
ꓐ ꓫ ꓻ ꓿
Vai U+A500 – U+A63F (42240–42559)
ꔁ ꔂ ꕝ ꕢ
Cyrillic Extended-B U+A640 – U+A69F (42560–42655)
Ꙃ ꙉ ꙮ Ꚗ
Bamum U+A6A0 – U+A6FF (42656–42751)
ꚠ ꛠ ꛕ꛰ ꛷
Modifier Tone Letters U+A700 – U+A71F (42752–42783)
꜁ ꜉ ꜜ ꜟ
Latin Extended-D U+A720 – U+A7FF (42784–43007)
Ꜣ Ꜯ ꝿ ꟿ
Syloti Nagri U+A800 – U+A82F (43008–43055)
ꠀ ꠇ ꠠꠤ ꠪
Common Indic Number Forms U+A830 – U+A83F (43056–43071)
꠰ ꠶ ꠸ ꠹
Phags-pa U+A840 – U+A87F (43072–43135)
ꡁ ꡧ ꡳ ꡷
Saurashtra U+A880 – U+A8DF (43136–43311)
ꢝꢁ ꢍ ꢳ ꣕
Devanagari Extended U+A8E0 – U+A8FF (43232–43263)
ठ꣠ ठ꣮ ꣳ ꣻ
Kayah Li U+A900 – U+A92F (43264–43231)
꤅ ꤎ ꤍꤪ ꤮
Rejang U+A930 – U+A95F (43312–43359)
ꤰ ꤸ ꤷꥐ ꥟
Hangul Jamo Extended-A U+A960 – U+A97F (43360–43391)
ꥠ ꥪ ꥴ ꥼ
Javanese U+A980 – U+A9DF (43392–43487)
ꦮꦀ ꦣ ꦮꦺ ꧙
Cham U+AA00 – U+AA5F (43520–43615)
ꨅ ꨍ ꨂꨬ ꩖
Myanmar Extended-A U+AA60 – U+AA7F (43616–43647)
ꩠ ꩮ ꩴ ဂꩻ
Tai Viet U+AA80 – U+AADF (43648–43743)
ꪀ ꪙ ꪒꪷ ꫟
Meetei Mayek U+ABC0 – U+ABFF (43968–44031)
ꯀ ꯌ ꯁꯧ ꯹
Hangul Syllables U+AC00 – U+D7A3 (44032–55203)
가 뮀 윸 힣
Hangul Jamo Extended-B U+D7B0 – U+D7FF (55216–55295)
ힰ ퟎ ퟡ ퟻ
Private Use Area
   
CJK Compatibility Ideographs U+F900 – U+FAFF (63744–64255)
豈 朗 歷 館
Alphabetic Presentation Forms U+FB00 – U+FB4F (64256–64335)
ff fi ﬗ ﭏ
Arabic Presentation Forms-A U+FB50 – U+FDFF (64336–65023)
ﭐ ﰡ ﲼ ﷻ
Variation Selectors U+FE00 – U+FE0F (65024–65039)
These characters are not permitted in HTML
Combining Half Marks U+FE20 – U+FE2F (65056–65071)
o︠ o︡ o︢ o︣
CJK Compatibility Forms U+FE30 – U+FE4F (65072–65103)
︴ ︵ ﹃ ﹌
Small Form Variants U+FE50 – U+FE6F (65104–65135)
﹖ ﹠ ﹩ ﹫
Arabic Presentation Forms-B U+FE70 – U+FEFF (65136–65279)
ﹰ ﺗ ﺺ ﻼ
Halfwidth and Fullwidth Forms U+FF00 – U+FFEF (65280–65519)
3 F カ ᄎ
Specials U+FEFF, U+FFF0 – U+FFFF (65279, 65520–65535)
   �
Linear B Syllabary U+10000 – U+1007F (65536–65663)
𐀀 𐀢 𐁀 𐁝
Linear B Ideograms U+10080 – U+100FF (65664–65791)
𐂀 𐂚 𐃃 𐃺
Aegean Numbers U+10100 – U+1013F (65792–65855)
𐄀 𐄎 𐄱 𐄸
Ancient Greek Numbers U+10140 – U+1018F (65856–65935)
𐅃 𐅉 𐅓 𐆉
Ancient Symbols U+10190 – U+101CF (65936–65999)
𐆐 𐆔 𐆘 𐆚
Phaistos Disc U+101D0 – U+101FF (66000–66047)
𐇐 𐇛 𐇯 𐇹
Lycian U+10280 – U+1029F (66176–66207)
𐊀 𐊉 𐊕 𐊚
Carian U+102A0 – U+102DF (66208–66271)
𐊡 𐊨 𐊾 𐋋
Old Italic U+10300 – U+1032F (66304–66351)
𐌀 𐌊 𐌜 𐌢
Gothic U+10330 – U+1034F (66352–66383)
𐌰 𐌸 𐍂 𐍊
Ugaritic U+10380 – U+1039F (66432–66463)
𐎀 𐎇 𐎖 𐎟
Deseret U+10400 – U+1044F (66560–66639)
𐐂 𐐉 𐐯 𐑉
Shavian U+10450 – U+1047F (66640–66687)
𐑐 𐑝 𐑫 𐑿
Osmanya U+10480 – U+104AF (66688–66735)
𐒀 𐒎 𐒝 𐒨
Cypriot Syllabary U+10800 – U+1083F (67584–67647)
𐠀 𐠓 𐠦 𐠿
Imperial Aramaic U+10840 – U+1085F (67648–67679)
𐡀 𐡋 𐡓 𐡟
Phoenician U+10900 – U+1091F (67840–67871)
𐤀 𐤈 𐤔 𐤕
Lydian U+10920 – U+1093F (67872–67903)
𐤠 𐤩 𐤰 𐤿
Kharoshthi U+10A00 – U+10A5F (68096–68191)
𐨀 𐨨𐨍 𐨲 𐩅
Old South Arabian U+10A60 – U+10A7F (68192–68223)
𐩠 𐩯 𐩽 𐩿
Avestan U+10B00 – U+10B3F (68352–68415)
𐬀 𐬟 𐬩 𐬿
Inscriptional Parthian U+10B40 – U+10B5F (68416–68447)
𐭀 𐭉 𐭚 𐭟
Inscriptional Pahlavi U+10B60 – U+10B7F (68448–68479)
𐭠 𐭬 𐭹 𐭿
Old Turkic U+10C00 – U+10C4F (68608–68687)
𐰀 𐰕 𐰯 𐱈
Rumi Numeral Symbols U+10E60 – U+10E7F (69216–69247)
𐹠 𐹮 𐹵 𐹻
Kaithi U+11080 – U+110CF (69760–69839)
𑂞𑂀 𑂚 𑂞𑂴 𑃁
Cuneiform U+12000 – U+123FF (73728–74751)
𒀀 𒀞 𒅑 𒍦
Cuneiform Numbers and Punctuation U+12400 – U+1247F (74752–74879)
𒐁 𒐌 𒐥 𒑳
Egyptian Hieroglyphs U+13000 – U+1342F (77824–78895)
𓀀 𓅸 𓉀 𓐮
Byzantine Musical Symbols U+1D000 – U+1D0FF (118784–119039)
𝁆 𝂋 𝃩 𝃰
Musical Symbols U+1D100 – U+1D1FF (119040–119295)
𝄁 𝄫 𝅘𝅥𝅮 𝇇
Tai Xuan Jing Symbols U+1D300 – U+1D35F (119552–119647)
𝌀 𝌃 𝌑 𝍊
Counting Rod Numerals U+1D360 – U+1D37F (119648–119679)
𝍠 𝍨 𝍬 𝍱
Mathematical Alphanumeric Symbols U+1D400 – U+1D7FF (119808–120831)
𝓐 𝕬 𝝃 𝟽
Mahjong Tiles U+1F000 – U+1F02F (126976–127023)
🀀 🀍 🀒 🀝
Domino Tiles U+1F030 – U+1F09F (127024–127135)
🀴 🁓 🁮 🂈
Enclosed Alphanumeric Supplement U+1F100 – U+1F1FF (127232–127487)
🄀 🄖 🄭 🆐
Enclosed Ideographic Supplement U+1F200 – U+1F2FF (127488–127743)
🈐 🈖 🈪 🉈
CJK Unified Ideographs Extension B U+20000 – U+2A6D6 (131072–173782)
𠀧 𠤩 𡨺 𡽫
CJK Unified Ideographs Extension C U+2A700 – U+2B73F (173824–177983)
𪜀 𪮘 𪾀 𫜴
CJK Compatibility Ideographs Supplement U+2F800 – U+2FA1F (194560–195103)
勺 卉 善 爨
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment