Skip to content

Instantly share code, notes, and snippets.

@RaffaeleSgarro
Last active August 29, 2015 14:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save RaffaeleSgarro/1e228f3a25e77016567e to your computer and use it in GitHub Desktop.
Save RaffaeleSgarro/1e228f3a25e77016567e to your computer and use it in GitHub Desktop.
package stackoverflow;
import java.io.UnsupportedEncodingException;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class GreekCodePoints {
public static void main(String... args) throws UnsupportedEncodingException {
String str = "\\U00ce\\U009c\\U00ce\\U00b1\\U00ce\\U00bd\\U00cf\\U008c\\U00ce\\U00bb\\U00ce\\U00b7\\U00cf\\U0082";
Pattern p = Pattern.compile("\\\\U([a-zA-Z0-9]{4})");
Matcher m = p.matcher(str);
List<Integer> codePoints = new ArrayList<>();
while (m.find()) {
String hex = m.group(1);
int codePoint = Integer.valueOf(hex, 16);
codePoints.add(codePoint);
}
byte[] bytes = new byte[codePoints.size()];
for (int i = 0; i < codePoints.size(); i++)
bytes[i] = codePoints.get(i).byteValue();
for (Map.Entry<String, Charset> charset : Charset.availableCharsets().entrySet()) {
System.out.println(charset.getKey() + ": " + new String(bytes, charset.getValue()));
}
}
}
Big5: �帢彖�弇庢�
Big5-HKSCS: �帢彖�弇庢�
EUC-JP: �留僚�了侶�
EUC-KR: �慣館�貫管�
GB18030: 螠伪谓蠈位畏蟼
GB2312: �伪谓�位畏�
GBK: 螠伪谓蠈位畏蟼
IBM-Thai: ุลุ๑ุาูฟุะุ๗ูb
IBM00858: ╬£╬▒╬¢¤î╬╗╬À¤é
IBM01140: óæó£ó¨õðó]ó¼õb
IBM01141: óæó£ó¨õðó|ó¼õb
IBM01142: ó{ó£ó¨õðó|ó¼õb
IBM01143: óæó£ó¨õðó|ó¼õb
IBM01144: óæó#ó¨õðó|ó¼õb
IBM01145: óæó£ó~õðó!ó¼õb
IBM01146: óæó[ó¨õðó]ó¼õb
IBM01147: óæó#ó~õðó|ó¼õb
IBM01148: óæó£ó¨õðó|ó¼õb
IBM01149: ó}ó£ó¨õ`ó|ó¼õb
IBM037: óæó£ó¨õðó]ó¼õb
IBM1026: óæó£ó¨õ}ó|ó¼õb
IBM1047: óæó£ó]õðó¨ó¼õb
IBM273: óæó£ó¨õðó|ó¼õb
IBM277: ó{ó£ó¨õðó|ó¼õb
IBM278: óæó£ó¨õðó|ó¼õb
IBM280: óæó#ó¨õðó|ó¼õb
IBM284: óæó£ó~õðó!ó¼õb
IBM285: óæó[ó¨õðó]ó¼õb
IBM297: óæó#ó~õðó|ó¼õb
IBM420: �ﻋ�ل�نوﺻ�م��وb
IBM424: ���£�¨���]�¼�b
IBM437: Μανόλης
IBM500: óæó£ó¨õðó|ó¼õb
IBM775: ╬£╬▒╬ĮŽī╬╗╬ĘŽé
IBM850: ╬£╬▒╬¢¤î╬╗╬À¤é
IBM852: ╬ť╬▒╬Ż¤î╬╗╬̤é
IBM855: ╬ю╬▒╬й¤ї╬╗╬и¤ѓ
IBM857: ╬£╬▒╬¢¤î╬╗╬À¤é
IBM860: ╬£╬▒╬╜╧Ô╬╗╬╖╧é
IBM861: ╬£╬▒╬╜╧ð╬╗╬╖╧é
IBM862: ╬£╬▒╬╜╧ל╬╗╬╖╧ג
IBM863: Μανόλης
IBM864: ﺧ�ﺧ١ﺧﺵﺩ┐ﺧ؛ﺧ٧ﺩ∙
IBM865: Μανόλης
IBM866: ╬Ь╬▒╬╜╧М╬╗╬╖╧В
IBM868: ╬ﺝ╬▒╬ﺿﻊ؟╬╗╬ﺻﻊ۲
IBM869: ╬£╬▒╬ΞΣ’╬╗╬ΜΣ�
IBM870: óšóĄó¨őđóŃóźőb
IBM871: ó}ó£ó¨õ`ó|ó¼õb
IBM918: ﮞﺵﮞﻍﮞﮔﻥﺫﮞ|ﮞﻕﻥb
ISO-2022-CN: Μανόλης
ISO-2022-JP: ��������������
ISO-2022-JP-2: ��������������
ISO-2022-KR: Μανόλης
ISO-8859-1: Μανόλης
ISO-8859-13: ĪœĪ±Ī½ĻŒĪ»Ī·Ļ‚
ISO-8859-15: ΜαΜόλης
ISO-8859-2: Μανόλης
ISO-8859-3: ΜÎħνόÎğης
ISO-8859-4: ΜαÎŊĪŒÎģΡĪ‚
ISO-8859-5: ЮœЮБЮНЯŒЮЛЮЗЯ‚
ISO-8859-6: خœخ�خ�دŒخ؛خ�د‚
ISO-8859-7: ΞœΞ±Ξ½ΟŒΞ»Ξ·Ο‚
ISO-8859-8: �œ�±�½�Œ�»�·�‚
ISO-8859-9: Μανόλης
JIS_X0201: ホ�ホアホスマ�ホサホキマ�
JIS_X0212-1990: �������
KOI8-R: н°н╠н╫о▄н╩н╥о┌
KOI8-U: н°н╠нҐо▄н╩нЇо┌
Shift_JIS: ホ慚アホスマ湖サホキマ�
TIS-620: ฮ�ฮฑฮฝฯ�ฮปฮทฯ�
US-ASCII: ��������������
UTF-16: 캜캱캽쾌캻캷쾂
UTF-16BE: 캜캱캽쾌캻캷쾂
UTF-16LE: 鳎뇎뷎賏믎럎苏
UTF-32: ����
UTF-32BE: ����
UTF-32LE: ����
UTF-8: Μανόλης
windows-1250: Μανόλης
windows-1251: ОњО±ОЅПЊО»О·П‚
windows-1252: Μανόλης
windows-1253: �αν�λης
windows-1254: Μανόλης
windows-1255: ־�־±־½ֿ�־»־·ֿ‚
windows-1256: خœخ±خ½دŒخ»خ·د‚
windows-1257: Ī�Ī±Ī½Ļ�Ī»Ī·Ļ‚
windows-1258: Μανόλης
windows-31j: ホ慚アホスマ湖サホキマ�
x-Big5-HKSCS-2001: �帢彖�弇庢�
x-Big5-Solaris: �帢彖�弇庢�
x-euc-jp-linux: �留僚�了侶�
x-EUC-TW: ��杰武��杲枚��
x-eucJP-Open: �留僚�了侶�
x-IBM1006: ﺳœﺳﺎﺳﺛﺵŒﺳﭨﺳﺓﺵ‚
x-IBM1025: ЛмЛыЛЕМфЛЦЛъМb
x-IBM1046: ﺥﻶﺥ١ﺥﺿﺩ┐ﺥ؛ﺥ٧ﺩ÷
x-IBM1097: ﻭﺿﻭﻐﻭﻟﻩﺱﻭ]ﻭﻛﻩb
x-IBM1098: ╬ﭺ╬▒╬¤�ﺀ╬╗╬ﻁ�،
x-IBM1112: óæó£óĶõāó]ó¼õb
x-IBM1122: óæó£ó¨õšó|ó¼õb
x-IBM1123: ЛмЛыЛЕМфЛЦЛъМb
x-IBM1124: ЮœЮБЮНЯŒЮЛЮЗЯ‚
x-IBM1364: �ᅩ�����ᅣ�ᅴ���b
x-IBM1381: �伪谓�位畏�
x-IBM1383: �伪谓�位畏�
x-IBM33722: �留僚�了侶�
x-IBM737: ╬ε╬▒╬╜╧Ν╬╗╬╖╧Γ
x-IBM833: �ᅩ�����ᅣ�ᅴ���b
x-IBM834: �����풩�
x-IBM856: ╬£╬▒╬¢¤ל╬╗╬�¤ג
x-IBM874: ฮ�ฮฑฮฝฯ�ฮปฮทฯ�
x-IBM875: ‘ι‘ά‘φ―γ‘τ‘ύ―b
x-IBM921: ĪœĪ±Ī½ĻŒĪ»Ī·Ļ‚
x-IBM922: Μανόλης
x-IBM930: ���¢�ン�サ�ロ�x�イ
x-IBM933: �ᅩ�����ᅣ�ᅴ���b
x-IBM935: ���������]���b
x-IBM937: ���������]���b
x-IBM939: �ヒ�£�]�ナ�ワ�リ�b
x-IBM942: ホ慚アホスマ湖サホキマ�
x-IBM942C: ホ慚アホスマ湖サホキマ�
x-IBM943: ホ慚アホスマ湖サホキマ�
x-IBM943C: ホ慚アホスマ湖サホキマ�
x-IBM948: 罍蠠譅籙譹襩瓙
x-IBM949: �慣館�貫管�
x-IBM949C: �慣館�貫管�
x-IBM950: �帢彖�弇庢�
x-IBM964: �杰武�杲枚�
x-IBM970: �慣館�貫管�
x-ISCII91: य़�य़औय़टर�य़झय़ङर�
x-ISO-2022-CN-CNS: Μανόλης
x-ISO-2022-CN-GB: Μανόλης
x-iso-8859-11: ฮœฮฑฮฝฯŒฮปฮทฯ‚
x-JIS0208: �������
x-JISAutoDetect: ホ慚アホスマ湖サホキマ�
x-Johab: 풒풤풯픫풭풩픡
x-MacArabic: خúخ١خ�د«خ؛خ٧دÇ
x-MacCentralEurope: őúőĪőĹŌĆőĽő∑Ōā
x-MacCroatian: Μανόλης
x-MacCyrillic: ќЬќ±ќљѕМќїќЈѕВ
x-MacDingbat: ➎�➎⑥➎❽➏�➎❻➎❷➏�
x-MacGreek: Έ­Έ±ΈΫœ¨ΈΜΈΖœ²
x-MacHebrew: ֶúֶ�ֶ�ִåֶ�ֶ�ִÇ
x-MacIceland: Μανόλης
x-MacRoman: Μανόλης
x-MacRomania: Μανόλης
x-MacSymbol: ∈�∈±∈∉�∈≈∈•∉�
x-MacThai: ฮฮฑฮฝฯฮปฮทฯ…
x-MacTurkish: Μανόλης
x-MacUkraine: ќЬќ±ќљѕМќїќЈѕВ
x-MS932_0213: ホ慚アホスマ湖サホキマ�
x-MS950-HKSCS: �帢彖�弇庢�
x-MS950-HKSCS-XP: �帢彖�弇庢�
x-mswin-936: 螠伪谓蠈位畏蟼
x-PCK: ホ慚アホスマ湖サホキマ�
x-SJIS_0213: ホ慚アホスマ湖サホキマ�
x-UTF-16LE-BOM: 鳎뇎뷎賏믎럎苏
X-UTF-32BE-BOM: ����
X-UTF-32LE-BOM: ����
x-windows-50220: ��������������
x-windows-50221: ��������������
x-windows-874: ฮ�ฮฑฮฝฯ�ฮปฮทฯ�
x-windows-949: �慣館�貫管�
x-windows-950: �帢彖�弇庢�
x-windows-iso2022jp: ��������������
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment