Skip to content

Instantly share code, notes, and snippets.

@Santarh
Last active January 1, 2016 23:39
Show Gist options
  • Save Santarh/8217537 to your computer and use it in GitHub Desktop.
Save Santarh/8217537 to your computer and use it in GitHub Desktop.
Converting character encoding on C#
using System;
using System.IO;
using System.Text;
public class Test
{
public static void Main()
{
// your code goes here
string str_unicode = "あいうえお"; // (あ)0x3042, (い)0x3044 ...
var unicode = Encoding.Unicode;
var utf8 = Encoding.UTF8;
byte[] byte_unicode = unicode.GetBytes(str_unicode);
// utf-16LE のため、(あ) は 0x42(66), 0x30(48)
Console.WriteLine("Unicode 'あ' の下位バイト: " + byte_unicode[0]);
Console.WriteLine("Unicode 'あ' の上位バイト: " + byte_unicode[1]);
// Unicode(UTF-16LE)からUTF-8に変換
byte[] byte_utf8 = Encoding.Convert(unicode, utf8, byte_unicode);
string string_utf8 = utf8.GetString(byte_utf8);
// utf-8 のため、(あ) は 0xe3(227), 0x81(129), 0x82(130)
Console.WriteLine("UTF-8 'あ' の第1バイト: " + byte_utf8[0]);
Console.WriteLine("UTF-8 'あ' の第2バイト: " + byte_utf8[1]);
Console.WriteLine("UTF-8 'あ' の第3バイト: " + byte_utf8[2]);
}
}
@Santarh
Copy link
Author

Santarh commented Jan 2, 2014

実行結果

Unicode 'あ' の下位バイト: 66
Unicode 'あ' の上位バイト: 48
UTF-8 'あ' の第1バイト: 227
UTF-8 'あ' の第2バイト: 129
UTF-8 'あ' の第3バイト: 130

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment