Created
April 20, 2016 17:52
-
-
Save tkamishima/52f3c1883bdf7d328e53ff2b2d455e0e to your computer and use it in GitHub Desktop.
python3 で unicode ファイルを genfromtxt で読み込む
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Thanks to http://stackoverflow.com/questions/33001373/loading-utf-8-file-in-python-3-using-numpy-genfromtxt | |
# converter の前の数字は,区切りファイルの列番号(0から) | |
# 普通に読むと bytecode になるが,それだと unicode 文字列に代入できないので,明示的に変換する | |
dtype = np.dtype([('eid', np.int), ('feature', 'U82')]) | |
np.genfromtxt("foo.tsv", delimiter='\t', dtype=dtype, converters={1: lambda x: x.decode('utf_8')}) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
2行目は こっちでいけた