Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
python3 で unicode ファイルを genfromtxt で読み込む
# Thanks to http://stackoverflow.com/questions/33001373/loading-utf-8-file-in-python-3-using-numpy-genfromtxt
# converter の前の数字は,区切りファイルの列番号(0から)
# 普通に読むと bytecode になるが,それだと unicode 文字列に代入できないので,明示的に変換する
dtype = np.dtype([('eid', np.int), ('feature', 'U82')])
np.genfromtxt("foo.tsv", delimiter='\t', dtype=dtype, converters={1: lambda x: x.decode('utf_8')})
@tkamishima

This comment has been minimized.

Copy link
Owner Author

@tkamishima tkamishima commented Apr 20, 2016

2行目は こっちでいけた

np.genfromtxt("foo.tsv", delimiter='\t', dtype=dtype, converters={1:np.char.decode})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.