Skip to content

Instantly share code, notes, and snippets.

@tkf
Created April 1, 2012 16:25
Show Gist options
  • Save tkf/2276773 to your computer and use it in GitHub Desktop.
Save tkf/2276773 to your computer and use it in GitHub Desktop.
Character <-> Int conversion in numpy

Character <-> Int conversion in numpy

First thing to do:

>>> import numpy

Char array to int array [1]:

>>> ac = numpy.array('ABC', 'c')
>>> ac                                   #doctest: +NORMALIZE_WHITESPACE
array(['A', 'B', 'C'], dtype='|S1')
>>> ac_as_int = ac.view(numpy.uint8)
>>> ac_as_int                            #doctest: +NORMALIZE_WHITESPACE
array([65, 66, 67],
      dtype=uint8)
>>> # get new copy
>>> numpy.array(ac_as_int)               #doctest: +NORMALIZE_WHITESPACE
array([65, 66, 67], dtype=uint8)
[1]Stolen from [Numpy-discussion] String to integer array of ASCII values.

Unicode array to int array:

>>> au = numpy.array(list('ABC'), 'U1')
>>> au                                   #doctest: +NORMALIZE_WHITESPACE
array([u'A', u'B', u'C'], dtype='<U1')
>>> au_as_int = au.view(numpy.uint32)
>>> au_as_int
array([65, 66, 67], dtype=uint32)
>>> # get new copy
>>> numpy.array(au_as_int)               #doctest: +NORMALIZE_WHITESPACE
array([65, 66, 67], dtype=uint32)

Int array to char array:

>>> ac_as_int.view('c')                  #doctest: +NORMALIZE_WHITESPACE
array(['A', 'B', 'C'], dtype='|S1')

Int array to unicode array:

>>> au_as_int.view('U1')                 #doctest: +NORMALIZE_WHITESPACE
array([u'A', u'B', u'C'], dtype='<U1')
>>> # you can't do this
>>> ac_as_int.view('U1')                 #doctest: +NORMALIZE_WHITESPACE
Traceback (most recent call last):
  ...
ValueError: new type not compatible with array.
>>> # you need explicit conversion
>>> ac_as_int.astype(numpy.uint32).view('U1') #doctest: +NORMALIZE_WHITESPACE
array([u'A', u'B', u'C'], dtype='<U1')

To use .view, you need to have same size (compatible type) of dtype. This is how to quickly check the size:

>>> numpy.array(0, dtype='c').dtype.itemsize
1
>>> numpy.array(0, dtype=numpy.uint8).dtype.itemsize
1
>>> numpy.array(0, dtype='U1').dtype.itemsize
4
>>> numpy.array(0, dtype=numpy.uint32).dtype.itemsize
4

(Not so many) resources:

Copy link

ghost commented Apr 30, 2016

Thank you for this gist. Also I've just discovered:

>>> np.char.mod('%c', ac_as_int)
array(['A', 'B', 'C'], dtype='<U1')

@JaeDukSeo
Copy link

This is an amazing post!

@py-mako
Copy link

py-mako commented Jul 2, 2021

Very helpful indeed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment