Skip to content

Instantly share code, notes, and snippets.

@albarrentine
Created May 2, 2012 14:44
Show Gist options
  • Save albarrentine/2577076 to your computer and use it in GitHub Desktop.
Save albarrentine/2577076 to your computer and use it in GitHub Desktop.
numpy.fromiter using a custom generator and data type
import numpy
# Imagine these come from a db cursor or something
coordinates = [(1,2,3), (4,5,6), (7,8,9)]
def my_gen(some_tuple):
for x, y, z in some_tuple:
yield x, y, z
a = numpy.fromiter(my_gen(coordinates), dtype=[('x', 'l'), ('y', 'l'), ('z', 'l')])
"""
This creates a numpy record array, which looks like:
array([(1, 2, 3), (4, 5, 6), (7, 8, 9)],
dtype=[('x', '<i8'), ('y', '<i8'), ('z', '<i8')])
>>> a['x']
array([1, 4, 7])
.... Some timings:
python -m timeit -n1 -r100 "a = []" "for i in xrange(1000000):" " a.append(i)"
1 loops, best of 100: 93.1 msec per loop
python -m timeit -n1 -r100 -s "import numpy" "a = numpy.fromiter(xrange(1000000), dtype=numpy.int64)"
1 loops, best of 100: 74.3 msec per loop
python -m timeit -n1 -r100 -s "import numpy" "def gen():" " for i in xrange(1000000):" " yield i" "l = []" "for i in gen():" " l.append(i)"
1 loops, best of 100: 247 msec per loop
python -m timeit -n1 -r100 -s "import numpy" "def gen():" " for i in xrange(1000000):" " yield i" "a = numpy.fromiter(gen(), dtype=numpy.int64)"
1 loops, best of 100: 123 msec per loop
"""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment