Skip to content

Instantly share code, notes, and snippets.

@CMCDragonkai
Created June 5, 2018 06:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save CMCDragonkai/3a73dd223a35f15da316eab2484984ac to your computer and use it in GitHub Desktop.
Save CMCDragonkai/3a73dd223a35f15da316eab2484984ac to your computer and use it in GitHub Desktop.
Memory Mapped Numpy Array (out-of-core) #python #numpy
import tempfile
import numpy as np
# the backing type is important
# numpy arrays must have a fixed size element
# so if you want to do out-of-core analytics, you need to know your element size
# for example here, we bytelength of 10
# the 'S' here means ASCII bytes, thus you can put bytestrings into it
# you can also use `('U', 10)` for utf-8 strings, this would mean 10 unicode codepoints
backing_type = np.dtype(('S', 10))
# you need to know the shape ahead of time
# but you can also incrementally reshape it
# make sure to always "enlarge" it
# otherwise you risk losing data
backing_shape = (100, )
with tempfile.TemporaryFile() as backing_f:
arr = np.memmap(
backing_f, dtype=backing_type, mode='r+', shape=backing_shape)
print('Using %d bytes', backing_f.tell())
arr[0] = b'abc'
arr[1] = 'abc'.encode()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment