Created
June 5, 2018 06:32
-
-
Save CMCDragonkai/3a73dd223a35f15da316eab2484984ac to your computer and use it in GitHub Desktop.
Memory Mapped Numpy Array (out-of-core) #python #numpy
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import tempfile | |
import numpy as np | |
# the backing type is important | |
# numpy arrays must have a fixed size element | |
# so if you want to do out-of-core analytics, you need to know your element size | |
# for example here, we bytelength of 10 | |
# the 'S' here means ASCII bytes, thus you can put bytestrings into it | |
# you can also use `('U', 10)` for utf-8 strings, this would mean 10 unicode codepoints | |
backing_type = np.dtype(('S', 10)) | |
# you need to know the shape ahead of time | |
# but you can also incrementally reshape it | |
# make sure to always "enlarge" it | |
# otherwise you risk losing data | |
backing_shape = (100, ) | |
with tempfile.TemporaryFile() as backing_f: | |
arr = np.memmap( | |
backing_f, dtype=backing_type, mode='r+', shape=backing_shape) | |
print('Using %d bytes', backing_f.tell()) | |
arr[0] = b'abc' | |
arr[1] = 'abc'.encode() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment