Last active
November 8, 2023 00:10
-
-
Save schlameel/3e2c043f8b8a0658fff75581dd88cc6a to your computer and use it in GitHub Desktop.
De-interleave data using numpy
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
CHANNEL_COUNT = 2 | |
frames = np.array([0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]) | |
deinterleaved = [frames[idx::CHANNEL_COUNT] for idx in range(CHANNEL_COUNT)] | |
print(deinterleaved[0]) | |
# prints "[0 0 0 0 0 0 0 0 0 0]" | |
print(deinterleaved[1]) | |
# prints "[1 1 1 1 1 1 1 1 1 1]" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Not needed though: you can open your data with a dtype naming each channel for each frame. Better, this allows you to open memmapped files which can have channels with different data types/sizes, eg, first channel might be U16, second might be S16, etc.
e.g.:
Then you can just access it like so (which does the de-interleaving for you):
This is better, because the data isn't being awkwardly copied into a python list, so numpy will leave it where it is (a memory mapped file is very good for this - on a 64bit system, that file could be TB in size, and this will work fine, and go as fast as the drive allows).
Or you might take a slice of the mmf based on indexing into it, and then access the channels of the slice. Good way to avoid copying all of what could be a huge file!
Works for any number of channels, so long as you can make the right dtype object to describe it.
There are a couple gotchas: They only bite if the data isn't a power of two number of bytes. (1,2,4,8 etc). In this case you can't do the above - at least not as of Numpy V 1.23.5.
Eg, If data is a set of booleans stored as flags in a byte: you must use 'u1' for the channel type, and then later use numpy.unpackbits() to separate the flags (you'd name them then). Similar if it's flags in a word '2', dword '4' or qword '8', although in those cases it also matters whether it's little-endian '<' or big-endian '>' too.
There's a way to deal with packed 24-bit data too ('>s3' doesn't work!), but it's a bit awkward. (open as 32bit, 'break' the stride from 4 to 3 modifying how the view accesses the array under the hood, and mask off the resulting overlapping byte for every access using a bitwise-and).
Still better than taking a massive copy though.