Skip to content

Instantly share code, notes, and snippets.

@paddy74
Created August 28, 2018 19:10
Show Gist options
  • Save paddy74/0900042929d5350be88e3db09ab53223 to your computer and use it in GitHub Desktop.
Save paddy74/0900042929d5350be88e3db09ab53223 to your computer and use it in GitHub Desktop.
Create a memory-mapped array comprised of the concatenation of every .npy file in a directory
def concatenate_array_files(src_dir, file_name):
"""Create a memory-mapped array comprised of the concatenation of every
.npy file in a directory
Parameters
----------
src_dir : str
Directory of 2D .npy arrays with matching feature counts
file_name : str
File in which to save the concatenated arrays
"""
total_samples = 0
for f in os.listdir(src_dir):
# Load the next array
pat_arr = np.load(os.path.join(src_dir, f), mmap_mode='r')
n_samples, n_features = pat_arr.shape
total_samples += n_samples
#print("**************")
#print("File : " + f)
#print("n_samples : " + str(n_samples))
#print("total_samples: " + str(total_samples))
# Load the concatenated file
big_arr = np.memmap(target_file, dtype = np.float32, mode='w+', shape=(total_samples, n_features))
big_arr[:n_samples, :] = pat_arr
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment