-
-
Save mgao6767/bf463c7a872d6da07a5965316a9daf85 to your computer and use it in GitHub Desktop.
Hello Adrian,
I am trying something similar, but noticed an issue when trying to convert the array into a pandas dataframe as below:
def work_with_shared_memory(shm_name, shape, dtype):
print(f'With SharedMemory: {current_process()=}')
# Locate the shared memory by its name
shm = SharedMemory(shm_name)
# Create the np.recarray from the buffer of the shared memory
np_array = np.recarray(shape=shape, dtype=dtype, buf=shm.buf)
df = pd.DataFrame.from_records(np_array) <===============================
return np.nansum(np_array.val)I get the error: "concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending."
I will obviously further manipulate the df before returning a value, but the function breaks before any manipulation.
Any thoughts on how to fix this would be greatly appreciated!
Thanks
Hi @frankfdbr,
The problem is that the dtype
of np_array
cannot be object
. If so, dereferencing np_array
will cause segfault: in this case, np_array.character_col
. It's okay to use np_array.val
and np_array.date
because their dtype
s are not object
.
The solution to this problem is to set dtype
in to_records()
, for example:
np_array = df.to_records(index=False,column_dtypes={'character_col': 'S6'})
If you want unicode, replace S6
with U6
(6 is for the length of the string).
Best,
Adrian
Thanks Adrian, much appreciated!!!! ;)
Hello Adrian,
I am trying something similar, but noticed an issue when trying to convert the array into a pandas dataframe as below:
def work_with_shared_memory(shm_name, shape, dtype):
print(f'With SharedMemory: {current_process()=}')
# Locate the shared memory by its name
shm = SharedMemory(shm_name)
# Create the np.recarray from the buffer of the shared memory
np_array = np.recarray(shape=shape, dtype=dtype, buf=shm.buf)
df = pd.DataFrame.from_records(np_array) <===============================
return np.nansum(np_array.val)
I get the error: "concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending."
I will obviously further manipulate the df before returning a value, but the function breaks before any manipulation.
Any thoughts on how to fix this would be greatly appreciated!
Thanks