Skip to content

Instantly share code, notes, and snippets.

@etrulls
Created May 25, 2020 09:31
Show Gist options
  • Save etrulls/cd96fa87c9b858b5414af06aa0856db7 to your computer and use it in GitHub Desktop.
Save etrulls/cd96fa87c9b858b5414af06aa0856db7 to your computer and use it in GitHub Desktop.
Processing /home/trulls/imw-2020/milan_cathedral and saving to out/2020-05-25/768-depth-imsize-960/milan_cathedral
0%| | 0/100 [00:00<?, ?it/s]ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
obj = _ForkingPickler.dumps(obj)
File "/usr/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/usr/local/lib/python3.8/dist-packages/torch/multiprocessing/reductions.py", line 333, in reduce_storage
fd, size = storage._share_fd_()
RuntimeError: unable to write to file </torch_2745_3720750461>
0%| | 0/100 [00:00<?, ?it/s]Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 761, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "/usr/lib/python3.8/queue.py", line 179, in get
self.not_empty.wait(remaining)
File "/usr/lib/python3.8/threading.py", line 306, in wait
gotit = waiter.acquire(True, timeout)
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 2743) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to
raise your shared memory limit.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "detect.py", line 124, in <module>
described_samples = extract(model, dataset, args.h5_path)
File "detect.py", line 95, in extract
for (names, transforms, images) in tqdm(dataloader):
File "/usr/local/lib/python3.8/dist-packages/tqdm/std.py", line 1129, in __iter__
for obj in iterable:
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 345, in __next__
data = self._next_data()
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 841, in _next_data
idx, data = self._get_data()
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 798, in _get_data
success, data = self._try_get_data()
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 774, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RuntimeError: DataLoader worker (pid(s) 2743) exited unexpectedly
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/pin_memory.py", line 25, in _pin_memory_loop
r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
File "/usr/lib/python3.8/multiprocessing/queues.py", line 116, in get
return _ForkingPickler.loads(res)
File "/usr/local/lib/python3.8/dist-packages/torch/multiprocessing/reductions.py", line 294, in rebuild_storage_fd
fd = df.detach()
File "/usr/lib/python3.8/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/usr/lib/python3.8/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 508, in Client
answer_challenge(c, authkey)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 751, in answer_challenge
message = connection.recv_bytes(256) # reject large message
File "/usr/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment