Skip to content

Instantly share code, notes, and snippets.

@morganmcg1
Created May 5, 2021 15:14
Show Gist options
  • Save morganmcg1/281f7d62275f784704dbc2ed37944598 to your computer and use it in GitHub Desktop.
Save morganmcg1/281f7d62275f784704dbc2ed37944598 to your computer and use it in GitHub Desktop.
cluster error
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
/srv/conda/envs/saturn/lib/python3.7/site-packages/wandb/sdk/wandb_init.py in init()
743 try:
--> 744 run = wi.init()
745 except_exit = wi.settings._except_exit
/srv/conda/envs/saturn/lib/python3.7/site-packages/wandb/sdk/wandb_init.py in init()
420 backend = Backend(settings=s)
--> 421 backend.ensure_launched()
422 backend.server_connect()
/srv/conda/envs/saturn/lib/python3.7/site-packages/wandb/sdk/backend/backend.py in ensure_launched()
124 # Start the process with __name__ == "__main__" workarounds
--> 125 self.wandb_process.start()
126 self._internal_pid = self.wandb_process.pid
/srv/conda/envs/saturn/lib/python3.7/multiprocessing/process.py in start()
109 assert not _current_process._config.get('daemon'), \
--> 110 'daemonic processes are not allowed to have children'
111 _cleanup()
AssertionError: daemonic processes are not allowed to have children
The above exception was the direct cause of the following exception:
Exception Traceback (most recent call last)
<ipython-input-16-d566959a97b0> in <module>
1 # If one or more worker jobs errors, this will describe the issue
----> 2 futures[0].result()
/srv/conda/envs/saturn/lib/python3.7/site-packages/distributed/client.py in result(self, timeout)
223 if self.status == "error":
224 typ, exc, tb = result
--> 225 raise exc.with_traceback(tb)
226 elif self.status == "cancelled":
227 raise result
/srv/conda/envs/saturn/lib/python3.7/site-packages/dask_pytorch_ddp/dispatch.py in dispatch_with_ddp()
117 try:
118 dist.init_process_group(backend=backend)
--> 119 val = pytorch_function(*args, **kwargs)
120 finally:
121 dist.destroy_process_group()
<ipython-input-13-96871946f5fa> in simple_train_cluster()
10 # --------- Start wandb --------- #
11 if worker_rank == 0:
---> 12 wandb.init(config=wbargs, entity='wandb', project = 'wandb_saturn_demo')
13 wandb.watch(model)
14
/srv/conda/envs/saturn/lib/python3.7/site-packages/wandb/sdk/wandb_init.py in init()
779 if except_exit:
780 os._exit(-1)
--> 781 six.raise_from(Exception("problem"), error_seen)
782 return run
/srv/conda/envs/saturn/lib/python3.7/site-packages/six.py in raise_from()
Exception: problem
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment