Skip to content

Instantly share code, notes, and snippets.

@anj-s
Created August 3, 2021 14:04
Show Gist options
  • Save anj-s/3e615541c34dc45a714e4a3aa8ada098 to your computer and use it in GitHub Desktop.
Save anj-s/3e615541c34dc45a714e4a3aa8ada098 to your computer and use it in GitHub Desktop.
SIGSEV error when running on multiple nodes
SIGSEGV(11), PID: 1831084, Thread 1831084:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: <unknown function> + 0x133f4 (0x7f12694e43f4 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: <unknown function> + 0x134e8 (0x7f12694e44e8 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #4: PyThread_acquire_lock_timed + 0xd9 (0x558dde9eaa69 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #5: <unknown function> + 0x1af68a (0x558ddea5768a in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #6: <unknown function> + 0x1a51c7 (0x558ddea4d1c7 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #7: <unknown function> + 0x10075e (0x558dde9a875e in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #8: _PyEval_EvalCodeWithName + 0x2d2 (0x558ddea32a92 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #9: _PyFunction_Vectorcall + 0x1e3 (0x558ddea33943 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #10: <unknown function> + 0x10075e (0x558dde9a875e in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #11: _PyEval_EvalCodeWithName + 0x2d2 (0x558ddea32a92 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #12: _PyFunction_Vectorcall + 0x1e3 (0x558ddea33943 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #13: <unknown function> + 0x10075e (0x558dde9a875e in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #14: _PyEval_EvalCodeWithName + 0x2d2 (0x558ddea32a92 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #15: _PyFunction_Vectorcall + 0x1e3 (0x558ddea33943 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #16: PyVectorcall_Call + 0x71 (0x558dde9e5041 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #17: _PyEval_EvalFrameDefault + 0x1fdb (0x558ddea6a99b in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #18: _PyEval_EvalCodeWithName + 0x659 (0x558ddea32e19 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #19: _PyFunction_Vectorcall + 0x1e3 (0x558ddea33943 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #20: <unknown function> + 0x10011a (0x558dde9a811a in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #21: _PyFunction_Vectorcall + 0x10b (0x558ddea3386b in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #22: PyVectorcall_Call + 0x71 (0x558dde9e5041 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #23: _PyEval_EvalFrameDefault + 0x1fdb (0x558ddea6a99b in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #24: _PyEval_EvalCodeWithName + 0x659 (0x558ddea32e19 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #25: _PyFunction_Vectorcall + 0x1e3 (0x558ddea33943 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #26: <unknown function> + 0xfeb84 (0x558dde9a6b84 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #27: <unknown function> + 0x20a98d (0x558ddeab298d in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #28: PyVectorcall_Call + 0x71 (0x558dde9e5041 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #29: _PyEval_EvalFrameDefault + 0x1fdb (0x558ddea6a99b in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #30: _PyEval_EvalCodeWithName + 0x659 (0x558ddea32e19 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #31: _PyFunction_Vectorcall + 0x1e3 (0x558ddea33943 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #32: <unknown function> + 0x10077f (0x558dde9a877f in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #33: _PyFunction_Vectorcall + 0x10b (0x558ddea3386b in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #34: <unknown function> + 0xfeb84 (0x558dde9a6b84 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #35: _PyEval_EvalCodeWithName + 0x2d2 (0x558ddea32a92 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #36: PyEval_EvalCodeEx + 0x44 (0x558ddea33754 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #37: PyEval_EvalCode + 0x1c (0x558ddeac1edc in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #38: <unknown function> + 0x219f84 (0x558ddeac1f84 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #39: <unknown function> + 0x24c1f4 (0x558ddeaf41f4 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #40: PyRun_FileExFlags + 0xa1 (0x558dde9bc6e1 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #41: PyRun_SimpleFileExFlags + 0x3b4 (0x558dde9bcac6 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #42: <unknown function> + 0x11598b (0x558dde9bd98b in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #43: Py_BytesMain + 0x39 (0x558ddeaf6d19 in /private/home/anj/.conda/envs/test_clone/bin/python)
frame #44: __libc_start_main + 0xf3 (0x7f12693060b3 in /lib/x86_64-linux-gnu/libc.so.6)
frame #45: <unknown function> + 0x1dee93 (0x558ddea86e93 in /private/home/anj/.conda/envs/test_clone/bin/python)
SIGSEGV(11), PID: 1831084, Thread 1831113:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: accept4 + 0x60 (0x7f1269402c90 in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: <unknown function> + 0x249df6 (0x7f1167bb3df6 in /lib/x86_64-linux-gnu/libcuda.so.1)
frame #4: <unknown function> + 0x23b39d (0x7f1167ba539d in /lib/x86_64-linux-gnu/libcuda.so.1)
frame #5: <unknown function> + 0x24be38 (0x7f1167bb5e38 in /lib/x86_64-linux-gnu/libcuda.so.1)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831114:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: gloo::transport::tcp::Loop::run() + 0x55 (0x7f11f0383a05 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #4: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #5: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831115:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10d::ProcessGroupGloo::runLoop(int) + 0x241 (0x7f12676762f1 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831116:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10d::ProcessGroupGloo::runLoop(int) + 0x241 (0x7f12676762f1 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831117:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831118:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831119:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831120:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831121:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831122:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831123:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831124:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831125:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831126:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831127:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831128:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831129:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831130:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831131:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831132:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: c10::ThreadPool::main_loop(unsigned long) + 0x8f (0x7f11eaf71cef in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831133:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_timedwait + 0x271 (0x7f12694e17b1 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: torch::distributed::rpc::RpcAgent::retryExpiredRpcs() + 0x19a (0x7f11eedbd5da in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #4: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #5: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831134:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: __sched_yield + 0xb (0x7f12693e489b in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: <unknown function> + 0xf31072 (0x7f126759a072 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831135:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: tensorpipe::EpollLoop::loop() + 0x10d (0x7f12675b748d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #5: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831136:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: __sched_yield + 0xb (0x7f12693e489b in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: <unknown function> + 0xf31072 (0x7f126759a072 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831137:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: tensorpipe::EpollLoop::loop() + 0x10d (0x7f12675b748d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #5: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831138:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831139:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831140:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831141:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831142:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831143:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831144:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831145:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831146:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831147:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831148:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831149:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831150:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831151:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831152:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831153:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831154:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: epoll_wait + 0x5e (0x7f12694015ce in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: uv__io_poll + 0x295 (0x7f126769c635 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: uv_run + 0x107 (0x7f1267691fd7 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: tensorpipe::transport::uv::Loop::eventLoop() + 0x1d (0x7f12675b188d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xf3107c (0x7f126759a07c in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831155:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: tensorpipe::channel::cma::ContextImpl::handleCopyRequests() + 0x143 (0x7f1267566c23 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831156:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: c10::FatalSignalHandler::fatalSignalHandler(int) + 0x16a (0x7f11eaf85d7a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #2: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: <unknown function> + 0x18e70e (0x7f126946d70e in /lib/x86_64-linux-gnu/libc.so.6)
frame #4: <unknown function> + 0x35e5a (0x7f114c7d4e5a in /usr/lib/x86_64-linux-gnu/libibverbs/libmlx5-rdmav25.so)
frame #5: <unknown function> + 0x1b246 (0x7f114c7ba246 in /usr/lib/x86_64-linux-gnu/libibverbs/libmlx5-rdmav25.so)
frame #6: tensorpipe::channel::cuda_gdr::IbvNic::pollOnce() + 0x2e (0x7f126759173e in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: tensorpipe::channel::cuda_gdr::ContextImpl::pollOnce() + 0x34 (0x7f1267592454 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0xf30d6a (0x7f1267599d6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #9: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tensorpipe::EventLoopDeferredExecutor::*)(std::string), tensorpipe::EventLoopDeferredExecutor*, std::string> > >::_M_run() + 0x41 (0x7f1267599981 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #10: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #11: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #12: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831157:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_wait + 0x216 (0x7f12694e1376 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: std::condition_variable::wait(std::unique_lock<std::mutex>&) + 0x10 (0x7f1268206e50 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #4: tensorpipe::CudaLoop::processCallbacks() + 0x750 (0x7f1267582340 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0xf19a55 (0x7f1267582a55 in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #7: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #8: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831158:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: pthread_cond_timedwait + 0x271 (0x7f12694e17b1 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #3: torch::distributed::rpc::TensorPipeAgent::pollTimeoutRpcs() + 0xfd (0x7f126745fa6d in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0xd6d84 (0x7f126820cd84 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #5: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
SIGSEGV(11), PID: 1831084, Thread 1831199:
frame #0: c10::FatalSignalHandler::stacktraceSignalHandler(bool) + 0x12a (0x7f11eaf85a6a in /private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x153c0 (0x7f12694e63c0 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #2: __poll + 0x4f (0x7f12693f4aff in /lib/x86_64-linux-gnu/libc.so.6)
frame #3: <unknown function> + 0x248d3b (0x7f1167bb2d3b in /lib/x86_64-linux-gnu/libcuda.so.1)
frame #4: <unknown function> + 0x30a54a (0x7f1167c7454a in /lib/x86_64-linux-gnu/libcuda.so.1)
frame #5: <unknown function> + 0x24be38 (0x7f1167bb5e38 in /lib/x86_64-linux-gnu/libcuda.so.1)
frame #6: <unknown function> + 0x9609 (0x7f12694da609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #7: clone + 0x43 (0x7f1269401293 in /lib/x86_64-linux-gnu/libc.so.6)
[W tensorpipe_agent.cpp:1102] RPC agent for worker0 encountered error when reading incoming response from worker1: eof (this error originated at tensorpipe/transport/ibv/connection_impl.cc:310)
Traceback (most recent call last):
File "playground/herring/repro/repro_rpc_seg_fault.py", line 112, in <module>
run_worker(node_id, args, world_size)
File "playground/herring/repro/repro_rpc_seg_fault.py", line 58, in run_worker
val = torch.distributed.rpc.rpc_sync(
File "/private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/distributed/rpc/api.py", line 79, in wrapper
return func(*args, **kwargs)
File "/private/home/anj/.conda/envs/test_clone/lib/python3.8/site-packages/torch/distributed/rpc/api.py", line 746, in rpc_sync
return fut.wait()
RuntimeError: eof (this error originated at tensorpipe/transport/ibv/connection_impl.cc:310)
[W tensorpipe_agent.cpp:899] RPC agent for worker0 encountered error when reading incoming request from worker1: eof (this error originated at tensorpipe/transport/ibv/connection_impl.cc:310)
srun: error: learnfair2366: task 1: Segmentation fault (core dumped)
srun: launch/slurm: _step_signal: Terminating StepId=44554066.2
srun: error: learnfair2353: task 0: Exited with exit code 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment