(ray_dev) C:\Users\gagan\ray_project\ray>pytest -v -s --count=1 python\ray\tests\test_multinode_failures_2.py::test_actor_creation_node_failure
Test session starts (platform: win32, Python 3.8.11, pytest 5.4.3, pytest-sugar 0.9.4)
cachedir: .pytest_cache
rootdir: C:\Users\gagan\ray_project\ray\python
plugins: anyio-3.3.2, asyncio-0.15.1, lazy-fixture-0.6.3, repeat-0.9.1, rerunfailures-10.2, sugar-0.9.4, timeout-1.4.2
collecting ... 2021-11-02 10:14:14,937 INFO worker.py:838 -- Connecting to existing Ray cluster at address: 127.0.0.1:6379
(pid=None)
2021-11-02 10:14:36,953 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffae0bd0c1a64ad1765b178ede01000000 Worker ID: fa8e061419f1162b00067461c6bc66bfde0b6fb61388f2b95b7e50dc Node ID: f5036d1f1684f4ddfd568cd76dd63289447cb2781a8d28449d908ab8 Worker IP address: 127.0.0.1 Worker port: 53256 Worker PID: 11284
――――――――――――――――――――― test_actor_creation_node_failure[ray_start_cluster0] ―――――――――――――――――――――
2021-11-02 10:38:44,509 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffff48ffebec2b5787bc344d73ad01000000 Worker ID: 5c86fa5f45db9992fb82254bc9b184f6ca3f4638617d0bd4e0afc845 Node ID: 575f423ffa24c16f9872628242726c2cc3a27ccb1d9e691563ac9a22 Worker IP address: 127.0.0.1 Worker port: 53296 Worker PID: 8784
ray_start_cluster = <ray.cluster_utils.Cluster object at 0x000001A15BB879D0>2021-11-02 10:38:44,509 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffff82b7742008bdf710055519f301000000 Worker ID: 1039f96a0f767350c1a48acd9d87a91110edd5a629173b48409abc95 Node ID: 575f423ffa24c16f9872628242726c2cc3a27ccb1d9e691563ac9a22 Worker IP address: 127.0.0.1 Worker port: 53311 Worker PID: 8512
2021-11-02 10:38:44,509 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffff4c3313e0022f6ad6f7cb442901000000 Worker ID: f8866a810d6bbb06441375e5fb050543b1537f326c85edd555b84d43 Node ID: 070019f351d7299024ffac98817cd27c75e18695e67b5e4fba4da174 Worker IP address: 127.0.0.1 Worker port: 53326 Worker PID: 3832
(pid=None)2021-11-02 10:38:44,524 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffee46e1f0bf2d6505bce826cf01000000 Worker ID: 2f2a3ef95d196f453e0902989770aba83b1d4ed4315149ad23777bbf Node ID: 575f423ffa24c16f9872628242726c2cc3a27ccb1d9e691563ac9a22 Worker IP address: 127.0.0.1 Worker port: 53304 Worker PID: 232
File "c:\users\gagan\ray_project\ray\python\ray\workers/default_worker.py", line 185, in <module>
2021-11-02 10:38:44,524 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffff033a2b37e78afc0ce82ee9d701000000 Worker ID: 5bc3b9175dedc94145be8a5719005373944eb543d8b9fb1f801873d5 Node ID: f5036d1f1684f4ddfd568cd76dd63289447cb2781a8d28449d908ab8 Worker IP address: 127.0.0.1 Worker port: 53429 Worker PID: 9524
2021-11-02 10:38:44,571 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffa8570e08c617c6dea57a983401000000 Worker ID: 8d1f1066884781aa246d421af6f2f2dbaeabd82b329e8b2edb0f0e55 Node ID: f5036d1f1684f4ddfd568cd76dd63289447cb2781a8d28449d908ab8 Worker IP address: 127.0.0.1 Worker port: 53438 Worker PID: 8024
2021-11-02 10:38:44,590 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffd5e240c27f5038729286186101000000 Worker ID: 4314807ed14a9a6e2fa2f28d43c427cfc23a32a3df8c107df90ba00d Node ID: 070019f351d7299024ffac98817cd27c75e18695e67b5e4fba4da174 Worker IP address: 127.0.0.1 Worker port: 53453 Worker PID: 11316
2021-11-02 10:38:44,590 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: fffffffffffffffffdcd68ec7417cb4194ad77b101000000 Worker ID: e9c6c669592c2cbd620e873e5e1caea2334a7d4bc9ad3e55528e3702 Node ID: 575f423ffa24c16f9872628242726c2cc3a27ccb1d9e691563ac9a22 Worker IP address: 127.0.0.1 Worker port: 53471 Worker PID: 11372
2021-11-02 10:38:44,731 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffe2b3a0959942db2ef065496d01000000 Worker ID: 791d6abe1d80a9ce2c77352181c8a0a44b1355c10044bfcdb9a649c4 Node ID: 575f423ffa24c16f9872628242726c2cc3a27ccb1d9e691563ac9a22 Worker IP address: 127.0.0.1 Worker port: 53486 Worker PID: 11692
2021-11-02 10:38:44,731 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffeffa03c835af3ee332a7edda01000000 Worker ID: efc12a1e0af9662e19ea85e8a2120e80c7bebeb89ffb87dbe16dec14 Node ID: 575f423ffa24c16f9872628242726c2cc3a27ccb1d9e691563ac9a22 Worker IP address: 127.0.0.1 Worker port: 53591 Worker PID: 3188
2021-11-02 10:38:44,731 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffd9e8e2da121274b5ab8b703701000000 Worker ID: aae722df4ad38baee0bcbbbbee4cc62f2aa30f91fc8178d223b5fc74 Node ID: 070019f351d7299024ffac98817cd27c75e18695e67b5e4fba4da174 Worker IP address: 127.0.0.1 Worker port: 53626 Worker PID: 7948
(pid=None) 2021-11-02 10:38:44,731 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffd4a74bf6aac6724c8e20f48001000000 Worker ID: 7852fb1d813dc22ad064ec9540b413c3d99a7e1909f209df29968307 Node ID: f5036d1f1684f4ddfd568cd76dd63289447cb2781a8d28449d908ab8 Worker IP address: 127.0.0.1 Worker port: 53629 Worker PID: 1256
@pytest2021-11-02 10:38:44,746 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffdc123dc06f9b9919c326333901000000 Worker ID: bd6ed7e6903f064811661c5abdf210f6a63052c968e121a67f08659f Node ID: 575f423ffa24c16f9872628242726c2cc3a27ccb1d9e691563ac9a22 Worker IP address: 127.0.0.1 Worker port: 53710 Worker PID: 7728
.mark.parametrize(gagan\ray_project\ray\python\ray\node.py", line 221, in __init__
2021-11-02 10:38:44,746 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffdec929a2b58d1ecc72d67da801000000 Worker ID: 13e774cffa21b355833c947467ae16e171e820655ee096fb3b00b2a3 Node ID: 575f423ffa24c16f9872628242726c2cc3a27ccb1d9e691563ac9a22 Worker IP address: 127.0.0.1 Worker port: 53592 Worker PID: 7960
2021-11-02 10:38:44,762 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffff06aaba414ca1cbeb9db70cb301000000 Worker ID: 2b6be0239748d0e936bff9594840fc5762b96ceab46e1cbd72f85517 Node ID: 070019f351d7299024ffac98817cd27c75e18695e67b5e4fba4da174 Worker IP address: 127.0.0.1 Worker port: 53280 Worker PID: 8096
(pid=None)"2021-11-02 10:38:44,790 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffef0e9fa5af3e0a5df2ec657e01000000 Worker ID: 5ccbfa3ff5230781d3789772d63c943c304064420fbbc25bc4db9190 Node ID: 575f423ffa24c16f9872628242726c2cc3a27ccb1d9e691563ac9a22 Worker IP address: 127.0.0.1 Worker port: 53604 Worker PID: 216
2021-11-02 10:38:44,790 WARNING worker.py:1239 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffffd74d17c6d11e5f497765e7d001000000 Worker ID: a103a41221f9fb24398c1bae5d632c0db819ca9a25966c3edaed08ab Node ID: f5036d1f1684f4ddfd568cd76dd63289447cb2781a8d28449d908ab8 Worker IP address: 127.0.0.1 Worker port: 53269 Worker PID: 11556
self.metrics_agent_port = self._get_cached_port(
ray_start_cluster(pid=None) File "c:\users\gagan\ray_project\ray\python\ray\node.py", line 669, in _get_cached_port
"(pid=None), [{
y_node.update(json.load(f))
"(pid=None) File "c:\programdata\anaconda3\envs\ray_dev\lib\json\__init__.py", line 293, in load
num_cpus(pid=None) return loads(fp.read(),
"(pid=None) File "c:\programdata\anaconda3\envs\ray_dev\lib\json\__init__.py", line 357, in loads
: 4(pid=None) return _default_decoder.decode(s)
,
(pid=None)" File "c:\programdata\anaconda3\envs\ray_dev\lib\json\decoder.py", line 337, in decode
(pid=None)num_nodes obj, end = self.raw_decode(s, idx=_w(s, 0).end())
"(pid=None) File "c:\programdata\anaconda3\envs\ray_dev\lib\json\decoder.py", line 355, in raw_decode
: (pid=None)3 raise JSONDecodeError("Expecting value", s, err.value) from None
,(pid=None)
r.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
do_initone) Traceback (most recent call last):
(pid=None)" File "c:\users\gagan\ray_project\ray\python\ray\workers/default_worker.py", line 185, in <module>
: (pid=None)True node = ray.node.Node(
(pid=None) }], File "c:\users\gagan\ray_project\ray\python\ray\node.py", line 221, in __init__
) indirect=True(pid=None) self.metrics_agent_port = self._get_cached_port(
(pid=None)def File "c:\users\gagan\ray_project\ray\python\ray\node.py", line 669, in _get_cached_port
(pid=None)test_actor_creation_node_failure ports_by_node.update(json.load(f))
(ray_start_cluster):(pid=None)
File "c:\programdata\anaconda3\envs\ray_dev\lib\json\__init__.py", line 293, in load
# TODO(swang): Refactor test_raylet_failed, etc to reuse the below code.(pid=None)
cluster = ray_start_cluster
(pid=None) File "c:\programdata\anaconda3\envs\ray_dev\lib\json\__init__.py", line 357,@rayloads
(pid=None).remote
urn _default_decoder.decode(s)
class(pid=None) File "c:\programdata\anaconda3\envs\ray_dev\lib\json\decoder.py", line 337, in decode
Child(pid=None) obj, end = self.raw_decode(s, idx=_w(s, 0).end())
:(pid=None)
File "c:\programdata\anaconda3\envs\ray_dev\lib\json\decoder.py", line 355, in raw_decode
def(pid=None) raise JSONDecodeError("Expecting value", s, err.value) from None
(_init__(pid=None) json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
pid=11284), death_probability): 2021-11-02 10:14:36,937 ERROR worker.py:426 -- SystemExit was raised from the worker
pid=11284) Traceback (most recent call last):
.death_probability = death_probability
pid=11284)
File "python\ray\_raylet.pyx", line 757, in ray._raylet.task_execution_handler
pid=11284) execute_task(task_type, task_name, ray_function, c_resources,
pid=11284)( File "python\ray\_raylet.pyx", line 580, in ray._raylet.execute_task
pid=11284)): with core_worker.profile_event(b"task", extra_data=extra_data):
pid=11284) File "python\ray\_raylet.pyx", line 618, in ray._raylet.execute_task
exit_chance = np.random.rand()
ifid=11284) with core_worker.profile_event(b"task:execute"):
pid=11284) exit_chance < self File "python\ray\_raylet.pyx", line 625, in ray._raylet.execute_task
.death_probability:
pid=11284) sys.exit(- with ray.worker._changeproctitle(title, next_title)1
pid=11284))
ile "python\ray\_raylet.pyx", line 629, in ray._raylet.execute_task
pid=11284)25 outputs = function_executor(*args, **kwargs)
pid=11284)# Children actors will die about half the time. File "python\ray\_raylet.pyx", line 578, in ray._raylet.execute_task.function_executor
0.5d=11284) death_probability = return function(actor, *arguments, **kwarguments)
pid=11284)
File "c:\users\gagan\ray_project\ray\python\ray\_private\function_manager.py", line 594, in actor_method_executor
pid=11284) return method(__ray_actor, *args, **kwargs)ld
pid=11284) File "c:\users\gagan\ray_project\ray\python\ray\util\tracing\tracing_helper.py", li e 451, in _resume_span
(num_children)] return method(self, *_args, **_kwargs)
pid=11284) File "C:\Users\gagan\ray_project\ray\python\ray\tests\test_multinode_failures_2.py", line 95, in ping
pid=11284)len sys.exit(-1)
pid=11284)1 SystemExit: -1> (Child
pid=8784):
10:14:37,718 ERROR worker.py:426 -- SystemExit was raised from the worker
inid=8784) j Traceback (most recent call last):
pid=8784)range File "python\ray\_raylet.pyx", line 757, in ray._raylet.task_execution_handler
pid=8784) execute_task(task_type, task_name, ray_function, c_resources,
):
pid=8784) File "python\ray\_raylet.pyx", line 580, in ray._raylet.execute_task# Submit some tasks on the actors. About half of the actors will
pid=8784)
# fail. with core_worker.profile_event(b"task", extra_data=extra_data):
pid=8784)for File "python\ray\_raylet.pyx", line 618, in ray._raylet.execute_task
pid=8784)in with core_worker.profile_event(b"task:execute"):
children]
pid=8784) File "python\ray\_raylet.pyx", line 625, in ray._raylet.execute_task# Wait a while for all the tasks to complete. This should trigger
pid=8784)
with ray.worker._changeproctitle(title, next_title):
pid=8784) File "python\ray\_raylet.pyx", line 629, in ray._raylet.execute_task
pid=8784)# to nodes that then failed. outputs = function_executor(*args, **kwargs)
ready, _ = ray.wait(
pid=8784) children_out, num_returns= File "python\ray\_raylet.pyx", line 57lenin ray._raylet.execute_task.function_executor
5pid=8784)(children_out), timeout= return function(actor, *arguments, **kwarguments)
pid=8784) * File "c:\users\gagan\ray_project\ray\python\ray\_private\function_manager.py", line 594, in actor_method_executor
)pid=8784) return method(__ray_actor, *args, **kwargs)
pid=8784)assert File "c:\users\gagan\ray_project\ray\python\ray\util\tracing\tracing_helper.py", line 451, in _resume_span
pid=8784)len return method(self, *_args, **_kwargs)
pid=8784)len File "C:\Users\gagan\ray_project\ray\python\ray\tests\test_multinode_failures_2.py", line 95, in ping
pid=8784)out)(Child
sys.exit(-1)
E assert 14 == 25
E +14: -1
pid=8512)E -25 2021-11-02 10:14:37,846 ERROR worker.py:426 -- SystemExit was raised from the worker
pid=8512)python\ray\tests\test_multinode_failures_2.py Traceback (most recent call last):
:112: AssertionError
pid=8512)
File "python\ray\_raylet.pyx", line 757, in ray._raylet.task_execution_handler
pid=8512) execute_task(task_type, task_name, ray_function, c_resources,
pid=8512) File "python\ray\_raylet.pyx", line 580, in ray._raylet.execute_task
pid=8512) with core_worker.profile_event(b"task", extra_data=extra_data):
pid=8512) File "python\ray\_raylet.pyx", line 618, in ray._raylet.execute_task
pid=8512) with core_worker.profile_event(b"task:execute"):
pid=8512) File "python\ray\_raylet.pyx", line 625, in ray._raylet.execute_task
pid=8512) with ray.worker._changeproctitle(title, next_title):
pid=8512) File "python\ray\_raylet.pyx", line 629, in ray._raylet.execute_task
pid=8512) outputs = function_executor(*args, **kwargs)
pid=8512) File "python\ray\_raylet.pyx", line 578, in ray._raylet.execute_task.function_executor
pid=8512) return function(actor, *arguments, **kwarguments)
pid=8512) File "c:\users\gagan\ray_project\ray\python\ray\_private\function_manager.py", line 594, in actor_method_executor
pid=8512) return method(__ray_actor, *args, **kwargs)
pid=8512) File "c:\users\gagan\ray_project\ray\python\ray\util\tracing\tracing_helper.py", line 451, in _resume_span
pid=8512) return method(self, *_args, **_kwargs)
pid=8512) File "C:\Users\gagan\ray_project\ray\python\ray\tests\test_multinode_failures_2.py", line 95, in ping
pid=8512) sys.exit(-1)
pid=8512) SystemExit: -1
ray\tests\test_multinode_failures_2.py::test_actor_creation_node_failure[ray_start_cluster0] ⨯100% ██████████
=================================== short test summary info ====================================
FAILED python\ray\tests\test_multinode_failures_2.py::test_actor_creation_node_failure[ray_start_cluster0]
Results (1478.75s):
1 failed
- ray\tests/test_multinode_failures_2.py:75 test_actor_creation_node_failure[ray_start_cluster0]
Created
November 2, 2021 10:40
-
-
Save czgdp1807/bc9ab4da6bcf924222165689804d6f2a to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment