Skip to content

Instantly share code, notes, and snippets.

@clarng
Created August 16, 2022 18:13
Show Gist options
  • Save clarng/9b2a2cdbc184dc9e8f3c8c7dc40ba7f2 to your computer and use it in GitHub Desktop.
Save clarng/9b2a2cdbc184dc9e8f3c8c7dc40ba7f2 to your computer and use it in GitHub Desktop.
Stage 0: : 5it [00:07, 1.39s/it]1.87.52)
Stage 0: : 5it [00:07, 1.40s/it]1.85.195)
Stage 0: : 8it [00:15, 2.11s/it] pid=23663)
Stage 0: : 6it [00:10, 2.01s/it]1.93.21) 3)
Stage 0: : 6it [00:10, 2.02s/it]1.76.4)
Stage 0: : 6it [00:10, 2.03s/it]1.90.220)
Stage 0: : 6it [00:10, 2.02s/it]1.75.180)
Stage 0: : 6it [00:10, 2.03s/it]1.79.152)
Stage 0: : 6it [00:10, 2.03s/it]1.76.82)
Stage 0: : 6it [00:10, 2.04s/it]1.92.6)
Stage 0: : 6it [00:10, 2.03s/it]1.86.117)
Stage 0: : 6it [00:10, 2.02s/it]1.83.41)
Stage 0: : 6it [00:10, 2.03s/it]1.92.196)
Stage 0: : 6it [00:10, 2.05s/it]1.85.195)
Stage 0: : 6it [00:10, 2.04s/it]1.75.48)
Stage 0: : 6it [00:10, 2.04s/it]
Stage 0: : 6it [00:10, 2.05s/it]1.78.120)
Stage 0: : 6it [00:10, 2.04s/it]1.64.186)
Stage 0: : 6it [00:10, 2.05s/it]1.90.196)
Stage 0: : 6it [00:10, 2.04s/it]1.87.41)
Stage 0: : 6it [00:10, 2.04s/it]1.94.99)
Stage 0: : 6it [00:10, 2.05s/it]1.70.173)
Stage 0: : 6it [00:10, 2.05s/it]1.87.52)
Stage 0: : 9it [00:17, 1.99s/it] pid=23663)
Stage 0: : 6it [00:11, 1.99s/it]1.93.21) 3)
Stage 0: : 6it [00:11, 2.00s/it]1.76.4)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.01s/it]1.90.220)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.01s/it]1.92.6)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 2.00s/it]1.83.41)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.01s/it]1.86.117)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.01s/it]1.92.196)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.02s/it]1.85.195)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.01s/it]1.75.48)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.02s/it]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.01s/it]1.78.120)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.01s/it]1.64.186)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.02s/it]1.90.196)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.01s/it]1.75.180)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.00s/it]1.79.152)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.01s/it]1.87.41)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.00s/it]1.76.82)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.02s/it]1.70.173)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.01s/it]1.94.99)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:12, 2.02s/it]1.87.52)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 10it [00:18, 1.92s/it]pid=23663)
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.17it/s]
Stage 0: : 2it [00:01, 1.17it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.17it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.15it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.17it/s]
Stage 0: : 2it [00:01, 1.15it/s]
Stage 0: : 2it [00:01, 1.15it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.14it/s]
Stage 0: : 2it [00:01, 1.15it/s]
Stage 0: : 2it [00:01, 1.15it/s]
Stage 0: : 2it [00:01, 1.15it/s]
Stage 0: : 2it [00:01, 1.12it/s]
Stage 0: : 11it [00:20, 1.94s/it]pid=23663)
Stage 0: : 3it [00:03, 1.43s/it]1.90.220) )
Stage 0: : 3it [00:03, 1.43s/it]1.83.41)
Stage 0: : 3it [00:03, 1.43s/it]1.86.117)
Stage 0: : 3it [00:03, 1.43s/it]1.92.196)
Stage 0: : 3it [00:03, 1.43s/it]1.93.21)
Stage 0: : 3it [00:03, 1.43s/it]1.75.48)
Stage 0: : 3it [00:03, 1.44s/it]1.76.4)
Stage 0: : 3it [00:03, 1.43s/it]1.78.120)
Stage 0: : 3it [00:03, 1.43s/it]
Stage 0: : 3it [00:03, 1.42s/it]1.90.196)
Stage 0: : 3it [00:03, 1.43s/it]1.79.152)
Stage 0: : 3it [00:03, 1.42s/it]1.75.180)
Stage 0: : 3it [00:03, 1.44s/it]1.92.6)
Stage 0: : 3it [00:03, 1.43s/it]1.70.173)
Stage 0: : 3it [00:03, 1.43s/it]1.76.82)
Stage 0: : 3it [00:03, 1.43s/it]1.87.52)
Stage 0: : 3it [00:03, 1.43s/it]1.64.186)
Stage 0: : 3it [00:03, 1.44s/it]1.94.99)
Stage 0: : 3it [00:03, 1.44s/it]1.87.41)
Stage 0: : 3it [00:04, 1.59s/it]1.85.195)
Stage 0: : 12it [00:22, 1.98s/it]pid=23663)
Stage 0: : 4it [00:05, 1.66s/it]1.75.48) 3)
Stage 0: : 4it [00:05, 1.64s/it]1.64.186)
Stage 0: : 4it [00:05, 1.64s/it]1.78.120)
Stage 0: : 4it [00:05, 1.65s/it]
Stage 0: : 13it [00:23, 1.54s/it]pid=23663)
Stage 0: : 4it [00:05, 1.63s/it]1.90.196) )
Stage 0: : 4it [00:06, 1.68s/it]1.90.220)
Stage 0: : 4it [00:05, 1.66s/it]1.79.152)
Stage 0: : 4it [00:05, 1.66s/it]1.92.6)
Stage 0: : 4it [00:05, 1.64s/it]1.75.180)
Stage 0: : 4it [00:05, 1.65s/it]1.70.173)
Stage 0: : 4it [00:05, 1.65s/it]1.76.82)
Stage 0: : 4it [00:05, 1.63s/it]1.94.99)
Stage 0: : 4it [00:06, 1.67s/it]1.83.41)
Stage 0: : 4it [00:06, 1.67s/it]1.86.117)
Stage 0: : 4it [00:05, 1.62s/it]1.87.41)
Stage 0: : 4it [00:06, 1.67s/it]1.92.196)
Stage 0: : 4it [00:06, 1.68s/it]1.93.21)
Stage 0: : 4it [00:05, 1.64s/it]1.87.52)
Stage 0: : 4it [00:05, 1.62s/it]1.85.195)
Stage 0: : 4it [00:06, 1.68s/it]1.76.4)
Stage 0: : 5it [00:06, 1.29s/it]1.75.48)
Stage 0: : 5it [00:06, 1.28s/it]
Stage 0: : 5it [00:06, 1.31s/it]1.90.220)
Stage 0: : 5it [00:06, 1.30s/it]1.79.152)
Stage 0: : 5it [00:06, 1.28s/it]1.92.6)
Stage 0: : 5it [00:06, 1.28s/it]1.75.180)
Stage 0: : 5it [00:06, 1.29s/it]1.83.41)
Stage 0: : 5it [00:06, 1.30s/it]1.86.117)
Stage 0: : 5it [00:06, 1.26s/it]1.87.41)
Stage 0: : 5it [00:06, 1.30s/it]1.92.196)
Stage 0: : 5it [00:06, 1.30s/it]1.93.21)
Stage 0: : 5it [00:06, 1.29s/it]1.87.52)
Stage 0: : 5it [00:06, 1.28s/it]1.85.195)
Stage 0: : 5it [00:06, 1.31s/it]1.76.4)
Stage 0: : 5it [00:06, 1.29s/it]1.64.186)
Stage 0: : 5it [00:06, 1.29s/it]1.78.120)
Stage 0: : 5it [00:06, 1.29s/it]1.90.196)
Stage 0: : 5it [00:06, 1.30s/it]1.70.173)
Stage 0: : 5it [00:06, 1.32s/it]1.76.82)
Stage 0: : 5it [00:06, 1.34s/it]1.94.99)
Stage 0: : 14it [00:26, 2.06s/it]pid=23663)
Stage 0: : 6it [00:09, 1.95s/it]1.92.6) 63)
Stage 0: : 6it [00:09, 1.94s/it]1.83.41)
Stage 0: : 6it [00:09, 1.96s/it]1.86.117)
Stage 0: : 6it [00:09, 1.95s/it]1.92.196)
Stage 0: : 6it [00:09, 1.92s/it]1.87.41)
Stage 0: : 6it [00:09, 1.95s/it]1.93.21)
Stage 0: : 6it [00:09, 1.93s/it]1.75.48)
Stage 0: : 6it [00:09, 1.96s/it]1.76.4)
Stage 0: : 6it [00:09, 1.96s/it]1.90.220)
Stage 0: : 6it [00:09, 1.94s/it]
Stage 0: : 6it [00:09, 1.95s/it]1.79.152)
Stage 0: : 6it [00:09, 1.96s/it]1.70.173)
Stage 0: : 6it [00:09, 1.95s/it]1.94.99)
Stage 0: : 6it [00:09, 1.96s/it]1.76.82)
Stage 0: : 6it [00:09, 1.95s/it]1.85.195)
Stage 0: : 6it [00:09, 1.99s/it]1.64.186)
Stage 0: : 6it [00:09, 1.99s/it]1.78.120)
Stage 0: : 6it [00:09, 1.98s/it]1.90.196)
Stage 0: : 6it [00:09, 1.98s/it]1.75.180)
Stage 0: : 6it [00:09, 1.99s/it]1.87.52)
Stage 0: : 15it [00:28, 1.98s/it]pid=23663)
Stage 0: : 6it [00:11, 1.90s/it]1.83.41) 3)
(ConsumingActor pid=746, ip=172.31.83.41) Time to read all data 29.179963346000022 seconds
(ConsumingActor pid=746, ip=172.31.83.41) P50/P95/Max batch delay (s) 0.01078775449997238 0.012131390349998127 2.7669360700000425
(ConsumingActor pid=746, ip=172.31.83.41) Num epochs read 2
(ConsumingActor pid=746, ip=172.31.83.41) Num batches read 512
(ConsumingActor pid=746, ip=172.31.83.41) Num bytes read 20480.0 MiB
(ConsumingActor pid=746, ip=172.31.83.41) Mean throughput 701.85 MiB/s
Stage 0: : 6it [00:11, 1.90s/it]1.92.196)
(ConsumingActor pid=739, ip=172.31.87.41) Time to read all data 29.170696408000026 seconds
(ConsumingActor pid=739, ip=172.31.87.41) P50/P95/Max batch delay (s) 0.010904575000012073 0.011703723450006009 2.8186021889999893
(ConsumingActor pid=739, ip=172.31.87.41) Num epochs read 2
(ConsumingActor pid=739, ip=172.31.87.41) Num batches read 512
(ConsumingActor pid=739, ip=172.31.87.41) Num bytes read 20480.0 MiB
(ConsumingActor pid=739, ip=172.31.87.41) Mean throughput 702.07 MiB/s
Stage 0: : 6it [00:11, 1.90s/it]1.86.117)
Stage 0: : 6it [00:11, 1.89s/it]1.87.41)
(ConsumingActor pid=740, ip=172.31.86.117) Time to read all data 29.250613789 seconds
(ConsumingActor pid=740, ip=172.31.86.117) P50/P95/Max batch delay (s) 0.01082026399998881 0.013117102449996306 2.7766691919999857
(ConsumingActor pid=740, ip=172.31.86.117) Num epochs read 2
(ConsumingActor pid=740, ip=172.31.86.117) Num batches read 512
(ConsumingActor pid=740, ip=172.31.86.117) Num bytes read 20480.0 MiB
(ConsumingActor pid=740, ip=172.31.86.117) Mean throughput 700.16 MiB/s
Stage 0: : 6it [00:11, 1.91s/it]1.93.21)
(ConsumingActor pid=740, ip=172.31.93.21) Time to read all data 29.251273184000013 seconds
(ConsumingActor pid=740, ip=172.31.93.21) P50/P95/Max batch delay (s) 0.010795337500013602 0.011299338899996769 2.760269125000036
(ConsumingActor pid=740, ip=172.31.93.21) Num epochs read 2
(ConsumingActor pid=740, ip=172.31.93.21) Num batches read 512
(ConsumingActor pid=740, ip=172.31.93.21) Num bytes read 20480.0 MiB
(ConsumingActor pid=740, ip=172.31.93.21) Mean throughput 700.14 MiB/s
Stage 0: : 6it [00:11, 1.90s/it]1.75.48)
(ConsumingActor pid=745, ip=172.31.75.48) Time to read all data 29.15819512799999 seconds
(ConsumingActor pid=745, ip=172.31.75.48) P50/P95/Max batch delay (s) 0.010786695000007285 0.013397468500019726 2.781348481000009
(ConsumingActor pid=745, ip=172.31.75.48) Num epochs read 2
(ConsumingActor pid=745, ip=172.31.75.48) Num batches read 512
(ConsumingActor pid=745, ip=172.31.75.48) Num bytes read 20480.0 MiB
(ConsumingActor pid=745, ip=172.31.75.48) Mean throughput 702.38 MiB/s
Stage 0: : 6it [00:11, 1.91s/it]1.76.4)
Stage 0: : 6it [00:11, 1.90s/it]1.78.120)
Stage 0: : 6it [00:11, 1.90s/it]1.90.196)
(ConsumingActor pid=738, ip=172.31.90.196) Time to read all data 29.26472254099997 seconds
(ConsumingActor pid=738, ip=172.31.90.196) P50/P95/Max batch delay (s) 0.010836551999972244 0.013735126749998014 2.818687676999957
(ConsumingActor pid=738, ip=172.31.90.196) Num epochs read 2
(ConsumingActor pid=738, ip=172.31.90.196) Num batches read 512
(ConsumingActor pid=738, ip=172.31.90.196) Num bytes read 20480.0 MiB
(ConsumingActor pid=738, ip=172.31.90.196) Mean throughput 699.82 MiB/s
(ConsumingActor pid=745, ip=172.31.90.220) Time to read all data 29.31621175800001 seconds
(ConsumingActor pid=745, ip=172.31.90.220) P50/P95/Max batch delay (s) 0.010861803500006317 0.012561205600010793 2.777011148999975
(ConsumingActor pid=745, ip=172.31.90.220) Num epochs read 2
(ConsumingActor pid=745, ip=172.31.90.220) Num batches read 512
(ConsumingActor pid=745, ip=172.31.90.220) Num bytes read 20480.0 MiB
(ConsumingActor pid=745, ip=172.31.90.220) Mean throughput 698.59 MiB/s
(ConsumingActor pid=745, ip=172.31.90.220) Ingest stats from rank=0:
(ConsumingActor pid=745, ip=172.31.90.220)
(ConsumingActor pid=745, ip=172.31.90.220) == Pipeline Window 10 ==
(ConsumingActor pid=745, ip=172.31.90.220) Stage 1 read->map_batches: 40/40 blocks executed in 2.06s
(ConsumingActor pid=745, ip=172.31.90.220) * Remote wall time: 1.4s min, 1.64s max, 1.47s mean, 58.66s total
(ConsumingActor pid=745, ip=172.31.90.220) * Remote cpu time: 1.41s min, 1.65s max, 1.47s mean, 58.94s total
(ConsumingActor pid=745, ip=172.31.90.220) * Peak heap memory usage (MiB): 6538040000.0 min, 10734672000.0 max, 9632884100 mean
(ConsumingActor pid=745, ip=172.31.90.220) * Output num rows: 104857 min, 104857 max, 104857 mean, 4194280 total
(ConsumingActor pid=745, ip=172.31.90.220) * Output size bytes: 1074155212 min, 1074155212 max, 1074155212 mean, 42966208480 total
(ConsumingActor pid=745, ip=172.31.90.220) * Tasks per node: 2 min, 2 max, 2 mean; 20 nodes used
(ConsumingActor pid=745, ip=172.31.90.220)
(ConsumingActor pid=745, ip=172.31.90.220) == Pipeline Window 11 ==
(ConsumingActor pid=745, ip=172.31.90.220) Stage 1 read->map_batches: 1/1 blocks executed in 0.01s
(ConsumingActor pid=745, ip=172.31.90.220) * Remote wall time: 2.62ms min, 2.62ms max, 2.62ms mean, 2.62ms total
(ConsumingActor pid=745, ip=172.31.90.220) * Remote cpu time: 2.61ms min, 2.61ms max, 2.61ms mean, 2.61ms total
(ConsumingActor pid=745, ip=172.31.90.220) * Peak heap memory usage (MiB): 9686232000.0 min, 9686232000.0 max, 9686232000 mean
(ConsumingActor pid=745, ip=172.31.90.220) * Output num rows: 120 min, 120 max, 120 mean, 120 total
(ConsumingActor pid=745, ip=172.31.90.220) * Output size bytes: 1229284 min, 1229284 max, 1229284 mean, 1229284 total
(ConsumingActor pid=745, ip=172.31.90.220) * Tasks per node: 1 min, 1 max, 1 mean; 1 nodes used
(ConsumingActor pid=745, ip=172.31.90.220)
(ConsumingActor pid=745, ip=172.31.90.220) == Pipeline Window 12 ==
(ConsumingActor pid=745, ip=172.31.90.220) Stage 1 read->map_batches: 40/40 blocks executed in 3.26s
(ConsumingActor pid=745, ip=172.31.90.220) * Remote wall time: 1.41s min, 1.53s max, 1.45s mean, 58.14s total
(ConsumingActor pid=745, ip=172.31.90.220) * Remote cpu time: 1.41s min, 1.52s max, 1.45s mean, 58.13s total
(ConsumingActor pid=745, ip=172.31.90.220) * Peak heap memory usage (MiB): 2341440000.0 min, 11783556000.0 max, 10471681900 mean
(ConsumingActor pid=745, ip=172.31.90.220) * Output num rows: 104857 min, 104857 max, 104857 mean, 4194280 total
(ConsumingActor pid=745, ip=172.31.90.220) * Output size bytes: 1074155212 min, 1074155212 max, 1074155212 mean, 42966208480 total
(ConsumingActor pid=745, ip=172.31.90.220) * Tasks per node: 1 min, 3 max, 2 mean; 20 nodes used
(ConsumingActor pid=745, ip=172.31.90.220)
(ConsumingActor pid=745, ip=172.31.90.220) ##### Overall Pipeline Time Breakdown #####
(ConsumingActor pid=745, ip=172.31.90.220) * Time stalled waiting for next dataset: 68.44ms min, 2.74s max, 1.48s mean, 17.78s total
(ConsumingActor pid=745, ip=172.31.90.220)
Stage 0: : 6it [00:11, 1.92s/it]1.90.220)
(ConsumingActor pid=747, ip=172.31.79.152) Time to read all data 29.229463819999978 seconds
(ConsumingActor pid=747, ip=172.31.79.152) P50/P95/Max batch delay (s) 0.010919250500023736 0.01276898569999218 2.790231468999991
(ConsumingActor pid=747, ip=172.31.79.152) Num epochs read 2
(ConsumingActor pid=747, ip=172.31.79.152) Num batches read 512
(ConsumingActor pid=747, ip=172.31.79.152) Num bytes read 20480.0 MiB
(ConsumingActor pid=747, ip=172.31.79.152) Mean throughput 700.66 MiB/s
Stage 0: : 6it [00:11, 1.90s/it]1.79.152)
Stage 0: : 6it [00:11, 1.89s/it]
(ConsumingActor pid=740, ip=172.31.92.6) Time to read all data 29.18884402399999 seconds
(ConsumingActor pid=740, ip=172.31.92.6) P50/P95/Max batch delay (s) 0.010854045000002088 0.012598926350008807 2.7959525549999853
(ConsumingActor pid=740, ip=172.31.92.6) Num epochs read 2
(ConsumingActor pid=740, ip=172.31.92.6) Num batches read 512
(ConsumingActor pid=740, ip=172.31.92.6) Num bytes read 20480.0 MiB
(ConsumingActor pid=740, ip=172.31.92.6) Mean throughput 701.64 MiB/s
(ConsumingActor pid=23652) Time to read all data 29.165431737999825 seconds
(ConsumingActor pid=23652) P50/P95/Max batch delay (s) 0.010879684499286668 0.016028892099893707 2.8205238620003
(ConsumingActor pid=23652) Num epochs read 2
(ConsumingActor pid=23652) Num batches read 512
(ConsumingActor pid=23652) Num bytes read 20480.0 MiB
(ConsumingActor pid=23652) Mean throughput 702.2 MiB/s
(ConsumingActor pid=735, ip=172.31.75.180) Time to read all data 29.249813272999972 seconds
(ConsumingActor pid=735, ip=172.31.75.180) P50/P95/Max batch delay (s) 0.010739783000019543 0.013223769249989912 2.7849573450000094
(ConsumingActor pid=735, ip=172.31.75.180) Num epochs read 2
(ConsumingActor pid=735, ip=172.31.75.180) Num batches read 512
(ConsumingActor pid=735, ip=172.31.75.180) Num bytes read 20480.0 MiB
(ConsumingActor pid=735, ip=172.31.75.180) Mean throughput 700.18 MiB/s
Stage 0: : 6it [00:11, 1.90s/it]1.92.6)
Stage 0: : 6it [00:11, 1.90s/it]167(ConsumingActor pid=735, ip=172.31.75.180)
Stage 0: : 6it [00:11, 1.90s/it]1.94.99)
(ConsumingActor pid=736, ip=172.31.94.99) Time to read all data 29.258913532999998 seconds
(ConsumingActor pid=736, ip=172.31.94.99) P50/P95/Max batch delay (s) 0.010915057500028524 0.03129511259999731 2.8173484679999774
(ConsumingActor pid=736, ip=172.31.94.99) Num epochs read 2
(ConsumingActor pid=736, ip=172.31.94.99) Num batches read 512
(ConsumingActor pid=736, ip=172.31.94.99) Num bytes read 20480.0 MiB
(ConsumingActor pid=736, ip=172.31.94.99) Mean throughput 699.96 MiB/s
Stage 0: : 6it [00:11, 1.91s/it]1.76.82)
(ConsumingActor pid=740, ip=172.31.76.82) Time to read all data 29.208217771000022 seconds
(ConsumingActor pid=740, ip=172.31.76.82) P50/P95/Max batch delay (s) 0.010992583499984221 0.011883279799994276 2.813059210000006
(ConsumingActor pid=740, ip=172.31.76.82) Num epochs read 2
(ConsumingActor pid=740, ip=172.31.76.82) Num batches read 512
(ConsumingActor pid=740, ip=172.31.76.82) Num bytes read 20480.0 MiB
(ConsumingActor pid=740, ip=172.31.76.82) Mean throughput 701.17 MiB/s
Stage 0: : 6it [00:11, 1.90s/it]1.70.173)
(ConsumingActor pid=746, ip=172.31.70.173) Time to read all data 29.236991578000016 seconds
(ConsumingActor pid=746, ip=172.31.70.173) P50/P95/Max batch delay (s) 0.010980795000023136 0.013579466749990842 2.819542859999956
(ConsumingActor pid=746, ip=172.31.70.173) Num epochs read 2
(ConsumingActor pid=746, ip=172.31.70.173) Num batches read 512
(ConsumingActor pid=746, ip=172.31.70.173) Num bytes read 20480.0 MiB
(ConsumingActor pid=746, ip=172.31.70.173) Mean throughput 700.48 MiB/s
(ConsumingActor pid=746, ip=172.31.92.196) Time to read all data 29.18918256500001 seconds
(ConsumingActor pid=746, ip=172.31.92.196) P50/P95/Max batch delay (s) 0.01082071050001332 0.01370501415005094 2.781268466999961
(ConsumingActor pid=746, ip=172.31.92.196) Num epochs read 2
(ConsumingActor pid=746, ip=172.31.92.196) Num batches read 512
(ConsumingActor pid=746, ip=172.31.92.196) Num bytes read 20480.0 MiB
(ConsumingActor pid=746, ip=172.31.92.196) Mean throughput 701.63 MiB/s
Stage 0: : 6it [00:11, 1.90s/it]1.87.52)
(ConsumingActor pid=737, ip=172.31.87.52) Time to read all data 29.274445688000014 seconds
(ConsumingActor pid=737, ip=172.31.87.52) P50/P95/Max batch delay (s) 0.010936502500015877 0.013563972550008426 2.826693067000008
(ConsumingActor pid=737, ip=172.31.87.52) Num epochs read 2
(ConsumingActor pid=737, ip=172.31.87.52) Num batches read 512
(ConsumingActor pid=737, ip=172.31.87.52) Num bytes read 20480.0 MiB
(ConsumingActor pid=737, ip=172.31.87.52) Mean throughput 699.59 MiB/s
Stage 0: : 6it [00:11, 1.91s/it]1.85.195)
(ConsumingActor pid=746, ip=172.31.85.195) Time to read all data 29.254682486999968 seconds
(ConsumingActor pid=746, ip=172.31.85.195) P50/P95/Max batch delay (s) 0.010890752000022985 0.01335758660000863 2.8021974099999625
(ConsumingActor pid=746, ip=172.31.85.195) Num epochs read 2
(ConsumingActor pid=746, ip=172.31.85.195) Num batches read 512
(ConsumingActor pid=746, ip=172.31.85.195) Num bytes read 20480.0 MiB
(ConsumingActor pid=746, ip=172.31.85.195) Mean throughput 700.06 MiB/s
(ConsumingActor pid=739, ip=172.31.76.4) Time to read all data 29.221106813000006 seconds
(ConsumingActor pid=739, ip=172.31.76.4) P50/P95/Max batch delay (s) 0.010787291500008678 0.011872523749980237 2.7714655030000017
(ConsumingActor pid=739, ip=172.31.76.4) Num epochs read 2
(ConsumingActor pid=739, ip=172.31.76.4) Num batches read 512
(ConsumingActor pid=739, ip=172.31.76.4) Num bytes read 20480.0 MiB
(ConsumingActor pid=739, ip=172.31.76.4) Mean throughput 700.86 MiB/s
(ConsumingActor pid=746, ip=172.31.64.186) Time to read all data 29.26256514800002 seconds
(ConsumingActor pid=746, ip=172.31.64.186) P50/P95/Max batch delay (s) 0.010883439000025419 0.013131027150021164 2.8298956819999717
(ConsumingActor pid=746, ip=172.31.64.186) Num epochs read 2
(ConsumingActor pid=746, ip=172.31.64.186) Num batches read 512
(ConsumingActor pid=746, ip=172.31.64.186) Num bytes read 20480.0 MiB
(ConsumingActor pid=746, ip=172.31.64.186) Mean throughput 699.87 MiB/s
Stage 0: : 6it [00:11, 1.90s/it]1.64.186)
(ConsumingActor pid=745, ip=172.31.78.120) Time to read all data 29.251510171999996 seconds
(ConsumingActor pid=745, ip=172.31.78.120) P50/P95/Max batch delay (s) 0.010865980999994918 0.012708394600002742 2.8385113969999907
(ConsumingActor pid=745, ip=172.31.78.120) Num epochs read 2
(ConsumingActor pid=745, ip=172.31.78.120) Num batches read 512
(ConsumingActor pid=745, ip=172.31.78.120) Num bytes read 20480.0 MiB
(ConsumingActor pid=745, ip=172.31.78.120) Mean throughput 700.13 MiB/s
2022-08-16 10:40:38,006 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: c1520f197a2de5cdc0ae3afb97e5fa732f8b056302000000 Worker ID: c7a31d03fba652df0d3d0a7ba8bacdead90b187f7273670eeb195608 Node ID: 8df66f131aa1bcad6dcbcd0a6887e11ef4b7b581696d15ed5be72a2b Worker IP address: 172.31.85.195 Worker port: 10004 Worker PID: 780 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 10:40:38,007 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 4e2e2bc393a57229d1e21cab33657e5de05945c802000000 Worker ID: f3e584095fb3799c1d578e7beb652d36e3a1930ff52d4fb26f944fc1 Node ID: 27386e7f5da3cd79e688aa344bad444ad51d44234a6e808e9b520085 Worker IP address: 172.31.76.82 Worker port: 10003 Worker PID: 771 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 10:40:38,062 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 3d61a9771cb65a9e2c0b1e67ca04e89e7b4b043f02000000 Worker ID: 5ac43caaa6b5c3eb8086e2ec900daeed95999e302e2f6c9a67390024 Node ID: 11e42de41c46cad33bcd371f6d5a52d293e7d6c18e6cf08f370dc42e Worker IP address: 172.31.80.141 Worker port: 10011 Worker PID: 27324 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$ git log
commit 436c89ba1a337fffcd6f46c78c4d5a5a3434d0a8 (HEAD -> master, origin/master, origin/HEAD)
Author: Sven Mika <svenmika1977@gmail.com>
Date: Tue Aug 16 12:05:55 2022 +0200
[RLlib] Eval workers use async req manager. (#27390)
commit 1c4b3879a1e3f596e74c9f4f46023dfcf627a1fe
Author: liuyang-my <84125729+liuyang-my@users.noreply.github.com>
Date: Tue Aug 16 15:22:00 2022 +0800
[Serve]Fix classloader bug in Java Deployment (#27899)
We have encountered `java.lang.ClassNotFoundException` when deploying Java Ray Serve deployments. The property `ray.job.code-search-path` which specifies the search path of user's classes is not working. The reason is that `ray.job.code-search-path` is loaded in an independent classloader in Ray context, but Serve Replica initialized user class with `AppClassLoader`. We need to change the classloader used to construct user classes to the one in Ray context.
commit c2abfdb2f7eee7f3e4320cb0d9e8e3bd639d5680
Author: Alex Wu <alex@anyscale.io>
Date: Mon Aug 15 18:14:29 2022 -0700
[autoscaler][observability] Observability into when/why nodes fail to launch (#27697)
This change adds launch failures to the recent failures section of ray status when a node provider provides structured error information. For node providers which don't provide this optional information, there is now change in behavior.
For reference, when trying to launch a node type with a quota issue, it looks like the following. InsufficientInstanceCapacity is the standard term for this issue..
```
======== Autoscaler status: 2022-08-11 22:22:10.735647 ========
Node status
---------------------------------------------------------------
Healthy:
1 cpu_4_ondemand
Pending:
quota, 1 launching
Recent failures:
quota: InsufficientInstanceCapacity (last_attempt: 22:22:00)
Resources
---------------------------------------------------------------
Usage:
0.0/4.0 CPU
0.00/9.079 GiB memory
0.00/4.539 GiB object_store_memory
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
^[^C(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
^[(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
s(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$ s
bash: s: command not found
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$ python data_ingest_benchmark.py --dataset-size-gb=200 --num-workers=20 --streaming
2022-08-16 11:05:03,426 INFO worker.py:1203 -- Using address localhost:9031 set in the environment variable RAY_ADDRESS
2022-08-16 11:05:06,228 INFO worker.py:1312 -- Connecting to existing Ray cluster at address: 172.31.80.141:9031...
2022-08-16 11:05:06,235 INFO worker.py:1487 -- Connected to Ray cluster. View the dashboard at https://session-aqwatdmgevphyalje148wqvj.i.anyscaleuserdata-staging.com/auth/?token=agh0_CkcwRQIgIk7mkj-gUcqPe1OsvZ_H8CPE_wsJBiwWJBszQZGOg8gCIQDWWn_c_wIHFEow6P51__K6H3uqXn3u3z5ucA_mHK8lJBJmEiD6fIFsE2xA7OC9UAjVkUdqDfyAqnjZHYJO1YMBnRNSfRgCIgNuL2E6DAii24bjBhCIutWeA0IMCKLY7pcGEIi61Z4D-gEeChxzZXNfQXF3YVREbWdFdnBoeUFMSkUxNDhXUXZK&redirect_to=dashboard.
2022-08-16 11:05:06,505 INFO packaging.py:342 -- Pushing file package 'gcs://_ray_pkg_ad68ed085ebd21ad1f0420511834ac5c.zip' (91.13MiB) to Ray cluster...
2022-08-16 11:05:07,597 INFO packaging.py:351 -- Successfully pushed file package 'gcs://_ray_pkg_ad68ed085ebd21ad1f0420511834ac5c.zip'.
2022-08-16 11:05:10,590 WARNING read_api.py:281 -- ⚠️ The blocks of this dataset are estimated to be 2.0x larger than the target block size of 512 MiB. This may lead to out-of-memory errors during processing. Consider reducing the size of input files or using `.repartition(n)` to increase the number of dataset blocks.
Created dataset Dataset(num_blocks=201, num_rows=20971520, schema={__value__: ArrowTensorType(shape=(1280,), dtype=int64)}) of size 214748364800
2022-08-16 11:05:10,790 INFO dataset.py:3282 -- Created DatasetPipeline with 6 windows: 1.17MiB min, 40.0GiB max, 33.33GiB mean
2022-08-16 11:05:10,790 INFO dataset.py:3291 -- Blocks per window: 1 min, 40 max, 33 mean
2022-08-16 11:05:10,791 WARNING dataset.py:3299 -- ⚠️ This pipeline's parallelism is limited by its blocks per window to ~33 concurrent tasks per window. To maximize performance, increase the blocks per window to at least 320. This may require increasing the base dataset's parallelism and/or adjusting the windowing parameters.
2022-08-16 11:05:10,792 INFO dataset.py:3326 -- ✔️ This pipeline's windows likely fit in object store memory without spilling.
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 2it [00:03, 1.62s/it]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s].25s/it]
Stage 0: : 3it [00:04, 1.67s/it] pid=32907)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 4it [00:06, 1.72s/it] pid=32907)
Stage 0: : 2it [00:02, 1.07s/it]
Stage 0: : 2it [00:02, 1.07s/it]
Stage 0: : 2it [00:02, 1.08s/it]
Stage 0: : 2it [00:02, 1.08s/it]
Stage 0: : 2it [00:02, 1.08s/it]
Stage 0: : 2it [00:02, 1.10s/it]
Stage 0: : 2it [00:02, 1.09s/it]
Stage 0: : 2it [00:02, 1.08s/it]
Stage 0: : 2it [00:02, 1.08s/it]
Stage 0: : 2it [00:02, 1.08s/it]
Stage 0: : 2it [00:02, 1.08s/it]
Stage 0: : 2it [00:02, 1.09s/it]
Stage 0: : 2it [00:02, 1.09s/it]
Stage 0: : 2it [00:02, 1.09s/it]
Stage 0: : 2it [00:02, 1.09s/it]
Stage 0: : 2it [00:02, 1.09s/it]
Stage 0: : 2it [00:02, 1.09s/it]
Stage 0: : 2it [00:02, 1.08s/it]
Stage 0: : 2it [00:02, 1.08s/it]
Stage 0: : 2it [00:02, 1.08s/it]
Stage 0: : 5it [00:08, 1.77s/it] pid=32907)
Stage 0: : 3it [00:04, 1.50s/it]31.75.180)
Stage 0: : 3it [00:04, 1.50s/it]31.64.186)
Stage 0: : 3it [00:04, 1.50s/it]31.75.48)
Stage 0: : 3it [00:04, 1.50s/it]31.94.99)
Stage 0: : 3it [00:04, 1.51s/it]31.90.220)
Stage 0: : 3it [00:04, 1.50s/it]31.70.173)
Stage 0: : 3it [00:04, 1.50s/it]31.87.41)
Stage 0: : 3it [00:04, 1.51s/it]31.92.196)
Stage 0: : 3it [00:04, 1.49s/it]31.93.21)
Stage 0: : 3it [00:04, 1.49s/it]31.83.41)
Stage 0: : 3it [00:04, 1.49s/it]31.79.152)
Stage 0: : 3it [00:04, 1.50s/it]31.78.120)
Stage 0: : 3it [00:04, 1.50s/it]31.90.196)
Stage 0: : 3it [00:04, 1.50s/it]31.76.82)
Stage 0: : 3it [00:04, 1.52s/it]
Stage 0: : 3it [00:04, 1.50s/it]31.87.52)
Stage 0: : 3it [00:04, 1.50s/it]31.92.6)
Stage 0: : 3it [00:04, 1.51s/it]31.76.4)
Stage 0: : 3it [00:04, 1.51s/it]31.86.117)
Stage 0: : 3it [00:04, 1.50s/it]31.85.195)
Stage 0: : 6it [00:10, 1.79s/it] pid=32907)
Stage 0: : 4it [00:05, 1.59s/it]31.83.41) )
Stage 0: : 4it [00:05, 1.59s/it]31.79.152)
Stage 0: : 4it [00:06, 1.61s/it]31.78.120)
Stage 0: : 4it [00:06, 1.62s/it]31.75.180)
Stage 0: : 4it [00:06, 1.61s/it]31.87.52)
Stage 0: : 4it [00:06, 1.61s/it]31.64.186)
Stage 0: : 4it [00:06, 1.61s/it]
Stage 0: : 7it [00:11, 1.40s/it] pid=32907)
Stage 0: : 4it [00:05, 1.59s/it]31.92.6) 7)
Stage 0: : 4it [00:06, 1.60s/it]31.76.4)
Stage 0: : 4it [00:06, 1.60s/it]31.75.48)
Stage 0: : 4it [00:06, 1.59s/it]31.86.117)
Stage 0: : 4it [00:06, 1.61s/it]31.94.99)
Stage 0: : 4it [00:06, 1.61s/it]31.90.220)
Stage 0: : 4it [00:06, 1.61s/it]31.70.173)
Stage 0: : 4it [00:06, 1.59s/it]31.87.41)
Stage 0: : 4it [00:06, 1.60s/it]31.92.196)
Stage 0: : 4it [00:06, 1.59s/it]31.85.195)
Stage 0: : 4it [00:05, 1.58s/it]31.93.21)
Stage 0: : 4it [00:05, 1.59s/it]31.90.196)
Stage 0: : 4it [00:06, 1.59s/it]31.76.82)
Stage 0: : 5it [00:06, 1.28s/it]31.75.180)
Stage 0: : 5it [00:06, 1.28s/it]31.64.186)
Stage 0: : 5it [00:06, 1.28s/it]31.75.48)
Stage 0: : 5it [00:06, 1.26s/it]31.86.117)
Stage 0: : 5it [00:06, 1.27s/it]31.94.99)
Stage 0: : 5it [00:06, 1.27s/it]31.90.220)
Stage 0: : 5it [00:06, 1.27s/it]31.70.173)
Stage 0: : 5it [00:06, 1.27s/it]31.92.196)
Stage 0: : 5it [00:06, 1.27s/it]31.87.41)
Stage 0: : 5it [00:06, 1.26s/it]31.85.195)
Stage 0: : 5it [00:06, 1.26s/it]31.79.152)
Stage 0: : 5it [00:06, 1.25s/it]31.93.21)
Stage 0: : 5it [00:06, 1.26s/it]31.83.41)
Stage 0: : 5it [00:06, 1.27s/it]31.78.120)
Stage 0: : 5it [00:06, 1.26s/it]31.90.196)
Stage 0: : 5it [00:06, 1.26s/it]31.76.82)
Stage 0: : 5it [00:06, 1.28s/it]31.87.52)
Stage 0: : 5it [00:06, 1.27s/it]31.92.6)
Stage 0: : 5it [00:06, 1.27s/it]
Stage 0: : 5it [00:06, 1.27s/it]31.76.4)
Stage 0: : 8it [00:14, 1.98s/it] pid=32907)
Stage 0: : 6it [00:09, 1.92s/it]31.78.120)
Stage 0: : 6it [00:09, 1.92s/it]31.75.180)
Stage 0: : 6it [00:09, 1.92s/it]31.64.186)
Stage 0: : 6it [00:09, 1.93s/it]31.75.48)
Stage 0: : 6it [00:09, 1.92s/it]31.86.117)
Stage 0: : 6it [00:09, 1.92s/it]31.94.99)
Stage 0: : 6it [00:09, 1.92s/it]31.90.220)
Stage 0: : 6it [00:09, 1.91s/it]31.92.196)
Stage 0: : 6it [00:09, 1.91s/it]31.79.152)
Stage 0: : 6it [00:09, 1.92s/it]31.87.41)
Stage 0: : 6it [00:09, 1.92s/it]31.85.195)
Stage 0: : 6it [00:09, 1.91s/it]31.93.21)
Stage 0: : 6it [00:09, 1.91s/it]31.83.41)
Stage 0: : 6it [00:09, 1.92s/it]31.90.196)
Stage 0: : 6it [00:09, 1.92s/it]31.76.82)
Stage 0: : 6it [00:09, 1.93s/it]31.87.52)
Stage 0: : 6it [00:09, 1.92s/it]31.92.6)
Stage 0: : 6it [00:09, 1.93s/it]
Stage 0: : 6it [00:09, 1.92s/it]31.76.4)
Stage 0: : 6it [00:09, 1.93s/it]31.70.173)
Stage 0: : 9it [00:16, 1.90s/it] pid=32907)
Stage 0: : 6it [00:11, 1.89s/it]31.75.180)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.64.186)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.94.99)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.90.220)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.88s/it]31.92.196)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.88s/it]31.79.152)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.88s/it]31.93.21)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.87.41)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.88s/it]31.83.41)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.90.196)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.78.120)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.90s/it]31.87.52)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.92.6)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.76.4)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.90s/it]
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.75.48)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.86.117)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.90s/it]31.70.173)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.85.195)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 6it [00:11, 1.89s/it]31.76.82)
Stage 0: 0%| | 0/1 [00:00<?, ?it/s]
Stage 0: : 10it [00:17, 1.86s/it]pid=32907)
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.17it/s]
Stage 0: : 2it [00:01, 1.15it/s]
Stage 0: : 2it [00:01, 1.15it/s]
Stage 0: : 2it [00:01, 1.17it/s]
Stage 0: : 2it [00:01, 1.17it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.15it/s]
Stage 0: : 2it [00:01, 1.15it/s]
Stage 0: : 2it [00:01, 1.17it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.14it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.17it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 2it [00:01, 1.16it/s]
Stage 0: : 11it [00:19, 1.87s/it]pid=32907)
Stage 0: : 3it [00:03, 1.39s/it]31.90.220)
Stage 0: : 3it [00:03, 1.39s/it]31.79.152)
Stage 0: : 3it [00:03, 1.39s/it]31.83.41)
Stage 0: : 3it [00:03, 1.40s/it]31.64.186)
Stage 0: : 3it [00:03, 1.40s/it]31.75.180)
Stage 0: : 3it [00:03, 1.39s/it]31.87.52)
Stage 0: : 3it [00:03, 1.39s/it]31.75.48)
Stage 0: : 3it [00:03, 1.40s/it]31.86.117)
Stage 0: : 3it [00:03, 1.41s/it]31.94.99)
Stage 0: : 3it [00:03, 1.41s/it]31.92.196)
Stage 0: : 3it [00:03, 1.39s/it]31.70.173)
Stage 0: : 3it [00:03, 1.39s/it]31.85.195)
Stage 0: : 3it [00:03, 1.40s/it]31.93.21)
Stage 0: : 3it [00:03, 1.41s/it]31.90.196)
Stage 0: : 3it [00:03, 1.40s/it]31.87.41)
Stage 0: : 3it [00:03, 1.41s/it]31.76.82)
Stage 0: : 3it [00:03, 1.39s/it]31.92.6)
Stage 0: : 3it [00:03, 1.40s/it]31.76.4)
Stage 0: : 3it [00:03, 1.41s/it]
Stage 0: : 3it [00:04, 1.52s/it]31.78.120)
Stage 0: : 12it [00:21, 1.89s/it]pid=32907)
Stage 0: : 13it [00:22, 1.48s/it]pid=32907)
Stage 0: : 4it [00:05, 1.58s/it]31.93.21) )
Stage 0: : 4it [00:05, 1.58s/it]31.79.152)
Stage 0: : 4it [00:05, 1.59s/it]31.83.41)
Stage 0: : 4it [00:05, 1.57s/it]31.90.196)
Stage 0: : 4it [00:05, 1.59s/it]31.64.186)
Stage 0: : 4it [00:05, 1.57s/it]31.76.82)
Stage 0: : 4it [00:05, 1.57s/it]31.87.52)
Stage 0: : 4it [00:05, 1.60s/it]31.75.180)
Stage 0: : 4it [00:05, 1.57s/it]31.92.6)
Stage 0: : 4it [00:05, 1.57s/it]31.76.4)
Stage 0: : 4it [00:05, 1.58s/it]31.75.48)
Stage 0: : 4it [00:05, 1.57s/it]31.86.117)
Stage 0: : 4it [00:05, 1.59s/it]31.90.220)
Stage 0: : 4it [00:05, 1.59s/it]31.94.99)
Stage 0: : 4it [00:05, 1.59s/it]31.92.196)
Stage 0: : 4it [00:05, 1.58s/it]
Stage 0: : 4it [00:05, 1.57s/it]31.70.173)
Stage 0: : 4it [00:05, 1.58s/it]31.85.195)
Stage 0: : 4it [00:05, 1.58s/it]31.78.120)
Stage 0: : 4it [00:05, 1.60s/it]31.87.41)
Stage 0: : 5it [00:06, 1.23s/it]31.75.48)
Stage 0: : 5it [00:06, 1.22s/it]
Stage 0: : 5it [00:06, 1.24s/it]31.79.152)
Stage 0: : 5it [00:06, 1.25s/it]31.83.41)
Stage 0: : 5it [00:06, 1.25s/it]31.64.186)
Stage 0: : 5it [00:06, 1.24s/it]31.87.52)
Stage 0: : 5it [00:06, 1.25s/it]31.75.180)
Stage 0: : 5it [00:06, 1.23s/it]31.92.6)
Stage 0: : 5it [00:06, 1.24s/it]31.76.4)
Stage 0: : 5it [00:06, 1.24s/it]31.86.117)
Stage 0: : 5it [00:06, 1.25s/it]31.90.220)
Stage 0: : 5it [00:06, 1.25s/it]31.94.99)
Stage 0: : 5it [00:06, 1.25s/it]31.92.196)
Stage 0: : 5it [00:06, 1.23s/it]31.70.173)
Stage 0: : 5it [00:06, 1.23s/it]31.85.195)
Stage 0: : 5it [00:06, 1.26s/it]31.93.21)
Stage 0: : 5it [00:06, 1.24s/it]31.90.196)
Stage 0: : 5it [00:06, 1.24s/it]31.78.120)
Stage 0: : 5it [00:06, 1.25s/it]31.87.41)
Stage 0: : 5it [00:06, 1.25s/it]31.76.82)
Stage 0: : 14it [00:25, 2.00s/it]pid=32907)
Stage 0: : 6it [00:09, 1.90s/it]31.75.180)
Stage 0: : 6it [00:09, 1.88s/it]31.75.48)
Stage 0: : 6it [00:09, 1.89s/it]31.90.220)
Stage 0: : 6it [00:09, 1.90s/it]31.94.99)
Stage 0: : 6it [00:09, 1.89s/it]31.70.173)
Stage 0: : 6it [00:09, 1.88s/it]
Stage 0: : 6it [00:09, 1.88s/it]31.85.195)
Stage 0: : 6it [00:09, 1.89s/it]31.79.152)
Stage 0: : 6it [00:09, 1.90s/it]31.90.196)
Stage 0: : 6it [00:09, 1.90s/it]31.83.41)
Stage 0: : 6it [00:09, 1.89s/it]31.64.186)
Stage 0: : 6it [00:09, 1.90s/it]31.87.52)
Stage 0: : 6it [00:09, 1.90s/it]31.76.82)
Stage 0: : 6it [00:09, 1.88s/it]31.92.6)
Stage 0: : 6it [00:09, 1.90s/it]31.76.4)
Stage 0: : 6it [00:09, 1.90s/it]31.86.117)
Stage 0: : 6it [00:09, 1.90s/it]31.92.196)
Stage 0: : 6it [00:09, 1.91s/it]31.93.21)
Stage 0: : 6it [00:09, 1.90s/it]31.78.120)
Stage 0: : 6it [00:09, 1.91s/it]31.87.41)
Stage 0: : 15it [00:27, 1.93s/it]pid=32907)
Stage 1: : 14it [00:27, 1.93s/it]pid=32907)
Stage 0: : 6it [00:11, 1.85s/it]31.75.180)
(ConsumingActor pid=4990, ip=172.31.75.48) Time to read all data 27.769030432000136 seconds
(ConsumingActor pid=4990, ip=172.31.75.48) P50/P95/Max batch delay (s) 0.01084901750004974 0.011308732550060084 2.682710188999863
(ConsumingActor pid=4990, ip=172.31.75.48) Num epochs read 2
(ConsumingActor pid=4990, ip=172.31.75.48) Num batches read 512
(ConsumingActor pid=4990, ip=172.31.75.48) Num bytes read 20480.0 MiB
(ConsumingActor pid=4990, ip=172.31.75.48) Mean throughput 737.51 MiB/s
Stage 0: : 6it [00:11, 1.84s/it]31.75.48)
Stage 0: : 6it [00:11, 1.85s/it]31.90.220)
(ConsumingActor pid=4998, ip=172.31.90.220) Time to read all data 27.80270644699999 seconds
(ConsumingActor pid=4998, ip=172.31.90.220) P50/P95/Max batch delay (s) 0.010772638500043286 0.014292749850073946 2.676380449999897
(ConsumingActor pid=4998, ip=172.31.90.220) Num epochs read 2
(ConsumingActor pid=4998, ip=172.31.90.220) Num batches read 512
(ConsumingActor pid=4998, ip=172.31.90.220) Num bytes read 20480.0 MiB
(ConsumingActor pid=4998, ip=172.31.90.220) Mean throughput 736.62 MiB/s
Stage 0: : 6it [00:11, 1.86s/it]31.94.99)
(ConsumingActor pid=4992, ip=172.31.94.99) Time to read all data 27.828023598999835 seconds
(ConsumingActor pid=4992, ip=172.31.94.99) P50/P95/Max batch delay (s) 0.010851683000055345 0.01175251415006641 2.700498185000015
(ConsumingActor pid=4992, ip=172.31.94.99) Num epochs read 2
(ConsumingActor pid=4992, ip=172.31.94.99) Num batches read 512
(ConsumingActor pid=4992, ip=172.31.94.99) Num bytes read 20480.0 MiB
(ConsumingActor pid=4992, ip=172.31.94.99) Mean throughput 735.95 MiB/s
Stage 0: : 6it [00:11, 1.84s/it]31.85.195)
(ConsumingActor pid=32903) Time to read all data 27.784072439000738 seconds
(ConsumingActor pid=32903) P50/P95/Max batch delay (s) 0.010890630000176316 0.013360943000589029 2.6969172389999585
(ConsumingActor pid=32903) Num epochs read 2
(ConsumingActor pid=32903) Num batches read 512
(ConsumingActor pid=32903) Num bytes read 20480.0 MiB
(ConsumingActor pid=32903) Mean throughput 737.11 MiB/s
Stage 0: : 6it [00:11, 1.84s/it]
(ConsumingActor pid=4991, ip=172.31.79.152) Time to read all data 27.765251075999913 seconds
(ConsumingActor pid=4991, ip=172.31.79.152) P50/P95/Max batch delay (s) 0.010812445000055959 0.01123660975011944 2.6809106820001034
(ConsumingActor pid=4991, ip=172.31.79.152) Num epochs read 2
(ConsumingActor pid=4991, ip=172.31.79.152) Num batches read 512
(ConsumingActor pid=4991, ip=172.31.79.152) Num bytes read 20480.0 MiB
(ConsumingActor pid=4991, ip=172.31.79.152) Mean throughput 737.61 MiB/s
Stage 0: : 6it [00:11, 1.85s/it]31.79.152)
(ConsumingActor pid=4991, ip=172.31.64.186) Time to read all data 27.782117384999992 seconds
(ConsumingActor pid=4991, ip=172.31.64.186) P50/P95/Max batch delay (s) 0.010847139999896172 0.01144872760006592 2.685125902999971
(ConsumingActor pid=4991, ip=172.31.64.186) Num epochs read 2
(ConsumingActor pid=4991, ip=172.31.64.186) Num batches read 512
(ConsumingActor pid=4991, ip=172.31.64.186) Num bytes read 20480.0 MiB
(ConsumingActor pid=4991, ip=172.31.64.186) Mean throughput 737.16 MiB/s
(ConsumingActor pid=4991, ip=172.31.64.186) Ingest stats from rank=0:
(ConsumingActor pid=4991, ip=172.31.64.186)
(ConsumingActor pid=4991, ip=172.31.64.186) == Pipeline Window 10 ==
(ConsumingActor pid=4991, ip=172.31.64.186) Stage 1 read->map_batches: 40/40 blocks executed in 1.94s
(ConsumingActor pid=4991, ip=172.31.64.186) * Remote wall time: 1.42s min, 1.69s max, 1.49s mean, 59.46s total
(ConsumingActor pid=4991, ip=172.31.64.186) * Remote cpu time: 1.39s min, 1.72s max, 1.48s mean, 59.24s total
(ConsumingActor pid=4991, ip=172.31.64.186) * Peak heap memory usage (MiB): 6538240000.0 min, 10734540000.0 max, 9685544900 mean
(ConsumingActor pid=4991, ip=172.31.64.186) * Output num rows: 104857 min, 104857 max, 104857 mean, 4194280 total
(ConsumingActor pid=4991, ip=172.31.64.186) * Output size bytes: 1074155212 min, 1074155212 max, 1074155212 mean, 42966208480 total
(ConsumingActor pid=4991, ip=172.31.64.186) * Tasks per node: 2 min, 2 max, 2 mean; 20 nodes used
(ConsumingActor pid=4991, ip=172.31.64.186)
(ConsumingActor pid=4991, ip=172.31.64.186) == Pipeline Window 11 ==
(ConsumingActor pid=4991, ip=172.31.64.186) Stage 1 read->map_batches: 1/1 blocks executed in 0.01s
(ConsumingActor pid=4991, ip=172.31.64.186) * Remote wall time: 2.62ms min, 2.62ms max, 2.62ms mean, 2.62ms total
(ConsumingActor pid=4991, ip=172.31.64.186) * Remote cpu time: 2.62ms min, 2.62ms max, 2.62ms mean, 2.62ms total
(ConsumingActor pid=4991, ip=172.31.64.186) * Peak heap memory usage (MiB): 9686036000.0 min, 9686036000.0 max, 9686036000 mean
(ConsumingActor pid=4991, ip=172.31.64.186) * Output num rows: 120 min, 120 max, 120 mean, 120 total
(ConsumingActor pid=4991, ip=172.31.64.186) * Output size bytes: 1229284 min, 1229284 max, 1229284 mean, 1229284 total
(ConsumingActor pid=4991, ip=172.31.64.186) * Tasks per node: 1 min, 1 max, 1 mean; 1 nodes used
(ConsumingActor pid=4991, ip=172.31.64.186)
(ConsumingActor pid=4991, ip=172.31.64.186) == Pipeline Window 12 ==
(ConsumingActor pid=4991, ip=172.31.64.186) Stage 1 read->map_batches: 40/40 blocks executed in 3.21s
(ConsumingActor pid=4991, ip=172.31.64.186) * Remote wall time: 1.42s min, 1.67s max, 1.48s mean, 59.36s total
(ConsumingActor pid=4991, ip=172.31.64.186) * Remote cpu time: 1.42s min, 1.69s max, 1.47s mean, 58.64s total
(ConsumingActor pid=4991, ip=172.31.64.186) * Peak heap memory usage (MiB): 2341260000.0 min, 11783480000.0 max, 10524350600 mean
(ConsumingActor pid=4991, ip=172.31.64.186) * Output num rows: 104857 min, 104857 max, 104857 mean, 4194280 total
(ConsumingActor pid=4991, ip=172.31.64.186) * Output size bytes: 1074155212 min, 1074155212 max, 1074155212 mean, 42966208480 total
(ConsumingActor pid=4991, ip=172.31.64.186) * Tasks per node: 1 min, 3 max, 2 mean; 20 nodes used
(ConsumingActor pid=4991, ip=172.31.64.186)
(ConsumingActor pid=4991, ip=172.31.64.186) ##### Overall Pipeline Time Breakdown #####
(ConsumingActor pid=4991, ip=172.31.64.186) * Time stalled waiting for next dataset: 58.44ms min, 2.65s max, 1.4s mean, 16.78s total
(ConsumingActor pid=4991, ip=172.31.64.186)
Stage 0: : 6it [00:11, 1.85s/it]31.64.186)
(ConsumingActor pid=5027, ip=172.31.83.41) Time to read all data 27.82684059500002 seconds
(ConsumingActor pid=5027, ip=172.31.83.41) P50/P95/Max batch delay (s) 0.010789019500066388 0.012367538300043173 2.6918804770000406
(ConsumingActor pid=5027, ip=172.31.83.41) Num epochs read 2
(ConsumingActor pid=5027, ip=172.31.83.41) Num batches read 512
(ConsumingActor pid=5027, ip=172.31.83.41) Num bytes read 20480.0 MiB
(ConsumingActor pid=5027, ip=172.31.83.41) Mean throughput 735.98 MiB/s
Stage 0: : 6it [00:11, 1.85s/it]31.83.41)
(ConsumingActor pid=4989, ip=172.31.87.52) Time to read all data 27.81175992600015 seconds
(ConsumingActor pid=4989, ip=172.31.87.52) P50/P95/Max batch delay (s) 0.010926474000029884 0.012382415699937609 2.70238580299997
(ConsumingActor pid=4989, ip=172.31.87.52) Num epochs read 2
(ConsumingActor pid=4989, ip=172.31.87.52) Num batches read 512
(ConsumingActor pid=4989, ip=172.31.87.52) Num bytes read 20480.0 MiB
(ConsumingActor pid=4989, ip=172.31.87.52) Mean throughput 736.38 MiB/s
Stage 0: : 6it [00:11, 1.85s/it]31.87.52)
(ConsumingActor pid=4833, ip=172.31.75.180) Time to read all data 27.785708548000002 seconds
(ConsumingActor pid=4833, ip=172.31.75.180) P50/P95/Max batch delay (s) 0.0107583029999887 0.012675734249967261 2.677450821999855
(ConsumingActor pid=4833, ip=172.31.75.180) Num epochs read 2
(ConsumingActor pid=4833, ip=172.31.75.180) Num batches read 512
(ConsumingActor pid=4833, ip=172.31.75.180) Num bytes read 20480.0 MiB
(ConsumingActor pid=4833, ip=172.31.75.180) Mean throughput 737.07 MiB/s
Stage 0: : 6it [00:11, 1.84s/it]31.92.6)
(ConsumingActor pid=4990, ip=172.31.92.6) Time to read all data 27.766196264999962 seconds
(ConsumingActor pid=4990, ip=172.31.92.6) P50/P95/Max batch delay (s) 0.0108298800000739 0.012002565549903463 2.6987975099998494
(ConsumingActor pid=4990, ip=172.31.92.6) Num epochs read 2
(ConsumingActor pid=4990, ip=172.31.92.6) Num batches read 512
(ConsumingActor pid=4990, ip=172.31.92.6) Num bytes read 20480.0 MiB
(ConsumingActor pid=4990, ip=172.31.92.6) Mean throughput 737.59 MiB/s
(ConsumingActor pid=4990, ip=172.31.76.4) Time to read all data 27.9069332439999 seconds
(ConsumingActor pid=4990, ip=172.31.76.4) P50/P95/Max batch delay (s) 0.010788822999984404 0.013265584350006064 2.7097362750000684
(ConsumingActor pid=4990, ip=172.31.76.4) Num epochs read 2
(ConsumingActor pid=4990, ip=172.31.76.4) Num batches read 512
(ConsumingActor pid=4990, ip=172.31.76.4) Num bytes read 20480.0 MiB
(ConsumingActor pid=4990, ip=172.31.76.4) Mean throughput 733.87 MiB/s
Stage 0: : 6it [00:11, 1.85s/it]31.76.4)
(ConsumingActor pid=4996, ip=172.31.86.117) Time to read all data 27.822303984999962 seconds
(ConsumingActor pid=4996, ip=172.31.86.117) P50/P95/Max batch delay (s) 0.01081017749993407 0.012647427199999582 2.692527888999848
(ConsumingActor pid=4996, ip=172.31.86.117) Num epochs read 2
(ConsumingActor pid=4996, ip=172.31.86.117) Num batches read 512
(ConsumingActor pid=4996, ip=172.31.86.117) Num bytes read 20480.0 MiB
(ConsumingActor pid=4996, ip=172.31.86.117) Mean throughput 736.1 MiB/s
Stage 0: : 6it [00:11, 1.85s/it]31.86.117)
Stage 0: : 6it [00:11, 1.86s/it]31.92.196)
(ConsumingActor pid=4992, ip=172.31.92.196) Time to read all data 27.868366673999844 seconds
(ConsumingActor pid=4992, ip=172.31.92.196) P50/P95/Max batch delay (s) 0.010817954000003738 0.012326456599976153 2.70574514499981
(ConsumingActor pid=4992, ip=172.31.92.196) Num epochs read 2
(ConsumingActor pid=4992, ip=172.31.92.196) Num batches read 512
(ConsumingActor pid=4992, ip=172.31.92.196) Num bytes read 20480.0 MiB
(ConsumingActor pid=4992, ip=172.31.92.196) Mean throughput 734.88 MiB/s
(ConsumingActor pid=4991, ip=172.31.70.173) Time to read all data 27.824825947999898 seconds
(ConsumingActor pid=4991, ip=172.31.70.173) P50/P95/Max batch delay (s) 0.010983539999983805 0.012774399100101159 2.697584390000202
(ConsumingActor pid=4991, ip=172.31.70.173) Num epochs read 2
(ConsumingActor pid=4991, ip=172.31.70.173) Num batches read 512
(ConsumingActor pid=4991, ip=172.31.70.173) Num bytes read 20480.0 MiB
(ConsumingActor pid=4991, ip=172.31.70.173) Mean throughput 736.03 MiB/s
Stage 0: : 6it [00:11, 1.84s/it]31.70.173)
(ConsumingActor pid=4836, ip=172.31.85.195) Time to read all data 27.77285567099989 seconds
(ConsumingActor pid=4836, ip=172.31.85.195) P50/P95/Max batch delay (s) 0.010962891000076525 0.012171528600117653 2.696978439000077
(ConsumingActor pid=4836, ip=172.31.85.195) Num epochs read 2
(ConsumingActor pid=4836, ip=172.31.85.195) Num batches read 512
(ConsumingActor pid=4836, ip=172.31.85.195) Num bytes read 20480.0 MiB
(ConsumingActor pid=4836, ip=172.31.85.195) Mean throughput 737.41 MiB/s
Stage 0: : 6it [00:11, 1.87s/it]31.93.21)
(ConsumingActor pid=4989, ip=172.31.93.21) Time to read all data 27.916048542 seconds
(ConsumingActor pid=4989, ip=172.31.93.21) P50/P95/Max batch delay (s) 0.010727032499971756 0.012820698599989553 2.7113253499999246
(ConsumingActor pid=4989, ip=172.31.93.21) Num epochs read 2
(ConsumingActor pid=4989, ip=172.31.93.21) Num batches read 512
(ConsumingActor pid=4989, ip=172.31.93.21) Num bytes read 20480.0 MiB
(ConsumingActor pid=4989, ip=172.31.93.21) Mean throughput 733.63 MiB/s
success! total time 36.53651547431946
Stage 0: : 6it [00:11, 1.87s/it]31.78.120)
(ConsumingActor pid=4990, ip=172.31.78.120) Time to read all data 27.878111763000106 seconds
(ConsumingActor pid=4990, ip=172.31.78.120) P50/P95/Max batch delay (s) 0.010836783000058858 0.012224998900069294 2.7157833920000485
(ConsumingActor pid=4990, ip=172.31.78.120) Num epochs read 2
(ConsumingActor pid=4990, ip=172.31.78.120) Num batches read 512
(ConsumingActor pid=4990, ip=172.31.78.120) Num bytes read 20480.0 MiB
(ConsumingActor pid=4990, ip=172.31.78.120) Mean throughput 734.63 MiB/s
Stage 0: : 6it [00:11, 1.86s/it]31.90.196)
(ConsumingActor pid=4990, ip=172.31.90.196) Time to read all data 27.90950742600012 seconds
(ConsumingActor pid=4990, ip=172.31.90.196) P50/P95/Max batch delay (s) 0.010865903999956572 0.011479466149876315 2.706424117000097
(ConsumingActor pid=4990, ip=172.31.90.196) Num epochs read 2
(ConsumingActor pid=4990, ip=172.31.90.196) Num batches read 512
(ConsumingActor pid=4990, ip=172.31.90.196) Num bytes read 20480.0 MiB
(ConsumingActor pid=4990, ip=172.31.90.196) Mean throughput 733.8 MiB/s
Stage 0: : 6it [00:11, 1.87s/it]31.87.41)
(ConsumingActor pid=5176, ip=172.31.87.41) Time to read all data 27.900899885999934 seconds
(ConsumingActor pid=5176, ip=172.31.87.41) P50/P95/Max batch delay (s) 0.010885396000048786 0.011435299549964383 2.7134943419998763
(ConsumingActor pid=5176, ip=172.31.87.41) Num epochs read 2
(ConsumingActor pid=5176, ip=172.31.87.41) Num batches read 512
(ConsumingActor pid=5176, ip=172.31.87.41) Num bytes read 20480.0 MiB
(ConsumingActor pid=5176, ip=172.31.87.41) Mean throughput 734.03 MiB/s
Stage 0: : 6it [00:11, 1.86s/it]31.76.82)
(ConsumingActor pid=4990, ip=172.31.76.82) Time to read all data 27.832898259999865 seconds
(ConsumingActor pid=4990, ip=172.31.76.82) P50/P95/Max batch delay (s) 0.011000080499911746 0.012230427000008598 2.6991563259998657
(ConsumingActor pid=4990, ip=172.31.76.82) Num epochs read 2
(ConsumingActor pid=4990, ip=172.31.76.82) Num batches read 512
(ConsumingActor pid=4990, ip=172.31.76.82) Num bytes read 20480.0 MiB
(ConsumingActor pid=4990, ip=172.31.76.82) Mean throughput 735.82 MiB/s
2022-08-16 11:05:40,066 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 4990898e63cef9d1afb993c4b8e2c37d510d259f03000000 Worker ID: 8daf286b5e5bd634b1f7e13452824c9142d659f3952159d4596a0f2d Node ID: 27386e7f5da3cd79e688aa344bad444ad51d44234a6e808e9b520085 Worker IP address: 172.31.76.82 Worker port: 10009 Worker PID: 5024 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,116 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 15be6f7919882808e433c46172826a7330534bac03000000 Worker ID: d4f4d331de1e8821ebf6d275cd969db495af4b9784ecfb5a36e0c6e0 Node ID: f0ebf46d577cdb6400f3ef60ae33f15526a6374f5bfd5c8a86206705 Worker IP address: 172.31.76.4 Worker port: 10012 Worker PID: 8269 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,201 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 4d3607cfe3a131e120a9b2a8806b64e6e8c7e94503000000 Worker ID: 49b08d07156fd0b8cebfc65ef0af631a52044aa723595782486ea25e Node ID: d1bc79e0ed15894b47071c10063c758b58f019cb491b800f680be5d7 Worker IP address: 172.31.94.99 Worker port: 10011 Worker PID: 6678 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,244 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 8dc01df80a5ffd23f4048d2484a521a7a5d23d5603000000 Worker ID: d30b23f4d64d0943b832e3c97f34b3220ff0adf506ea78f647d4adbe Node ID: a0acbc690f520f1be4ccd3e3979984ebd58c20aab7d0921ff56a38e0 Worker IP address: 172.31.92.6 Worker port: 10009 Worker PID: 5024 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,245 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 395e7bb83c1a928fec1895b9b7a77dea85279b8a03000000 Worker ID: fa45747bab5bab4e4124a5e6dec10998b4fe7b6d17809475cc4df18c Node ID: 4d17ca0a8b4a6c902bb646eee81d881e12b3deb46acf78b7568b9adc Worker IP address: 172.31.78.120 Worker port: 10008 Worker PID: 5020 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,252 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 0252b5aecfd00646dcbac25d34c31acb8b0bac6a03000000 Worker ID: 16c806c325cb4d408cb1449bf6d4d3d989396e33b5135113cb6a7ee2 Node ID: 27386e7f5da3cd79e688aa344bad444ad51d44234a6e808e9b520085 Worker IP address: 172.31.76.82 Worker port: 10008 Worker PID: 5021 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,257 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 5f248c9b3aa99783ca87ab81b384edf6cc88f18103000000 Worker ID: 700a1334d8480af57030fc57f0c9616e685de54196b1bfcd24948b72 Node ID: a549d4c7da7839bd35e8209464650ba383c4fd3af00f9fdf38c1914e Worker IP address: 172.31.75.180 Worker port: 10009 Worker PID: 4892 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,258 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: b26937f9ee58abf9a2535519f040e8e0234e0db803000000 Worker ID: f8828db09c5ac01a51b6c2a7e0412b13016cb30534878911ad57e8fe Node ID: 11e42de41c46cad33bcd371f6d5a52d293e7d6c18e6cf08f370dc42e Worker IP address: 172.31.80.141 Worker port: 10013 Worker PID: 32867 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,261 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 3167e7405c944f09da7abb211e4e2b2563fea91b03000000 Worker ID: 013d9ba77a289f685fdeba77c4fac61b57a0d9b282114a7231bac58a Node ID: 8c5313fc3a17538ad6c1b2d5f45d662b4575d5b4872a10146e66a8fd Worker IP address: 172.31.79.152 Worker port: 10009 Worker PID: 5024 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,262 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: b4389606a6f8da22ed592d28b0f0bb28e14f19f103000000 Worker ID: 4d9473fee20fe1bd579837ecf65ed52612ab1d9bb9409ca57d3b102a Node ID: 8fd4c15e22db6b4f705f9439430c7086bbaf9ea463a97982ac03cbf0 Worker IP address: 172.31.93.21 Worker port: 10009 Worker PID: 5023 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,266 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 6ac0afa6174e333358a5c66e5722a1f05efd39ed03000000 Worker ID: 63a966dbbf6af0b3822a3cbbb0cd7626cd58e19b491a6105ec75c97f Node ID: 7afd191589b24464a931d964d70b4d699dec9b82f944b3c1d011f807 Worker IP address: 172.31.90.220 Worker port: 10009 Worker PID: 5032 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,267 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: d7ab4091e189c2a050c8f79b0a64ad609aa9265503000000 Worker ID: f6a441be7830d0ea1c85213e57b37394d59fde44c7a68d934440535d Node ID: 18e753cd5e03c0ba8a7728c3b547347eb14a24b56f4bc234aa9d2a9c Worker IP address: 172.31.75.48 Worker port: 10008 Worker PID: 5020 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,268 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 14448e3833de4f8458c706aa578dfe87a2fb938c03000000 Worker ID: 3f86a0231ff10bd23d9fb3c2a8a2f87d815354457999937f25a6afe5 Node ID: d3042c75f114781d77e5ebb9a96293a503cdcb982fed3cf851e0e1a1 Worker IP address: 172.31.92.196 Worker port: 10009 Worker PID: 5025 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,273 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 4a33fb1e9e197d7a516bcaad882b38845fc4bc6503000000 Worker ID: 0baa25c4b99caa491611d59d4cbf314b2052940d23c3bc59a058d032 Node ID: 7271ea71dc40f358d48778fe33384cd161761d5bb74f84ea2b1b14c4 Worker IP address: 172.31.90.196 Worker port: 10009 Worker PID: 5023 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,274 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 0535ecda4b066fa44ac311ea9d05a17510b7bdf303000000 Worker ID: 9acc54c3f2efb7552d6b5f66987107c1b2318a747578839ef44f68a7 Node ID: c9b39092914b337fe8dbc5296d949bf804b4669ac055de1fcb0da499 Worker IP address: 172.31.87.41 Worker port: 10010 Worker PID: 5208 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,277 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 4a7ca94e331436fed8911eeb43f9119b2b863a3703000000 Worker ID: a8dcda93b82cc7aa46ba515647cbbfb26abd66b87ffc22f3316ff98b Node ID: 8df66f131aa1bcad6dcbcd0a6887e11ef4b7b581696d15ed5be72a2b Worker IP address: 172.31.85.195 Worker port: 10008 Worker PID: 4867 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,277 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 9f248f36bac0868da94e4b1472051276463580e203000000 Worker ID: c5b96840085e268ea8f5f668fdfb52e5d1b4490c5b857134055ca997 Node ID: 4d17ca0a8b4a6c902bb646eee81d881e12b3deb46acf78b7568b9adc Worker IP address: 172.31.78.120 Worker port: 10009 Worker PID: 5023 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,277 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 8320bd2c9d9445f4d177d8d5e202b44c35987db303000000 Worker ID: 58bfd07dce7635b425272fabafe70832e90c6da89372a62cfb48fa44 Node ID: 7afd191589b24464a931d964d70b4d699dec9b82f944b3c1d011f807 Worker IP address: 172.31.90.220 Worker port: 10008 Worker PID: 5029 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,277 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 3148d898f96dd43e9a1aa789b795b4b28b03d1b303000000 Worker ID: 40a9a2f78c683b6838d8e3ae4c41e1d45c6144139bbccbed4b77c86c Node ID: d3042c75f114781d77e5ebb9a96293a503cdcb982fed3cf851e0e1a1 Worker IP address: 172.31.92.196 Worker port: 10008 Worker PID: 5022 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,280 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 5877546d6e35c3c59015623601bf0450ef4c4ef903000000 Worker ID: 4bfe8365401a0a6e279aac43a859735b049352793d96446c90a40959 Node ID: 18e753cd5e03c0ba8a7728c3b547347eb14a24b56f4bc234aa9d2a9c Worker IP address: 172.31.75.48 Worker port: 10009 Worker PID: 5023 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,281 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 9ebcd5e3c9068c3748744ff797a8f30f6840af5903000000 Worker ID: 76225b5f235001799c492d2c3beb18f24ea535804426cc1051adf070 Node ID: 71b298367b6af24f9fcaec3f1cd37d35c0e7de1a1aa7223d69499904 Worker IP address: 172.31.70.173 Worker port: 10008 Worker PID: 5021 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,281 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 99938dbd01cc1b2b63483fcb3036d7e7c3b2708703000000 Worker ID: a4f56646f07f33032b40a3c8c751b8018d9e0095cc95e75194c52046 Node ID: f0ebf46d577cdb6400f3ef60ae33f15526a6374f5bfd5c8a86206705 Worker IP address: 172.31.76.4 Worker port: 10009 Worker PID: 5023 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,289 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 3e3c1bde8666739e9ac15a93edfd1f13985a48ec03000000 Worker ID: 32aa4d991356d03e89a916e21d446035158bc7d44fa3ddb1c9efb5f2 Node ID: 8c5313fc3a17538ad6c1b2d5f45d662b4575d5b4872a10146e66a8fd Worker IP address: 172.31.79.152 Worker port: 10008 Worker PID: 5021 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,290 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: f58c7b620714bdf7ad9b321e83947c4e2c0bf7e403000000 Worker ID: 6017a5886fe2d647569acdccad30f9e025d78ef23bd71d85359a12ea Node ID: c9b39092914b337fe8dbc5296d949bf804b4669ac055de1fcb0da499 Worker IP address: 172.31.87.41 Worker port: 10009 Worker PID: 5207 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,291 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 2fcbed8a1e34c23f084345cc0892b3fe18e5b32103000000 Worker ID: f5a67bfb021477c16bfae85a1d34ed3781a379d32e706e7325fcdbb4 Node ID: 71b298367b6af24f9fcaec3f1cd37d35c0e7de1a1aa7223d69499904 Worker IP address: 172.31.70.173 Worker port: 10009 Worker PID: 5024 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,291 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 521792c4dc15bb1b4c63f223dd74086e728935a603000000 Worker ID: dcd5fb1b75cdbe362edb14d914bb823c95f022e37cdccac32756089a Node ID: 8fd4c15e22db6b4f705f9439430c7086bbaf9ea463a97982ac03cbf0 Worker IP address: 172.31.93.21 Worker port: 10008 Worker PID: 5020 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,292 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 82a009ae37958935bfdd895bcb0a9b6a7d7f478003000000 Worker ID: 109ff81407bcc07096c559cf0af11b69f82a22a30cf647fdb86e99c9 Node ID: a0acbc690f520f1be4ccd3e3979984ebd58c20aab7d0921ff56a38e0 Worker IP address: 172.31.92.6 Worker port: 10008 Worker PID: 5021 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,294 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 8587da5e9a90dbfc145a4bf259ad1faac199ac4303000000 Worker ID: 780e52e81eddb264b34db86001d198236a93053f9fe583e77f145010 Node ID: d1bc79e0ed15894b47071c10063c758b58f019cb491b800f680be5d7 Worker IP address: 172.31.94.99 Worker port: 10008 Worker PID: 5022 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,298 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 5cb65acffa9c4e8ea7b8033c0c6d8105b0977bc103000000 Worker ID: bff70146fb9ee67a72a8e07ba23cdc291485a9f9dda8e0226375a68a Node ID: 1cba18f3f6d0531b74552710505f2e87eccbb33889289a1dfd5c6208 Worker IP address: 172.31.86.117 Worker port: 10009 Worker PID: 5029 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,299 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: c5917806ada2d8652acd84bc655c3b74137c7b3003000000 Worker ID: 8cf007e3fadaa410db8caa8dc7fba8a173dfb01a81f8430971de9830 Node ID: 5efa10c6538b595a48ba491bb22de2b12552d22b91d37d246060ac35 Worker IP address: 172.31.64.186 Worker port: 10009 Worker PID: 5025 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,300 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 6bbe1d0f019e747d1ae0a6bae997af0090289fe203000000 Worker ID: 79fbc33668520a2281aa67403db34e184825b3db73c0d9ee22634adf Node ID: fdcef9b03f9ab4835f3c3f330e97495386178aae3514ef3156d03991 Worker IP address: 172.31.83.41 Worker port: 10009 Worker PID: 5057 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,301 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: b9243fee925856e1d3e7e2e77aa9ae732e9983fe03000000 Worker ID: b190bdcdb4ca2f4932c0c01e51e6fe019233cc2afa0005bd9fee73f2 Node ID: 16426d55cd9aeeac53b26670921fd1507acd3399c6bb192c628c0ea1 Worker IP address: 172.31.87.52 Worker port: 10008 Worker PID: 5019 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,302 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 462d6fc844d6a816ab36f950e0abe78641630d7103000000 Worker ID: 481f1741b305e1a9a45570e738364e8c99e59e480fd3db184cadf3df Node ID: 7271ea71dc40f358d48778fe33384cd161761d5bb74f84ea2b1b14c4 Worker IP address: 172.31.90.196 Worker port: 10008 Worker PID: 5020 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,304 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: c389ca595ace7fc9adfb1e19fb8f668ba93273fa03000000 Worker ID: 68a9a0602e8d4c40a1375d39c90110fdb0947e5e97b845364af9ce78 Node ID: a549d4c7da7839bd35e8209464650ba383c4fd3af00f9fdf38c1914e Worker IP address: 172.31.75.180 Worker port: 10010 Worker PID: 4895 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,304 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: c8176601a16a4428dda5a660836750394d41bb0103000000 Worker ID: 845308e272969f9eaae55948cb80b902aeee58aaee2b18629fd44ab9 Node ID: 1cba18f3f6d0531b74552710505f2e87eccbb33889289a1dfd5c6208 Worker IP address: 172.31.86.117 Worker port: 10008 Worker PID: 5026 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,305 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 03308e5d50741771dc7bd1d0506813014fac3d9903000000 Worker ID: e573a953d9cddd64409ab33afd5468a55c93f8f895cea85c033128ad Node ID: fdcef9b03f9ab4835f3c3f330e97495386178aae3514ef3156d03991 Worker IP address: 172.31.83.41 Worker port: 10010 Worker PID: 5060 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,306 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 808a9a25d193ee288dbdeaa068a6b0b31061194a03000000 Worker ID: c4add67dda495882e1db1ca2529eb0bc57b0654ff171530ca8f6ca24 Node ID: 5efa10c6538b595a48ba491bb22de2b12552d22b91d37d246060ac35 Worker IP address: 172.31.64.186 Worker port: 10008 Worker PID: 5022 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,306 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 57333c79c7b1398eb4bc372f90ed0649cf92cdf303000000 Worker ID: 1e2838d7fac2123d488b32bd5a1e57e09108d26e1923ca83cd26a88d Node ID: 16426d55cd9aeeac53b26670921fd1507acd3399c6bb192c628c0ea1 Worker IP address: 172.31.87.52 Worker port: 10009 Worker PID: 5022 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,313 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 2630d4d579f3cb96a0e40df6da6f00aa4c840a0403000000 Worker ID: f3f7a232e651cf0c1e361c817833dbb57f77143cf612c53f95780645 Node ID: 11e42de41c46cad33bcd371f6d5a52d293e7d6c18e6cf08f370dc42e Worker IP address: 172.31.80.141 Worker port: 10016 Worker PID: 32993 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2022-08-16 11:05:40,328 WARNING worker.py:1799 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 21b2f87b83c4f7b5bb504c501b33a19f62411e8203000000 Worker ID: 8c4f7b1241421d12d4ac4e59b7a4c6a3e9d9643a4d97fe2682db6897 Node ID: 8df66f131aa1bcad6dcbcd0a6887e11ef4b7b581696d15ed5be72a2b Worker IP address: 172.31.85.195 Worker port: 10009 Worker PID: 4870 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
(base) ray@ip-172-31-80-141:~/workspace-project-rtoom_pipeline_data_ingest2/release/nightly_tests/dataset$
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment