Skip to content

Instantly share code, notes, and snippets.

@nsivabalan
Created August 5, 2022 15:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nsivabalan/ae3bc624665198e3a39c4f682e1df8ed to your computer and use it in GitHub Desktop.
Save nsivabalan/ae3bc624665198e3a39c4f682e1df8ed to your computer and use it in GitHub Desktop.
22/08/04 18:53:06 INFO HoodieLogFormatWriter: HoodieLogFile{pathStr='hdfs://hdfs-namenodes:8020/tmp/jenki
ns-infra-hudi/hudi/job-run/LongSpark2.4.7HudiTestsManualEKS_Siva/data/2022-08-04/1/MERGE_ON_READdeltastre
amer-long-running-multi-partitions-metadata.yamltest-metadata-aggressive-clean-archival.properties/output
/.hoodie/metadata/.hoodie/archived/.commits_.archive.1_1-0-1', fileLen=0} exists. Appending to existing f
ile
22/08/04 18:53:06 INFO DFSClient: Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as 10.2.6.189:9866
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:142)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1359)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1184)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
22/08/04 18:53:06 WARN DFSClient: Error Recovery for block BP-276436981-10.2.8.134-1659344985993:blk_1073945833_228043 in pipeline DatanodeInfoWithStorage[10.2.5.85:9866,DS-28999f8d-068e-4a1e-9907-8db72bfc0343,DISK], DatanodeInfoWithStorage[10.2.5.61:9866,DS-2212dc19-5a8e-499b-8ebe-60026e4dc924,DISK], DatanodeInfoWithStorage[10.2.6.189:9866,DS-97b5a5a3-1bbc-4759-a40f-5c6df0546090,DISK]: bad datanode DatanodeInfoWithStorage[10.2.6.189:9866,DS-97b5a5a3-1bbc-4759-a40f-5c6df0546090,DISK]
.
.
.
.
.
.
.
.
.
.
22/08/04 18:55:23 INFO HoodieLogFormatWriter: HoodieLogFile{pathStr='hdfs://hdfs-namenodes:8020/tmp/jenki
ns-infra-hudi/hudi/job-run/LongSpark2.4.7HudiTestsManualEKS_Siva/data/2022-08-04/1/MERGE_ON_READdeltastre
amer-long-running-multi-partitions-metadata.yamltest-metadata-aggressive-clean-archival.properties/output
/.hoodie/metadata/.hoodie/archived/.commits_.archive.1_1-0-1', fileLen=0} exists. Appending to existing f
ile
22/08/04 18:55:23 INFO DFSClient: Exception in createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2282)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1343)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1184)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
22/08/04 18:55:23 WARN DFSClient: Error Recovery for block BP-276436981-10.2.8.134-1659344985993:blk_1073945833_228148 in pipeline DatanodeInfoWithStorage[10.2.6.250:9866,DS-9a919509-8328-482c-bdab-9d1257a953e1,DISK], DatanodeInfoWithStorage[10.2.5.61:9866,DS-2212dc19-5a8e-499b-8ebe-60026e4dc924,DISK], DatanodeInfoWithStorage[10.2.5.85:9866,DS-28999f8d-068e-4a1e-9907-8db72bfc0343,DISK]: bad datanode DatanodeInfoWithStorage[10.2.6.250:9866,DS-9a919509-8328-482c-bdab-9d1257a953e1,DISK]
.
.
.
.
.
.
.
.
.
22/08/04 18:58:04 WARN TaskSetManager: Lost task 4.1 in stage 1685.0 (TID 7386, 10.2.7.226, executor 2):
FetchFailed(null, shuffleId=303, mapId=-1, reduceId=4, message=
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 303
at org.apache.spark.MapOutputTracker$$anonfun$convertMapStatuses$2.apply(MapOutputTracker.scala:882)
at org.apache.spark.MapOutputTracker$$anonfun$convertMapStatuses$2.apply(MapOutputTracker.scala:878)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at org.apache.spark.MapOutputTracker$.convertMapStatuses(MapOutputTracker.scala:878)
at org.apache.spark.MapOutputTrackerWorker.getMapSizesByExecutorId(MapOutputTracker.scala:691)
at org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:49)
at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:105)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
.
.
.
.
.
22/08/04 21:47:05 WARN TaskSetManager: Lost task 2.0 in stage 7077.0 (TID 35061, 10.2.7.68, executor 22): java.io.IOException: Not a data file.
at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
at org.apache.avro.mapreduce.AvroRecordReaderBase.createAvroFileReader(AvroRecordReaderBase.java:183)
at org.apache.avro.mapreduce.AvroRecordReaderBase.initialize(AvroRecordReaderBase.java:94)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.liftedTree1$1(NewHadoopRDD.scala:199)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:196)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:151)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:70)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:359)
at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:357)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:357)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:308)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.EOFException
at org.apache.avro.io.BinaryDecoder$InputStreamByteSource.readRaw(BinaryDecoder.java:827)
at org.apache.avro.io.BinaryDecoder.doReadBytes(BinaryDecoder.java:349)
at org.apache.avro.io.BinaryDecoder.readFixed(BinaryDecoder.java:302)
at org.apache.avro.io.Decoder.readFixed(Decoder.java:150)
at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:100)
... 41 more
22/08/04 21:47:05 INFO BlockManagerInfo: Added rdd_13820_0 in memory on 10.2.7.68:44221 (size: 477.2 KB, free: 997.3 MB)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment