Skip to content

Instantly share code, notes, and snippets.

@ad1happy2go
Created May 28, 2024 10:53
Show Gist options
  • Save ad1happy2go/8c39c5abb865af0913d49925d9a1b478 to your computer and use it in GitHub Desktop.
Save ad1happy2go/8c39c5abb865af0913d49925d9a1b478 to your computer and use it in GitHub Desktop.
[ec2-user@ip-10-0-78-189 ~]$ spark-3.3.4-bin-hadoop3/bin/spark-submit --master local \
> --jars ${JAR_PATH}/hadoop-aws-3.2.0.jar,aws-java-sdk-bundle-1.11.375.jar,"${JAR_PATH}/hudi-spark${SPARK_VERSION}-bundle_2.12-${HUDI_VERSION}.jar,${JAR_PATH}/hudi-datahub-sync-bundle-${HUDI_VERSION}.jar" \
> --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
> ${JAR_PATH}/hudi-utilities-slim-bundle_2.12-$HUDI_VERSION.jar \
> --target-base-path ${DATA_PATH}/stocks/data/target/${NOW} \
> --target-table stocks${NOW} \
> --table-type COPY_ON_WRITE \
> --base-file-format PARQUET \
> --props ${DATA_PATH}/stocks/configs/hoodie.properties \
> --source-class org.apache.hudi.utilities.sources.JsonDFSSource \
> --source-ordering-field ts \
> --payload-class org.apache.hudi.common.model.DefaultHoodieRecordPayload \
> --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider \
> --hoodie-conf hoodie.deltastreamer.schemaprovider.source.schema.file=${DATA_PATH}/stocks/data/schema.avsc \
> --hoodie-conf hoodie.deltastreamer.schemaprovider.target.schema.file=${DATA_PATH}/stocks/data/schema.avsc \
> --op UPSERT \
> --enable-sync \
> --sync-tool-classes org.apache.hudi.sync.datahub.DataHubSyncTool \
> --spark-master yarn \
> --hoodie-conf hoodie.deltastreamer.source.dfs.root=${DATA_PATH}/stocks/data/source \
> --hoodie-conf hoodie.datasource.write.recordkey.field=symbol \
> --hoodie-conf hoodie.datasource.write.partitionpath.field=date \
> --hoodie-conf hoodie.datasource.write.precombine.field=ts \
> --hoodie-conf hoodie.datasource.write.keygenerator.type=SIMPLE \
> --hoodie-conf hoodie.datasource.write.hive_style_partitioning=false \
> --hoodie-conf hoodie.metadata.enable=true \
> --hoodie-conf hoodie.meta.sync.datahub.emitter.server=${EMITTER_SERVER} \
> --hoodie-conf hoodie.datasource.hive_sync.database=rxusandbox
24/05/28 10:51:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/05/28 10:51:23 WARN DependencyUtils: Local jar /home/ec2-user/hadoop-aws-3.2.0.jar does not exist, skipping.
24/05/28 10:51:23 WARN DependencyUtils: Local jar /home/ec2-user/aws-java-sdk-bundle-1.11.375.jar does not exist, skipping.
24/05/28 10:51:23 WARN SchedulerConfGenerator: Job Scheduling Configs will not be in effect as spark.scheduler.mode is not set to FAIR at instantiation time. Continuing without scheduling configs
24/05/28 10:51:23 INFO SparkContext: Running Spark version 3.3.4
24/05/28 10:51:23 INFO ResourceUtils: ==============================================================
24/05/28 10:51:23 INFO ResourceUtils: No custom resources configured for spark.driver.
24/05/28 10:51:23 INFO ResourceUtils: ==============================================================
24/05/28 10:51:23 INFO SparkContext: Submitted application: streamer-stocks20240528t103509
24/05/28 10:51:23 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
24/05/28 10:51:23 INFO ResourceProfile: Limiting resource is cpu
24/05/28 10:51:23 INFO ResourceProfileManager: Added ResourceProfile id: 0
24/05/28 10:51:23 INFO SecurityManager: Changing view acls to: ec2-user
24/05/28 10:51:23 INFO SecurityManager: Changing modify acls to: ec2-user
24/05/28 10:51:23 INFO SecurityManager: Changing view acls groups to:
24/05/28 10:51:23 INFO SecurityManager: Changing modify acls groups to:
24/05/28 10:51:23 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ec2-user); groups with view permissions: Set(); users with modify permissions: Set(ec2-user); groups with modify permissions: Set()
24/05/28 10:51:23 INFO deprecation: mapred.output.compression.codec is deprecated. Instead, use mapreduce.output.fileoutputformat.compress.codec
24/05/28 10:51:23 INFO deprecation: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
24/05/28 10:51:23 INFO deprecation: mapred.output.compression.type is deprecated. Instead, use mapreduce.output.fileoutputformat.compress.type
24/05/28 10:51:24 INFO Utils: Successfully started service 'sparkDriver' on port 34659.
24/05/28 10:51:24 INFO SparkEnv: Registering MapOutputTracker
24/05/28 10:51:24 INFO SparkEnv: Registering BlockManagerMaster
24/05/28 10:51:24 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
24/05/28 10:51:24 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
24/05/28 10:51:24 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
24/05/28 10:51:24 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-878c700c-a69a-471d-ac78-d59fee573dcf
24/05/28 10:51:24 INFO MemoryStore: MemoryStore started with capacity 366.3 MiB
24/05/28 10:51:24 INFO SparkEnv: Registering OutputCommitCoordinator
24/05/28 10:51:24 INFO Utils: Successfully started service 'SparkUI' on port 8090.
24/05/28 10:51:25 ERROR SparkContext: Failed to add file:/home/ec2-user/hadoop-aws-3.2.0.jar to Spark environment
java.io.FileNotFoundException: Jar /home/ec2-user/hadoop-aws-3.2.0.jar not found
at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1959)
at org.apache.spark.SparkContext.addJar(SparkContext.scala:2014)
at org.apache.spark.SparkContext.$anonfun$new$12(SparkContext.scala:507)
at org.apache.spark.SparkContext.$anonfun$new$12$adapted(SparkContext.scala:507)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:507)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at org.apache.hudi.utilities.UtilHelpers.buildSparkContext(UtilHelpers.java:359)
at org.apache.hudi.utilities.streamer.HoodieStreamer.main(HoodieStreamer.java:599)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:984)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:191)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:214)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1072)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1081)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
24/05/28 10:51:25 ERROR SparkContext: Failed to add file:/home/ec2-user/aws-java-sdk-bundle-1.11.375.jar to Spark environment
java.io.FileNotFoundException: Jar /home/ec2-user/aws-java-sdk-bundle-1.11.375.jar not found
at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1959)
at org.apache.spark.SparkContext.addJar(SparkContext.scala:2014)
at org.apache.spark.SparkContext.$anonfun$new$12(SparkContext.scala:507)
at org.apache.spark.SparkContext.$anonfun$new$12$adapted(SparkContext.scala:507)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:507)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at org.apache.hudi.utilities.UtilHelpers.buildSparkContext(UtilHelpers.java:359)
at org.apache.hudi.utilities.streamer.HoodieStreamer.main(HoodieStreamer.java:599)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:984)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:191)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:214)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1072)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1081)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
24/05/28 10:51:25 INFO SparkContext: Added JAR file:///home/ec2-user/hudi-spark3.3-bundle_2.12-0.15.0-rc1.jar at spark://ip-10-0-78-189.us-west-2.compute.internal:34659/jars/hudi-spark3.3-bundle_2.12-0.15.0-rc1.jar with timestamp 1716893483672
24/05/28 10:51:25 INFO SparkContext: Added JAR file:///home/ec2-user/hudi-datahub-sync-bundle-0.15.0-rc1.jar at spark://ip-10-0-78-189.us-west-2.compute.internal:34659/jars/hudi-datahub-sync-bundle-0.15.0-rc1.jar with timestamp 1716893483672
24/05/28 10:51:25 INFO SparkContext: Added JAR file:/home/ec2-user/hudi-utilities-slim-bundle_2.12-0.15.0-rc1.jar at spark://ip-10-0-78-189.us-west-2.compute.internal:34659/jars/hudi-utilities-slim-bundle_2.12-0.15.0-rc1.jar with timestamp 1716893483672
24/05/28 10:51:25 INFO Executor: Starting executor ID driver on host ip-10-0-78-189.us-west-2.compute.internal
24/05/28 10:51:25 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): ''
24/05/28 10:51:25 INFO Executor: Fetching spark://ip-10-0-78-189.us-west-2.compute.internal:34659/jars/hudi-spark3.3-bundle_2.12-0.15.0-rc1.jar with timestamp 1716893483672
24/05/28 10:51:25 INFO TransportClientFactory: Successfully created connection to ip-10-0-78-189.us-west-2.compute.internal/10.0.78.189:34659 after 86 ms (0 ms spent in bootstraps)
24/05/28 10:51:25 INFO Utils: Fetching spark://ip-10-0-78-189.us-west-2.compute.internal:34659/jars/hudi-spark3.3-bundle_2.12-0.15.0-rc1.jar to /tmp/spark-a3a2d54a-4756-473e-8387-a35f0036b3f5/userFiles-ce7c90d1-8277-4d4f-b483-585497ed4b3e/fetchFileTemp324488218261745797.tmp
24/05/28 10:51:26 INFO Executor: Adding file:/tmp/spark-a3a2d54a-4756-473e-8387-a35f0036b3f5/userFiles-ce7c90d1-8277-4d4f-b483-585497ed4b3e/hudi-spark3.3-bundle_2.12-0.15.0-rc1.jar to class loader
24/05/28 10:51:26 INFO Executor: Fetching spark://ip-10-0-78-189.us-west-2.compute.internal:34659/jars/hudi-datahub-sync-bundle-0.15.0-rc1.jar with timestamp 1716893483672
24/05/28 10:51:26 INFO Utils: Fetching spark://ip-10-0-78-189.us-west-2.compute.internal:34659/jars/hudi-datahub-sync-bundle-0.15.0-rc1.jar to /tmp/spark-a3a2d54a-4756-473e-8387-a35f0036b3f5/userFiles-ce7c90d1-8277-4d4f-b483-585497ed4b3e/fetchFileTemp528721711297356980.tmp
24/05/28 10:51:26 INFO Executor: Adding file:/tmp/spark-a3a2d54a-4756-473e-8387-a35f0036b3f5/userFiles-ce7c90d1-8277-4d4f-b483-585497ed4b3e/hudi-datahub-sync-bundle-0.15.0-rc1.jar to class loader
24/05/28 10:51:26 INFO Executor: Fetching spark://ip-10-0-78-189.us-west-2.compute.internal:34659/jars/hudi-utilities-slim-bundle_2.12-0.15.0-rc1.jar with timestamp 1716893483672
24/05/28 10:51:26 INFO Utils: Fetching spark://ip-10-0-78-189.us-west-2.compute.internal:34659/jars/hudi-utilities-slim-bundle_2.12-0.15.0-rc1.jar to /tmp/spark-a3a2d54a-4756-473e-8387-a35f0036b3f5/userFiles-ce7c90d1-8277-4d4f-b483-585497ed4b3e/fetchFileTemp4136785405861798744.tmp
24/05/28 10:51:26 INFO Executor: Adding file:/tmp/spark-a3a2d54a-4756-473e-8387-a35f0036b3f5/userFiles-ce7c90d1-8277-4d4f-b483-585497ed4b3e/hudi-utilities-slim-bundle_2.12-0.15.0-rc1.jar to class loader
24/05/28 10:51:26 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40923.
24/05/28 10:51:26 INFO NettyBlockTransferService: Server created on ip-10-0-78-189.us-west-2.compute.internal:40923
24/05/28 10:51:26 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
24/05/28 10:51:26 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, ip-10-0-78-189.us-west-2.compute.internal, 40923, None)
24/05/28 10:51:26 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-0-78-189.us-west-2.compute.internal:40923 with 366.3 MiB RAM, BlockManagerId(driver, ip-10-0-78-189.us-west-2.compute.internal, 40923, None)
24/05/28 10:51:26 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, ip-10-0-78-189.us-west-2.compute.internal, 40923, None)
24/05/28 10:51:26 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, ip-10-0-78-189.us-west-2.compute.internal, 40923, None)
24/05/28 10:51:27 WARN DFSPropertiesConfiguration: Cannot find HUDI_CONF_DIR, please set it as the dir of hudi-defaults.conf
24/05/28 10:51:27 WARN DFSPropertiesConfiguration: Properties file file:/etc/hudi/conf/hudi-defaults.conf not found. Ignoring to load props file
24/05/28 10:51:27 INFO UtilHelpers: Adding overridden properties to file properties.
24/05/28 10:51:27 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir.
24/05/28 10:51:27 INFO SharedState: Warehouse path is 'file:/home/ec2-user/spark-warehouse'.
24/05/28 10:51:28 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:28 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:28 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:29 INFO HoodieStreamer: Creating Hudi Streamer with configs:
hoodie.archive.delete.parallelism: 9
hoodie.auto.adjust.lock.configs: true
hoodie.bloom.index.parallelism: 9
hoodie.bootstrap.parallelism: 9
hoodie.bulkinsert.shuffle.parallelism: 9
hoodie.cleaner.parallelism: 9
hoodie.datasource.hive_sync.database: rxusandbox
hoodie.datasource.write.hive_style_partitioning: false
hoodie.datasource.write.keygenerator.type: SIMPLE
hoodie.datasource.write.partitionpath.field: date
hoodie.datasource.write.precombine.field: ts
hoodie.datasource.write.reconcile.schema: false
hoodie.datasource.write.recordkey.field: symbol
hoodie.delete.shuffle.parallelism: 9
hoodie.deltastreamer.schemaprovider.source.schema.file: /home/ec2-user/testcases/stocks/data/schema.avsc
hoodie.deltastreamer.schemaprovider.target.schema.file: /home/ec2-user/testcases/stocks/data/schema.avsc
hoodie.deltastreamer.source.dfs.root: /home/ec2-user/testcases/stocks/data/source
hoodie.file.listing.parallelism: 9
hoodie.finalize.write.parallelism: 9
hoodie.global.simple.index.parallelism: 9
hoodie.insert.shuffle.parallelism: 9
hoodie.markers.delete.parallelism: 9
hoodie.meta.sync.datahub.emitter.server: http://localhost:8080
hoodie.metadata.enable: true
hoodie.metadata.insert.parallelism: 1
hoodie.parquet.compression.codec: snappy
hoodie.rollback.parallelism: 9
hoodie.simple.index.parallelism: 1
hoodie.upsert.shuffle.parallelism: 9
24/05/28 10:51:29 WARN ConfigUtils: The configuration key 'hoodie.deltastreamer.schemaprovider.source.schema.file' has been deprecated and may be removed in the future. Please use the new key 'hoodie.streamer.schemaprovider.source.schema.file' instead.
24/05/28 10:51:29 WARN ConfigUtils: The configuration key 'hoodie.deltastreamer.schemaprovider.target.schema.file' has been deprecated and may be removed in the future. Please use the new key 'hoodie.streamer.schemaprovider.target.schema.file' instead.
24/05/28 10:51:29 INFO HadoopFSUtils: Resolving file /home/ec2-user/testcases/stocks/data/schema.avsc to be a remote file.
24/05/28 10:51:29 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:29 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:29 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:29 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20240528105055899__commit__REQUESTED__20240528105058119]}
24/05/28 10:51:29 WARN ConfigUtils: The configuration key 'hoodie.deltastreamer.source.dfs.root' has been deprecated and may be removed in the future. Please use the new key 'hoodie.streamer.source.dfs.root' instead.
24/05/28 10:51:29 INFO DFSPathSelector: Using path selector org.apache.hudi.utilities.sources.helpers.DFSPathSelector
24/05/28 10:51:29 INFO HoodieIngestionService: Ingestion service starts running in run-once mode
24/05/28 10:51:29 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:29 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:29 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:29 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20240528105055899__commit__REQUESTED__20240528105058119]}
24/05/28 10:51:29 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:29 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:29 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:29 INFO StreamSync: Checkpoint to resume from : Optional.empty
24/05/28 10:51:29 WARN ConfigUtils: The configuration key 'hoodie.deltastreamer.source.dfs.root' has been deprecated and may be removed in the future. Please use the new key 'hoodie.streamer.source.dfs.root' instead.
24/05/28 10:51:29 INFO DFSPathSelector: Root path => /home/ec2-user/testcases/stocks/data/source source limit => 9223372036854775807
24/05/28 10:51:29 WARN ConfigUtils: The configuration key 'hoodie.deltastreamer.source.dfs.root' has been deprecated and may be removed in the future. Please use the new key 'hoodie.streamer.source.dfs.root' instead.
24/05/28 10:51:29 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 366.9 KiB, free 365.9 MiB)
24/05/28 10:51:30 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 34.3 KiB, free 365.9 MiB)
24/05/28 10:51:30 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 34.3 KiB, free: 366.3 MiB)
24/05/28 10:51:30 INFO SparkContext: Created broadcast 0 from textFile at JsonDFSSource.java:54
24/05/28 10:51:30 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20240528105055899__commit__REQUESTED__20240528105058119]}
24/05/28 10:51:30 INFO UtilHelpers: Adding overridden properties to file properties.
24/05/28 10:51:30 INFO StreamSync: Setting up new Hoodie Write Client
24/05/28 10:51:30 INFO EmbeddedTimelineService: Overriding hostIp to (ip-10-0-78-189.us-west-2.compute.internal) found in spark-conf. It was null
24/05/28 10:51:30 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:30 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:30 INFO log: Logging initialized @10134ms to org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog
24/05/28 10:51:30 INFO Javalin:
__ __ _ __ __
/ /____ _ _ __ ____ _ / /(_)____ / // /
__ / // __ `/| | / // __ `// // // __ \ / // /_
/ /_/ // /_/ / | |/ // /_/ // // // / / / /__ __/
\____/ \__,_/ |___/ \__,_//_//_//_/ /_/ /_/
https://javalin.io/documentation
24/05/28 10:51:30 INFO Javalin: Starting Javalin ...
24/05/28 10:51:30 INFO Javalin: You are running Javalin 4.6.7 (released October 24, 2022. Your Javalin version is 582 days old. Consider checking for a newer version.).
24/05/28 10:51:30 INFO Server: jetty-9.4.53.v20231009; built: 2023-10-09T12:29:09.265Z; git: 27bde00a0b95a1d5bbee0eae7984f891d2d0f8c9; jvm 1.8.0_392-b08
24/05/28 10:51:31 INFO Server: Started @10735ms
24/05/28 10:51:31 INFO Javalin: Listening on http://localhost:40045/
24/05/28 10:51:31 INFO Javalin: Javalin started in 300ms \o/
24/05/28 10:51:31 INFO TimelineService: Starting Timeline server on port :40045
24/05/28 10:51:31 INFO EmbeddedTimelineService: Started embedded timeline server at ip-10-0-78-189.us-west-2.compute.internal:40045
24/05/28 10:51:31 INFO BaseHoodieClient: Timeline Server already running. Not restarting the service
24/05/28 10:51:31 INFO BaseHoodieClient: Timeline Server already running. Not restarting the service
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20240528105055899__commit__REQUESTED__20240528105058119]}
24/05/28 10:51:31 INFO CleanerUtils: Cleaned failed attempts if any
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20240528105055899__commit__REQUESTED__20240528105058119]}
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[00000000000000010__deltacommit__COMPLETED__20240528105102403]}
24/05/28 10:51:31 INFO AbstractTableFileSystemView: Took 2 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:31 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:31 INFO FileSystemViewManager: Creating View Manager with storage type REMOTE_FIRST.
24/05/28 10:51:31 INFO FileSystemViewManager: Creating remote first table view
24/05/28 10:51:31 INFO BaseHoodieWriteClient: Begin rollback of instant 20240528105055899
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20240528105055899__commit__REQUESTED__20240528105058119]}
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[00000000000000010__deltacommit__COMPLETED__20240528105102403]}
24/05/28 10:51:31 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:31 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:31 INFO FileSystemViewManager: Creating View Manager with storage type REMOTE_FIRST.
24/05/28 10:51:31 INFO FileSystemViewManager: Creating remote first table view
24/05/28 10:51:31 INFO BaseHoodieWriteClient: Scheduling Rollback at instant time : 20240528105131254 (exists in active timeline: true), with rollback plan: false
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20240528105131254__rollback__REQUESTED__20240528105131379]}
24/05/28 10:51:31 INFO BaseRollbackPlanActionExecutor: Requesting Rollback with instant time [==>20240528105131254__rollback__REQUESTED]
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20240528105131254__rollback__REQUESTED__20240528105131379]}
24/05/28 10:51:31 INFO HoodieActiveTimeline: Create new file for toInstant ?/home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/20240528105131254.rollback.inflight
24/05/28 10:51:31 INFO CopyOnWriteRollbackActionExecutor: Time(in ms) taken to finish rollback 0
24/05/28 10:51:31 INFO BaseRollbackActionExecutor: Rolled back inflight instant 20240528105055899
24/05/28 10:51:31 INFO BaseRollbackActionExecutor: Index rolled back for commits [==>20240528105055899__commit__REQUESTED__20240528105058119]
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[00000000000000010__deltacommit__COMPLETED__20240528105102403]}
24/05/28 10:51:31 INFO HoodieBackedTableMetadataWriter: Async metadata indexing disabled and following partitions already initialized: [files]
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[00000000000000010__deltacommit__COMPLETED__20240528105102403]}
24/05/28 10:51:31 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:31 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:31 INFO HoodieTableMetadataUtil: Found at 20240528105131254 from Rollback. #partitions_updated=0, #files_deleted=0, #files_appended=0
24/05/28 10:51:31 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:31 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:31 INFO HoodieTableMetadataUtil: Loading latest file slices for metadata table partition files
24/05/28 10:51:31 INFO AbstractTableFileSystemView: Building file system view for partition (files)
24/05/28 10:51:31 INFO BaseHoodieClient: Embedded Timeline Server is disabled. Not starting timeline service
24/05/28 10:51:31 INFO BaseHoodieClient: Embedded Timeline Server is disabled. Not starting timeline service
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[00000000000000010__deltacommit__COMPLETED__20240528105102403]}
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:31 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:31 INFO HoodieBackedTableMetadataWriter: New commit at 20240528105131254 being applied to MDT.
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[00000000000000010__deltacommit__COMPLETED__20240528105102403]}
24/05/28 10:51:31 INFO CleanerUtils: Cleaned failed attempts if any
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[00000000000000010__deltacommit__COMPLETED__20240528105102403]}
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:31 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:31 INFO BaseHoodieWriteClient: Generate a new instant time: 20240528105131254 action: deltacommit
24/05/28 10:51:31 INFO HoodieActiveTimeline: Creating a new instant [==>20240528105131254__deltacommit__REQUESTED]
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:31 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20240528105131254__deltacommit__REQUESTED__20240528105131587]}
24/05/28 10:51:31 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:31 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:31 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:31 INFO AsyncCleanerService: The HoodieWriteClient is not configured to auto & async clean. Async clean service will not start.
24/05/28 10:51:31 INFO AsyncArchiveService: The HoodieWriteClient is not configured to auto & async archive. Async archive service will not start.
24/05/28 10:51:31 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:31 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:31 INFO SparkContext: Starting job: countByKey at HoodieJavaPairRDD.java:105
24/05/28 10:51:31 INFO DAGScheduler: Job 0 finished: countByKey at HoodieJavaPairRDD.java:105, took 0.027799 s
24/05/28 10:51:31 INFO BaseSparkCommitActionExecutor: Source read and index timer 177
24/05/28 10:51:31 INFO UpsertPartitioner: AvgRecordSize => 1024
24/05/28 10:51:31 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:31 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:31 INFO UpsertPartitioner: Total Buckets: 0, bucketInfoMap size: 0, partitionPathToInsertBucketInfos size: 0, updateLocationToBucket size: 0
24/05/28 10:51:31 INFO HoodieActiveTimeline: Create new file for toInstant ?/home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/20240528105131254.deltacommit.inflight
24/05/28 10:51:32 INFO BaseSparkCommitActionExecutor: no validators configured.
24/05/28 10:51:32 INFO BaseCommitActionExecutor: Auto commit enabled: Committing 20240528105131254
24/05/28 10:51:32 INFO SparkContext: Starting job: collect at HoodieJavaRDD.java:177
24/05/28 10:51:32 INFO DAGScheduler: Job 1 finished: collect at HoodieJavaRDD.java:177, took 0.001519 s
24/05/28 10:51:32 INFO CommitUtils: Creating metadata for UPSERT_PREPPED numWriteStats:0 numReplaceFileIds:0
24/05/28 10:51:32 INFO BaseSparkCommitActionExecutor: Committing 20240528105131254, action Type deltacommit, operation Type UPSERT_PREPPED
24/05/28 10:51:32 INFO HoodieActiveTimeline: Marking instant complete [==>20240528105131254__deltacommit__INFLIGHT]
24/05/28 10:51:32 INFO HoodieActiveTimeline: Create new file for toInstant ?/home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/20240528105131254.deltacommit
24/05/28 10:51:32 INFO HoodieActiveTimeline: Completed [==>20240528105131254__deltacommit__INFLIGHT]
24/05/28 10:51:32 INFO BaseSparkCommitActionExecutor: Committed 20240528105131254
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:32 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieBackedTableMetadataWriter: Ignoring rollback of instant 20240528105055899 at 20240528105131254. The commit to rollback is not found in MDT
24/05/28 10:51:32 INFO BaseRollbackActionExecutor: Deleting instant=[==>20240528105055899__commit__REQUESTED__20240528105058119]
24/05/28 10:51:32 INFO HoodieActiveTimeline: Deleting instant [==>20240528105055899__commit__REQUESTED__20240528105058119]
24/05/28 10:51:32 INFO HoodieActiveTimeline: Removed instant [==>20240528105055899__commit__REQUESTED__20240528105058119]
24/05/28 10:51:32 INFO BaseRollbackActionExecutor: Deleted pending commit [==>20240528105055899__commit__REQUESTED__20240528105058119]
24/05/28 10:51:32 INFO HoodieActiveTimeline: Create new file for toInstant ?/home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/20240528105131254.rollback
24/05/28 10:51:32 INFO BaseRollbackActionExecutor: Rollback of Commits [20240528105055899] is complete
24/05/28 10:51:32 INFO BlockManager: Removing RDD 10
24/05/28 10:51:32 INFO BlockManager: Removing RDD 18
24/05/28 10:51:32 INFO BaseHoodieWriteClient: Generate a new instant time: 20240528105129280 action: commit
24/05/28 10:51:32 INFO HoodieActiveTimeline: Creating a new instant [==>20240528105129280__commit__REQUESTED]
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieBackedTableMetadataWriter: Async metadata indexing disabled and following partitions already initialized: [files]
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:32 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:32 INFO BaseHoodieClient: Embedded Timeline Server is disabled. Not starting timeline service
24/05/28 10:51:32 INFO BaseHoodieClient: Embedded Timeline Server is disabled. Not starting timeline service
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:32 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:32 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieBackedTableMetadataWriter: Latest deltacommit time found is 20240528105131254, running clean operations.
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:32 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:32 INFO BaseHoodieWriteClient: Cleaner started
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:32 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:32 INFO BaseHoodieWriteClient: Scheduling cleaning at instant time: 20240528105131254002
24/05/28 10:51:32 INFO FileSystemViewManager: Creating InMemory based view for basePath /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata.
24/05/28 10:51:32 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:32 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:32 INFO CleanPlanner: No earliest commit to retain. No need to scan partitions !!
24/05/28 10:51:32 INFO CleanPlanActionExecutor: Nothing to clean here. It is already clean
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:32 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:32 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:32 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:32 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:33 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:33 INFO HoodieBackedTableMetadataWriter: Cannot compact metadata table as there are 1 inflight instants in data table before latest deltacommit in metadata table: 20240528105131254. Inflight instants in data table: [[==>20240528105129280__commit__REQUESTED__20240528105132863]]
24/05/28 10:51:33 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:33 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:33 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:33 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:33 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:33 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:33 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:33 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:33 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:33 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:33 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:33 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:33 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:33 INFO HoodieTimelineArchiver: No Instants to archive
24/05/28 10:51:33 INFO HoodieBackedTableMetadataWriter: All the table services operations on MDT completed successfully
24/05/28 10:51:33 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:33 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:33 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:33 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:33 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:33 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:33 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:33 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:33 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:33 INFO FileSystemViewManager: Creating View Manager with storage type REMOTE_FIRST.
24/05/28 10:51:33 INFO FileSystemViewManager: Creating remote first table view
24/05/28 10:51:33 INFO AsyncCleanerService: The HoodieWriteClient is not configured to auto & async clean. Async clean service will not start.
24/05/28 10:51:33 INFO AsyncArchiveService: The HoodieWriteClient is not configured to auto & async archive. Async archive service will not start.
24/05/28 10:51:33 INFO SparkContext: Starting job: collect at HoodieJavaRDD.java:177
24/05/28 10:51:33 INFO FileInputFormat: Total input files to process : 2
24/05/28 10:51:33 INFO DAGScheduler: Registering RDD 20 (mapToPair at HoodieJavaRDD.java:149) as input to shuffle 3
24/05/28 10:51:33 INFO DAGScheduler: Registering RDD 26 (distinct at HoodieJavaRDD.java:157) as input to shuffle 2
24/05/28 10:51:33 INFO DAGScheduler: Got job 2 (collect at HoodieJavaRDD.java:177) with 9 output partitions
24/05/28 10:51:33 INFO DAGScheduler: Final stage: ResultStage 2 (collect at HoodieJavaRDD.java:177)
24/05/28 10:51:33 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 1)
24/05/28 10:51:33 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 1)
24/05/28 10:51:33 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[20] at mapToPair at HoodieJavaRDD.java:149), which has no missing parents
24/05/28 10:51:33 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 33.6 KiB, free 365.9 MiB)
24/05/28 10:51:33 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 15.7 KiB, free 365.9 MiB)
24/05/28 10:51:33 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 15.7 KiB, free: 366.3 MiB)
24/05/28 10:51:33 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:33 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[20] at mapToPair at HoodieJavaRDD.java:149) (first 15 tasks are for partitions Vector(0, 1))
24/05/28 10:51:33 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks resource profile 0
24/05/28 10:51:33 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4519 bytes) taskResourceAssignments Map()
24/05/28 10:51:33 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
24/05/28 10:51:34 INFO HadoopRDD: Input split: file:/home/ec2-user/testcases/stocks/data/source/batch_2.json:0+363815
24/05/28 10:51:34 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1259 bytes result sent to driver
24/05/28 10:51:34 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 1, PROCESS_LOCAL, 4519 bytes) taskResourceAssignments Map()
24/05/28 10:51:34 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
24/05/28 10:51:34 INFO HadoopRDD: Input split: file:/home/ec2-user/testcases/stocks/data/source/batch_1.json:0+759994
24/05/28 10:51:34 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1308 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/2)
24/05/28 10:51:35 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 1216 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 263 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (2/2)
24/05/28 10:51:35 INFO DAGScheduler: ShuffleMapStage 0 (mapToPair at HoodieJavaRDD.java:149) finished in 1.740 s
24/05/28 10:51:35 INFO DAGScheduler: looking for newly runnable stages
24/05/28 10:51:35 INFO DAGScheduler: running: Set()
24/05/28 10:51:35 INFO DAGScheduler: waiting: Set(ShuffleMapStage 1, ResultStage 2)
24/05/28 10:51:35 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
24/05/28 10:51:35 INFO DAGScheduler: failed: Set()
24/05/28 10:51:35 INFO DAGScheduler: Submitting ShuffleMapStage 1 (MapPartitionsRDD[26] at distinct at HoodieJavaRDD.java:157), which has no missing parents
24/05/28 10:51:35 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 27.1 KiB, free 365.8 MiB)
24/05/28 10:51:35 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 12.8 KiB, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 12.8 KiB, free: 366.2 MiB)
24/05/28 10:51:35 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:35 INFO DAGScheduler: Submitting 9 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[26] at distinct at HoodieJavaRDD.java:157) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8))
24/05/28 10:51:35 INFO TaskSchedulerImpl: Adding task set 1.0 with 9 tasks resource profile 0
24/05/28 10:51:35 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO Executor: Running task 0.0 in stage 1.0 (TID 2)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 2 (1186.0 B) non-empty blocks including 2 (1186.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 14 ms
24/05/28 10:51:35 INFO MemoryStore: Block rdd_22_0 stored as values in memory (estimated size 1342.0 B, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added rdd_22_0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 1342.0 B, free: 366.2 MiB)
24/05/28 10:51:35 INFO Executor: Finished task 0.0 in stage 1.0 (TID 2). 1431 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 1, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO Executor: Running task 1.0 in stage 1.0 (TID 3)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 2 (2006.0 B) non-empty blocks including 2 (2006.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 125 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/9)
24/05/28 10:51:35 INFO MemoryStore: Block rdd_22_1 stored as values in memory (estimated size 2.6 KiB, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added rdd_22_1 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 2.6 KiB, free: 366.2 MiB)
24/05/28 10:51:35 INFO Executor: Finished task 1.0 in stage 1.0 (TID 3). 1431 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 4) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 2, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO Executor: Running task 2.0 in stage 1.0 (TID 4)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 2 (1434.0 B) non-empty blocks including 2 (1434.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 38 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (2/9)
24/05/28 10:51:35 INFO MemoryStore: Block rdd_22_2 stored as values in memory (estimated size 1726.0 B, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added rdd_22_2 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 1726.0 B, free: 366.2 MiB)
24/05/28 10:51:35 INFO Executor: Finished task 2.0 in stage 1.0 (TID 4). 1431 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 5) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 3, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 4) in 26 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (3/9)
24/05/28 10:51:35 INFO Executor: Running task 3.0 in stage 1.0 (TID 5)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 2 (1578.0 B) non-empty blocks including 2 (1578.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO MemoryStore: Block rdd_22_3 stored as values in memory (estimated size 2.1 KiB, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added rdd_22_3 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 2.1 KiB, free: 366.2 MiB)
24/05/28 10:51:35 INFO Executor: Finished task 3.0 in stage 1.0 (TID 5). 1431 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 6) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 4, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 5) in 54 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (4/9)
24/05/28 10:51:35 INFO Executor: Running task 4.0 in stage 1.0 (TID 6)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 2 (1910.0 B) non-empty blocks including 2 (1910.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO MemoryStore: Block rdd_22_4 stored as values in memory (estimated size 2.4 KiB, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added rdd_22_4 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 2.4 KiB, free: 366.2 MiB)
24/05/28 10:51:35 INFO Executor: Finished task 4.0 in stage 1.0 (TID 6). 1431 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 7) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 5, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO Executor: Running task 5.0 in stage 1.0 (TID 7)
24/05/28 10:51:35 INFO TaskSetManager: Finished task 4.0 in stage 1.0 (TID 6) in 33 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (5/9)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 2 (2.3 KiB) non-empty blocks including 2 (2.3 KiB) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO MemoryStore: Block rdd_22_5 stored as values in memory (estimated size 3.2 KiB, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added rdd_22_5 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 3.2 KiB, free: 366.2 MiB)
24/05/28 10:51:35 INFO Executor: Finished task 5.0 in stage 1.0 (TID 7). 1431 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 6.0 in stage 1.0 (TID 8) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 6, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO Executor: Running task 6.0 in stage 1.0 (TID 8)
24/05/28 10:51:35 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 7) in 35 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (6/9)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 2 (1657.0 B) non-empty blocks including 2 (1657.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO MemoryStore: Block rdd_22_6 stored as values in memory (estimated size 2.1 KiB, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added rdd_22_6 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 2.1 KiB, free: 366.2 MiB)
24/05/28 10:51:35 INFO Executor: Finished task 6.0 in stage 1.0 (TID 8). 1431 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 7.0 in stage 1.0 (TID 9) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 7, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO Executor: Running task 7.0 in stage 1.0 (TID 9)
24/05/28 10:51:35 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 8) in 38 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (7/9)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 2 (1304.0 B) non-empty blocks including 2 (1304.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO MemoryStore: Block rdd_22_7 stored as values in memory (estimated size 1543.0 B, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added rdd_22_7 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 1543.0 B, free: 366.2 MiB)
24/05/28 10:51:35 INFO Executor: Finished task 7.0 in stage 1.0 (TID 9). 1431 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 8.0 in stage 1.0 (TID 10) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 8, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 9) in 36 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (8/9)
24/05/28 10:51:35 INFO Executor: Running task 8.0 in stage 1.0 (TID 10)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 2 (1434.0 B) non-empty blocks including 2 (1434.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO MemoryStore: Block rdd_22_8 stored as values in memory (estimated size 1719.0 B, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added rdd_22_8 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 1719.0 B, free: 366.2 MiB)
24/05/28 10:51:35 INFO Executor: Finished task 8.0 in stage 1.0 (TID 10). 1431 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Finished task 8.0 in stage 1.0 (TID 10) in 30 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (9/9)
24/05/28 10:51:35 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
24/05/28 10:51:35 INFO DAGScheduler: ShuffleMapStage 1 (distinct at HoodieJavaRDD.java:157) finished in 0.404 s
24/05/28 10:51:35 INFO DAGScheduler: looking for newly runnable stages
24/05/28 10:51:35 INFO DAGScheduler: running: Set()
24/05/28 10:51:35 INFO DAGScheduler: waiting: Set(ResultStage 2)
24/05/28 10:51:35 INFO DAGScheduler: failed: Set()
24/05/28 10:51:35 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[28] at distinct at HoodieJavaRDD.java:157), which has no missing parents
24/05/28 10:51:35 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 6.3 KiB, free 365.8 MiB)
24/05/28 10:51:35 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 3.5 KiB, free 365.8 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 3.5 KiB, free: 366.2 MiB)
24/05/28 10:51:35 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:35 INFO DAGScheduler: Submitting 9 missing tasks from ResultStage 2 (MapPartitionsRDD[28] at distinct at HoodieJavaRDD.java:157) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8))
24/05/28 10:51:35 INFO TaskSchedulerImpl: Adding task set 2.0 with 9 tasks resource profile 0
24/05/28 10:51:35 INFO TaskSetManager: Starting task 6.0 in stage 2.0 (TID 11) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 6, NODE_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO Executor: Running task 6.0 in stage 2.0 (TID 11)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 9 (540.0 B) non-empty blocks including 9 (540.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO Executor: Finished task 6.0 in stage 2.0 (TID 11). 1247 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Finished task 6.0 in stage 2.0 (TID 11) in 19 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/9)
24/05/28 10:51:35 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 12) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO Executor: Running task 0.0 in stage 2.0 (TID 12)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:35 INFO Executor: Finished task 0.0 in stage 2.0 (TID 12). 1235 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 13) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 1, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 12) in 8 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (2/9)
24/05/28 10:51:35 INFO Executor: Running task 1.0 in stage 2.0 (TID 13)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:35 INFO Executor: Finished task 1.0 in stage 2.0 (TID 13). 1235 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 2.0 in stage 2.0 (TID 14) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 2, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 13) in 12 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (3/9)
24/05/28 10:51:35 INFO Executor: Running task 2.0 in stage 2.0 (TID 14)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:35 INFO Executor: Finished task 2.0 in stage 2.0 (TID 14). 1235 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 3.0 in stage 2.0 (TID 15) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 3, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO TaskSetManager: Finished task 2.0 in stage 2.0 (TID 14) in 9 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (4/9)
24/05/28 10:51:35 INFO Executor: Running task 3.0 in stage 2.0 (TID 15)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:35 INFO Executor: Finished task 3.0 in stage 2.0 (TID 15). 1235 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 4.0 in stage 2.0 (TID 16) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 4, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO TaskSetManager: Finished task 3.0 in stage 2.0 (TID 15) in 12 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (5/9)
24/05/28 10:51:35 INFO Executor: Running task 4.0 in stage 2.0 (TID 16)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO Executor: Finished task 4.0 in stage 2.0 (TID 16). 1235 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 5.0 in stage 2.0 (TID 17) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 5, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO TaskSetManager: Finished task 4.0 in stage 2.0 (TID 16) in 17 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (6/9)
24/05/28 10:51:35 INFO Executor: Running task 5.0 in stage 2.0 (TID 17)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:35 INFO Executor: Finished task 5.0 in stage 2.0 (TID 17). 1235 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 7.0 in stage 2.0 (TID 18) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 7, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO TaskSetManager: Finished task 5.0 in stage 2.0 (TID 17) in 11 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (7/9)
24/05/28 10:51:35 INFO Executor: Running task 7.0 in stage 2.0 (TID 18)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:35 INFO Executor: Finished task 7.0 in stage 2.0 (TID 18). 1235 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Starting task 8.0 in stage 2.0 (TID 19) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 8, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO TaskSetManager: Finished task 7.0 in stage 2.0 (TID 18) in 8 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (8/9)
24/05/28 10:51:35 INFO Executor: Running task 8.0 in stage 2.0 (TID 19)
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:35 INFO Executor: Finished task 8.0 in stage 2.0 (TID 19). 1235 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Finished task 8.0 in stage 2.0 (TID 19) in 11 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (9/9)
24/05/28 10:51:35 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool
24/05/28 10:51:35 INFO DAGScheduler: ResultStage 2 (collect at HoodieJavaRDD.java:177) finished in 0.120 s
24/05/28 10:51:35 INFO DAGScheduler: Job 2 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:35 INFO TaskSchedulerImpl: Killing all running tasks in stage 2: Stage finished
24/05/28 10:51:35 INFO DAGScheduler: Job 2 finished: collect at HoodieJavaRDD.java:177, took 2.483629 s
24/05/28 10:51:35 INFO SparkContext: Starting job: collect at HoodieSparkEngineContext.java:150
24/05/28 10:51:35 INFO DAGScheduler: Got job 3 (collect at HoodieSparkEngineContext.java:150) with 1 output partitions
24/05/28 10:51:35 INFO DAGScheduler: Final stage: ResultStage 3 (collect at HoodieSparkEngineContext.java:150)
24/05/28 10:51:35 INFO DAGScheduler: Parents of final stage: List()
24/05/28 10:51:35 INFO DAGScheduler: Missing parents: List()
24/05/28 10:51:35 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[30] at flatMap at HoodieSparkEngineContext.java:150), which has no missing parents
24/05/28 10:51:35 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 266.2 KiB, free 365.5 MiB)
24/05/28 10:51:35 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 95.3 KiB, free 365.4 MiB)
24/05/28 10:51:35 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 95.3 KiB, free: 366.1 MiB)
24/05/28 10:51:35 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:35 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[30] at flatMap at HoodieSparkEngineContext.java:150) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:35 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks resource profile 0
24/05/28 10:51:35 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 20) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4344 bytes) taskResourceAssignments Map()
24/05/28 10:51:35 INFO Executor: Running task 0.0 in stage 3.0 (TID 20)
24/05/28 10:51:35 INFO Executor: Finished task 0.0 in stage 3.0 (TID 20). 805 bytes result sent to driver
24/05/28 10:51:35 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 20) in 90 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:35 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool
24/05/28 10:51:35 INFO DAGScheduler: ResultStage 3 (collect at HoodieSparkEngineContext.java:150) finished in 0.137 s
24/05/28 10:51:35 INFO DAGScheduler: Job 3 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:35 INFO TaskSchedulerImpl: Killing all running tasks in stage 3: Stage finished
24/05/28 10:51:35 INFO DAGScheduler: Job 3 finished: collect at HoodieSparkEngineContext.java:150, took 0.142853 s
24/05/28 10:51:35 INFO MapPartitionsRDD: Removing RDD 22 from persistence list
24/05/28 10:51:35 INFO AbstractTableFileSystemView: Took 1 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:35 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:35 INFO BlockManager: Removing RDD 22
24/05/28 10:51:36 INFO SparkContext: Starting job: countByKey at HoodieJavaPairRDD.java:105
24/05/28 10:51:36 INFO DAGScheduler: Registering RDD 23 (mapToPair at HoodieJavaRDD.java:149) as input to shuffle 4
24/05/28 10:51:36 INFO DAGScheduler: Registering RDD 33 (mapToPair at HoodieJavaRDD.java:149) as input to shuffle 5
24/05/28 10:51:36 INFO DAGScheduler: Registering RDD 41 (countByKey at HoodieJavaPairRDD.java:105) as input to shuffle 6
24/05/28 10:51:36 INFO DAGScheduler: Got job 4 (countByKey at HoodieJavaPairRDD.java:105) with 9 output partitions
24/05/28 10:51:36 INFO DAGScheduler: Final stage: ResultStage 8 (countByKey at HoodieJavaPairRDD.java:105)
24/05/28 10:51:36 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 7)
24/05/28 10:51:36 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 7)
24/05/28 10:51:36 INFO DAGScheduler: Submitting ShuffleMapStage 5 (MapPartitionsRDD[23] at mapToPair at HoodieJavaRDD.java:149), which has no missing parents
24/05/28 10:51:36 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 26.0 KiB, free 365.4 MiB)
24/05/28 10:51:36 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 12.4 KiB, free 365.4 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 12.4 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:36 INFO DAGScheduler: Submitting 9 missing tasks from ShuffleMapStage 5 (MapPartitionsRDD[23] at mapToPair at HoodieJavaRDD.java:149) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8))
24/05/28 10:51:36 INFO TaskSchedulerImpl: Adding task set 5.0 with 9 tasks resource profile 0
24/05/28 10:51:36 INFO DAGScheduler: Submitting ShuffleMapStage 6 (MapPartitionsRDD[33] at mapToPair at HoodieJavaRDD.java:149), which has no missing parents
24/05/28 10:51:36 INFO TaskSetManager: Starting task 0.0 in stage 5.0 (TID 21) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 0.0 in stage 5.0 (TID 21)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 2 (1186.0 B) non-empty blocks including 2 (1186.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 2 ms
24/05/28 10:51:36 INFO MemoryStore: Block broadcast_6 stored as values in memory (estimated size 268.2 KiB, free 365.2 MiB)
24/05/28 10:51:36 INFO MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 96.7 KiB, free 365.1 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 96.7 KiB, free: 366.0 MiB)
24/05/28 10:51:36 INFO SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:36 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 6 (MapPartitionsRDD[33] at mapToPair at HoodieJavaRDD.java:149) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:36 INFO TaskSchedulerImpl: Adding task set 6.0 with 1 tasks resource profile 0
24/05/28 10:51:36 INFO Executor: Finished task 0.0 in stage 5.0 (TID 21). 1474 bytes result sent to driver
24/05/28 10:51:36 INFO BlockManagerInfo: Removed broadcast_4_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 95.3 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO TaskSetManager: Starting task 1.0 in stage 5.0 (TID 22) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 1, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 0.0 in stage 5.0 (TID 21) in 133 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/9)
24/05/28 10:51:36 INFO Executor: Running task 1.0 in stage 5.0 (TID 22)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 2 (2006.0 B) non-empty blocks including 2 (2006.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 14 ms
24/05/28 10:51:36 INFO BlockManagerInfo: Removed broadcast_2_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 12.8 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 1.0 in stage 5.0 (TID 22). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 2.0 in stage 5.0 (TID 23) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 2, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 2.0 in stage 5.0 (TID 23)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 1.0 in stage 5.0 (TID 22) in 45 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (2/9)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 2 (1434.0 B) non-empty blocks including 2 (1434.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO BlockManagerInfo: Removed broadcast_3_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 3.5 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 2.0 in stage 5.0 (TID 23). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 3.0 in stage 5.0 (TID 24) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 3, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 2.0 in stage 5.0 (TID 23) in 27 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (3/9)
24/05/28 10:51:36 INFO Executor: Running task 3.0 in stage 5.0 (TID 24)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 2 (1578.0 B) non-empty blocks including 2 (1578.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 3.0 in stage 5.0 (TID 24). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 4.0 in stage 5.0 (TID 25) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 4, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 3.0 in stage 5.0 (TID 24) in 36 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (4/9)
24/05/28 10:51:36 INFO Executor: Running task 4.0 in stage 5.0 (TID 25)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 2 (1910.0 B) non-empty blocks including 2 (1910.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 4.0 in stage 5.0 (TID 25). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 5.0 in stage 5.0 (TID 26) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 5, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 5.0 in stage 5.0 (TID 26)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 4.0 in stage 5.0 (TID 25) in 22 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (5/9)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 2 (2.3 KiB) non-empty blocks including 2 (2.3 KiB) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 5.0 in stage 5.0 (TID 26). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 6.0 in stage 5.0 (TID 27) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 6, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 6.0 in stage 5.0 (TID 27)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 5.0 in stage 5.0 (TID 26) in 22 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (6/9)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 2 (1657.0 B) non-empty blocks including 2 (1657.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 6.0 in stage 5.0 (TID 27). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 7.0 in stage 5.0 (TID 28) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 7, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 6.0 in stage 5.0 (TID 27) in 18 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (7/9)
24/05/28 10:51:36 INFO Executor: Running task 7.0 in stage 5.0 (TID 28)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 2 (1304.0 B) non-empty blocks including 2 (1304.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 7.0 in stage 5.0 (TID 28). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 8.0 in stage 5.0 (TID 29) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 8, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 7.0 in stage 5.0 (TID 28) in 18 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (8/9)
24/05/28 10:51:36 INFO Executor: Running task 8.0 in stage 5.0 (TID 29)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 2 (1434.0 B) non-empty blocks including 2 (1434.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
24/05/28 10:51:36 INFO Executor: Finished task 8.0 in stage 5.0 (TID 29). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 0.0 in stage 6.0 (TID 30) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4321 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 0.0 in stage 6.0 (TID 30)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 8.0 in stage 5.0 (TID 29) in 18 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (9/9)
24/05/28 10:51:36 INFO TaskSchedulerImpl: Removed TaskSet 5.0, whose tasks have all completed, from pool
24/05/28 10:51:36 INFO DAGScheduler: ShuffleMapStage 5 (mapToPair at HoodieJavaRDD.java:149) finished in 0.350 s
24/05/28 10:51:36 INFO DAGScheduler: looking for newly runnable stages
24/05/28 10:51:36 INFO DAGScheduler: running: Set(ShuffleMapStage 6)
24/05/28 10:51:36 INFO DAGScheduler: waiting: Set(ShuffleMapStage 7, ResultStage 8)
24/05/28 10:51:36 INFO DAGScheduler: failed: Set()
24/05/28 10:51:36 INFO Executor: Finished task 0.0 in stage 6.0 (TID 30). 872 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Finished task 0.0 in stage 6.0 (TID 30) in 28 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:36 INFO TaskSchedulerImpl: Removed TaskSet 6.0, whose tasks have all completed, from pool
24/05/28 10:51:36 INFO DAGScheduler: ShuffleMapStage 6 (mapToPair at HoodieJavaRDD.java:149) finished in 0.360 s
24/05/28 10:51:36 INFO DAGScheduler: looking for newly runnable stages
24/05/28 10:51:36 INFO DAGScheduler: running: Set()
24/05/28 10:51:36 INFO DAGScheduler: waiting: Set(ShuffleMapStage 7, ResultStage 8)
24/05/28 10:51:36 INFO DAGScheduler: failed: Set()
24/05/28 10:51:36 INFO DAGScheduler: Submitting ShuffleMapStage 7 (MapPartitionsRDD[41] at countByKey at HoodieJavaPairRDD.java:105), which has no missing parents
24/05/28 10:51:36 INFO MemoryStore: Block broadcast_7 stored as values in memory (estimated size 9.9 KiB, free 365.5 MiB)
24/05/28 10:51:36 INFO MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 5.1 KiB, free 365.5 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added broadcast_7_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 5.1 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:36 INFO DAGScheduler: Submitting 9 missing tasks from ShuffleMapStage 7 (MapPartitionsRDD[41] at countByKey at HoodieJavaPairRDD.java:105) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8))
24/05/28 10:51:36 INFO TaskSchedulerImpl: Adding task set 7.0 with 9 tasks resource profile 0
24/05/28 10:51:36 INFO TaskSetManager: Starting task 0.0 in stage 7.0 (TID 31) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 0.0 in stage 7.0 (TID 31)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 1 (593.0 B) non-empty blocks including 1 (593.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO MemoryStore: Block rdd_39_0 stored as values in memory (estimated size 1342.0 B, free 365.5 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added rdd_39_0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 1342.0 B, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 0.0 in stage 7.0 (TID 31). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 1.0 in stage 7.0 (TID 32) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 1, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 0.0 in stage 7.0 (TID 31) in 36 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/9)
24/05/28 10:51:36 INFO Executor: Running task 1.0 in stage 7.0 (TID 32)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 1 (955.0 B) non-empty blocks including 1 (955.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO MemoryStore: Block rdd_39_1 stored as values in memory (estimated size 2.6 KiB, free 365.4 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added rdd_39_1 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 2.6 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 1.0 in stage 7.0 (TID 32). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 2.0 in stage 7.0 (TID 33) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 2, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 2.0 in stage 7.0 (TID 33)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 1.0 in stage 7.0 (TID 32) in 23 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (2/9)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 1 (717.0 B) non-empty blocks including 1 (717.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO MemoryStore: Block rdd_39_2 stored as values in memory (estimated size 1726.0 B, free 365.4 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added rdd_39_2 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 1726.0 B, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 2.0 in stage 7.0 (TID 33). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 3.0 in stage 7.0 (TID 34) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 3, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 3.0 in stage 7.0 (TID 34)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 2.0 in stage 7.0 (TID 33) in 36 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (3/9)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 1 (789.0 B) non-empty blocks including 1 (789.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 9 ms
24/05/28 10:51:36 INFO MemoryStore: Block rdd_39_3 stored as values in memory (estimated size 2.1 KiB, free 365.4 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added rdd_39_3 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 2.1 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 3.0 in stage 7.0 (TID 34). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 4.0 in stage 7.0 (TID 35) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 4, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 3.0 in stage 7.0 (TID 34) in 40 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (4/9)
24/05/28 10:51:36 INFO Executor: Running task 4.0 in stage 7.0 (TID 35)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 1 (955.0 B) non-empty blocks including 1 (955.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO MemoryStore: Block rdd_39_4 stored as values in memory (estimated size 2.4 KiB, free 365.4 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added rdd_39_4 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 2.4 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 4.0 in stage 7.0 (TID 35). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 5.0 in stage 7.0 (TID 36) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 5, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 5.0 in stage 7.0 (TID 36)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 4.0 in stage 7.0 (TID 35) in 34 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (5/9)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 1 (1271.0 B) non-empty blocks including 1 (1271.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO MemoryStore: Block rdd_39_5 stored as values in memory (estimated size 3.2 KiB, free 365.4 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added rdd_39_5 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 3.2 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 5.0 in stage 7.0 (TID 36). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 6.0 in stage 7.0 (TID 37) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 6, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 5.0 in stage 7.0 (TID 36) in 30 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (6/9)
24/05/28 10:51:36 INFO Executor: Running task 6.0 in stage 7.0 (TID 37)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 1 (868.0 B) non-empty blocks including 1 (868.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 3 ms
24/05/28 10:51:36 INFO MemoryStore: Block rdd_39_6 stored as values in memory (estimated size 2.1 KiB, free 365.4 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added rdd_39_6 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 2.1 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 6.0 in stage 7.0 (TID 37). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 7.0 in stage 7.0 (TID 38) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 7, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 6.0 in stage 7.0 (TID 37) in 23 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (7/9)
24/05/28 10:51:36 INFO Executor: Running task 7.0 in stage 7.0 (TID 38)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 1 (652.0 B) non-empty blocks including 1 (652.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO MemoryStore: Block rdd_39_7 stored as values in memory (estimated size 1543.0 B, free 365.4 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added rdd_39_7 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 1543.0 B, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 7.0 in stage 7.0 (TID 38). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 8.0 in stage 7.0 (TID 39) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 8, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 7.0 in stage 7.0 (TID 38) in 33 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (8/9)
24/05/28 10:51:36 INFO Executor: Running task 8.0 in stage 7.0 (TID 39)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 1 (717.0 B) non-empty blocks including 1 (717.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO MemoryStore: Block rdd_39_8 stored as values in memory (estimated size 1719.0 B, free 365.4 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added rdd_39_8 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 1719.0 B, free: 366.1 MiB)
24/05/28 10:51:36 INFO Executor: Finished task 8.0 in stage 7.0 (TID 39). 1431 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Finished task 8.0 in stage 7.0 (TID 39) in 24 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (9/9)
24/05/28 10:51:36 INFO TaskSchedulerImpl: Removed TaskSet 7.0, whose tasks have all completed, from pool
24/05/28 10:51:36 INFO DAGScheduler: ShuffleMapStage 7 (countByKey at HoodieJavaPairRDD.java:105) finished in 0.286 s
24/05/28 10:51:36 INFO DAGScheduler: looking for newly runnable stages
24/05/28 10:51:36 INFO DAGScheduler: running: Set()
24/05/28 10:51:36 INFO DAGScheduler: waiting: Set(ResultStage 8)
24/05/28 10:51:36 INFO DAGScheduler: failed: Set()
24/05/28 10:51:36 INFO DAGScheduler: Submitting ResultStage 8 (ShuffledRDD[42] at countByKey at HoodieJavaPairRDD.java:105), which has no missing parents
24/05/28 10:51:36 INFO MemoryStore: Block broadcast_8 stored as values in memory (estimated size 5.5 KiB, free 365.4 MiB)
24/05/28 10:51:36 INFO MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 3.2 KiB, free 365.4 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 3.2 KiB, free: 366.1 MiB)
24/05/28 10:51:36 INFO SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:36 INFO DAGScheduler: Submitting 9 missing tasks from ResultStage 8 (ShuffledRDD[42] at countByKey at HoodieJavaPairRDD.java:105) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8))
24/05/28 10:51:36 INFO TaskSchedulerImpl: Adding task set 8.0 with 9 tasks resource profile 0
24/05/28 10:51:36 INFO TaskSetManager: Starting task 6.0 in stage 8.0 (TID 40) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 6, NODE_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 6.0 in stage 8.0 (TID 40)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 9 (873.0 B) non-empty blocks including 9 (873.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 7 ms
24/05/28 10:51:36 INFO Executor: Finished task 6.0 in stage 8.0 (TID 40). 1292 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 0.0 in stage 8.0 (TID 41) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 0.0 in stage 8.0 (TID 41)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 6.0 in stage 8.0 (TID 40) in 28 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/9)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 0.0 in stage 8.0 (TID 41). 1235 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 1.0 in stage 8.0 (TID 42) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 1, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 0.0 in stage 8.0 (TID 41) in 7 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (2/9)
24/05/28 10:51:36 INFO Executor: Running task 1.0 in stage 8.0 (TID 42)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 1.0 in stage 8.0 (TID 42). 1235 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 2.0 in stage 8.0 (TID 43) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 2, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 2.0 in stage 8.0 (TID 43)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 1.0 in stage 8.0 (TID 42) in 6 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (3/9)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 2.0 in stage 8.0 (TID 43). 1235 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 3.0 in stage 8.0 (TID 44) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 3, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 2.0 in stage 8.0 (TID 43) in 5 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (4/9)
24/05/28 10:51:36 INFO Executor: Running task 3.0 in stage 8.0 (TID 44)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 3.0 in stage 8.0 (TID 44). 1235 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 4.0 in stage 8.0 (TID 45) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 4, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 4.0 in stage 8.0 (TID 45)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 3.0 in stage 8.0 (TID 44) in 6 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (5/9)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 4.0 in stage 8.0 (TID 45). 1235 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 5.0 in stage 8.0 (TID 46) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 5, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 5.0 in stage 8.0 (TID 46)
24/05/28 10:51:36 INFO TaskSetManager: Finished task 4.0 in stage 8.0 (TID 45) in 6 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (6/9)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 5.0 in stage 8.0 (TID 46). 1235 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 7.0 in stage 8.0 (TID 47) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 7, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 5.0 in stage 8.0 (TID 46) in 5 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (7/9)
24/05/28 10:51:36 INFO Executor: Running task 7.0 in stage 8.0 (TID 47)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:36 INFO Executor: Finished task 7.0 in stage 8.0 (TID 47). 1235 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Starting task 8.0 in stage 8.0 (TID 48) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 8, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO TaskSetManager: Finished task 7.0 in stage 8.0 (TID 47) in 5 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (8/9)
24/05/28 10:51:36 INFO Executor: Running task 8.0 in stage 8.0 (TID 48)
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:36 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 2 ms
24/05/28 10:51:36 INFO Executor: Finished task 8.0 in stage 8.0 (TID 48). 1235 bytes result sent to driver
24/05/28 10:51:36 INFO TaskSetManager: Finished task 8.0 in stage 8.0 (TID 48) in 14 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (9/9)
24/05/28 10:51:36 INFO TaskSchedulerImpl: Removed TaskSet 8.0, whose tasks have all completed, from pool
24/05/28 10:51:36 INFO DAGScheduler: ResultStage 8 (countByKey at HoodieJavaPairRDD.java:105) finished in 0.090 s
24/05/28 10:51:36 INFO DAGScheduler: Job 4 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:36 INFO TaskSchedulerImpl: Killing all running tasks in stage 8: Stage finished
24/05/28 10:51:36 INFO DAGScheduler: Job 4 finished: countByKey at HoodieJavaPairRDD.java:105, took 0.785811 s
24/05/28 10:51:36 INFO BaseSparkCommitActionExecutor: Source read and index timer 843
24/05/28 10:51:36 INFO UpsertPartitioner: AvgRecordSize => 1024
24/05/28 10:51:36 INFO SparkContext: Starting job: collectAsMap at UpsertPartitioner.java:285
24/05/28 10:51:36 INFO DAGScheduler: Got job 5 (collectAsMap at UpsertPartitioner.java:285) with 1 output partitions
24/05/28 10:51:36 INFO DAGScheduler: Final stage: ResultStage 9 (collectAsMap at UpsertPartitioner.java:285)
24/05/28 10:51:36 INFO DAGScheduler: Parents of final stage: List()
24/05/28 10:51:36 INFO DAGScheduler: Missing parents: List()
24/05/28 10:51:36 INFO DAGScheduler: Submitting ResultStage 9 (MapPartitionsRDD[44] at mapToPair at UpsertPartitioner.java:284), which has no missing parents
24/05/28 10:51:36 INFO MemoryStore: Block broadcast_9 stored as values in memory (estimated size 267.0 KiB, free 365.2 MiB)
24/05/28 10:51:36 INFO MemoryStore: Block broadcast_9_piece0 stored as bytes in memory (estimated size 95.6 KiB, free 365.1 MiB)
24/05/28 10:51:36 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 95.6 KiB, free: 366.0 MiB)
24/05/28 10:51:36 INFO SparkContext: Created broadcast 9 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:36 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 9 (MapPartitionsRDD[44] at mapToPair at UpsertPartitioner.java:284) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:36 INFO TaskSchedulerImpl: Adding task set 9.0 with 1 tasks resource profile 0
24/05/28 10:51:36 INFO TaskSetManager: Starting task 0.0 in stage 9.0 (TID 49) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4344 bytes) taskResourceAssignments Map()
24/05/28 10:51:36 INFO Executor: Running task 0.0 in stage 9.0 (TID 49)
24/05/28 10:51:37 INFO Executor: Finished task 0.0 in stage 9.0 (TID 49). 799 bytes result sent to driver
24/05/28 10:51:37 INFO TaskSetManager: Finished task 0.0 in stage 9.0 (TID 49) in 33 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:37 INFO TaskSchedulerImpl: Removed TaskSet 9.0, whose tasks have all completed, from pool
24/05/28 10:51:37 INFO DAGScheduler: ResultStage 9 (collectAsMap at UpsertPartitioner.java:285) finished in 0.082 s
24/05/28 10:51:37 INFO DAGScheduler: Job 5 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:37 INFO TaskSchedulerImpl: Killing all running tasks in stage 9: Stage finished
24/05/28 10:51:37 INFO DAGScheduler: Job 5 finished: collectAsMap at UpsertPartitioner.java:285, took 0.089095 s
24/05/28 10:51:37 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:37 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:37 INFO UpsertPartitioner: For partitionPath : 2018/08/31 Total Small Files => 0
24/05/28 10:51:37 INFO UpsertPartitioner: After small file assignment: unassignedInserts => 99, totalInsertBuckets => 1, recordsPerBucket => 122880
24/05/28 10:51:37 INFO UpsertPartitioner: Total insert buckets for partition path 2018/08/31 => [(InsertBucket {bucketNumber=0, weight=1.0},1.0)]
24/05/28 10:51:37 INFO UpsertPartitioner: Total Buckets: 1, bucketInfoMap size: 1, partitionPathToInsertBucketInfos size: 1, updateLocationToBucket size: 0
24/05/28 10:51:37 INFO HoodieActiveTimeline: Create new file for toInstant ?/home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/20240528105129280.inflight
24/05/28 10:51:37 INFO BaseSparkCommitActionExecutor: no validators configured.
24/05/28 10:51:37 INFO BaseCommitActionExecutor: Auto commit disabled for 20240528105129280
24/05/28 10:51:37 INFO SparkContext: Starting job: sum at StreamSync.java:848
24/05/28 10:51:37 INFO DAGScheduler: Registering RDD 45 (mapToPair at HoodieJavaRDD.java:149) as input to shuffle 7
24/05/28 10:51:37 INFO DAGScheduler: Got job 6 (sum at StreamSync.java:848) with 1 output partitions
24/05/28 10:51:37 INFO DAGScheduler: Final stage: ResultStage 14 (sum at StreamSync.java:848)
24/05/28 10:51:37 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 13)
24/05/28 10:51:37 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 13)
24/05/28 10:51:37 INFO DAGScheduler: Submitting ShuffleMapStage 13 (MapPartitionsRDD[45] at mapToPair at HoodieJavaRDD.java:149), which has no missing parents
24/05/28 10:51:37 INFO MemoryStore: Block broadcast_10 stored as values in memory (estimated size 272.1 KiB, free 364.8 MiB)
24/05/28 10:51:37 INFO MemoryStore: Block broadcast_10_piece0 stored as bytes in memory (estimated size 96.9 KiB, free 364.7 MiB)
24/05/28 10:51:37 INFO BlockManagerInfo: Added broadcast_10_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 96.9 KiB, free: 365.9 MiB)
24/05/28 10:51:37 INFO SparkContext: Created broadcast 10 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:37 INFO DAGScheduler: Submitting 9 missing tasks from ShuffleMapStage 13 (MapPartitionsRDD[45] at mapToPair at HoodieJavaRDD.java:149) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8))
24/05/28 10:51:37 INFO TaskSchedulerImpl: Adding task set 13.0 with 9 tasks resource profile 0
24/05/28 10:51:37 INFO TaskSetManager: Starting task 0.0 in stage 13.0 (TID 50) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:37 INFO Executor: Running task 0.0 in stage 13.0 (TID 50)
24/05/28 10:51:37 INFO BlockManager: Found block rdd_39_0 locally
24/05/28 10:51:37 INFO Executor: Finished task 0.0 in stage 13.0 (TID 50). 1079 bytes result sent to driver
24/05/28 10:51:37 INFO TaskSetManager: Starting task 1.0 in stage 13.0 (TID 51) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 1, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:37 INFO TaskSetManager: Finished task 0.0 in stage 13.0 (TID 50) in 47 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/9)
24/05/28 10:51:37 INFO Executor: Running task 1.0 in stage 13.0 (TID 51)
24/05/28 10:51:37 INFO BlockManager: Found block rdd_39_1 locally
24/05/28 10:51:37 INFO Executor: Finished task 1.0 in stage 13.0 (TID 51). 1079 bytes result sent to driver
24/05/28 10:51:37 INFO TaskSetManager: Starting task 2.0 in stage 13.0 (TID 52) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 2, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:37 INFO Executor: Running task 2.0 in stage 13.0 (TID 52)
24/05/28 10:51:37 INFO TaskSetManager: Finished task 1.0 in stage 13.0 (TID 51) in 41 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (2/9)
24/05/28 10:51:37 INFO BlockManager: Found block rdd_39_2 locally
24/05/28 10:51:37 INFO Executor: Finished task 2.0 in stage 13.0 (TID 52). 1079 bytes result sent to driver
24/05/28 10:51:37 INFO TaskSetManager: Starting task 3.0 in stage 13.0 (TID 53) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 3, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:37 INFO Executor: Running task 3.0 in stage 13.0 (TID 53)
24/05/28 10:51:37 INFO TaskSetManager: Finished task 2.0 in stage 13.0 (TID 52) in 36 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (3/9)
24/05/28 10:51:37 INFO BlockManager: Found block rdd_39_3 locally
24/05/28 10:51:37 INFO Executor: Finished task 3.0 in stage 13.0 (TID 53). 1079 bytes result sent to driver
24/05/28 10:51:37 INFO TaskSetManager: Starting task 4.0 in stage 13.0 (TID 54) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 4, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:37 INFO TaskSetManager: Finished task 3.0 in stage 13.0 (TID 53) in 32 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (4/9)
24/05/28 10:51:37 INFO Executor: Running task 4.0 in stage 13.0 (TID 54)
24/05/28 10:51:37 INFO BlockManager: Found block rdd_39_4 locally
24/05/28 10:51:37 INFO Executor: Finished task 4.0 in stage 13.0 (TID 54). 1079 bytes result sent to driver
24/05/28 10:51:37 INFO TaskSetManager: Starting task 5.0 in stage 13.0 (TID 55) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 5, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:37 INFO Executor: Running task 5.0 in stage 13.0 (TID 55)
24/05/28 10:51:37 INFO TaskSetManager: Finished task 4.0 in stage 13.0 (TID 54) in 33 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (5/9)
24/05/28 10:51:37 INFO BlockManager: Found block rdd_39_5 locally
24/05/28 10:51:37 INFO Executor: Finished task 5.0 in stage 13.0 (TID 55). 1079 bytes result sent to driver
24/05/28 10:51:37 INFO TaskSetManager: Starting task 6.0 in stage 13.0 (TID 56) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 6, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:37 INFO TaskSetManager: Finished task 5.0 in stage 13.0 (TID 55) in 24 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (6/9)
24/05/28 10:51:37 INFO Executor: Running task 6.0 in stage 13.0 (TID 56)
24/05/28 10:51:37 INFO BlockManager: Found block rdd_39_6 locally
24/05/28 10:51:37 INFO Executor: Finished task 6.0 in stage 13.0 (TID 56). 1079 bytes result sent to driver
24/05/28 10:51:37 INFO TaskSetManager: Starting task 7.0 in stage 13.0 (TID 57) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 7, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:37 INFO TaskSetManager: Finished task 6.0 in stage 13.0 (TID 56) in 40 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (7/9)
24/05/28 10:51:37 INFO Executor: Running task 7.0 in stage 13.0 (TID 57)
24/05/28 10:51:37 INFO BlockManager: Found block rdd_39_7 locally
24/05/28 10:51:37 INFO Executor: Finished task 7.0 in stage 13.0 (TID 57). 1079 bytes result sent to driver
24/05/28 10:51:37 INFO TaskSetManager: Starting task 8.0 in stage 13.0 (TID 58) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 8, PROCESS_LOCAL, 4323 bytes) taskResourceAssignments Map()
24/05/28 10:51:37 INFO TaskSetManager: Finished task 7.0 in stage 13.0 (TID 57) in 24 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (8/9)
24/05/28 10:51:37 INFO Executor: Running task 8.0 in stage 13.0 (TID 58)
24/05/28 10:51:37 INFO BlockManager: Found block rdd_39_8 locally
24/05/28 10:51:37 INFO Executor: Finished task 8.0 in stage 13.0 (TID 58). 1079 bytes result sent to driver
24/05/28 10:51:37 INFO TaskSetManager: Finished task 8.0 in stage 13.0 (TID 58) in 24 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (9/9)
24/05/28 10:51:37 INFO TaskSchedulerImpl: Removed TaskSet 13.0, whose tasks have all completed, from pool
24/05/28 10:51:37 INFO DAGScheduler: ShuffleMapStage 13 (mapToPair at HoodieJavaRDD.java:149) finished in 0.344 s
24/05/28 10:51:37 INFO DAGScheduler: looking for newly runnable stages
24/05/28 10:51:37 INFO DAGScheduler: running: Set()
24/05/28 10:51:37 INFO DAGScheduler: waiting: Set(ResultStage 14)
24/05/28 10:51:37 INFO DAGScheduler: failed: Set()
24/05/28 10:51:37 INFO DAGScheduler: Submitting ResultStage 14 (MapPartitionsRDD[50] at mapToDouble at StreamSync.java:848), which has no missing parents
24/05/28 10:51:37 INFO MemoryStore: Block broadcast_11 stored as values in memory (estimated size 280.9 KiB, free 364.4 MiB)
24/05/28 10:51:37 INFO MemoryStore: Block broadcast_11_piece0 stored as bytes in memory (estimated size 102.3 KiB, free 364.3 MiB)
24/05/28 10:51:37 INFO BlockManagerInfo: Added broadcast_11_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 102.3 KiB, free: 365.8 MiB)
24/05/28 10:51:37 INFO SparkContext: Created broadcast 11 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:37 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 14 (MapPartitionsRDD[50] at mapToDouble at StreamSync.java:848) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:37 INFO TaskSchedulerImpl: Adding task set 14.0 with 1 tasks resource profile 0
24/05/28 10:51:37 INFO TaskSetManager: Starting task 0.0 in stage 14.0 (TID 59) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, NODE_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:37 INFO Executor: Running task 0.0 in stage 14.0 (TID 59)
24/05/28 10:51:37 INFO ShuffleBlockFetcherIterator: Getting 9 (7.8 KiB) non-empty blocks including 9 (7.8 KiB) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:37 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:37 INFO SimpleExecutor: Starting consumer, consuming records from the records iterator directly
24/05/28 10:51:37 INFO MarkerHandler: Request: create marker: 2018/08/31/782f2897-6acb-4819-ad56-ea707f60a39e-0_0-14-59_20240528105129280.parquet.marker.CREATE
24/05/28 10:51:37 INFO TimelineServerBasedWriteMarkers: [timeline-server-based] Created marker file 2018/08/31/782f2897-6acb-4819-ad56-ea707f60a39e-0_0-14-59_20240528105129280.parquet.marker.CREATE in 90 ms
24/05/28 10:51:37 INFO CodecPool: Got brand-new compressor [.snappy]
24/05/28 10:51:38 INFO HoodieCreateHandle: New CreateHandle for partition :2018/08/31 with fileId 782f2897-6acb-4819-ad56-ea707f60a39e-0
24/05/28 10:51:38 INFO HoodieCreateHandle: Closing the file 782f2897-6acb-4819-ad56-ea707f60a39e-0 as we are done with all the records 99
24/05/28 10:51:38 INFO BlockManagerInfo: Removed broadcast_9_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 95.6 KiB, free: 365.9 MiB)
24/05/28 10:51:38 INFO BlockManagerInfo: Removed broadcast_7_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 5.1 KiB, free: 365.9 MiB)
24/05/28 10:51:38 INFO BlockManagerInfo: Removed broadcast_8_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 3.2 KiB, free: 365.9 MiB)
24/05/28 10:51:38 INFO BlockManagerInfo: Removed broadcast_6_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 96.7 KiB, free: 366.0 MiB)
24/05/28 10:51:38 INFO BlockManagerInfo: Removed broadcast_10_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 96.9 KiB, free: 366.1 MiB)
24/05/28 10:51:38 INFO BlockManagerInfo: Removed broadcast_5_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 12.4 KiB, free: 366.1 MiB)
24/05/28 10:51:38 INFO HoodieCreateHandle: CreateHandle for partitionPath 2018/08/31 fileID 782f2897-6acb-4819-ad56-ea707f60a39e-0, took 1060 ms.
24/05/28 10:51:38 INFO MemoryStore: Block rdd_49_0 stored as values in memory (estimated size 376.0 B, free 365.5 MiB)
24/05/28 10:51:38 INFO BlockManagerInfo: Added rdd_49_0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 376.0 B, free: 366.1 MiB)
24/05/28 10:51:38 INFO Executor: Finished task 0.0 in stage 14.0 (TID 59). 1154 bytes result sent to driver
24/05/28 10:51:38 INFO TaskSetManager: Finished task 0.0 in stage 14.0 (TID 59) in 1127 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:38 INFO TaskSchedulerImpl: Removed TaskSet 14.0, whose tasks have all completed, from pool
24/05/28 10:51:38 INFO DAGScheduler: ResultStage 14 (sum at StreamSync.java:848) finished in 1.165 s
24/05/28 10:51:38 INFO DAGScheduler: Job 6 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:38 INFO TaskSchedulerImpl: Killing all running tasks in stage 14: Stage finished
24/05/28 10:51:38 INFO DAGScheduler: Job 6 finished: sum at StreamSync.java:848, took 1.531241 s
24/05/28 10:51:38 INFO SparkContext: Starting job: sum at StreamSync.java:849
24/05/28 10:51:38 INFO DAGScheduler: Got job 7 (sum at StreamSync.java:849) with 1 output partitions
24/05/28 10:51:38 INFO DAGScheduler: Final stage: ResultStage 19 (sum at StreamSync.java:849)
24/05/28 10:51:38 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 18)
24/05/28 10:51:38 INFO DAGScheduler: Missing parents: List()
24/05/28 10:51:38 INFO DAGScheduler: Submitting ResultStage 19 (MapPartitionsRDD[52] at mapToDouble at StreamSync.java:849), which has no missing parents
24/05/28 10:51:38 INFO MemoryStore: Block broadcast_12 stored as values in memory (estimated size 280.9 KiB, free 365.2 MiB)
24/05/28 10:51:38 INFO MemoryStore: Block broadcast_12_piece0 stored as bytes in memory (estimated size 102.3 KiB, free 365.1 MiB)
24/05/28 10:51:38 INFO BlockManagerInfo: Added broadcast_12_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 102.3 KiB, free: 366.0 MiB)
24/05/28 10:51:38 INFO SparkContext: Created broadcast 12 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:38 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 19 (MapPartitionsRDD[52] at mapToDouble at StreamSync.java:849) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:38 INFO TaskSchedulerImpl: Adding task set 19.0 with 1 tasks resource profile 0
24/05/28 10:51:38 INFO TaskSetManager: Starting task 0.0 in stage 19.0 (TID 60) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:38 INFO Executor: Running task 0.0 in stage 19.0 (TID 60)
24/05/28 10:51:38 INFO BlockManager: Found block rdd_49_0 locally
24/05/28 10:51:38 INFO Executor: Finished task 0.0 in stage 19.0 (TID 60). 896 bytes result sent to driver
24/05/28 10:51:38 INFO TaskSetManager: Finished task 0.0 in stage 19.0 (TID 60) in 22 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:38 INFO TaskSchedulerImpl: Removed TaskSet 19.0, whose tasks have all completed, from pool
24/05/28 10:51:38 INFO DAGScheduler: ResultStage 19 (sum at StreamSync.java:849) finished in 0.057 s
24/05/28 10:51:38 INFO DAGScheduler: Job 7 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:38 INFO TaskSchedulerImpl: Killing all running tasks in stage 19: Stage finished
24/05/28 10:51:38 INFO DAGScheduler: Job 7 finished: sum at StreamSync.java:849, took 0.066974 s
24/05/28 10:51:38 INFO StreamSync: instantTime=20240528105129280, totalRecords=99, totalErrorRecords=0, totalSuccessfulRecords=99
24/05/28 10:51:38 INFO SparkContext: Starting job: collect at SparkRDDWriteClient.java:107
24/05/28 10:51:38 INFO DAGScheduler: Got job 8 (collect at SparkRDDWriteClient.java:107) with 1 output partitions
24/05/28 10:51:38 INFO DAGScheduler: Final stage: ResultStage 24 (collect at SparkRDDWriteClient.java:107)
24/05/28 10:51:38 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 23)
24/05/28 10:51:38 INFO DAGScheduler: Missing parents: List()
24/05/28 10:51:38 INFO DAGScheduler: Submitting ResultStage 24 (MapPartitionsRDD[54] at map at SparkRDDWriteClient.java:107), which has no missing parents
24/05/28 10:51:38 INFO MemoryStore: Block broadcast_13 stored as values in memory (estimated size 281.0 KiB, free 364.8 MiB)
24/05/28 10:51:38 INFO MemoryStore: Block broadcast_13_piece0 stored as bytes in memory (estimated size 102.3 KiB, free 364.7 MiB)
24/05/28 10:51:38 INFO BlockManagerInfo: Added broadcast_13_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 102.3 KiB, free: 365.9 MiB)
24/05/28 10:51:38 INFO SparkContext: Created broadcast 13 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:38 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 24 (MapPartitionsRDD[54] at map at SparkRDDWriteClient.java:107) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:38 INFO TaskSchedulerImpl: Adding task set 24.0 with 1 tasks resource profile 0
24/05/28 10:51:38 INFO TaskSetManager: Starting task 0.0 in stage 24.0 (TID 61) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:38 INFO Executor: Running task 0.0 in stage 24.0 (TID 61)
24/05/28 10:51:38 INFO BlockManager: Found block rdd_49_0 locally
24/05/28 10:51:38 INFO Executor: Finished task 0.0 in stage 24.0 (TID 61). 1170 bytes result sent to driver
24/05/28 10:51:38 INFO TaskSetManager: Finished task 0.0 in stage 24.0 (TID 61) in 25 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:38 INFO TaskSchedulerImpl: Removed TaskSet 24.0, whose tasks have all completed, from pool
24/05/28 10:51:38 INFO DAGScheduler: ResultStage 24 (collect at SparkRDDWriteClient.java:107) finished in 0.072 s
24/05/28 10:51:38 INFO DAGScheduler: Job 8 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:38 INFO TaskSchedulerImpl: Killing all running tasks in stage 24: Stage finished
24/05/28 10:51:38 INFO DAGScheduler: Job 8 finished: collect at SparkRDDWriteClient.java:107, took 0.079586 s
24/05/28 10:51:38 INFO BaseHoodieWriteClient: Committing 20240528105129280 action commit
24/05/28 10:51:38 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:38 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:38 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:38 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:38 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:38 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:38 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:38 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:38 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:38 INFO FileSystemViewManager: Creating View Manager with storage type REMOTE_FIRST.
24/05/28 10:51:38 INFO FileSystemViewManager: Creating remote first table view
24/05/28 10:51:38 INFO CommitUtils: Creating metadata for UPSERT numWriteStats:1 numReplaceFileIds:0
24/05/28 10:51:38 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:38 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:38 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:38 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:38 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:38 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:38 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:38 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:38 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:38 INFO FileSystemViewManager: Creating View Manager with storage type REMOTE_FIRST.
24/05/28 10:51:38 INFO FileSystemViewManager: Creating remote first table view
24/05/28 10:51:38 INFO BaseHoodieWriteClient: Committing 20240528105129280 action commit
24/05/28 10:51:38 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:38 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:38 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:38 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:38 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:39 INFO HoodieBackedTableMetadataWriter: Async metadata indexing disabled and following partitions already initialized: [files]
24/05/28 10:51:39 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:39 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:39 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:39 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:39 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:39 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:39 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:39 INFO HoodieTableMetadataUtil: Updating at 20240528105129280 from Commit/UPSERT. #partitions_updated=2, #files_added=1
24/05/28 10:51:39 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:39 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:39 INFO HoodieTableMetadataUtil: Loading latest file slices for metadata table partition files
24/05/28 10:51:39 INFO AbstractTableFileSystemView: Building file system view for partition (files)
24/05/28 10:51:39 INFO BaseHoodieClient: Embedded Timeline Server is disabled. Not starting timeline service
24/05/28 10:51:39 INFO BaseHoodieClient: Embedded Timeline Server is disabled. Not starting timeline service
24/05/28 10:51:39 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:39 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:39 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:39 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:39 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:39 INFO HoodieBackedTableMetadataWriter: New commit at 20240528105129280 being applied to MDT.
24/05/28 10:51:39 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:39 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:39 INFO CleanerUtils: Cleaned failed attempts if any
24/05/28 10:51:39 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:39 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:39 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:39 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:39 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:39 INFO BaseHoodieWriteClient: Generate a new instant time: 20240528105129280 action: deltacommit
24/05/28 10:51:39 INFO HoodieActiveTimeline: Creating a new instant [==>20240528105129280__deltacommit__REQUESTED]
24/05/28 10:51:39 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:39 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:39 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:39 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:39 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:39 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:39 INFO AsyncCleanerService: The HoodieWriteClient is not configured to auto & async clean. Async clean service will not start.
24/05/28 10:51:39 INFO AsyncArchiveService: The HoodieWriteClient is not configured to auto & async archive. Async archive service will not start.
24/05/28 10:51:39 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:39 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:39 INFO SparkContext: Starting job: countByKey at HoodieJavaPairRDD.java:105
24/05/28 10:51:39 INFO DAGScheduler: Registering RDD 63 (countByKey at HoodieJavaPairRDD.java:105) as input to shuffle 8
24/05/28 10:51:39 INFO DAGScheduler: Got job 9 (countByKey at HoodieJavaPairRDD.java:105) with 1 output partitions
24/05/28 10:51:39 INFO DAGScheduler: Final stage: ResultStage 26 (countByKey at HoodieJavaPairRDD.java:105)
24/05/28 10:51:39 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 25)
24/05/28 10:51:39 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 25)
24/05/28 10:51:39 INFO DAGScheduler: Submitting ShuffleMapStage 25 (MapPartitionsRDD[63] at countByKey at HoodieJavaPairRDD.java:105), which has no missing parents
24/05/28 10:51:39 INFO MemoryStore: Block broadcast_14 stored as values in memory (estimated size 10.2 KiB, free 364.7 MiB)
24/05/28 10:51:39 INFO MemoryStore: Block broadcast_14_piece0 stored as bytes in memory (estimated size 5.5 KiB, free 364.7 MiB)
24/05/28 10:51:39 INFO BlockManagerInfo: Added broadcast_14_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 5.5 KiB, free: 365.9 MiB)
24/05/28 10:51:39 INFO SparkContext: Created broadcast 14 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:39 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 25 (MapPartitionsRDD[63] at countByKey at HoodieJavaPairRDD.java:105) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:39 INFO TaskSchedulerImpl: Adding task set 25.0 with 1 tasks resource profile 0
24/05/28 10:51:39 INFO TaskSetManager: Starting task 0.0 in stage 25.0 (TID 62) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4695 bytes) taskResourceAssignments Map()
24/05/28 10:51:39 INFO Executor: Running task 0.0 in stage 25.0 (TID 62)
24/05/28 10:51:39 INFO MemoryStore: Block rdd_61_0 stored as values in memory (estimated size 398.0 B, free 364.7 MiB)
24/05/28 10:51:39 INFO BlockManagerInfo: Added rdd_61_0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 398.0 B, free: 365.9 MiB)
24/05/28 10:51:39 INFO Executor: Finished task 0.0 in stage 25.0 (TID 62). 1122 bytes result sent to driver
24/05/28 10:51:39 INFO TaskSetManager: Finished task 0.0 in stage 25.0 (TID 62) in 21 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:39 INFO TaskSchedulerImpl: Removed TaskSet 25.0, whose tasks have all completed, from pool
24/05/28 10:51:39 INFO DAGScheduler: ShuffleMapStage 25 (countByKey at HoodieJavaPairRDD.java:105) finished in 0.029 s
24/05/28 10:51:39 INFO DAGScheduler: looking for newly runnable stages
24/05/28 10:51:39 INFO DAGScheduler: running: Set()
24/05/28 10:51:39 INFO DAGScheduler: waiting: Set(ResultStage 26)
24/05/28 10:51:39 INFO DAGScheduler: failed: Set()
24/05/28 10:51:39 INFO DAGScheduler: Submitting ResultStage 26 (ShuffledRDD[64] at countByKey at HoodieJavaPairRDD.java:105), which has no missing parents
24/05/28 10:51:39 INFO MemoryStore: Block broadcast_15 stored as values in memory (estimated size 5.5 KiB, free 364.7 MiB)
24/05/28 10:51:39 INFO MemoryStore: Block broadcast_15_piece0 stored as bytes in memory (estimated size 3.2 KiB, free 364.7 MiB)
24/05/28 10:51:39 INFO BlockManagerInfo: Added broadcast_15_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 3.2 KiB, free: 365.9 MiB)
24/05/28 10:51:39 INFO SparkContext: Created broadcast 15 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:39 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 26 (ShuffledRDD[64] at countByKey at HoodieJavaPairRDD.java:105) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:39 INFO TaskSchedulerImpl: Adding task set 26.0 with 1 tasks resource profile 0
24/05/28 10:51:39 INFO TaskSetManager: Starting task 0.0 in stage 26.0 (TID 63) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, NODE_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:39 INFO Executor: Running task 0.0 in stage 26.0 (TID 63)
24/05/28 10:51:39 INFO ShuffleBlockFetcherIterator: Getting 1 (117.0 B) non-empty blocks including 1 (117.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:39 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
24/05/28 10:51:39 INFO Executor: Finished task 0.0 in stage 26.0 (TID 63). 1316 bytes result sent to driver
24/05/28 10:51:39 INFO TaskSetManager: Finished task 0.0 in stage 26.0 (TID 63) in 12 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:39 INFO TaskSchedulerImpl: Removed TaskSet 26.0, whose tasks have all completed, from pool
24/05/28 10:51:39 INFO DAGScheduler: ResultStage 26 (countByKey at HoodieJavaPairRDD.java:105) finished in 0.019 s
24/05/28 10:51:39 INFO DAGScheduler: Job 9 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:39 INFO TaskSchedulerImpl: Killing all running tasks in stage 26: Stage finished
24/05/28 10:51:39 INFO DAGScheduler: Job 9 finished: countByKey at HoodieJavaPairRDD.java:105, took 0.058790 s
24/05/28 10:51:39 INFO BaseSparkCommitActionExecutor: Source read and index timer 95
24/05/28 10:51:39 INFO UpsertPartitioner: AvgRecordSize => 1024
24/05/28 10:51:39 INFO SparkContext: Starting job: collectAsMap at UpsertPartitioner.java:285
24/05/28 10:51:39 INFO DAGScheduler: Got job 10 (collectAsMap at UpsertPartitioner.java:285) with 1 output partitions
24/05/28 10:51:39 INFO DAGScheduler: Final stage: ResultStage 27 (collectAsMap at UpsertPartitioner.java:285)
24/05/28 10:51:39 INFO DAGScheduler: Parents of final stage: List()
24/05/28 10:51:39 INFO DAGScheduler: Missing parents: List()
24/05/28 10:51:39 INFO DAGScheduler: Submitting ResultStage 27 (MapPartitionsRDD[66] at mapToPair at UpsertPartitioner.java:284), which has no missing parents
24/05/28 10:51:39 INFO MemoryStore: Block broadcast_16 stored as values in memory (estimated size 264.3 KiB, free 364.4 MiB)
24/05/28 10:51:39 INFO MemoryStore: Block broadcast_16_piece0 stored as bytes in memory (estimated size 92.1 KiB, free 364.3 MiB)
24/05/28 10:51:39 INFO BlockManagerInfo: Added broadcast_16_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 92.1 KiB, free: 365.8 MiB)
24/05/28 10:51:39 INFO SparkContext: Created broadcast 16 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:39 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 27 (MapPartitionsRDD[66] at mapToPair at UpsertPartitioner.java:284) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:39 INFO TaskSchedulerImpl: Adding task set 27.0 with 1 tasks resource profile 0
24/05/28 10:51:39 INFO TaskSetManager: Starting task 0.0 in stage 27.0 (TID 64) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4339 bytes) taskResourceAssignments Map()
24/05/28 10:51:39 INFO Executor: Running task 0.0 in stage 27.0 (TID 64)
24/05/28 10:51:39 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:39 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:39 INFO FileSystemViewManager: Creating InMemory based view for basePath /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata.
24/05/28 10:51:39 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:39 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:39 INFO AbstractTableFileSystemView: Building file system view for partition (files)
24/05/28 10:51:39 INFO Executor: Finished task 0.0 in stage 27.0 (TID 64). 837 bytes result sent to driver
24/05/28 10:51:39 INFO TaskSetManager: Finished task 0.0 in stage 27.0 (TID 64) in 29 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:39 INFO TaskSchedulerImpl: Removed TaskSet 27.0, whose tasks have all completed, from pool
24/05/28 10:51:39 INFO DAGScheduler: ResultStage 27 (collectAsMap at UpsertPartitioner.java:285) finished in 0.064 s
24/05/28 10:51:39 INFO DAGScheduler: Job 10 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:39 INFO TaskSchedulerImpl: Killing all running tasks in stage 27: Stage finished
24/05/28 10:51:39 INFO DAGScheduler: Job 10 finished: collectAsMap at UpsertPartitioner.java:285, took 0.066678 s
24/05/28 10:51:39 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:39 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:39 INFO UpsertPartitioner: Total Buckets: 1, bucketInfoMap size: 1, partitionPathToInsertBucketInfos size: 0, updateLocationToBucket size: 1
24/05/28 10:51:39 INFO HoodieActiveTimeline: Create new file for toInstant ?/home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/20240528105129280.deltacommit.inflight
24/05/28 10:51:39 INFO BaseSparkCommitActionExecutor: no validators configured.
24/05/28 10:51:39 INFO BaseCommitActionExecutor: Auto commit enabled: Committing 20240528105129280
24/05/28 10:51:39 INFO SparkContext: Starting job: collect at HoodieJavaRDD.java:177
24/05/28 10:51:39 INFO DAGScheduler: Registering RDD 67 (mapToPair at HoodieJavaRDD.java:149) as input to shuffle 9
24/05/28 10:51:39 INFO DAGScheduler: Got job 11 (collect at HoodieJavaRDD.java:177) with 1 output partitions
24/05/28 10:51:39 INFO DAGScheduler: Final stage: ResultStage 29 (collect at HoodieJavaRDD.java:177)
24/05/28 10:51:39 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 28)
24/05/28 10:51:39 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 28)
24/05/28 10:51:39 INFO DAGScheduler: Submitting ShuffleMapStage 28 (MapPartitionsRDD[67] at mapToPair at HoodieJavaRDD.java:149), which has no missing parents
24/05/28 10:51:39 INFO MemoryStore: Block broadcast_17 stored as values in memory (estimated size 269.1 KiB, free 364.1 MiB)
24/05/28 10:51:39 INFO MemoryStore: Block broadcast_17_piece0 stored as bytes in memory (estimated size 95.7 KiB, free 364.0 MiB)
24/05/28 10:51:39 INFO BlockManagerInfo: Added broadcast_17_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 95.7 KiB, free: 365.7 MiB)
24/05/28 10:51:39 INFO SparkContext: Created broadcast 17 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:39 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 28 (MapPartitionsRDD[67] at mapToPair at HoodieJavaRDD.java:149) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:39 INFO TaskSchedulerImpl: Adding task set 28.0 with 1 tasks resource profile 0
24/05/28 10:51:39 INFO TaskSetManager: Starting task 0.0 in stage 28.0 (TID 65) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4695 bytes) taskResourceAssignments Map()
24/05/28 10:51:39 INFO Executor: Running task 0.0 in stage 28.0 (TID 65)
24/05/28 10:51:39 INFO BlockManager: Found block rdd_61_0 locally
24/05/28 10:51:39 INFO Executor: Finished task 0.0 in stage 28.0 (TID 65). 1080 bytes result sent to driver
24/05/28 10:51:39 INFO TaskSetManager: Finished task 0.0 in stage 28.0 (TID 65) in 65 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:39 INFO TaskSchedulerImpl: Removed TaskSet 28.0, whose tasks have all completed, from pool
24/05/28 10:51:39 INFO DAGScheduler: ShuffleMapStage 28 (mapToPair at HoodieJavaRDD.java:149) finished in 0.127 s
24/05/28 10:51:39 INFO DAGScheduler: looking for newly runnable stages
24/05/28 10:51:39 INFO DAGScheduler: running: Set()
24/05/28 10:51:39 INFO DAGScheduler: waiting: Set(ResultStage 29)
24/05/28 10:51:39 INFO DAGScheduler: failed: Set()
24/05/28 10:51:39 INFO DAGScheduler: Submitting ResultStage 29 (MapPartitionsRDD[72] at map at HoodieJavaRDD.java:125), which has no missing parents
24/05/28 10:51:39 INFO MemoryStore: Block broadcast_18 stored as values in memory (estimated size 375.6 KiB, free 363.6 MiB)
24/05/28 10:51:39 INFO MemoryStore: Block broadcast_18_piece0 stored as bytes in memory (estimated size 135.0 KiB, free 363.5 MiB)
24/05/28 10:51:39 INFO BlockManagerInfo: Added broadcast_18_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 135.0 KiB, free: 365.6 MiB)
24/05/28 10:51:39 INFO SparkContext: Created broadcast 18 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:39 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 29 (MapPartitionsRDD[72] at map at HoodieJavaRDD.java:125) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:39 INFO TaskSchedulerImpl: Adding task set 29.0 with 1 tasks resource profile 0
24/05/28 10:51:39 INFO TaskSetManager: Starting task 0.0 in stage 29.0 (TID 66) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, NODE_LOCAL, 4271 bytes) taskResourceAssignments Map()
24/05/28 10:51:39 INFO Executor: Running task 0.0 in stage 29.0 (TID 66)
24/05/28 10:51:39 INFO ShuffleBlockFetcherIterator: Getting 1 (368.0 B) non-empty blocks including 1 (368.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks
24/05/28 10:51:39 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 4 ms
24/05/28 10:51:39 INFO BaseSparkDeltaCommitActionExecutor: Merging updates for commit 20240528105129280 for file files-0000-0
24/05/28 10:51:39 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:39 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:39 INFO FileSystemViewManager: Creating InMemory based view for basePath /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata.
24/05/28 10:51:39 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:39 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:39 INFO AbstractTableFileSystemView: Building file system view for partition (files)
# WARNING: Unable to attach Serviceability Agent. Unable to attach even with module exceptions: [org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed., org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed., org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.]
24/05/28 10:51:40 INFO HoodieLogFormat$WriterBuilder: Building HoodieLogFormat Writer
24/05/28 10:51:40 INFO HoodieLogFormat$WriterBuilder: HoodieLogFile on path /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.1_0-0-0
24/05/28 10:51:40 INFO DirectWriteMarkers: Creating Marker Path=/home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/.temp/20240528105129280/files/.files-0000-0_00000000000000010.log.2_0-29-66.marker.APPEND
24/05/28 10:51:40 INFO DirectWriteMarkers: [direct] Created marker file /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/.temp/20240528105129280/files/.files-0000-0_00000000000000010.log.2_0-29-66.marker.APPEND in 30 ms
24/05/28 10:51:40 INFO HoodieLogFormatWriter: Callback failed. Rolling over to HoodieLogFile{pathStr='/home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.2_0-29-66', fileLen=-1}
24/05/28 10:51:41 INFO BlockManagerInfo: Removed broadcast_11_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 102.3 KiB, free: 365.7 MiB)
24/05/28 10:51:41 WARN MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
24/05/28 10:51:41 INFO BlockManagerInfo: Removed broadcast_15_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 3.2 KiB, free: 365.7 MiB)
24/05/28 10:51:41 INFO MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
24/05/28 10:51:41 INFO MetricsSystemImpl: HBase metrics system started
24/05/28 10:51:41 INFO BlockManagerInfo: Removed broadcast_16_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 92.1 KiB, free: 365.8 MiB)
24/05/28 10:51:41 INFO MetricRegistries: Loaded MetricRegistries class org.apache.hudi.org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl
24/05/28 10:51:41 INFO BlockManagerInfo: Removed broadcast_14_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 5.5 KiB, free: 365.8 MiB)
24/05/28 10:51:41 INFO BlockManagerInfo: Removed broadcast_17_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 95.7 KiB, free: 365.9 MiB)
24/05/28 10:51:41 INFO BlockManagerInfo: Removed broadcast_12_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 102.3 KiB, free: 366.0 MiB)
24/05/28 10:51:41 INFO BlockManagerInfo: Removed broadcast_13_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 102.3 KiB, free: 366.1 MiB)
24/05/28 10:51:41 INFO CodecPool: Got brand-new compressor [.gz]
24/05/28 10:51:41 INFO CodecPool: Got brand-new compressor [.gz]
24/05/28 10:51:41 INFO HoodieAppendHandle: AppendHandle for partitionPath files filePath files/.files-0000-0_00000000000000010.log.2_0-29-66, took 1525 ms.
24/05/28 10:51:41 INFO MemoryStore: Block rdd_71_0 stored as values in memory (estimated size 485.0 B, free 365.3 MiB)
24/05/28 10:51:41 INFO BlockManagerInfo: Added rdd_71_0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 485.0 B, free: 366.1 MiB)
24/05/28 10:51:41 INFO Executor: Finished task 0.0 in stage 29.0 (TID 66). 1670 bytes result sent to driver
24/05/28 10:51:41 INFO TaskSetManager: Finished task 0.0 in stage 29.0 (TID 66) in 1623 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:41 INFO TaskSchedulerImpl: Removed TaskSet 29.0, whose tasks have all completed, from pool
24/05/28 10:51:41 INFO DAGScheduler: ResultStage 29 (collect at HoodieJavaRDD.java:177) finished in 1.681 s
24/05/28 10:51:41 INFO DAGScheduler: Job 11 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:41 INFO TaskSchedulerImpl: Killing all running tasks in stage 29: Stage finished
24/05/28 10:51:41 INFO DAGScheduler: Job 11 finished: collect at HoodieJavaRDD.java:177, took 1.823859 s
24/05/28 10:51:41 INFO CommitUtils: Creating metadata for UPSERT_PREPPED numWriteStats:1 numReplaceFileIds:0
24/05/28 10:51:41 INFO BaseSparkCommitActionExecutor: Committing 20240528105129280, action Type deltacommit, operation Type UPSERT_PREPPED
24/05/28 10:51:41 INFO SparkContext: Starting job: collect at HoodieSparkEngineContext.java:150
24/05/28 10:51:41 INFO DAGScheduler: Got job 12 (collect at HoodieSparkEngineContext.java:150) with 1 output partitions
24/05/28 10:51:41 INFO DAGScheduler: Final stage: ResultStage 30 (collect at HoodieSparkEngineContext.java:150)
24/05/28 10:51:41 INFO DAGScheduler: Parents of final stage: List()
24/05/28 10:51:41 INFO DAGScheduler: Missing parents: List()
24/05/28 10:51:41 INFO DAGScheduler: Submitting ResultStage 30 (MapPartitionsRDD[74] at flatMap at HoodieSparkEngineContext.java:150), which has no missing parents
24/05/28 10:51:41 INFO MemoryStore: Block broadcast_19 stored as values in memory (estimated size 102.1 KiB, free 365.2 MiB)
24/05/28 10:51:41 INFO MemoryStore: Block broadcast_19_piece0 stored as bytes in memory (estimated size 37.4 KiB, free 365.2 MiB)
24/05/28 10:51:41 INFO BlockManagerInfo: Added broadcast_19_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 37.4 KiB, free: 366.1 MiB)
24/05/28 10:51:41 INFO SparkContext: Created broadcast 19 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:41 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 30 (MapPartitionsRDD[74] at flatMap at HoodieSparkEngineContext.java:150) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:41 INFO TaskSchedulerImpl: Adding task set 30.0 with 1 tasks resource profile 0
24/05/28 10:51:41 INFO TaskSetManager: Starting task 0.0 in stage 30.0 (TID 67) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4455 bytes) taskResourceAssignments Map()
24/05/28 10:51:41 INFO Executor: Running task 0.0 in stage 30.0 (TID 67)
24/05/28 10:51:41 INFO Executor: Finished task 0.0 in stage 30.0 (TID 67). 805 bytes result sent to driver
24/05/28 10:51:41 INFO TaskSetManager: Finished task 0.0 in stage 30.0 (TID 67) in 22 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:41 INFO TaskSchedulerImpl: Removed TaskSet 30.0, whose tasks have all completed, from pool
24/05/28 10:51:41 INFO DAGScheduler: ResultStage 30 (collect at HoodieSparkEngineContext.java:150) finished in 0.044 s
24/05/28 10:51:41 INFO DAGScheduler: Job 12 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:41 INFO TaskSchedulerImpl: Killing all running tasks in stage 30: Stage finished
24/05/28 10:51:41 INFO DAGScheduler: Job 12 finished: collect at HoodieSparkEngineContext.java:150, took 0.046904 s
24/05/28 10:51:41 INFO SparkContext: Starting job: collect at HoodieSparkEngineContext.java:150
24/05/28 10:51:41 INFO DAGScheduler: Got job 13 (collect at HoodieSparkEngineContext.java:150) with 1 output partitions
24/05/28 10:51:41 INFO DAGScheduler: Final stage: ResultStage 31 (collect at HoodieSparkEngineContext.java:150)
24/05/28 10:51:41 INFO DAGScheduler: Parents of final stage: List()
24/05/28 10:51:41 INFO DAGScheduler: Missing parents: List()
24/05/28 10:51:41 INFO DAGScheduler: Submitting ResultStage 31 (MapPartitionsRDD[76] at flatMap at HoodieSparkEngineContext.java:150), which has no missing parents
24/05/28 10:51:41 INFO MemoryStore: Block broadcast_20 stored as values in memory (estimated size 102.1 KiB, free 365.1 MiB)
24/05/28 10:51:41 INFO MemoryStore: Block broadcast_20_piece0 stored as bytes in memory (estimated size 37.4 KiB, free 365.1 MiB)
24/05/28 10:51:41 INFO BlockManagerInfo: Added broadcast_20_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 37.4 KiB, free: 366.0 MiB)
24/05/28 10:51:41 INFO SparkContext: Created broadcast 20 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:41 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 31 (MapPartitionsRDD[76] at flatMap at HoodieSparkEngineContext.java:150) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:41 INFO TaskSchedulerImpl: Adding task set 31.0 with 1 tasks resource profile 0
24/05/28 10:51:41 INFO TaskSetManager: Starting task 0.0 in stage 31.0 (TID 68) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4455 bytes) taskResourceAssignments Map()
24/05/28 10:51:41 INFO Executor: Running task 0.0 in stage 31.0 (TID 68)
24/05/28 10:51:41 INFO Executor: Finished task 0.0 in stage 31.0 (TID 68). 858 bytes result sent to driver
24/05/28 10:51:41 INFO TaskSetManager: Finished task 0.0 in stage 31.0 (TID 68) in 12 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:41 INFO TaskSchedulerImpl: Removed TaskSet 31.0, whose tasks have all completed, from pool
24/05/28 10:51:41 INFO DAGScheduler: ResultStage 31 (collect at HoodieSparkEngineContext.java:150) finished in 0.037 s
24/05/28 10:51:41 INFO DAGScheduler: Job 13 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:41 INFO TaskSchedulerImpl: Killing all running tasks in stage 31: Stage finished
24/05/28 10:51:41 INFO DAGScheduler: Job 13 finished: collect at HoodieSparkEngineContext.java:150, took 0.041018 s
24/05/28 10:51:41 INFO HoodieActiveTimeline: Marking instant complete [==>20240528105129280__deltacommit__INFLIGHT]
24/05/28 10:51:41 INFO HoodieActiveTimeline: Create new file for toInstant ?/home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/20240528105129280.deltacommit
24/05/28 10:51:41 INFO HoodieActiveTimeline: Completed [==>20240528105129280__deltacommit__INFLIGHT]
24/05/28 10:51:41 INFO BaseSparkCommitActionExecutor: Committed 20240528105129280
24/05/28 10:51:41 INFO SparkContext: Starting job: collectAsMap at HoodieSparkEngineContext.java:164
24/05/28 10:51:41 INFO DAGScheduler: Got job 14 (collectAsMap at HoodieSparkEngineContext.java:164) with 1 output partitions
24/05/28 10:51:41 INFO DAGScheduler: Final stage: ResultStage 32 (collectAsMap at HoodieSparkEngineContext.java:164)
24/05/28 10:51:41 INFO DAGScheduler: Parents of final stage: List()
24/05/28 10:51:41 INFO DAGScheduler: Missing parents: List()
24/05/28 10:51:41 INFO DAGScheduler: Submitting ResultStage 32 (MapPartitionsRDD[78] at mapToPair at HoodieSparkEngineContext.java:161), which has no missing parents
24/05/28 10:51:41 INFO MemoryStore: Block broadcast_21 stored as values in memory (estimated size 102.3 KiB, free 365.0 MiB)
24/05/28 10:51:41 INFO MemoryStore: Block broadcast_21_piece0 stored as bytes in memory (estimated size 37.4 KiB, free 364.9 MiB)
24/05/28 10:51:41 INFO BlockManagerInfo: Added broadcast_21_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 37.4 KiB, free: 366.0 MiB)
24/05/28 10:51:41 INFO SparkContext: Created broadcast 21 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:41 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 32 (MapPartitionsRDD[78] at mapToPair at HoodieSparkEngineContext.java:161) (first 15 tasks are for partitions Vector(0))
24/05/28 10:51:41 INFO TaskSchedulerImpl: Adding task set 32.0 with 1 tasks resource profile 0
24/05/28 10:51:41 INFO TaskSetManager: Starting task 0.0 in stage 32.0 (TID 69) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4455 bytes) taskResourceAssignments Map()
24/05/28 10:51:41 INFO Executor: Running task 0.0 in stage 32.0 (TID 69)
24/05/28 10:51:41 INFO Executor: Finished task 0.0 in stage 32.0 (TID 69). 932 bytes result sent to driver
24/05/28 10:51:41 INFO TaskSetManager: Finished task 0.0 in stage 32.0 (TID 69) in 12 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/1)
24/05/28 10:51:41 INFO TaskSchedulerImpl: Removed TaskSet 32.0, whose tasks have all completed, from pool
24/05/28 10:51:41 INFO DAGScheduler: ResultStage 32 (collectAsMap at HoodieSparkEngineContext.java:164) finished in 0.032 s
24/05/28 10:51:41 INFO DAGScheduler: Job 14 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:41 INFO TaskSchedulerImpl: Killing all running tasks in stage 32: Stage finished
24/05/28 10:51:41 INFO DAGScheduler: Job 14 finished: collectAsMap at HoodieSparkEngineContext.java:164, took 0.034118 s
24/05/28 10:51:41 INFO FSUtils: Removed directory at /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/.temp/20240528105129280
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO FileSystemViewManager: Creating View Manager with storage type MEMORY.
24/05/28 10:51:41 INFO FileSystemViewManager: Creating in-memory based Table View
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:41 INFO HoodieActiveTimeline: Marking instant complete [==>20240528105129280__commit__INFLIGHT]
24/05/28 10:51:41 INFO HoodieActiveTimeline: Create new file for toInstant ?/home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/20240528105129280.commit
24/05/28 10:51:41 INFO HoodieActiveTimeline: Completed [==>20240528105129280__commit__INFLIGHT]
24/05/28 10:51:41 INFO SparkContext: Starting job: collectAsMap at HoodieSparkEngineContext.java:164
24/05/28 10:51:41 INFO DAGScheduler: Got job 15 (collectAsMap at HoodieSparkEngineContext.java:164) with 2 output partitions
24/05/28 10:51:41 INFO DAGScheduler: Final stage: ResultStage 33 (collectAsMap at HoodieSparkEngineContext.java:164)
24/05/28 10:51:41 INFO DAGScheduler: Parents of final stage: List()
24/05/28 10:51:41 INFO DAGScheduler: Missing parents: List()
24/05/28 10:51:41 INFO DAGScheduler: Submitting ResultStage 33 (MapPartitionsRDD[80] at mapToPair at HoodieSparkEngineContext.java:161), which has no missing parents
24/05/28 10:51:41 INFO MemoryStore: Block broadcast_22 stored as values in memory (estimated size 102.3 KiB, free 364.8 MiB)
24/05/28 10:51:41 INFO MemoryStore: Block broadcast_22_piece0 stored as bytes in memory (estimated size 37.4 KiB, free 364.8 MiB)
24/05/28 10:51:41 INFO BlockManagerInfo: Added broadcast_22_piece0 in memory on ip-10-0-78-189.us-west-2.compute.internal:40923 (size: 37.4 KiB, free: 366.0 MiB)
24/05/28 10:51:41 INFO SparkContext: Created broadcast 22 from broadcast at DAGScheduler.scala:1509
24/05/28 10:51:41 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 33 (MapPartitionsRDD[80] at mapToPair at HoodieSparkEngineContext.java:161) (first 15 tasks are for partitions Vector(0, 1))
24/05/28 10:51:41 INFO TaskSchedulerImpl: Adding task set 33.0 with 2 tasks resource profile 0
24/05/28 10:51:41 INFO TaskSetManager: Starting task 0.0 in stage 33.0 (TID 70) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 0, PROCESS_LOCAL, 4445 bytes) taskResourceAssignments Map()
24/05/28 10:51:41 INFO Executor: Running task 0.0 in stage 33.0 (TID 70)
24/05/28 10:51:41 INFO Executor: Finished task 0.0 in stage 33.0 (TID 70). 922 bytes result sent to driver
24/05/28 10:51:41 INFO TaskSetManager: Starting task 1.0 in stage 33.0 (TID 71) (ip-10-0-78-189.us-west-2.compute.internal, executor driver, partition 1, PROCESS_LOCAL, 4441 bytes) taskResourceAssignments Map()
24/05/28 10:51:41 INFO TaskSetManager: Finished task 0.0 in stage 33.0 (TID 70) in 10 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (1/2)
24/05/28 10:51:41 INFO Executor: Running task 1.0 in stage 33.0 (TID 71)
24/05/28 10:51:41 INFO Executor: Finished task 1.0 in stage 33.0 (TID 71). 918 bytes result sent to driver
24/05/28 10:51:41 INFO TaskSetManager: Finished task 1.0 in stage 33.0 (TID 71) in 13 ms on ip-10-0-78-189.us-west-2.compute.internal (executor driver) (2/2)
24/05/28 10:51:41 INFO TaskSchedulerImpl: Removed TaskSet 33.0, whose tasks have all completed, from pool
24/05/28 10:51:41 INFO DAGScheduler: ResultStage 33 (collectAsMap at HoodieSparkEngineContext.java:164) finished in 0.038 s
24/05/28 10:51:41 INFO DAGScheduler: Job 15 is finished. Cancelling potential speculative or zombie tasks for this job
24/05/28 10:51:41 INFO TaskSchedulerImpl: Killing all running tasks in stage 33: Stage finished
24/05/28 10:51:41 INFO DAGScheduler: Job 15 finished: collectAsMap at HoodieSparkEngineContext.java:164, took 0.041762 s
24/05/28 10:51:41 INFO FSUtils: Removed directory at /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/.temp/20240528105129280
24/05/28 10:51:41 INFO BaseHoodieWriteClient: Committed 20240528105129280
24/05/28 10:51:41 INFO MapPartitionsRDD: Removing RDD 39 from persistence list
24/05/28 10:51:41 INFO MapPartitionsRDD: Removing RDD 49 from persistence list
24/05/28 10:51:41 INFO UnionRDD: Removing RDD 61 from persistence list
24/05/28 10:51:41 INFO MapPartitionsRDD: Removing RDD 71 from persistence list
24/05/28 10:51:41 INFO BlockManager: Removing RDD 39
24/05/28 10:51:41 INFO BaseHoodieWriteClient: Start to clean synchronously.
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO BlockManager: Removing RDD 49
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:41 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:41 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:41 INFO FileSystemViewManager: Creating View Manager with storage type REMOTE_FIRST.
24/05/28 10:51:41 INFO FileSystemViewManager: Creating remote first table view
24/05/28 10:51:41 INFO BaseHoodieWriteClient: Cleaner started
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:41 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:41 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:41 INFO FileSystemViewManager: Creating View Manager with storage type REMOTE_FIRST.
24/05/28 10:51:41 INFO FileSystemViewManager: Creating remote first table view
24/05/28 10:51:41 INFO BaseHoodieWriteClient: Scheduling cleaning at instant time: 20240528105141866
24/05/28 10:51:41 INFO FileSystemViewManager: Creating remote view for basePath /home/ec2-user/testcases/stocks/data/target/20240528t103509. Server=ip-10-0-78-189.us-west-2.compute.internal:40045, Timeout=300
24/05/28 10:51:41 INFO FileSystemViewManager: Creating InMemory based view for basePath /home/ec2-user/testcases/stocks/data/target/20240528t103509.
24/05/28 10:51:41 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:41 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:41 INFO BlockManager: Removing RDD 61
24/05/28 10:51:41 INFO RemoteHoodieTableFileSystemView: Sending request : (http://ip-10-0-78-189.us-west-2.compute.internal:40045/v1/hoodie/view/compactions/pending/?basepath=%2Fhome%2Fec2-user%2Ftestcases%2Fstocks%2Fdata%2Ftarget%2F20240528t103509&lastinstantts=20240528105131254&timelinehash=1f6aefc880525adfcf49f93cb473096049aabf6f2fdfd09d42f46125f63fb640)
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO FileSystemViewManager: Creating InMemory based view for basePath /home/ec2-user/testcases/stocks/data/target/20240528t103509.
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:41 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:41 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:41 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:41 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:41 INFO BlockManager: Removing RDD 71
24/05/28 10:51:41 INFO RemoteHoodieTableFileSystemView: Sending request : (http://ip-10-0-78-189.us-west-2.compute.internal:40045/v1/hoodie/view/logcompactions/pending/?basepath=%2Fhome%2Fec2-user%2Ftestcases%2Fstocks%2Fdata%2Ftarget%2F20240528t103509&lastinstantts=20240528105131254&timelinehash=1f6aefc880525adfcf49f93cb473096049aabf6f2fdfd09d42f46125f63fb640)
24/05/28 10:51:41 INFO CleanPlanner: No earliest commit to retain. No need to scan partitions !!
24/05/28 10:51:41 INFO CleanPlanActionExecutor: Nothing to clean here. It is already clean
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:41 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:41 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:41 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:41 INFO FileSystemViewManager: Creating View Manager with storage type REMOTE_FIRST.
24/05/28 10:51:41 INFO FileSystemViewManager: Creating remote first table view
24/05/28 10:51:41 INFO BaseHoodieWriteClient: Start to archive synchronously.
24/05/28 10:51:41 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:41 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:41 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:42 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:42 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:42 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata/.hoodie/hoodie.properties
24/05/28 10:51:42 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/metadata
24/05/28 10:51:42 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:42 INFO AbstractTableFileSystemView: Took 1 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:42 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:42 INFO HoodieTimelineArchiver: Not archiving as there is no compaction yet on the metadata table
24/05/28 10:51:42 INFO HoodieTimelineArchiver: No Instants to archive
24/05/28 10:51:42 INFO FileSystemViewManager: Creating remote view for basePath /home/ec2-user/testcases/stocks/data/target/20240528t103509. Server=ip-10-0-78-189.us-west-2.compute.internal:40045, Timeout=300
24/05/28 10:51:42 INFO FileSystemViewManager: Creating InMemory based view for basePath /home/ec2-user/testcases/stocks/data/target/20240528t103509.
24/05/28 10:51:42 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:42 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:42 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:42 INFO RemoteHoodieTableFileSystemView: Sending request : (http://ip-10-0-78-189.us-west-2.compute.internal:40045/v1/hoodie/view/refresh/?basepath=%2Fhome%2Fec2-user%2Ftestcases%2Fstocks%2Fdata%2Ftarget%2F20240528t103509&lastinstantts=20240528105131254&timelinehash=1f6aefc880525adfcf49f93cb473096049aabf6f2fdfd09d42f46125f63fb640)
24/05/28 10:51:42 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:42 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:42 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:42 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
24/05/28 10:51:42 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__deltacommit__COMPLETED__20240528105132179]}
24/05/28 10:51:42 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups
24/05/28 10:51:42 INFO ClusteringUtils: Found 0 files in pending clustering operations
24/05/28 10:51:42 INFO StreamSync: Commit 20240528105129280 successful!
24/05/28 10:51:42 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:42 INFO HoodieTableConfig: Loading table properties from /home/ec2-user/testcases/stocks/data/target/20240528t103509/.hoodie/hoodie.properties
24/05/28 10:51:42 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:42 INFO HoodieTableMetaClient: Loading Active commit timeline for /home/ec2-user/testcases/stocks/data/target/20240528t103509
24/05/28 10:51:42 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[20240528105131254__rollback__COMPLETED__20240528105132211]}
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
24/05/28 10:51:42 INFO BlockManagerInfo: Removed broadcast_19_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 37.4 KiB, free: 366.0 MiB)
24/05/28 10:51:42 INFO BlockManager: Removing RDD 61
24/05/28 10:51:42 INFO BlockManagerInfo: Removed broadcast_22_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 37.4 KiB, free: 366.0 MiB)
24/05/28 10:51:42 INFO BlockManagerInfo: Removed broadcast_20_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 37.4 KiB, free: 366.1 MiB)
24/05/28 10:51:42 INFO BlockManagerInfo: Removed broadcast_18_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 135.0 KiB, free: 366.2 MiB)
24/05/28 10:51:42 INFO BlockManagerInfo: Removed broadcast_21_piece0 on ip-10-0-78-189.us-west-2.compute.internal:40923 in memory (size: 37.4 KiB, free: 366.3 MiB)
24/05/28 10:51:42 INFO BlockManager: Removing RDD 71
24/05/28 10:51:43 INFO DatahubResponseLogger: Completed Datahub RestEmitter request. Status: succeeded
24/05/28 10:51:43 INFO DatahubResponseLogger: Completed Datahub RestEmitter request. Status: succeeded
24/05/28 10:51:43 INFO DatahubResponseLogger: Completed Datahub RestEmitter request. Status: succeeded
24/05/28 10:51:43 INFO StreamSync: [MetaSync] SyncTool class org.apache.hudi.sync.datahub.DataHubSyncTool completed successfully and took 0 s 0 ms
24/05/28 10:51:43 INFO StreamSync: Shutting down embedded timeline server
24/05/28 10:51:43 INFO EmbeddedTimelineService: Closing Timeline server
24/05/28 10:51:43 INFO TimelineService: Closing Timeline Service
24/05/28 10:51:43 INFO Javalin: Stopping Javalin ...
24/05/28 10:51:43 INFO Javalin: Javalin has stopped
24/05/28 10:51:43 INFO TimelineService: Closed Timeline Service
24/05/28 10:51:43 INFO EmbeddedTimelineService: Closed Timeline server
24/05/28 10:51:43 INFO HoodieIngestionService: Ingestion service (run-once mode) has been shut down.
24/05/28 10:51:43 INFO SparkUI: Stopped Spark web UI at http://ip-10-0-78-189.us-west-2.compute.internal:8090
24/05/28 10:51:43 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
24/05/28 10:51:43 INFO MemoryStore: MemoryStore cleared
24/05/28 10:51:43 INFO BlockManager: BlockManager stopped
24/05/28 10:51:43 INFO BlockManagerMaster: BlockManagerMaster stopped
24/05/28 10:51:43 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
24/05/28 10:51:43 INFO SparkContext: Successfully stopped SparkContext
24/05/28 10:51:43 INFO ShutdownHookManager: Shutdown hook called
24/05/28 10:51:43 INFO ShutdownHookManager: Deleting directory /tmp/spark-a22aa3d0-6da4-4f67-85d9-ba2e6fd1a653
24/05/28 10:51:43 INFO ShutdownHookManager: Deleting directory /tmp/spark-a3a2d54a-4756-473e-8387-a35f0036b3f5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment