Skip to content

Instantly share code, notes, and snippets.

@lukemarsden
Created August 12, 2022 14:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lukemarsden/42e409876f1a2e678b3a6e8cab1cae2d to your computer and use it in GitHub Desktop.
Save lukemarsden/42e409876f1a2e678b3a6e8cab1cae2d to your computer and use it in GitHub Desktop.
22/08/12 15:51:56 WARN Utils: Your hostname, mind resolves to a loopback address: 127.0.1.1; using 10.1.255.235 instead (on interface enp6s0f0)
22/08/12 15:51:56 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
22/08/12 15:51:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/08/12 15:51:57 INFO SparkContext: Running Spark version 3.3.0
22/08/12 15:51:57 INFO ResourceUtils: ==============================================================
22/08/12 15:51:57 INFO ResourceUtils: No custom resources configured for spark.driver.
22/08/12 15:51:57 INFO ResourceUtils: ==============================================================
22/08/12 15:51:57 INFO SparkContext: Submitted application: spark.py
22/08/12 15:51:57 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
22/08/12 15:51:57 INFO ResourceProfile: Limiting resource is cpu
22/08/12 15:51:57 INFO ResourceProfileManager: Added ResourceProfile id: 0
22/08/12 15:51:57 INFO SecurityManager: Changing view acls to: luke
22/08/12 15:51:57 INFO SecurityManager: Changing modify acls to: luke
22/08/12 15:51:57 INFO SecurityManager: Changing view acls groups to:
22/08/12 15:51:57 INFO SecurityManager: Changing modify acls groups to:
22/08/12 15:51:57 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(luke); groups with view permissions: Set(); users with modify permissions: Set(luke); groups with modify permissions: Set()
22/08/12 15:51:57 INFO Utils: Successfully started service 'sparkDriver' on port 40305.
22/08/12 15:51:57 INFO SparkEnv: Registering MapOutputTracker
22/08/12 15:51:57 INFO SparkEnv: Registering BlockManagerMaster
22/08/12 15:51:57 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
22/08/12 15:51:57 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/08/12 15:51:57 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
22/08/12 15:51:57 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-d49ea96c-48f3-4b33-a6b0-6ddc85895d81
22/08/12 15:51:57 INFO MemoryStore: MemoryStore started with capacity 434.4 MiB
22/08/12 15:51:57 INFO SparkEnv: Registering OutputCommitCoordinator
22/08/12 15:51:57 INFO Utils: Successfully started service 'SparkUI' on port 4040.
22/08/12 15:51:57 INFO SparkContext: Added JAR file:///home/luke/pp/pachyderm/spark/hadoop-aws-3.3.3.jar at spark://10.1.255.235:40305/jars/hadoop-aws-3.3.3.jar with timestamp 1660315917186
22/08/12 15:51:57 INFO SparkContext: Added JAR file:///home/luke/pp/pachyderm/spark/aws-java-sdk-bundle-1.12.264.jar at spark://10.1.255.235:40305/jars/aws-java-sdk-bundle-1.12.264.jar with timestamp 1660315917186
22/08/12 15:51:57 INFO Executor: Starting executor ID driver on host 10.1.255.235
22/08/12 15:51:57 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): ''
22/08/12 15:51:57 INFO Executor: Fetching spark://10.1.255.235:40305/jars/hadoop-aws-3.3.3.jar with timestamp 1660315917186
22/08/12 15:51:58 INFO TransportClientFactory: Successfully created connection to /10.1.255.235:40305 after 21 ms (0 ms spent in bootstraps)
22/08/12 15:51:58 INFO Utils: Fetching spark://10.1.255.235:40305/jars/hadoop-aws-3.3.3.jar to /tmp/spark-659878ce-6457-4db6-8008-5727d42f7983/userFiles-b8bc4f4d-91a4-45f1-a722-3fdc5f65dce4/fetchFileTemp5104747579013272228.tmp
22/08/12 15:51:58 INFO Executor: Adding file:/tmp/spark-659878ce-6457-4db6-8008-5727d42f7983/userFiles-b8bc4f4d-91a4-45f1-a722-3fdc5f65dce4/hadoop-aws-3.3.3.jar to class loader
22/08/12 15:51:58 INFO Executor: Fetching spark://10.1.255.235:40305/jars/aws-java-sdk-bundle-1.12.264.jar with timestamp 1660315917186
22/08/12 15:51:58 INFO Utils: Fetching spark://10.1.255.235:40305/jars/aws-java-sdk-bundle-1.12.264.jar to /tmp/spark-659878ce-6457-4db6-8008-5727d42f7983/userFiles-b8bc4f4d-91a4-45f1-a722-3fdc5f65dce4/fetchFileTemp3889847115389988890.tmp
22/08/12 15:51:58 INFO Executor: Adding file:/tmp/spark-659878ce-6457-4db6-8008-5727d42f7983/userFiles-b8bc4f4d-91a4-45f1-a722-3fdc5f65dce4/aws-java-sdk-bundle-1.12.264.jar to class loader
22/08/12 15:51:58 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 37955.
22/08/12 15:51:58 INFO NettyBlockTransferService: Server created on 10.1.255.235:37955
22/08/12 15:51:58 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/08/12 15:51:58 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.1.255.235, 37955, None)
22/08/12 15:51:58 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.255.235:37955 with 434.4 MiB RAM, BlockManagerId(driver, 10.1.255.235, 37955, None)
22/08/12 15:51:58 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.1.255.235, 37955, None)
22/08/12 15:51:58 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.1.255.235, 37955, None)
[('spark.driver.extraJavaOptions', '-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED'), ('spark.hadoop.fs.s3a.connection.ssl.enabled', 'false'), ('spark.driver.host', '10.1.255.235'), ('spark.hadoop.fs.s3a.path.style.access', 'true'), ('spark.repl.local.jars', 'file:///home/luke/pp/pachyderm/spark/hadoop-aws-3.3.3.jar,file:///home/luke/pp/pachyderm/spark/aws-java-sdk-bundle-1.12.264.jar'), ('spark.executor.id', 'driver'), ('spark.hadoop.fs.s3a.change.detection.mode', 'none'), ('spark.jars', 'file:///home/luke/pp/pachyderm/spark/hadoop-aws-3.3.3.jar,file:///home/luke/pp/pachyderm/spark/aws-java-sdk-bundle-1.12.264.jar'), ('spark.hadoop.fs.s3a.impl', 'org.apache.hadoop.fs.s3a.S3AFileSystem'), ('spark.app.submitTime', '1660315916714'), ('spark.hadoop.fs.s3a.change.detection.version.required', 'false'), ('spark.app.initial.jar.urls', 'spark://10.1.255.235:40305/jars/aws-java-sdk-bundle-1.12.264.jar,spark://10.1.255.235:40305/jars/hadoop-aws-3.3.3.jar'), ('spark.app.name', 'spark.py'), ('spark.rdd.compress', 'True'), ('spark.hadoop.fs.s3a.endpoint', 'http://localhost:30600'), ('spark.app.id', 'local-1660315917910'), ('spark.executor.extraJavaOptions', '-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED'), ('spark.hadoop.fs.s3a.access.key', 'lemon'), ('spark.serializer.objectStreamReset', '100'), ('spark.master', 'local[*]'), ('spark.submit.pyFiles', ''), ('spark.hadoop.fs.s3a.secret.key', 'lemon'), ('spark.submit.deployMode', 'client'), ('spark.driver.port', '40305'), ('spark.app.startTime', '1660315917186')]
22/08/12 15:51:58 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir.
22/08/12 15:51:58 INFO SharedState: Warehouse path is 'file:/home/luke/pp/pachyderm/spark/spark-warehouse'.
22/08/12 15:52:01 INFO CodeGenerator: Code generated in 148.728963 ms
22/08/12 15:52:01 INFO SparkContext: Starting job: showString at NativeMethodAccessorImpl.java:0
22/08/12 15:52:01 INFO DAGScheduler: Got job 0 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions
22/08/12 15:52:01 INFO DAGScheduler: Final stage: ResultStage 0 (showString at NativeMethodAccessorImpl.java:0)
22/08/12 15:52:01 INFO DAGScheduler: Parents of final stage: List()
22/08/12 15:52:01 INFO DAGScheduler: Missing parents: List()
22/08/12 15:52:01 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[6] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents
22/08/12 15:52:01 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 12.4 KiB, free 434.4 MiB)
22/08/12 15:52:01 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 6.6 KiB, free 434.4 MiB)
22/08/12 15:52:01 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.1.255.235:37955 (size: 6.6 KiB, free: 434.4 MiB)
22/08/12 15:52:01 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1513
22/08/12 15:52:01 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[6] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0))
22/08/12 15:52:01 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks resource profile 0
22/08/12 15:52:01 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0) (10.1.255.235, executor driver, partition 0, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:01 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 411, boot = 350, init = 60, finish = 1
22/08/12 15:52:02 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 567 ms on 10.1.255.235 (executor driver) (1/1)
22/08/12 15:52:02 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
22/08/12 15:52:02 INFO PythonAccumulatorV2: Connected to AccumulatorServer at host: 127.0.0.1 port: 45075
22/08/12 15:52:02 INFO DAGScheduler: ResultStage 0 (showString at NativeMethodAccessorImpl.java:0) finished in 0.788 s
22/08/12 15:52:02 INFO DAGScheduler: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job
22/08/12 15:52:02 INFO TaskSchedulerImpl: Killing all running tasks in stage 0: Stage finished
22/08/12 15:52:02 INFO DAGScheduler: Job 0 finished: showString at NativeMethodAccessorImpl.java:0, took 0.846117 s
22/08/12 15:52:02 INFO SparkContext: Starting job: showString at NativeMethodAccessorImpl.java:0
22/08/12 15:52:02 INFO DAGScheduler: Got job 1 (showString at NativeMethodAccessorImpl.java:0) with 4 output partitions
22/08/12 15:52:02 INFO DAGScheduler: Final stage: ResultStage 1 (showString at NativeMethodAccessorImpl.java:0)
22/08/12 15:52:02 INFO DAGScheduler: Parents of final stage: List()
22/08/12 15:52:02 INFO DAGScheduler: Missing parents: List()
22/08/12 15:52:02 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[6] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents
22/08/12 15:52:02 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 12.4 KiB, free 434.4 MiB)
22/08/12 15:52:02 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 6.6 KiB, free 434.4 MiB)
22/08/12 15:52:02 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.1.255.235:37955 (size: 6.6 KiB, free: 434.4 MiB)
22/08/12 15:52:02 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1513
22/08/12 15:52:02 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 1 (MapPartitionsRDD[6] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(1, 2, 3, 4))
22/08/12 15:52:02 INFO TaskSchedulerImpl: Adding task set 1.0 with 4 tasks resource profile 0
22/08/12 15:52:02 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1) (10.1.255.235, executor driver, partition 1, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 2) (10.1.255.235, executor driver, partition 2, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 3) (10.1.255.235, executor driver, partition 3, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 4) (10.1.255.235, executor driver, partition 4, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
22/08/12 15:52:02 INFO Executor: Running task 1.0 in stage 1.0 (TID 2)
22/08/12 15:52:02 INFO Executor: Running task 2.0 in stage 1.0 (TID 3)
22/08/12 15:52:02 INFO Executor: Running task 3.0 in stage 1.0 (TID 4)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 64, boot = 4, init = 60, finish = 0
22/08/12 15:52:02 INFO Executor: Finished task 3.0 in stage 1.0 (TID 4). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 4) in 87 ms on 10.1.255.235 (executor driver) (1/4)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 69, boot = 5, init = 64, finish = 0
22/08/12 15:52:02 INFO Executor: Finished task 2.0 in stage 1.0 (TID 3). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 3) in 97 ms on 10.1.255.235 (executor driver) (2/4)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 88, boot = -85, init = 173, finish = 0
22/08/12 15:52:02 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 108 ms on 10.1.255.235 (executor driver) (3/4)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 89, boot = 9, init = 80, finish = 0
22/08/12 15:52:02 INFO Executor: Finished task 1.0 in stage 1.0 (TID 2). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 2) in 115 ms on 10.1.255.235 (executor driver) (4/4)
22/08/12 15:52:02 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
22/08/12 15:52:02 INFO DAGScheduler: ResultStage 1 (showString at NativeMethodAccessorImpl.java:0) finished in 0.132 s
22/08/12 15:52:02 INFO DAGScheduler: Job 1 is finished. Cancelling potential speculative or zombie tasks for this job
22/08/12 15:52:02 INFO TaskSchedulerImpl: Killing all running tasks in stage 1: Stage finished
22/08/12 15:52:02 INFO DAGScheduler: Job 1 finished: showString at NativeMethodAccessorImpl.java:0, took 0.139647 s
22/08/12 15:52:02 INFO SparkContext: Starting job: showString at NativeMethodAccessorImpl.java:0
22/08/12 15:52:02 INFO DAGScheduler: Got job 2 (showString at NativeMethodAccessorImpl.java:0) with 11 output partitions
22/08/12 15:52:02 INFO DAGScheduler: Final stage: ResultStage 2 (showString at NativeMethodAccessorImpl.java:0)
22/08/12 15:52:02 INFO DAGScheduler: Parents of final stage: List()
22/08/12 15:52:02 INFO DAGScheduler: Missing parents: List()
22/08/12 15:52:02 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[6] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents
22/08/12 15:52:02 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 12.4 KiB, free 434.4 MiB)
22/08/12 15:52:02 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 6.6 KiB, free 434.3 MiB)
22/08/12 15:52:02 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 10.1.255.235:37955 (size: 6.6 KiB, free: 434.4 MiB)
22/08/12 15:52:02 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1513
22/08/12 15:52:02 INFO DAGScheduler: Submitting 11 missing tasks from ResultStage 2 (MapPartitionsRDD[6] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15))
22/08/12 15:52:02 INFO TaskSchedulerImpl: Adding task set 2.0 with 11 tasks resource profile 0
22/08/12 15:52:02 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 5) (10.1.255.235, executor driver, partition 5, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 6) (10.1.255.235, executor driver, partition 6, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 2.0 in stage 2.0 (TID 7) (10.1.255.235, executor driver, partition 7, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 3.0 in stage 2.0 (TID 8) (10.1.255.235, executor driver, partition 8, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 4.0 in stage 2.0 (TID 9) (10.1.255.235, executor driver, partition 9, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 5.0 in stage 2.0 (TID 10) (10.1.255.235, executor driver, partition 10, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 6.0 in stage 2.0 (TID 11) (10.1.255.235, executor driver, partition 11, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 7.0 in stage 2.0 (TID 12) (10.1.255.235, executor driver, partition 12, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 8.0 in stage 2.0 (TID 13) (10.1.255.235, executor driver, partition 13, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 9.0 in stage 2.0 (TID 14) (10.1.255.235, executor driver, partition 14, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO TaskSetManager: Starting task 10.0 in stage 2.0 (TID 15) (10.1.255.235, executor driver, partition 15, PROCESS_LOCAL, 4471 bytes) taskResourceAssignments Map()
22/08/12 15:52:02 INFO Executor: Running task 3.0 in stage 2.0 (TID 8)
22/08/12 15:52:02 INFO Executor: Running task 2.0 in stage 2.0 (TID 7)
22/08/12 15:52:02 INFO Executor: Running task 0.0 in stage 2.0 (TID 5)
22/08/12 15:52:02 INFO Executor: Running task 1.0 in stage 2.0 (TID 6)
22/08/12 15:52:02 INFO Executor: Running task 4.0 in stage 2.0 (TID 9)
22/08/12 15:52:02 INFO Executor: Running task 5.0 in stage 2.0 (TID 10)
22/08/12 15:52:02 INFO Executor: Running task 6.0 in stage 2.0 (TID 11)
22/08/12 15:52:02 INFO Executor: Running task 8.0 in stage 2.0 (TID 13)
22/08/12 15:52:02 INFO Executor: Running task 7.0 in stage 2.0 (TID 12)
22/08/12 15:52:02 INFO Executor: Running task 9.0 in stage 2.0 (TID 14)
22/08/12 15:52:02 INFO Executor: Running task 10.0 in stage 2.0 (TID 15)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 81, boot = -50, init = 131, finish = 0
22/08/12 15:52:02 INFO PythonRunner: Times: total = 82, boot = -53, init = 135, finish = 0
22/08/12 15:52:02 INFO PythonRunner: Times: total = 79, boot = 14, init = 64, finish = 1
22/08/12 15:52:02 INFO PythonRunner: Times: total = 85, boot = -63, init = 147, finish = 1
22/08/12 15:52:02 INFO PythonRunner: Times: total = 84, boot = -41, init = 125, finish = 0
22/08/12 15:52:02 INFO Executor: Finished task 7.0 in stage 2.0 (TID 12). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO Executor: Finished task 4.0 in stage 2.0 (TID 9). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO Executor: Finished task 3.0 in stage 2.0 (TID 8). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 7.0 in stage 2.0 (TID 12) in 106 ms on 10.1.255.235 (executor driver) (1/11)
22/08/12 15:52:02 INFO TaskSetManager: Finished task 4.0 in stage 2.0 (TID 9) in 109 ms on 10.1.255.235 (executor driver) (2/11)
22/08/12 15:52:02 INFO Executor: Finished task 1.0 in stage 2.0 (TID 6). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 3.0 in stage 2.0 (TID 8) in 110 ms on 10.1.255.235 (executor driver) (3/11)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 90, boot = 5, init = 85, finish = 0
22/08/12 15:52:02 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 6) in 118 ms on 10.1.255.235 (executor driver) (4/11)
22/08/12 15:52:02 INFO Executor: Finished task 5.0 in stage 2.0 (TID 10). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO Executor: Finished task 2.0 in stage 2.0 (TID 7). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 5.0 in stage 2.0 (TID 10) in 120 ms on 10.1.255.235 (executor driver) (5/11)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 97, boot = 8, init = 89, finish = 0
22/08/12 15:52:02 INFO TaskSetManager: Finished task 2.0 in stage 2.0 (TID 7) in 127 ms on 10.1.255.235 (executor driver) (6/11)
22/08/12 15:52:02 INFO Executor: Finished task 8.0 in stage 2.0 (TID 13). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 8.0 in stage 2.0 (TID 13) in 131 ms on 10.1.255.235 (executor driver) (7/11)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 113, boot = 4, init = 109, finish = 0
22/08/12 15:52:02 INFO Executor: Finished task 0.0 in stage 2.0 (TID 5). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 5) in 143 ms on 10.1.255.235 (executor driver) (8/11)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 119, boot = 12, init = 107, finish = 0
22/08/12 15:52:02 INFO Executor: Finished task 10.0 in stage 2.0 (TID 15). 1830 bytes result sent to driver
22/08/12 15:52:02 INFO PythonRunner: Times: total = 125, boot = 19, init = 105, finish = 1
22/08/12 15:52:02 INFO TaskSetManager: Finished task 10.0 in stage 2.0 (TID 15) in 150 ms on 10.1.255.235 (executor driver) (9/11)
22/08/12 15:52:02 INFO PythonRunner: Times: total = 131, boot = 19, init = 112, finish = 0
22/08/12 15:52:02 INFO Executor: Finished task 9.0 in stage 2.0 (TID 14). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO Executor: Finished task 6.0 in stage 2.0 (TID 11). 1789 bytes result sent to driver
22/08/12 15:52:02 INFO TaskSetManager: Finished task 9.0 in stage 2.0 (TID 14) in 155 ms on 10.1.255.235 (executor driver) (10/11)
22/08/12 15:52:02 INFO TaskSetManager: Finished task 6.0 in stage 2.0 (TID 11) in 159 ms on 10.1.255.235 (executor driver) (11/11)
22/08/12 15:52:02 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool
22/08/12 15:52:02 INFO DAGScheduler: ResultStage 2 (showString at NativeMethodAccessorImpl.java:0) finished in 0.176 s
22/08/12 15:52:02 INFO DAGScheduler: Job 2 is finished. Cancelling potential speculative or zombie tasks for this job
22/08/12 15:52:02 INFO TaskSchedulerImpl: Killing all running tasks in stage 2: Stage finished
22/08/12 15:52:02 INFO DAGScheduler: Job 2 finished: showString at NativeMethodAccessorImpl.java:0, took 0.182775 s
22/08/12 15:52:02 INFO CodeGenerator: Code generated in 14.807713 ms
+---+---+
| a| b|
+---+---+
| 1|2.0|
+---+---+
22/08/12 15:52:03 WARN MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
22/08/12 15:52:03 INFO MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
22/08/12 15:52:03 INFO MetricsSystemImpl: s3a-file-system metrics system started
22/08/12 15:52:03 INFO ParquetFileFormat: Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:03 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:03 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:03 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:03 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:03 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:03 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO CodeGenerator: Code generated in 13.698056 ms
22/08/12 15:52:04 INFO SparkContext: Starting job: parquet at NativeMethodAccessorImpl.java:0
22/08/12 15:52:04 INFO DAGScheduler: Got job 3 (parquet at NativeMethodAccessorImpl.java:0) with 16 output partitions
22/08/12 15:52:04 INFO DAGScheduler: Final stage: ResultStage 3 (parquet at NativeMethodAccessorImpl.java:0)
22/08/12 15:52:04 INFO DAGScheduler: Parents of final stage: List()
22/08/12 15:52:04 INFO DAGScheduler: Missing parents: List()
22/08/12 15:52:04 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[7] at parquet at NativeMethodAccessorImpl.java:0), which has no missing parents
22/08/12 15:52:04 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 214.9 KiB, free 434.1 MiB)
22/08/12 15:52:04 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 77.9 KiB, free 434.1 MiB)
22/08/12 15:52:04 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 10.1.255.235:37955 (size: 77.9 KiB, free: 434.3 MiB)
22/08/12 15:52:04 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1513
22/08/12 15:52:04 INFO DAGScheduler: Submitting 16 missing tasks from ResultStage 3 (MapPartitionsRDD[7] at parquet at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
22/08/12 15:52:04 INFO TaskSchedulerImpl: Adding task set 3.0 with 16 tasks resource profile 0
22/08/12 15:52:04 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 16) (10.1.255.235, executor driver, partition 0, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 17) (10.1.255.235, executor driver, partition 1, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 2.0 in stage 3.0 (TID 18) (10.1.255.235, executor driver, partition 2, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 3.0 in stage 3.0 (TID 19) (10.1.255.235, executor driver, partition 3, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 4.0 in stage 3.0 (TID 20) (10.1.255.235, executor driver, partition 4, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 5.0 in stage 3.0 (TID 21) (10.1.255.235, executor driver, partition 5, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 6.0 in stage 3.0 (TID 22) (10.1.255.235, executor driver, partition 6, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 7.0 in stage 3.0 (TID 23) (10.1.255.235, executor driver, partition 7, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 8.0 in stage 3.0 (TID 24) (10.1.255.235, executor driver, partition 8, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 9.0 in stage 3.0 (TID 25) (10.1.255.235, executor driver, partition 9, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 10.0 in stage 3.0 (TID 26) (10.1.255.235, executor driver, partition 10, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 11.0 in stage 3.0 (TID 27) (10.1.255.235, executor driver, partition 11, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 10.1.255.235:37955 in memory (size: 6.6 KiB, free: 434.3 MiB)
22/08/12 15:52:04 INFO TaskSetManager: Starting task 12.0 in stage 3.0 (TID 28) (10.1.255.235, executor driver, partition 12, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 13.0 in stage 3.0 (TID 29) (10.1.255.235, executor driver, partition 13, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 14.0 in stage 3.0 (TID 30) (10.1.255.235, executor driver, partition 14, PROCESS_LOCAL, 4433 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO TaskSetManager: Starting task 15.0 in stage 3.0 (TID 31) (10.1.255.235, executor driver, partition 15, PROCESS_LOCAL, 4471 bytes) taskResourceAssignments Map()
22/08/12 15:52:04 INFO Executor: Running task 0.0 in stage 3.0 (TID 16)
22/08/12 15:52:04 INFO Executor: Running task 1.0 in stage 3.0 (TID 17)
22/08/12 15:52:04 INFO Executor: Running task 2.0 in stage 3.0 (TID 18)
22/08/12 15:52:04 INFO Executor: Running task 5.0 in stage 3.0 (TID 21)
22/08/12 15:52:04 INFO Executor: Running task 4.0 in stage 3.0 (TID 20)
22/08/12 15:52:04 INFO Executor: Running task 3.0 in stage 3.0 (TID 19)
22/08/12 15:52:04 INFO Executor: Running task 8.0 in stage 3.0 (TID 24)
22/08/12 15:52:04 INFO Executor: Running task 7.0 in stage 3.0 (TID 23)
22/08/12 15:52:04 INFO Executor: Running task 6.0 in stage 3.0 (TID 22)
22/08/12 15:52:04 INFO Executor: Running task 10.0 in stage 3.0 (TID 26)
22/08/12 15:52:04 INFO Executor: Running task 9.0 in stage 3.0 (TID 25)
22/08/12 15:52:04 INFO Executor: Running task 11.0 in stage 3.0 (TID 27)
22/08/12 15:52:04 INFO Executor: Running task 13.0 in stage 3.0 (TID 29)
22/08/12 15:52:04 INFO Executor: Running task 12.0 in stage 3.0 (TID 28)
22/08/12 15:52:04 INFO Executor: Running task 14.0 in stage 3.0 (TID 30)
22/08/12 15:52:04 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 10.1.255.235:37955 in memory (size: 6.6 KiB, free: 434.3 MiB)
22/08/12 15:52:04 INFO Executor: Running task 15.0 in stage 3.0 (TID 31)
22/08/12 15:52:04 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 10.1.255.235:37955 in memory (size: 6.6 KiB, free: 434.3 MiB)
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO PythonRunner: Times: total = 124, boot = -1803, init = 1927, finish = 0
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO PythonRunner: Times: total = 101, boot = -1805, init = 1906, finish = 0
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO PythonRunner: Times: total = 126, boot = -1811, init = 1936, finish = 1
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO PythonRunner: Times: total = 120, boot = -1829, init = 1949, finish = 0
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO PythonRunner: Times: total = 109, boot = -1827, init = 1936, finish = 0
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO PythonRunner: Times: total = 194, boot = -1849, init = 2043, finish = 0
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO PythonRunner: Times: total = 153, boot = -1846, init = 1999, finish = 0
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO PythonRunner: Times: total = 190, boot = -1841, init = 2031, finish = 0
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO PythonRunner: Times: total = 93, boot = -1851, init = 1944, finish = 0
22/08/12 15:52:04 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO CodecConfig: Compression: SNAPPY
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO PythonRunner: Times: total = 120, boot = -1819, init = 1939, finish = 0
22/08/12 15:52:04 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
22/08/12 15:52:04 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
22/08/12 15:52:04 INFO CodecConfig: Compression: SNAPPY
22/08/12 15:52:04 INFO CodecConfig: Compression: SNAPPY
22/08/12 15:52:04 INFO CodecConfig: Compression: SNAPPY
22/08/12 15:52:04 INFO PythonRunner: Times: total = 252, boot = 6, init = 246, finish = 0
22/08/12 15:52:04 INFO PythonRunner: Times: total = 258, boot = 4, init = 254, finish = 0
22/08/12 15:52:04 INFO PythonRunner: Times: total = 272, boot = 18, init = 254, finish = 0
22/08/12 15:52:04 INFO ParquetOutputFormat: Parquet block size to 134217728
22/08/12 15:52:04 INFO ParquetOutputFormat: Validation is off
22/08/12 15:52:04 INFO ParquetOutputFormat: Maximum row group padding size is 8388608 bytes
22/08/12 15:52:04 INFO ParquetOutputFormat: Parquet properties are:
Parquet page size to 1048576
Parquet dictionary page size to 1048576
Dictionary is true
Writer version is: PARQUET_1_0
Page size checking is: estimated
Min row count for page size check is: 100
Max row count for page size check is: 10000
Truncate length for column indexes is: 64
Truncate length for statistics min/max is: 2147483647
Bloom filter enabled: false
Max Bloom filter size for a column is 1048576
Bloom filter expected number of distinct values are: null
Page row count limit to 20000
Writing page checksums is: on
22/08/12 15:52:04 INFO PythonRunner: Times: total = 300, boot = 24, init = 275, finish = 1
22/08/12 15:52:04 INFO ParquetOutputFormat: Parquet block size to 134217728
22/08/12 15:52:04 INFO ParquetOutputFormat: Validation is off
22/08/12 15:52:04 INFO ParquetOutputFormat: Maximum row group padding size is 8388608 bytes
22/08/12 15:52:04 INFO ParquetOutputFormat: Parquet properties are:
Parquet page size to 1048576
Parquet dictionary page size to 1048576
Dictionary is true
Writer version is: PARQUET_1_0
Page size checking is: estimated
Min row count for page size check is: 100
Max row count for page size check is: 10000
Truncate length for column indexes is: 64
Truncate length for statistics min/max is: 2147483647
Bloom filter enabled: false
Max Bloom filter size for a column is 1048576
Bloom filter expected number of distinct values are: null
Page row count limit to 20000
Writing page checksums is: on
22/08/12 15:52:05 INFO ParquetWriteSupport: Initialized Parquet WriteSupport with Catalyst schema:
{
"type" : "struct",
"fields" : [ {
"name" : "a",
"type" : "long",
"nullable" : true,
"metadata" : { }
}, {
"name" : "b",
"type" : "double",
"nullable" : true,
"metadata" : { }
} ]
}
and corresponding Parquet message type:
message spark_schema {
optional int64 a;
optional double b;
}
22/08/12 15:52:05 INFO ParquetWriteSupport: Initialized Parquet WriteSupport with Catalyst schema:
{
"type" : "struct",
"fields" : [ {
"name" : "a",
"type" : "long",
"nullable" : true,
"metadata" : { }
}, {
"name" : "b",
"type" : "double",
"nullable" : true,
"metadata" : { }
} ]
}
and corresponding Parquet message type:
message spark_schema {
optional int64 a;
optional double b;
}
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552044067496105161590318_0003_m_000004_20
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552041591615013203944757_0003_m_000012_28
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552043522334103219208464_0003_m_000008_24
22/08/12 15:52:05 INFO Executor: Finished task 12.0 in stage 3.0 (TID 28). 2828 bytes result sent to driver
22/08/12 15:52:05 INFO Executor: Finished task 8.0 in stage 3.0 (TID 24). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO Executor: Finished task 4.0 in stage 3.0 (TID 20). 2828 bytes result sent to driver
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_20220812155204154530677864664845_0003_m_000002_18
22/08/12 15:52:05 INFO TaskSetManager: Finished task 12.0 in stage 3.0 (TID 28) in 615 ms on 10.1.255.235 (executor driver) (1/16)
22/08/12 15:52:05 INFO TaskSetManager: Finished task 4.0 in stage 3.0 (TID 20) in 621 ms on 10.1.255.235 (executor driver) (2/16)
22/08/12 15:52:05 INFO Executor: Finished task 2.0 in stage 3.0 (TID 18). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 8.0 in stage 3.0 (TID 24) in 621 ms on 10.1.255.235 (executor driver) (3/16)
22/08/12 15:52:05 INFO TaskSetManager: Finished task 2.0 in stage 3.0 (TID 18) in 634 ms on 10.1.255.235 (executor driver) (4/16)
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552044913070298919379450_0003_m_000006_22
22/08/12 15:52:05 INFO Executor: Finished task 6.0 in stage 3.0 (TID 22). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 6.0 in stage 3.0 (TID 22) in 648 ms on 10.1.255.235 (executor driver) (5/16)
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552048959010947190359361_0003_m_000010_26
22/08/12 15:52:05 INFO Executor: Finished task 10.0 in stage 3.0 (TID 26). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 10.0 in stage 3.0 (TID 26) in 661 ms on 10.1.255.235 (executor driver) (6/16)
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_20220812155204938114852032463722_0003_m_000003_19
22/08/12 15:52:05 INFO Executor: Finished task 3.0 in stage 3.0 (TID 19). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 3.0 in stage 3.0 (TID 19) in 701 ms on 10.1.255.235 (executor driver) (7/16)
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552042257571263133994627_0003_m_000007_23
22/08/12 15:52:05 INFO Executor: Finished task 7.0 in stage 3.0 (TID 23). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 7.0 in stage 3.0 (TID 23) in 709 ms on 10.1.255.235 (executor driver) (8/16)
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552045936362877118075426_0003_m_000001_17
22/08/12 15:52:05 INFO Executor: Finished task 1.0 in stage 3.0 (TID 17). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 17) in 740 ms on 10.1.255.235 (executor driver) (9/16)
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_20220812155204982699882531607104_0003_m_000014_30
22/08/12 15:52:05 INFO Executor: Finished task 14.0 in stage 3.0 (TID 30). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 14.0 in stage 3.0 (TID 30) in 738 ms on 10.1.255.235 (executor driver) (10/16)
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552042253980659633085161_0003_m_000005_21
22/08/12 15:52:05 INFO Executor: Finished task 5.0 in stage 3.0 (TID 21). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 5.0 in stage 3.0 (TID 21) in 758 ms on 10.1.255.235 (executor driver) (11/16)
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552048928859914082921635_0003_m_000011_27
22/08/12 15:52:05 INFO Executor: Finished task 11.0 in stage 3.0 (TID 27). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 11.0 in stage 3.0 (TID 27) in 761 ms on 10.1.255.235 (executor driver) (12/16)
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552043289737422578421763_0003_m_000009_25
22/08/12 15:52:05 INFO Executor: Finished task 9.0 in stage 3.0 (TID 25). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 9.0 in stage 3.0 (TID 25) in 771 ms on 10.1.255.235 (executor driver) (13/16)
22/08/12 15:52:05 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552041868800916138177761_0003_m_000013_29
22/08/12 15:52:05 INFO Executor: Finished task 13.0 in stage 3.0 (TID 29). 2785 bytes result sent to driver
22/08/12 15:52:05 INFO TaskSetManager: Finished task 13.0 in stage 3.0 (TID 29) in 779 ms on 10.1.255.235 (executor driver) (14/16)
22/08/12 15:52:05 INFO CodecPool: Got brand-new compressor [.snappy]
22/08/12 15:52:05 INFO CodecPool: Got brand-new compressor [.snappy]
22/08/12 15:52:05 INFO PythonRunner: Times: total = 113, boot = -1839, init = 1952, finish = 0
22/08/12 15:52:05 INFO PythonRunner: Times: total = 248, boot = 15, init = 232, finish = 1
22/08/12 15:52:09 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_20220812155204348417260796102101_0003_m_000015_31
22/08/12 15:52:09 WARN BasicWriteTaskStatsTracker: Expected 1 files, but only saw 0. This could be due to the output format not writing empty files, or files being not immediately visible in the filesystem.
22/08/12 15:52:09 INFO Executor: Finished task 15.0 in stage 3.0 (TID 31). 2828 bytes result sent to driver
22/08/12 15:52:09 INFO TaskSetManager: Finished task 15.0 in stage 3.0 (TID 31) in 5491 ms on 10.1.255.235 (executor driver) (15/16)
22/08/12 15:52:10 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_202208121552048305613759150302975_0003_m_000000_16
22/08/12 15:52:10 WARN BasicWriteTaskStatsTracker: Expected 1 files, but only saw 0. This could be due to the output format not writing empty files, or files being not immediately visible in the filesystem.
22/08/12 15:52:10 INFO Executor: Finished task 0.0 in stage 3.0 (TID 16). 2785 bytes result sent to driver
22/08/12 15:52:10 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 16) in 5510 ms on 10.1.255.235 (executor driver) (16/16)
22/08/12 15:52:10 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool
22/08/12 15:52:10 INFO DAGScheduler: ResultStage 3 (parquet at NativeMethodAccessorImpl.java:0) finished in 5.602 s
22/08/12 15:52:10 INFO DAGScheduler: Job 3 is finished. Cancelling potential speculative or zombie tasks for this job
22/08/12 15:52:10 INFO TaskSchedulerImpl: Killing all running tasks in stage 3: Stage finished
22/08/12 15:52:10 INFO DAGScheduler: Job 3 finished: parquet at NativeMethodAccessorImpl.java:0, took 5.610808 s
22/08/12 15:52:10 INFO FileFormatWriter: Start to commit write Job 17da440a-28ed-4d3a-aa02-c3a3551500bf.
22/08/12 15:52:10 ERROR FileFormatWriter: Aborting job 17da440a-28ed-4d3a-aa02-c3a3551500bf.
java.io.FileNotFoundException: No such file or directory: s3a://master.rando2/nonemptyprefix5/_temporary/0
at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3866)
at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3688)
at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:3300)
at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$20(S3AFileSystem.java:3264)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$21(S3AFileSystem.java:3263)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444)
at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2337)
at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2356)
at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:3262)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1972)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:2014)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.getAllCommittedTaskPaths(FileOutputCommitter.java:334)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitJobInternal(FileOutputCommitter.java:404)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitJob(FileOutputCommitter.java:377)
at org.apache.parquet.hadoop.ParquetOutputCommitter.commitJob(ParquetOutputCommitter.java:48)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.commitJob(HadoopMapReduceCommitProtocol.scala:192)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$25(FileFormatWriter.scala:267)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:642)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:267)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:186)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:113)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:111)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:125)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:98)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:584)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:584)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:560)
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)
at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:116)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:860)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:390)
at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:363)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:239)
at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:793)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:829)
Traceback (most recent call last):
File "/home/luke/pp/pachyderm/spark/spark.py", line 47, in <module>
df.write.parquet('s3a://master.rando2/nonemptyprefix5', mode="overwrite")
File "/home/luke/pp/pachyderm/venv/lib/python3.10/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 1140, in parquet
File "/home/luke/pp/pachyderm/venv/lib/python3.10/site-packages/pyspark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1321, in __call__
File "/home/luke/pp/pachyderm/venv/lib/python3.10/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 190, in deco
File "/home/luke/pp/pachyderm/venv/lib/python3.10/site-packages/pyspark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o241.parquet.
: org.apache.spark.SparkException: Job aborted.
at org.apache.spark.sql.errors.QueryExecutionErrors$.jobAbortedError(QueryExecutionErrors.scala:638)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:278)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:186)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:113)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:111)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:125)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:98)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:584)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:584)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:560)
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)
at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:116)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:860)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:390)
at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:363)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:239)
at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:793)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.FileNotFoundException: No such file or directory: s3a://master.rando2/nonemptyprefix5/_temporary/0
at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3866)
at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3688)
at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:3300)
at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$20(S3AFileSystem.java:3264)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$21(S3AFileSystem.java:3263)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444)
at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2337)
at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2356)
at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:3262)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1972)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:2014)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.getAllCommittedTaskPaths(FileOutputCommitter.java:334)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitJobInternal(FileOutputCommitter.java:404)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitJob(FileOutputCommitter.java:377)
at org.apache.parquet.hadoop.ParquetOutputCommitter.commitJob(ParquetOutputCommitter.java:48)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.commitJob(HadoopMapReduceCommitProtocol.scala:192)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$25(FileFormatWriter.scala:267)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:642)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:267)
... 42 more
22/08/12 15:52:10 INFO SparkContext: Invoking stop() from shutdown hook
22/08/12 15:52:10 INFO SparkUI: Stopped Spark web UI at http://10.1.255.235:4040
22/08/12 15:52:10 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/08/12 15:52:10 INFO MemoryStore: MemoryStore cleared
22/08/12 15:52:10 INFO BlockManager: BlockManager stopped
22/08/12 15:52:10 INFO BlockManagerMaster: BlockManagerMaster stopped
22/08/12 15:52:10 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/08/12 15:52:10 INFO SparkContext: Successfully stopped SparkContext
22/08/12 15:52:10 INFO ShutdownHookManager: Shutdown hook called
22/08/12 15:52:10 INFO ShutdownHookManager: Deleting directory /tmp/spark-6186ab03-b328-44c3-951e-24cff4dae50f
22/08/12 15:52:10 INFO ShutdownHookManager: Deleting directory /tmp/spark-659878ce-6457-4db6-8008-5727d42f7983
22/08/12 15:52:10 INFO ShutdownHookManager: Deleting directory /tmp/spark-659878ce-6457-4db6-8008-5727d42f7983/pyspark-b01de86d-bb30-470a-b367-ce43dac16cc5
22/08/12 15:52:10 INFO MetricsSystemImpl: Stopping s3a-file-system metrics system...
22/08/12 15:52:10 INFO MetricsSystemImpl: s3a-file-system metrics system stopped.
22/08/12 15:52:10 INFO MetricsSystemImpl: s3a-file-system metrics system shutdown complete.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment