Skip to content

Instantly share code, notes, and snippets.

@Aslan
Created October 26, 2015 20:44
Show Gist options
  • Save Aslan/d4bae0841ae83799c396 to your computer and use it in GitHub Desktop.
Save Aslan/d4bae0841ae83799c396 to your computer and use it in GitHub Desktop.
[hadoop@ip-10-65-200-150 ~]$ /usr/bin/spark-submit --master yarn-cluster --class com.truex.prometheus.CLIJob /home/hadoop/Prometheus-assembly-0.0.1.jar -e 'select x.id, x.title, x.description, x.mediaavailableDate as available_date, x.mediaexpirationDate as expiration_date, mediacategories.medianame as media_name, x.mediakeywords as keywords, mediaratings.scheme as rating_scheme, mediaratings.rating, cast(mediaratings.subRatings as String) as sub_ratings, content.plfileduration as duration, x.plmediaprovider as provider, x.ngccontentAdType as ad_type, x.ngcepisodeNumber as episode, ngcnetwork as network, x.ngcseasonNumber as season_number, x.ngcuID as ngc_uid, x.ngcvideoType as video_type from etl lateral view explode(entries) entries as x lateral view explode(x.mediacategories) cat as mediacategories lateral view explode(x.mediaratings) r as mediaratings lateral view explode(x.mediacontent) mediacontent as content lateral view outer explode(x.ngcnetwork) net as ngcnetwork' -j http://feed.theplatform.com/f/ngc/ngcngw-analytics /tmp/test
15/10/26 20:40:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/10/26 20:40:15 INFO client.RMProxy: Connecting to ResourceManager at ip-10-65-200-150.ec2.internal/10.65.200.150:8032
15/10/26 20:40:15 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
15/10/26 20:40:15 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (11520 MB per container)
15/10/26 20:40:15 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
15/10/26 20:40:15 INFO yarn.Client: Setting up container launch context for our AM
15/10/26 20:40:15 INFO yarn.Client: Setting up the launch environment for our AM container
15/10/26 20:40:15 INFO yarn.Client: Preparing resources for our AM container
15/10/26 20:40:16 INFO yarn.Client: Uploading resource file:/usr/lib/spark/lib/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar -> hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar
15/10/26 20:40:17 INFO metrics.MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1444274560440
15/10/26 20:40:17 INFO metrics.MetricsSaver: Created MetricsSaver j-2US4HNPLS1SJO:i-131cdec7:SparkSubmit:16504 period:60 /mnt/var/em/raw/i-131cdec7_20151026_SparkSubmit_16504_raw.bin
15/10/26 20:40:18 INFO metrics.MetricsSaver: 1 aggregated HDFSWriteDelay 1584 raw values into 1 aggregated values, total 1
15/10/26 20:40:18 INFO yarn.Client: Uploading resource file:/home/hadoop/Prometheus-assembly-0.0.1.jar -> hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/Prometheus-assembly-0.0.1.jar
15/10/26 20:40:20 INFO yarn.Client: Uploading resource file:/tmp/spark-c36ee3b4-48ff-446b-9a99-c32551428073/__spark_conf__2551024085108713180.zip -> hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/__spark_conf__2551024085108713180.zip
15/10/26 20:40:20 INFO spark.SecurityManager: Changing view acls to: hadoop
15/10/26 20:40:20 INFO spark.SecurityManager: Changing modify acls to: hadoop
15/10/26 20:40:20 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
15/10/26 20:40:20 INFO yarn.Client: Submitting application 64 to ResourceManager
15/10/26 20:40:20 INFO impl.YarnClientImpl: Submitted application application_1444274555723_0064
15/10/26 20:40:21 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:40:21 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1445892020636
final status: UNDEFINED
tracking URL: http://ip-10-65-200-150.ec2.internal:20888/proxy/application_1444274555723_0064/
user: hadoop
15/10/26 20:40:22 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:40:23 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:40:24 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:40:25 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:40:26 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:40:27 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:40:28 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:28 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.169.170.124
ApplicationMaster RPC port: 0
queue: default
start time: 1445892020636
final status: UNDEFINED
tracking URL: http://ip-10-65-200-150.ec2.internal:20888/proxy/application_1444274555723_0064/
user: hadoop
15/10/26 20:40:29 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:30 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:31 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:32 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:33 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:34 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:35 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:36 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:37 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:38 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:39 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:40 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:41 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:42 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:43 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:44 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:45 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:46 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:47 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:48 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:49 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:50 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:51 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:52 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:53 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:54 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:55 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:56 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:57 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:58 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:40:59 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:00 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:01 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:02 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:03 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:04 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:05 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:06 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:07 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:08 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:09 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:10 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:11 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:12 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:13 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:14 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:15 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:16 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:17 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:41:17 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1445892020636
final status: UNDEFINED
tracking URL: http://ip-10-65-200-150.ec2.internal:20888/proxy/application_1444274555723_0064/
user: hadoop
15/10/26 20:41:18 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:41:19 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:41:20 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:41:21 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:41:22 INFO yarn.Client: Application report for application_1444274555723_0064 (state: ACCEPTED)
15/10/26 20:41:23 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:23 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.169.170.124
ApplicationMaster RPC port: 0
queue: default
start time: 1445892020636
final status: UNDEFINED
tracking URL: http://ip-10-65-200-150.ec2.internal:20888/proxy/application_1444274555723_0064/
user: hadoop
15/10/26 20:41:24 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:25 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:26 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:27 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:28 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:29 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:30 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:31 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:32 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:33 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:34 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:35 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:36 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:37 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:38 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:39 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:40 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:41 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:42 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:43 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:44 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:45 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:46 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:47 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:48 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:49 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:50 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:51 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:52 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:53 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:54 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:55 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:56 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:57 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:58 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:41:59 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:42:00 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:42:01 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:42:02 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:42:03 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:42:04 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:42:05 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:42:06 INFO yarn.Client: Application report for application_1444274555723_0064 (state: RUNNING)
15/10/26 20:42:07 INFO yarn.Client: Application report for application_1444274555723_0064 (state: FINISHED)
15/10/26 20:42:07 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.169.170.124
ApplicationMaster RPC port: 0
queue: default
start time: 1445892020636
final status: FAILED
tracking URL: http://ip-10-65-200-150.ec2.internal:20888/proxy/application_1444274555723_0064/history/application_1444274555723_0064/2
user: hadoop
Exception in thread "main" org.apache.spark.SparkException: Application application_1444274555723_0064 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:920)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:966)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/10/26 20:42:07 INFO util.ShutdownHookManager: Shutdown hook called
15/10/26 20:42:07 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-c36ee3b4-48ff-446b-9a99-c32551428073
15/10/26 20:43:21 INFO client.RMProxy: Connecting to ResourceManager at ip-10-65-200-150.ec2.internal/10.65.200.150:8032
Container: container_1444274555723_0064_02_000003 on ip-10-169-170-124.ec2.internal_8041
==========================================================================================
LogType:stderr
Log Upload Time:26-Oct-2015 20:42:08
LogLength:21000
Log Contents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/mnt1/yarn/usercache/hadoop/filecache/119/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/10/26 20:41:25 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
15/10/26 20:41:26 INFO spark.SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:41:26 INFO spark.SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:41:26 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:41:27 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/10/26 20:41:27 INFO Remoting: Starting remoting
15/10/26 20:41:27 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@ip-10-169-170-124.ec2.internal:44676]
15/10/26 20:41:27 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 44676.
15/10/26 20:41:28 INFO spark.SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:41:28 INFO spark.SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:41:28 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:41:28 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/10/26 20:41:28 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/10/26 20:41:28 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/10/26 20:41:28 INFO Remoting: Starting remoting
15/10/26 20:41:28 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/10/26 20:41:28 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor@ip-10-169-170-124.ec2.internal:42939]
15/10/26 20:41:28 INFO util.Utils: Successfully started service 'sparkExecutor' on port 42939.
15/10/26 20:41:28 INFO storage.DiskBlockManager: Created local directory at /mnt/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-42186a9e-48b8-49f7-8d64-05b7bd13558b
15/10/26 20:41:28 INFO storage.DiskBlockManager: Created local directory at /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-434f149c-c099-40c8-abef-fb3d6c80e9b2
15/10/26 20:41:28 INFO storage.MemoryStore: MemoryStore started with capacity 535.0 MB
15/10/26 20:41:28 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sparkDriver@10.169.170.124:39575/user/CoarseGrainedScheduler
15/10/26 20:41:28 INFO executor.CoarseGrainedExecutorBackend: Successfully registered with driver
15/10/26 20:41:28 INFO executor.Executor: Starting executor ID 2 on host ip-10-169-170-124.ec2.internal
15/10/26 20:41:29 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41776.
15/10/26 20:41:29 INFO netty.NettyBlockTransferService: Server created on 41776
15/10/26 20:41:29 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/10/26 20:41:29 INFO storage.BlockManagerMaster: Registered BlockManager
15/10/26 20:41:29 INFO storage.BlockManager: Registering executor with local external shuffle service.
15/10/26 20:41:52 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 0
15/10/26 20:41:52 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/10/26 20:41:52 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0
15/10/26 20:41:52 INFO storage.MemoryStore: ensureFreeSpace(47141) called with curMem=0, maxMem=560993402
15/10/26 20:41:52 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 46.0 KB, free 535.0 MB)
15/10/26 20:41:52 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 291 ms
15/10/26 20:41:52 INFO storage.MemoryStore: ensureFreeSpace(135712) called with curMem=47141, maxMem=560993402
15/10/26 20:41:52 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 132.5 KB, free 534.8 MB)
15/10/26 20:41:52 INFO Configuration.deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
15/10/26 20:41:52 INFO Configuration.deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
15/10/26 20:41:52 INFO Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
15/10/26 20:41:52 INFO Configuration.deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
15/10/26 20:41:52 INFO Configuration.deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
15/10/26 20:41:53 INFO metrics.MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1444274560440
15/10/26 20:41:53 INFO metrics.MetricsSaver: Created MetricsSaver j-2US4HNPLS1SJO:i-031cded7:CoarseGrainedExecutorBackend:16330 period:60 /mnt/var/em/raw/i-031cded7_20151026_CoarseGrainedExecutorBackend_16330_raw.bin
15/10/26 20:41:53 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0000_m_000000_0' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.original/_temporary/0/task_201510262041_0000_m_000000
15/10/26 20:41:53 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0000_m_000000_0: Committed
15/10/26 20:41:53 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 0). 1885 bytes result sent to driver
15/10/26 20:41:54 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 2
15/10/26 20:41:54 INFO executor.Executor: Running task 0.0 in stage 1.0 (TID 2)
15/10/26 20:41:54 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 1
15/10/26 20:41:54 INFO storage.MemoryStore: ensureFreeSpace(2070) called with curMem=0, maxMem=560993402
15/10/26 20:41:54 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.0 KB, free 535.0 MB)
15/10/26 20:41:54 INFO broadcast.TorrentBroadcast: Reading broadcast variable 1 took 12 ms
15/10/26 20:41:54 INFO storage.MemoryStore: ensureFreeSpace(3776) called with curMem=2070, maxMem=560993402
15/10/26 20:41:54 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.7 KB, free 535.0 MB)
15/10/26 20:41:54 INFO executor.Executor: Finished task 0.0 in stage 1.0 (TID 2). 1616 bytes result sent to driver
15/10/26 20:41:57 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 5
15/10/26 20:41:57 INFO executor.Executor: Running task 1.0 in stage 2.0 (TID 5)
15/10/26 20:41:57 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 3
15/10/26 20:41:57 INFO storage.MemoryStore: ensureFreeSpace(29338) called with curMem=0, maxMem=560993402
15/10/26 20:41:57 INFO storage.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 28.7 KB, free 535.0 MB)
15/10/26 20:41:57 INFO broadcast.TorrentBroadcast: Reading broadcast variable 3 took 9 ms
15/10/26 20:41:57 INFO storage.MemoryStore: ensureFreeSpace(82904) called with curMem=29338, maxMem=560993402
15/10/26 20:41:57 INFO storage.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 81.0 KB, free 534.9 MB)
15/10/26 20:41:58 INFO datasources.DefaultWriterContainer: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
15/10/26 20:41:59 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
15/10/26 20:41:59 INFO compress.CodecPool: Got brand-new compressor [.gz]
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
15/10/26 20:42:00 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0002_m_000001_0' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.parquet/_temporary/0/task_201510262041_0002_m_000001
15/10/26 20:42:00 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0002_m_000001_0: Committed
15/10/26 20:42:00 INFO executor.Executor: Finished task 1.0 in stage 2.0 (TID 5). 936 bytes result sent to driver
15/10/26 20:42:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 6
15/10/26 20:42:03 INFO executor.Executor: Running task 0.0 in stage 3.0 (TID 6)
15/10/26 20:42:03 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 5
15/10/26 20:42:03 INFO storage.MemoryStore: ensureFreeSpace(58865) called with curMem=0, maxMem=560993402
15/10/26 20:42:03 INFO storage.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 57.5 KB, free 534.9 MB)
15/10/26 20:42:03 INFO broadcast.TorrentBroadcast: Reading broadcast variable 5 took 66 ms
15/10/26 20:42:03 INFO storage.MemoryStore: ensureFreeSpace(170000) called with curMem=58865, maxMem=560993402
15/10/26 20:42:03 INFO storage.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 166.0 KB, free 534.8 MB)
15/10/26 20:42:03 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262042_0003_m_000000_6' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.tsv/_temporary/0/task_201510262042_0003_m_000000
15/10/26 20:42:03 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262042_0003_m_000000_6: Committed
15/10/26 20:42:03 INFO executor.Executor: Finished task 0.0 in stage 3.0 (TID 6). 2311 bytes result sent to driver
15/10/26 20:42:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 9
15/10/26 20:42:03 INFO executor.Executor: Running task 3.0 in stage 3.0 (TID 9)
15/10/26 20:42:03 INFO spark.CacheManager: Partition rdd_21_1 not found, computing it
15/10/26 20:42:04 INFO codegen.GenerateUnsafeProjection: Code generated in 389.388605 ms
15/10/26 20:42:04 INFO codegen.GenerateSafeProjection: Code generated in 66.894668 ms
15/10/26 20:42:04 INFO codegen.GenerateUnsafeProjection: Code generated in 240.134448 ms
15/10/26 20:42:04 INFO codegen.GenerateSafeProjection: Code generated in 48.817602 ms
15/10/26 20:42:04 INFO codegen.GenerateUnsafeProjection: Code generated in 221.004339 ms
15/10/26 20:42:04 INFO codegen.GenerateSafeProjection: Code generated in 38.258951 ms
15/10/26 20:42:04 INFO codegen.GenerateUnsafeProjection: Code generated in 73.309384 ms
15/10/26 20:42:05 ERROR executor.Executor: Exception in task 3.0 in stage 3.0 (TID 9)
java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:42:05 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 10
15/10/26 20:42:05 INFO executor.Executor: Running task 3.1 in stage 3.0 (TID 10)
15/10/26 20:42:06 INFO spark.CacheManager: Partition rdd_21_1 not found, computing it
15/10/26 20:42:06 ERROR executor.Executor: Exception in task 3.1 in stage 3.0 (TID 10)
java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:42:07 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown
15/10/26 20:42:07 INFO storage.MemoryStore: MemoryStore cleared
15/10/26 20:42:07 INFO storage.BlockManager: BlockManager stopped
15/10/26 20:42:07 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/10/26 20:42:07 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/10/26 20:42:07 WARN channel.DefaultChannelPipeline: An exception was thrown by an exception handler.
java.util.concurrent.RejectedExecutionException: Worker has already been shutdown
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:115)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:71)
at org.jboss.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:55)
at org.jboss.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
at org.jboss.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
at org.jboss.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496)
at org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46)
at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:54)
at org.jboss.netty.channel.Channels.disconnect(Channels.java:781)
at org.jboss.netty.channel.AbstractChannel.disconnect(AbstractChannel.java:211)
at akka.remote.transport.netty.NettyTransport$$anonfun$gracefulClose$1.apply(NettyTransport.scala:223)
at akka.remote.transport.netty.NettyTransport$$anonfun$gracefulClose$1.apply(NettyTransport.scala:222)
at scala.util.Success.foreach(Try.scala:205)
at scala.concurrent.Future$$anonfun$foreach$1.apply(Future.scala:204)
at scala.concurrent.Future$$anonfun$foreach$1.apply(Future.scala:204)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/10/26 20:42:07 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/10/26 20:42:07 INFO util.ShutdownHookManager: Shutdown hook called
LogType:stdout
Log Upload Time:26-Oct-2015 20:42:08
LogLength:16564
Log Contents:
2015-10-26T20:41:27.713+0000: [GC [1 CMS-initial-mark: 0K(707840K)] 267411K(1014528K), 0.0499410 secs] [Times: user=0.05 sys=0.00, real=0.05 secs]
2015-10-26T20:41:27.787+0000: [CMS-concurrent-mark: 0.023/0.024 secs] [Times: user=0.03 sys=0.00, real=0.02 secs]
2015-10-26T20:41:27.789+0000: [GC2015-10-26T20:41:27.789+0000: [ParNew: 272640K->17630K(306688K), 0.0270300 secs] 272640K->17630K(1014528K), 0.0271010 secs] [Times: user=0.03 sys=0.03, real=0.02 secs]
2015-10-26T20:41:27.816+0000: [CMS-concurrent-preclean: 0.001/0.028 secs] [Times: user=0.04 sys=0.03, real=0.03 secs]
2015-10-26T20:41:27.816+0000: [GC[YG occupancy: 17630 K (306688 K)]2015-10-26T20:41:27.816+0000: [Rescan (parallel) , 0.0081770 secs]2015-10-26T20:41:27.824+0000: [weak refs processing, 0.0000320 secs]2015-10-26T20:41:27.824+0000: [class unloading, 0.0007990 secs]2015-10-26T20:41:27.825+0000: [scrub symbol table, 0.0025380 secs]2015-10-26T20:41:27.828+0000: [scrub string table, 0.0002320 secs] [1 CMS-remark: 0K(707840K)] 17630K(1014528K), 0.0120720 secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
2015-10-26T20:41:27.834+0000: [CMS-concurrent-sweep: 0.006/0.006 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
2015-10-26T20:41:27.871+0000: [CMS-concurrent-reset: 0.037/0.037 secs] [Times: user=0.04 sys=0.03, real=0.04 secs]
2015-10-26T20:41:31.257+0000: [GC [1 CMS-initial-mark: 0K(707840K)] 250601K(1014528K), 0.0497860 secs] [Times: user=0.05 sys=0.00, real=0.04 secs]
2015-10-26T20:41:31.330+0000: [CMS-concurrent-mark: 0.023/0.023 secs] [Times: user=0.02 sys=0.00, real=0.03 secs]
2015-10-26T20:41:31.358+0000: [CMS-concurrent-preclean: 0.024/0.029 secs] [Times: user=0.03 sys=0.00, real=0.03 secs]
CMS: abort preclean due to time 2015-10-26T20:41:36.431+0000: [CMS-concurrent-abortable-preclean: 1.668/5.072 secs] [Times: user=1.69 sys=0.01, real=5.07 secs]
2015-10-26T20:41:36.431+0000: [GC[YG occupancy: 250601 K (306688 K)]2015-10-26T20:41:36.431+0000: [Rescan (parallel) , 0.0452000 secs]2015-10-26T20:41:36.476+0000: [weak refs processing, 0.0000330 secs]2015-10-26T20:41:36.476+0000: [class unloading, 0.0014420 secs]2015-10-26T20:41:36.478+0000: [scrub symbol table, 0.0041110 secs]2015-10-26T20:41:36.482+0000: [scrub string table, 0.0004200 secs] [1 CMS-remark: 0K(707840K)] 250601K(1014528K), 0.0516690 secs] [Times: user=0.15 sys=0.00, real=0.05 secs]
2015-10-26T20:41:36.491+0000: [CMS-concurrent-sweep: 0.008/0.008 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-10-26T20:41:36.494+0000: [CMS-concurrent-reset: 0.003/0.003 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
2015-10-26T20:41:52.465+0000: [GC2015-10-26T20:41:52.465+0000: [ParNew: 290270K->31810K(306688K), 0.0837810 secs] 290270K->54078K(1014528K), 0.0838730 secs] [Times: user=0.13 sys=0.06, real=0.09 secs]
2015-10-26T20:41:58.947+0000: [GC [1 CMS-initial-mark: 22267K(707840K)] 318699K(1014528K), 0.0568690 secs] [Times: user=0.06 sys=0.00, real=0.06 secs]
2015-10-26T20:41:59.068+0000: [CMS-concurrent-mark: 0.062/0.064 secs] [Times: user=0.13 sys=0.00, real=0.06 secs]
2015-10-26T20:41:59.091+0000: [GC2015-10-26T20:41:59.091+0000: [ParNew: 304450K->34048K(306688K), 0.0501470 secs] 326718K->81876K(1014528K), 0.0502210 secs] [Times: user=0.09 sys=0.06, real=0.05 secs]
2015-10-26T20:41:59.142+0000: [CMS-concurrent-preclean: 0.018/0.073 secs] [Times: user=0.13 sys=0.09, real=0.08 secs]
2015-10-26T20:41:59.142+0000: [GC[YG occupancy: 36076 K (306688 K)]2015-10-26T20:41:59.142+0000: [Rescan (parallel) , 0.0143310 secs]2015-10-26T20:41:59.156+0000: [weak refs processing, 0.0000590 secs]2015-10-26T20:41:59.156+0000: [class unloading, 0.0023280 secs]2015-10-26T20:41:59.159+0000: [scrub symbol table, 0.0063340 secs]2015-10-26T20:41:59.165+0000: [scrub string table, 0.0004640 secs] [1 CMS-remark: 47828K(707840K)] 83905K(1014528K), 0.0239070 secs] [Times: user=0.07 sys=0.00, real=0.02 secs]
2015-10-26T20:41:59.198+0000: [CMS-concurrent-sweep: 0.020/0.032 secs] [Times: user=0.08 sys=0.00, real=0.04 secs]
2015-10-26T20:41:59.201+0000: [CMS-concurrent-reset: 0.003/0.003 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.codec.CodecConfig: Compression: GZIP
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet block size to 134217728
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet page size to 1048576
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet dictionary page size to 1048576
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Dictionary is on
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Validation is off
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Writer version is: PARQUET_1_0
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.InternalParquetRecordWriter: Flushing mem columnStore to file. allocated memory: 133,966
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 129B for [$xmlns, dcterms] BINARY: 1 values, 36B raw, 54B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 141B for [$xmlns, media] BINARY: 1 values, 40B raw, 58B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 144B for [$xmlns, ngc] BINARY: 1 values, 41B raw, 59B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 169B for [$xmlns, pl] BINARY: 1 values, 49B raw, 67B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 187B for [$xmlns, pla] BINARY: 1 values, 55B raw, 73B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 194B for [$xmlns, plfile] BINARY: 1 values, 58B raw, 74B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 182B for [$xmlns, plmedia] BINARY: 1 values, 54B raw, 70B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 190B for [$xmlns, plrelease] BINARY: 1 values, 56B raw, 74B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 537B for [entries, bag, array_element, description] BINARY: 100 values, 109B raw, 130B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 98 entries, 19,996B raw, 98B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 993B for [entries, bag, array_element, id] BINARY: 100 values, 6,716B raw, 839B comp, 1 pages, encodings: [RLE, PLAIN]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 161B for [entries, bag, array_element, mediaavailableDate] INT64: 100 values, 96B raw, 117B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 49 entries, 392B raw, 49B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 147B for [entries, bag, array_element, mediacategories, bag, array_element, medialabel] BINARY: 190 values, 117B raw, 108B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 7 entries, 86B raw, 7B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 257B for [entries, bag, array_element, mediacategories, bag, array_element, medianame] BINARY: 190 values, 178B raw, 159B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 24 entries, 822B raw, 24B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 126B for [entries, bag, array_element, mediacategories, bag, array_element, mediascheme] BINARY: 190 values, 79B raw, 84B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 2 entries, 22B raw, 2B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 207B for [entries, bag, array_element, mediacontent, bag, array_element, plfileduration] DOUBLE: 200 values, 189B raw, 163B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 29 entries, 232B raw, 29B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 62B for [entries, bag, array_element, mediacopyright] BINARY: 100 values, 19B raw, 36B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 1 entries, 4B raw, 1B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 62B for [entries, bag, array_element, mediacopyrightUrl] BINARY: 100 values, 19B raw, 36B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 1 entries, 4B raw, 1B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 59B for [entries, bag, array_element, mediacountries, bag, array_element] BINARY: 100 values, 17B raw, 36B comp, 1 pages, encodings: [RLE, PLAIN]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 122B for [entries, bag, array_element, mediacredits, bag, array_element, mediarole] BINARY: 181 values, 80B raw, 89B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 2 entries, 13B raw, 2B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 94B for [entries, bag, array_element, mediacredits, bag, array_element, mediascheme] BINARY: 181 values, 61B raw, 67B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 1 entries, 4B raw, 1B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 248B for [entries, bag, array_element, mediacredits, bag, array_element, mediavalue] BINARY: 181 values, 198B raw, 198B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 43 entries, 757B raw, 43B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 67B for [entries, bag, array_element, mediaexcludeCountries] BOOLEAN: 100 values, 29B raw, 39B comp, 1 pages, encodings: [RLE, PLAIN]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 148B for [entries, bag, array_element, mediaexpirationDate] INT64: 100 values, 83B raw, 104B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 21 entries, 168B raw, 21B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 1,806B for [entries, bag, array_element, mediakeywords] BINARY: 100 values, 4,910B raw, 1,677B comp, 1 pages, encodings: [RLE, PLAIN]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 87B for [entries, bag, array_element, mediaratings, bag, array_element, rating] BINARY: 100 values, 32B raw, 51B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 2 entries, 18B raw, 2B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 83B for [entries, bag, array_element, mediaratings, bag, array_element, scheme] BINARY: 100 values, 20B raw, 37B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 1 entries, 14B raw, 1B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 75B for [entries, bag, array_element, mediaratings, bag, array_element, subRatings, bag, array_element] BINARY: 100 values, 29B raw, 46B comp, 1 pages, encodings: [RLE, PLAIN]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 62B for [entries, bag, array_element, mediatext] BINARY: 100 values, 19B raw, 36B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 1 entries, 4B raw, 1B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 79B for [entries, bag, array_element, mediathumbnails, bag, array_element, plfileduration] DOUBLE: 100 values, 20B raw, 37B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 1 entries, 8B raw, 1B comp}
2015-10-26T20:42:04.439+0000: [GC2015-10-26T20:42:04.439+0000: [ParNew: 306688K->14030K(306688K), 0.0498740 secs] 354461K->83664K(1014528K), 0.0499460 secs] [Times: user=0.16 sys=0.01, real=0.05 secs]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 74B for [entries, bag, array_element, ngccontentAdType] BINARY: 100 values, 22B raw, 41B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 2 entries, 15B raw, 2B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 163B for [entries, bag, array_element, ngcepisodeNumber] INT64: 100 values, 96B raw, 119B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 30 entries, 240B raw, 30B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 107B for [entries, bag, array_element, ngcnetwork, bag, array_element] BINARY: 100 values, 35B raw, 54B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 2 entries, 35B raw, 2B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 120B for [entries, bag, array_element, ngcseasonNumber] INT64: 100 values, 57B raw, 77B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 6 entries, 48B raw, 6B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 707B for [entries, bag, array_element, ngcuID] BINARY: 100 values, 2,711B raw, 639B comp, 1 pages, encodings: [RLE, PLAIN]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 86B for [entries, bag, array_element, ngcvideoType] BINARY: 100 values, 19B raw, 36B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 1 entries, 16B raw, 1B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 177B for [entries, bag, array_element, plmediachapters, bag, array_element, plmediaendTime] DOUBLE: 569 values, 207B raw, 133B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 46 entries, 368B raw, 46B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 797B for [entries, bag, array_element, plmediachapters, bag, array_element, plmediastartTime] DOUBLE: 569 values, 809B raw, 753B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 411 entries, 3,288B raw, 411B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 113B for [entries, bag, array_element, plmediachapters, bag, array_element, plmediathumbnailUrl] BINARY: 569 values, 161B raw, 85B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 1 entries, 4B raw, 1B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 125B for [entries, bag, array_element, plmediachapters, bag, array_element, plmediatitle] BINARY: 569 values, 172B raw, 95B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 2 entries, 10B raw, 2B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 68B for [entries, bag, array_element, plmediaprovider] BINARY: 100 values, 19B raw, 36B comp, 1 pages, encodings: [RLE, PLAIN_DICTIONARY], dic { 1 entries, 7B raw, 1B comp}
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 1,358B for [entries, bag, array_element, title] BINARY: 100 values, 2,271B raw, 1,300B comp, 1 pages, encodings: [RLE, PLAIN]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 70B for [entryCount] INT64: 1 values, 14B raw, 29B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 70B for [itemsPerPage] INT64: 1 values, 14B raw, 29B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 70B for [startIndex] INT64: 1 values, 14B raw, 29B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Oct 26, 2015 8:41:59 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 110B for [title] BINARY: 1 values, 29B raw, 47B comp, 1 pages, encodings: [RLE, PLAIN, BIT_PACKED]
Container: container_1444274555723_0064_01_000003 on ip-10-169-170-124.ec2.internal_8041
==========================================================================================
LogType:stderr
Log Upload Time:26-Oct-2015 20:42:08
LogLength:14766
Log Contents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/mnt1/yarn/usercache/hadoop/filecache/119/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/10/26 20:40:30 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
15/10/26 20:40:31 INFO spark.SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:40:31 INFO spark.SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:40:31 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:40:32 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/10/26 20:40:32 INFO Remoting: Starting remoting
15/10/26 20:40:33 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@ip-10-169-170-124.ec2.internal:45582]
15/10/26 20:40:33 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 45582.
15/10/26 20:40:33 INFO spark.SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:40:33 INFO spark.SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:40:33 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:40:33 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/10/26 20:40:33 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/10/26 20:40:33 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/10/26 20:40:33 INFO Remoting: Starting remoting
15/10/26 20:40:33 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/10/26 20:40:33 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor@ip-10-169-170-124.ec2.internal:59701]
15/10/26 20:40:33 INFO util.Utils: Successfully started service 'sparkExecutor' on port 59701.
15/10/26 20:40:33 INFO storage.DiskBlockManager: Created local directory at /mnt/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-ee9025b1-db1c-470d-a4e1-f1fd687ca036
15/10/26 20:40:33 INFO storage.DiskBlockManager: Created local directory at /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-46e1243a-6f8a-4996-bbdb-8e8651ab1a56
15/10/26 20:40:34 INFO storage.MemoryStore: MemoryStore started with capacity 535.0 MB
15/10/26 20:40:34 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sparkDriver@10.169.170.124:53152/user/CoarseGrainedScheduler
15/10/26 20:40:34 INFO executor.CoarseGrainedExecutorBackend: Successfully registered with driver
15/10/26 20:40:34 INFO executor.Executor: Starting executor ID 2 on host ip-10-169-170-124.ec2.internal
15/10/26 20:40:34 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42829.
15/10/26 20:40:34 INFO netty.NettyBlockTransferService: Server created on 42829
15/10/26 20:40:34 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/10/26 20:40:34 INFO storage.BlockManagerMaster: Registered BlockManager
15/10/26 20:40:34 INFO storage.BlockManager: Registering executor with local external shuffle service.
15/10/26 20:41:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 0
15/10/26 20:41:00 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/10/26 20:41:00 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0
15/10/26 20:41:00 INFO storage.MemoryStore: ensureFreeSpace(47141) called with curMem=0, maxMem=560993402
15/10/26 20:41:00 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 46.0 KB, free 535.0 MB)
15/10/26 20:41:00 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 278 ms
15/10/26 20:41:00 INFO storage.MemoryStore: ensureFreeSpace(135712) called with curMem=47141, maxMem=560993402
15/10/26 20:41:00 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 132.5 KB, free 534.8 MB)
15/10/26 20:41:01 INFO Configuration.deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
15/10/26 20:41:01 INFO Configuration.deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
15/10/26 20:41:01 INFO Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
15/10/26 20:41:01 INFO Configuration.deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
15/10/26 20:41:01 INFO Configuration.deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
15/10/26 20:41:02 INFO metrics.MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1444274560440
15/10/26 20:41:02 INFO metrics.MetricsSaver: Created MetricsSaver j-2US4HNPLS1SJO:i-031cded7:CoarseGrainedExecutorBackend:15964 period:60 /mnt/var/em/raw/i-031cded7_20151026_CoarseGrainedExecutorBackend_15964_raw.bin
15/10/26 20:41:02 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0000_m_000000_0' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.original/_temporary/0/task_201510262041_0000_m_000000
15/10/26 20:41:02 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0000_m_000000_0: Committed
15/10/26 20:41:02 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 0). 1885 bytes result sent to driver
15/10/26 20:41:02 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 3
15/10/26 20:41:02 INFO executor.Executor: Running task 1.0 in stage 1.0 (TID 3)
15/10/26 20:41:02 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 1
15/10/26 20:41:02 INFO storage.MemoryStore: ensureFreeSpace(2070) called with curMem=0, maxMem=560993402
15/10/26 20:41:02 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.0 KB, free 535.0 MB)
15/10/26 20:41:02 INFO broadcast.TorrentBroadcast: Reading broadcast variable 1 took 85 ms
15/10/26 20:41:02 INFO storage.MemoryStore: ensureFreeSpace(3776) called with curMem=2070, maxMem=560993402
15/10/26 20:41:02 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.7 KB, free 535.0 MB)
15/10/26 20:41:04 INFO executor.Executor: Finished task 1.0 in stage 1.0 (TID 3). 6395 bytes result sent to driver
15/10/26 20:41:05 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4
15/10/26 20:41:05 INFO executor.Executor: Running task 0.0 in stage 2.0 (TID 4)
15/10/26 20:41:05 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 3
15/10/26 20:41:05 INFO storage.MemoryStore: ensureFreeSpace(29341) called with curMem=0, maxMem=560993402
15/10/26 20:41:05 INFO storage.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 28.7 KB, free 535.0 MB)
15/10/26 20:41:05 INFO broadcast.TorrentBroadcast: Reading broadcast variable 3 took 23 ms
15/10/26 20:41:05 INFO storage.MemoryStore: ensureFreeSpace(82904) called with curMem=29341, maxMem=560993402
15/10/26 20:41:05 INFO storage.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 81.0 KB, free 534.9 MB)
15/10/26 20:41:06 INFO datasources.DefaultWriterContainer: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
15/10/26 20:41:06 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
15/10/26 20:41:06 INFO compress.CodecPool: Got brand-new compressor [.gz]
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
15/10/26 20:41:07 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0002_m_000000_0' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.parquet/_temporary/0/task_201510262041_0002_m_000000
15/10/26 20:41:07 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0002_m_000000_0: Committed
15/10/26 20:41:07 INFO executor.Executor: Finished task 0.0 in stage 2.0 (TID 4). 936 bytes result sent to driver
15/10/26 20:41:11 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 7
15/10/26 20:41:11 INFO executor.Executor: Running task 1.0 in stage 3.0 (TID 7)
15/10/26 20:41:11 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 5
15/10/26 20:41:11 INFO storage.MemoryStore: ensureFreeSpace(58865) called with curMem=0, maxMem=560993402
15/10/26 20:41:11 INFO storage.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 57.5 KB, free 534.9 MB)
15/10/26 20:41:11 INFO broadcast.TorrentBroadcast: Reading broadcast variable 5 took 12 ms
15/10/26 20:41:11 INFO storage.MemoryStore: ensureFreeSpace(170000) called with curMem=58865, maxMem=560993402
15/10/26 20:41:11 INFO storage.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 166.0 KB, free 534.8 MB)
15/10/26 20:41:12 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0003_m_000001_7' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.tsv/_temporary/0/task_201510262041_0003_m_000001
15/10/26 20:41:12 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0003_m_000001_7: Committed
15/10/26 20:41:12 INFO executor.Executor: Finished task 1.0 in stage 3.0 (TID 7). 2311 bytes result sent to driver
15/10/26 20:41:12 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 9
15/10/26 20:41:12 INFO executor.Executor: Running task 3.0 in stage 3.0 (TID 9)
15/10/26 20:41:12 INFO spark.CacheManager: Partition rdd_21_1 not found, computing it
15/10/26 20:41:12 INFO codegen.GenerateUnsafeProjection: Code generated in 401.321266 ms
15/10/26 20:41:12 INFO codegen.GenerateSafeProjection: Code generated in 93.663103 ms
15/10/26 20:41:13 INFO codegen.GenerateUnsafeProjection: Code generated in 317.367712 ms
15/10/26 20:41:13 INFO codegen.GenerateSafeProjection: Code generated in 35.166778 ms
15/10/26 20:41:13 INFO codegen.GenerateUnsafeProjection: Code generated in 258.889519 ms
15/10/26 20:41:13 INFO codegen.GenerateSafeProjection: Code generated in 46.844147 ms
15/10/26 20:41:13 INFO codegen.GenerateUnsafeProjection: Code generated in 100.836632 ms
15/10/26 20:41:14 ERROR executor.Executor: Exception in task 3.0 in stage 3.0 (TID 9)
java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:41:15 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown
15/10/26 20:41:15 INFO storage.MemoryStore: MemoryStore cleared
15/10/26 20:41:15 INFO storage.BlockManager: BlockManager stopped
15/10/26 20:41:15 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/10/26 20:41:15 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/10/26 20:41:15 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/10/26 20:41:15 INFO util.ShutdownHookManager: Shutdown hook called
LogType:stdout
Log Upload Time:26-Oct-2015 20:42:08
LogLength:4212
Log Contents:
2015-10-26T20:40:32.451+0000: [GC [1 CMS-initial-mark: 0K(707840K)] 212737K(1014528K), 0.0428540 secs] [Times: user=0.04 sys=0.00, real=0.04 secs]
2015-10-26T20:40:32.518+0000: [CMS-concurrent-mark: 0.023/0.024 secs] [Times: user=0.03 sys=0.02, real=0.03 secs]
2015-10-26T20:40:32.557+0000: [CMS-concurrent-preclean: 0.031/0.039 secs] [Times: user=0.07 sys=0.03, real=0.04 secs]
2015-10-26T20:40:32.736+0000: [GC2015-10-26T20:40:32.736+0000: [ParNew: 272640K->17621K(306688K), 0.0272100 secs] 272640K->17621K(1014528K), 0.0272960 secs] [Times: user=0.05 sys=0.02, real=0.03 secs]
2015-10-26T20:40:34.284+0000: [CMS-concurrent-abortable-preclean: 1.142/1.726 secs] [Times: user=3.21 sys=0.57, real=1.72 secs]
2015-10-26T20:40:34.284+0000: [GC[YG occupancy: 172530 K (306688 K)]2015-10-26T20:40:34.284+0000: [Rescan (parallel) , 0.0125280 secs]2015-10-26T20:40:34.297+0000: [weak refs processing, 0.0000310 secs]2015-10-26T20:40:34.297+0000: [class unloading, 0.0028710 secs]2015-10-26T20:40:34.300+0000: [scrub symbol table, 0.0032650 secs]2015-10-26T20:40:34.303+0000: [scrub string table, 0.0002620 secs] [1 CMS-remark: 0K(707840K)] 172530K(1014528K), 0.0192580 secs] [Times: user=0.06 sys=0.00, real=0.02 secs]
2015-10-26T20:40:34.311+0000: [CMS-concurrent-sweep: 0.006/0.007 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
2015-10-26T20:40:34.343+0000: [CMS-concurrent-reset: 0.032/0.032 secs] [Times: user=0.03 sys=0.03, real=0.03 secs]
2015-10-26T20:41:00.663+0000: [GC2015-10-26T20:41:00.663+0000: [ParNew: 277492K->30472K(306688K), 0.0712370 secs] 277492K->36288K(1014528K), 0.0713170 secs] [Times: user=0.14 sys=0.05, real=0.07 secs]
2015-10-26T20:41:01.711+0000: [GC [1 CMS-initial-mark: 5815K(707840K)] 128580K(1014528K), 0.0208610 secs] [Times: user=0.02 sys=0.00, real=0.02 secs]
2015-10-26T20:41:01.792+0000: [CMS-concurrent-mark: 0.058/0.061 secs] [Times: user=0.13 sys=0.02, real=0.06 secs]
2015-10-26T20:41:01.824+0000: [CMS-concurrent-preclean: 0.025/0.031 secs] [Times: user=0.06 sys=0.00, real=0.03 secs]
2015-10-26T20:41:05.302+0000: [GC2015-10-26T20:41:05.302+0000: [ParNew: 303112K->34048K(306688K), 0.0731860 secs] 308928K->64619K(1014528K), 0.0732730 secs] [Times: user=0.11 sys=0.04, real=0.07 secs]
CMS: abort preclean due to time 2015-10-26T20:41:06.892+0000: [CMS-concurrent-abortable-preclean: 3.305/5.068 secs] [Times: user=8.11 sys=0.82, real=5.07 secs]
2015-10-26T20:41:06.893+0000: [GC[YG occupancy: 129688 K (306688 K)]2015-10-26T20:41:06.893+0000: [Rescan (parallel) , 0.0085610 secs]2015-10-26T20:41:06.901+0000: [weak refs processing, 0.0000540 secs]2015-10-26T20:41:06.902+0000: [class unloading, 0.0103110 secs]2015-10-26T20:41:06.912+0000: [scrub symbol table, 0.0065780 secs]2015-10-26T20:41:06.918+0000: [scrub string table, 0.0004800 secs] [1 CMS-remark: 30571K(707840K)] 160259K(1014528K), 0.0264900 secs] [Times: user=0.05 sys=0.00, real=0.03 secs]
2015-10-26T20:41:06.937+0000: [CMS-concurrent-sweep: 0.015/0.017 secs] [Times: user=0.03 sys=0.00, real=0.02 secs]
2015-10-26T20:41:06.939+0000: [CMS-concurrent-reset: 0.003/0.003 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2015-10-26T20:41:13.290+0000: [GC2015-10-26T20:41:13.290+0000: [ParNew: 306688K->14326K(306688K), 0.0486080 secs] 337207K->66146K(1014528K), 0.0486920 secs] [Times: user=0.15 sys=0.03, real=0.05 secs]
Oct 26, 2015 8:41:06 PM INFO: org.apache.parquet.hadoop.codec.CodecConfig: Compression: GZIP
Oct 26, 2015 8:41:06 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet block size to 134217728
Oct 26, 2015 8:41:06 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet page size to 1048576
Oct 26, 2015 8:41:06 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet dictionary page size to 1048576
Oct 26, 2015 8:41:06 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Dictionary is on
Oct 26, 2015 8:41:06 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Validation is off
Oct 26, 2015 8:41:06 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Writer version is: PARQUET_1_0
Oct 26, 2015 8:41:06 PM INFO: org.apache.parquet.hadoop.InternalParquetRecordWriter: Flushing mem columnStore to file. allocated memory: 65,568
Container: container_1444274555723_0064_01_000001 on ip-10-169-170-124.ec2.internal_8041
==========================================================================================
LogType:stderr
Log Upload Time:26-Oct-2015 20:42:08
LogLength:68050
Log Contents:
log4j:ERROR Could not read configuration file from URL [file:/etc/spark/conf/log4j.properties].
java.io.FileNotFoundException: /etc/spark/conf/log4j.properties (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:146)
at java.io.FileInputStream.<init>(FileInputStream.java:101)
at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:557)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
at org.apache.spark.Logging$class.initializeLogging(Logging.scala:122)
at org.apache.spark.Logging$class.initializeIfNecessary(Logging.scala:107)
at org.apache.spark.Logging$class.log(Logging.scala:51)
at org.apache.spark.deploy.yarn.ApplicationMaster$.log(ApplicationMaster.scala:603)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:617)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
log4j:ERROR Ignoring configuration file [file:/etc/spark/conf/log4j.properties].
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/mnt1/yarn/usercache/hadoop/filecache/119/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/10/26 20:40:23 INFO ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
15/10/26 20:40:24 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1444274555723_0064_000001
15/10/26 20:40:24 INFO SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:40:24 INFO SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:40:24 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:40:25 INFO ApplicationMaster: Starting the user application in a separate Thread
15/10/26 20:40:25 INFO ApplicationMaster: Waiting for spark context initialization
15/10/26 20:40:25 INFO ApplicationMaster: Waiting for spark context initialization ...
15/10/26 20:40:25 INFO SparkContext: Running Spark version 1.5.0
15/10/26 20:40:25 INFO SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:40:25 INFO SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:40:25 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:40:25 INFO Slf4jLogger: Slf4jLogger started
15/10/26 20:40:25 INFO Remoting: Starting remoting
15/10/26 20:40:25 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@10.169.170.124:53152]
15/10/26 20:40:25 INFO Utils: Successfully started service 'sparkDriver' on port 53152.
15/10/26 20:40:26 INFO SparkEnv: Registering MapOutputTracker
15/10/26 20:40:26 INFO SparkEnv: Registering BlockManagerMaster
15/10/26 20:40:26 INFO DiskBlockManager: Created local directory at /mnt/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-c05f984d-c1d4-4eef-ae80-567e181fd8ec
15/10/26 20:40:26 INFO DiskBlockManager: Created local directory at /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-7e5e6b80-7032-40a2-a0d8-833812ec57ec
15/10/26 20:40:26 INFO MemoryStore: MemoryStore started with capacity 535.0 MB
15/10/26 20:40:26 INFO HttpFileServer: HTTP File server directory is /mnt/yarn/usercache/hadoop/appcache/application_1444274555723_0064/spark-058eba6b-ad25-4176-8cf6-4cb0e2f5e8c4/httpd-0444ac19-2b65-476c-8016-5b33db12b444
15/10/26 20:40:26 INFO HttpServer: Starting HTTP Server
15/10/26 20:40:26 INFO Utils: Successfully started service 'HTTP file server' on port 58440.
15/10/26 20:40:26 INFO SparkEnv: Registering OutputCommitCoordinator
15/10/26 20:40:26 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15/10/26 20:40:26 INFO Utils: Successfully started service 'SparkUI' on port 43347.
15/10/26 20:40:26 INFO SparkUI: Started SparkUI at http://10.169.170.124:43347
15/10/26 20:40:26 INFO YarnClusterScheduler: Created YarnClusterScheduler
15/10/26 20:40:26 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/10/26 20:40:27 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52722.
15/10/26 20:40:27 INFO NettyBlockTransferService: Server created on 52722
15/10/26 20:40:27 INFO BlockManagerMaster: Trying to register BlockManager
15/10/26 20:40:27 INFO BlockManagerMasterEndpoint: Registering block manager 10.169.170.124:52722 with 535.0 MB RAM, BlockManagerId(driver, 10.169.170.124, 52722)
15/10/26 20:40:27 INFO BlockManagerMaster: Registered BlockManager
15/10/26 20:40:27 INFO MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1444274560440
15/10/26 20:40:27 INFO MetricsSaver: Created MetricsSaver j-2US4HNPLS1SJO:i-031cded7:ApplicationMaster:15861 period:60 /mnt/var/em/raw/i-031cded7_20151026_ApplicationMaster_15861_raw.bin
15/10/26 20:40:27 INFO EventLoggingListener: Logging events to hdfs:///var/log/spark/apps/application_1444274555723_0064_1
15/10/26 20:40:27 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka://sparkDriver/user/YarnAM#769513966])
15/10/26 20:40:27 INFO RMProxy: Connecting to ResourceManager at ip-10-65-200-150.ec2.internal/10.65.200.150:8030
15/10/26 20:40:27 INFO YarnRMClient: Registering the ApplicationMaster
15/10/26 20:40:27 INFO YarnAllocator: Will request 2 executor containers, each with 1 cores and 1408 MB memory including 384 MB overhead
15/10/26 20:40:27 INFO YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
15/10/26 20:40:27 INFO YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
15/10/26 20:40:27 INFO ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
15/10/26 20:40:29 INFO AMRMClientImpl: Received new token for : ip-10-67-169-247.ec2.internal:8041
15/10/26 20:40:29 INFO AMRMClientImpl: Received new token for : ip-10-169-170-124.ec2.internal:8041
15/10/26 20:40:29 INFO YarnAllocator: Launching container container_1444274555723_0064_01_000002 for on host ip-10-67-169-247.ec2.internal
15/10/26 20:40:29 INFO YarnAllocator: Launching ExecutorRunnable. driverUrl: akka.tcp://sparkDriver@10.169.170.124:53152/user/CoarseGrainedScheduler, executorHostname: ip-10-67-169-247.ec2.internal
15/10/26 20:40:29 INFO YarnAllocator: Launching container container_1444274555723_0064_01_000003 for on host ip-10-169-170-124.ec2.internal
15/10/26 20:40:29 INFO ExecutorRunnable: Starting Executor Container
15/10/26 20:40:29 INFO YarnAllocator: Launching ExecutorRunnable. driverUrl: akka.tcp://sparkDriver@10.169.170.124:53152/user/CoarseGrainedScheduler, executorHostname: ip-10-169-170-124.ec2.internal
15/10/26 20:40:29 INFO YarnAllocator: Received 2 containers from YARN, launching executors on 2 of them.
15/10/26 20:40:29 INFO ExecutorRunnable: Starting Executor Container
15/10/26 20:40:29 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
15/10/26 20:40:29 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
15/10/26 20:40:29 INFO ExecutorRunnable: Setting up ContainerLaunchContext
15/10/26 20:40:29 INFO ExecutorRunnable: Setting up ContainerLaunchContext
15/10/26 20:40:29 INFO ExecutorRunnable: Preparing Local resources
15/10/26 20:40:29 INFO ExecutorRunnable: Preparing Local resources
15/10/26 20:40:29 INFO ExecutorRunnable: Prepared Local resources Map(__app__.jar -> resource { scheme: "hdfs" host: "ip-10-65-200-150.ec2.internal" port: 8020 file: "/user/hadoop/.sparkStaging/application_1444274555723_0064/Prometheus-assembly-0.0.1.jar" } size: 162982714 timestamp: 1445892020379 type: FILE visibility: PRIVATE, __spark__.jar -> resource { scheme: "hdfs" host: "ip-10-65-200-150.ec2.internal" port: 8020 file: "/user/hadoop/.sparkStaging/application_1444274555723_0064/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar" } size: 206949550 timestamp: 1445892018878 type: FILE visibility: PRIVATE)
15/10/26 20:40:29 INFO ExecutorRunnable: Prepared Local resources Map(__app__.jar -> resource { scheme: "hdfs" host: "ip-10-65-200-150.ec2.internal" port: 8020 file: "/user/hadoop/.sparkStaging/application_1444274555723_0064/Prometheus-assembly-0.0.1.jar" } size: 162982714 timestamp: 1445892020379 type: FILE visibility: PRIVATE, __spark__.jar -> resource { scheme: "hdfs" host: "ip-10-65-200-150.ec2.internal" port: 8020 file: "/user/hadoop/.sparkStaging/application_1444274555723_0064/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar" } size: 206949550 timestamp: 1445892018878 type: FILE visibility: PRIVATE)
15/10/26 20:40:29 INFO ExecutorRunnable:
===============================================================================
YARN executor launch context:
env:
CLASSPATH -> /etc/hadoop/conf:/etc/hive/conf:/usr/lib/hadoop/*:/usr/lib/hadoop-hdfs/*:/usr/lib/hadoop-mapreduce/*:/usr/lib/hadoop-yarn/*:/usr/lib/hadoop-lzo/lib/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>/usr/lib/hadoop-lzo/lib/*<CPS>/usr/share/aws/emr/emrfs/conf<CPS>/usr/share/aws/emr/emrfs/lib/*<CPS>/usr/share/aws/emr/emrfs/auxlib/*<CPS>/usr/share/aws/emr/lib/*<CPS>/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar<CPS>/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar<CPS>/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar<CPS>/usr/share/aws/emr/cloudwatch-sink/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>/usr/lib/hadoop-lzo/lib/*<CPS>/usr/share/aws/emr/emrfs/conf<CPS>/usr/share/aws/emr/emrfs/lib/*<CPS>/usr/share/aws/emr/emrfs/auxlib/*<CPS>/usr/share/aws/emr/lib/*<CPS>/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar<CPS>/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar<CPS>/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar<CPS>/usr/share/aws/emr/cloudwatch-sink/lib/*
SPARK_LOG_URL_STDERR -> http://ip-10-169-170-124.ec2.internal:8042/node/containerlogs/container_1444274555723_0064_01_000003/hadoop/stderr?start=-4096
SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1444274555723_0064
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 206949550,162982714
SPARK_USER -> hadoop
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PRIVATE
SPARK_YARN_MODE -> true
SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1445892018878,1445892020379
SPARK_LOG_URL_STDOUT -> http://ip-10-169-170-124.ec2.internal:8042/node/containerlogs/container_1444274555723_0064_01_000003/hadoop/stdout?start=-4096
SPARK_YARN_CACHE_FILES -> hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar#__spark__.jar,hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/Prometheus-assembly-0.0.1.jar#__app__.jar
command:
LD_LIBRARY_PATH="/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:$LD_LIBRARY_PATH" {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -Xmx1024m '-verbose:gc' '-XX:+PrintGCDetails' '-XX:+PrintGCDateStamps' '-XX:+UseConcMarkSweepGC' '-XX:CMSInitiatingOccupancyFraction=70' '-XX:MaxHeapFreeRatio=70' '-XX:+CMSClassUnloadingEnabled' '-XX:OnOutOfMemoryError=kill -9 %p' -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=53152' '-Dspark.history.ui.port=18080' '-Dspark.ui.port=0' -Dspark.yarn.app.container.log.dir=<LOG_DIR> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url akka.tcp://sparkDriver@10.169.170.124:53152/user/CoarseGrainedScheduler --executor-id 2 --hostname ip-10-169-170-124.ec2.internal --cores 1 --app-id application_1444274555723_0064 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
===============================================================================
15/10/26 20:40:29 INFO ExecutorRunnable:
===============================================================================
YARN executor launch context:
env:
CLASSPATH -> /etc/hadoop/conf:/etc/hive/conf:/usr/lib/hadoop/*:/usr/lib/hadoop-hdfs/*:/usr/lib/hadoop-mapreduce/*:/usr/lib/hadoop-yarn/*:/usr/lib/hadoop-lzo/lib/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>/usr/lib/hadoop-lzo/lib/*<CPS>/usr/share/aws/emr/emrfs/conf<CPS>/usr/share/aws/emr/emrfs/lib/*<CPS>/usr/share/aws/emr/emrfs/auxlib/*<CPS>/usr/share/aws/emr/lib/*<CPS>/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar<CPS>/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar<CPS>/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar<CPS>/usr/share/aws/emr/cloudwatch-sink/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>/usr/lib/hadoop-lzo/lib/*<CPS>/usr/share/aws/emr/emrfs/conf<CPS>/usr/share/aws/emr/emrfs/lib/*<CPS>/usr/share/aws/emr/emrfs/auxlib/*<CPS>/usr/share/aws/emr/lib/*<CPS>/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar<CPS>/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar<CPS>/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar<CPS>/usr/share/aws/emr/cloudwatch-sink/lib/*
SPARK_LOG_URL_STDERR -> http://ip-10-67-169-247.ec2.internal:8042/node/containerlogs/container_1444274555723_0064_01_000002/hadoop/stderr?start=-4096
SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1444274555723_0064
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 206949550,162982714
SPARK_USER -> hadoop
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PRIVATE
SPARK_YARN_MODE -> true
SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1445892018878,1445892020379
SPARK_LOG_URL_STDOUT -> http://ip-10-67-169-247.ec2.internal:8042/node/containerlogs/container_1444274555723_0064_01_000002/hadoop/stdout?start=-4096
SPARK_YARN_CACHE_FILES -> hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar#__spark__.jar,hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/Prometheus-assembly-0.0.1.jar#__app__.jar
command:
LD_LIBRARY_PATH="/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:$LD_LIBRARY_PATH" {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -Xmx1024m '-verbose:gc' '-XX:+PrintGCDetails' '-XX:+PrintGCDateStamps' '-XX:+UseConcMarkSweepGC' '-XX:CMSInitiatingOccupancyFraction=70' '-XX:MaxHeapFreeRatio=70' '-XX:+CMSClassUnloadingEnabled' '-XX:OnOutOfMemoryError=kill -9 %p' -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=53152' '-Dspark.history.ui.port=18080' '-Dspark.ui.port=0' -Dspark.yarn.app.container.log.dir=<LOG_DIR> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url akka.tcp://sparkDriver@10.169.170.124:53152/user/CoarseGrainedScheduler --executor-id 1 --hostname ip-10-67-169-247.ec2.internal --cores 1 --app-id application_1444274555723_0064 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
===============================================================================
15/10/26 20:40:29 INFO ContainerManagementProtocolProxy: Opening proxy : ip-10-67-169-247.ec2.internal:8041
15/10/26 20:40:29 INFO ContainerManagementProtocolProxy: Opening proxy : ip-10-169-170-124.ec2.internal:8041
15/10/26 20:40:33 INFO ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. ip-10-169-170-124.ec2.internal:45582
15/10/26 20:40:34 INFO YarnClusterSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@ip-10-169-170-124.ec2.internal:59701/user/Executor#1167937417]) with ID 2
15/10/26 20:40:34 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-169-170-124.ec2.internal:42829 with 535.0 MB RAM, BlockManagerId(2, ip-10-169-170-124.ec2.internal, 42829)
15/10/26 20:40:35 INFO ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. ip-10-67-169-247.ec2.internal:49653
15/10/26 20:40:35 INFO YarnClusterSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@ip-10-67-169-247.ec2.internal:45434/user/Executor#-142258348]) with ID 1
15/10/26 20:40:35 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
15/10/26 20:40:35 INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done
15/10/26 20:40:35 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-67-169-247.ec2.internal:37713 with 535.0 MB RAM, BlockManagerId(1, ip-10-67-169-247.ec2.internal, 37713)
15/10/26 20:40:37 INFO HiveContext: Initializing execution hive, version 1.2.1
15/10/26 20:40:37 INFO ClientWrapper: Inspected Hadoop version: 2.6.0-amzn-1
15/10/26 20:40:37 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-amzn-1
15/10/26 20:40:37 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
15/10/26 20:40:37 INFO ObjectStore: ObjectStore, initialize called
15/10/26 20:40:38 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/10/26 20:40:38 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
15/10/26 20:40:40 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
15/10/26 20:40:42 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:40:42 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:40:44 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:40:44 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:40:44 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
15/10/26 20:40:44 INFO ObjectStore: Initialized ObjectStore
15/10/26 20:40:44 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
15/10/26 20:40:44 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
15/10/26 20:40:44 INFO HiveMetaStore: Added admin role in metastore
15/10/26 20:40:44 INFO HiveMetaStore: Added public role in metastore
15/10/26 20:40:44 INFO HiveMetaStore: No user is added in admin role, since config is empty
15/10/26 20:40:45 INFO HiveMetaStore: 0: get_all_databases
15/10/26 20:40:45 INFO audit: ugi=hadoop ip=unknown-ip-addr cmd=get_all_databases
15/10/26 20:40:45 INFO HiveMetaStore: 0: get_functions: db=default pat=*
15/10/26 20:40:45 INFO audit: ugi=hadoop ip=unknown-ip-addr cmd=get_functions: db=default pat=*
15/10/26 20:40:45 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:40:45 INFO SessionState: Created local directory: /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_01_000001/tmp/yarn
15/10/26 20:40:45 INFO SessionState: Created local directory: /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_01_000001/tmp/6f97d6db-8359-4210-a071-9b20a7bb71f7_resources
15/10/26 20:40:45 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/6f97d6db-8359-4210-a071-9b20a7bb71f7
15/10/26 20:40:45 INFO SessionState: Created local directory: /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_01_000001/tmp/yarn/6f97d6db-8359-4210-a071-9b20a7bb71f7
15/10/26 20:40:45 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/6f97d6db-8359-4210-a071-9b20a7bb71f7/_tmp_space.db
15/10/26 20:40:45 INFO HiveContext: default warehouse location is /user/hive/warehouse
15/10/26 20:40:45 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
15/10/26 20:40:45 INFO ClientWrapper: Inspected Hadoop version: 2.4.0
15/10/26 20:40:45 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.4.0
15/10/26 20:40:45 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:45 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:46 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:46 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:46 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:46 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/10/26 20:40:46 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
15/10/26 20:40:46 INFO ObjectStore: ObjectStore, initialize called
15/10/26 20:40:46 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/10/26 20:40:46 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
15/10/26 20:40:49 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:49 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:49 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:49 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
15/10/26 20:40:50 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:40:50 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:40:52 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:40:52 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:40:52 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
15/10/26 20:40:52 INFO ObjectStore: Initialized ObjectStore
15/10/26 20:40:52 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
15/10/26 20:40:52 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
15/10/26 20:40:52 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:52 INFO HiveMetaStore: Added admin role in metastore
15/10/26 20:40:52 INFO HiveMetaStore: Added public role in metastore
15/10/26 20:40:53 INFO HiveMetaStore: No user is added in admin role, since config is empty
15/10/26 20:40:53 INFO HiveMetaStore: 0: get_all_databases
15/10/26 20:40:53 INFO audit: ugi=yarn ip=unknown-ip-addr cmd=get_all_databases
15/10/26 20:40:53 INFO HiveMetaStore: 0: get_functions: db=default pat=*
15/10/26 20:40:53 INFO audit: ugi=yarn ip=unknown-ip-addr cmd=get_functions: db=default pat=*
15/10/26 20:40:53 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:40:53 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:40:53 INFO SessionState: Created local directory: /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_01_000001/tmp/dca9c923-98ad-4cd5-b70a-8f0da8e4936e_resources
15/10/26 20:40:53 INFO SessionState: Created HDFS directory: /tmp/hive/yarn/dca9c923-98ad-4cd5-b70a-8f0da8e4936e
15/10/26 20:40:53 INFO SessionState: Created local directory: /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_01_000001/tmp/yarn/dca9c923-98ad-4cd5-b70a-8f0da8e4936e
15/10/26 20:40:53 INFO SessionState: Created HDFS directory: /tmp/hive/yarn/dca9c923-98ad-4cd5-b70a-8f0da8e4936e/_tmp_space.db
15/10/26 20:41:00 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
15/10/26 20:41:00 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
15/10/26 20:41:00 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
15/10/26 20:41:00 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
15/10/26 20:41:00 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
15/10/26 20:41:00 INFO SparkContext: Starting job: saveAsTextFile at CLIJob.scala:96
15/10/26 20:41:00 INFO DAGScheduler: Got job 0 (saveAsTextFile at CLIJob.scala:96) with 2 output partitions
15/10/26 20:41:00 INFO DAGScheduler: Final stage: ResultStage 0(saveAsTextFile at CLIJob.scala:96)
15/10/26 20:41:00 INFO DAGScheduler: Parents of final stage: List()
15/10/26 20:41:00 INFO DAGScheduler: Missing parents: List()
15/10/26 20:41:00 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at saveAsTextFile at CLIJob.scala:96), which has no missing parents
15/10/26 20:41:00 INFO MemoryStore: ensureFreeSpace(135712) called with curMem=0, maxMem=560993402
15/10/26 20:41:00 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 132.5 KB, free 534.9 MB)
15/10/26 20:41:00 INFO MemoryStore: ensureFreeSpace(47141) called with curMem=135712, maxMem=560993402
15/10/26 20:41:00 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 46.0 KB, free 534.8 MB)
15/10/26 20:41:00 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.169.170.124:52722 (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:00 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:861
15/10/26 20:41:00 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at saveAsTextFile at CLIJob.scala:96)
15/10/26 20:41:00 INFO YarnClusterScheduler: Adding task set 0.0 with 2 tasks
15/10/26 20:41:00 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 2087 bytes)
15/10/26 20:41:00 WARN TaskSetManager: Stage 0 contains a task of very large size (188 KB). The maximum recommended task size is 100 KB.
15/10/26 20:41:00 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 193533 bytes)
15/10/26 20:41:00 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-169-170-124.ec2.internal:42829 (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:00 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-67-169-247.ec2.internal:37713 (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:02 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1756 ms on ip-10-169-170-124.ec2.internal (1/2)
15/10/26 20:41:02 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1799 ms on ip-10-67-169-247.ec2.internal (2/2)
15/10/26 20:41:02 INFO YarnClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/10/26 20:41:02 INFO DAGScheduler: ResultStage 0 (saveAsTextFile at CLIJob.scala:96) finished in 1.821 s
15/10/26 20:41:02 INFO DAGScheduler: Job 0 finished: saveAsTextFile at CLIJob.scala:96, took 2.037637 s
15/10/26 20:41:02 INFO BlockManagerInfo: Removed broadcast_0_piece0 on ip-10-67-169-247.ec2.internal:37713 in memory (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:02 INFO BlockManagerInfo: Removed broadcast_0_piece0 on ip-10-169-170-124.ec2.internal:42829 in memory (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:02 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 10.169.170.124:52722 in memory (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:02 INFO ContextCleaner: Cleaned accumulator 1
15/10/26 20:41:02 INFO SparkContext: Starting job: json at CLIJob.scala:104
15/10/26 20:41:02 INFO DAGScheduler: Got job 1 (json at CLIJob.scala:104) with 2 output partitions
15/10/26 20:41:02 INFO DAGScheduler: Final stage: ResultStage 1(json at CLIJob.scala:104)
15/10/26 20:41:02 INFO DAGScheduler: Parents of final stage: List()
15/10/26 20:41:02 INFO DAGScheduler: Missing parents: List()
15/10/26 20:41:02 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[5] at json at CLIJob.scala:104), which has no missing parents
15/10/26 20:41:02 INFO MemoryStore: ensureFreeSpace(3776) called with curMem=0, maxMem=560993402
15/10/26 20:41:02 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.7 KB, free 535.0 MB)
15/10/26 20:41:02 INFO MemoryStore: ensureFreeSpace(2070) called with curMem=3776, maxMem=560993402
15/10/26 20:41:02 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.0 KB, free 535.0 MB)
15/10/26 20:41:02 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.169.170.124:52722 (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:02 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:861
15/10/26 20:41:02 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 1 (MapPartitionsRDD[5] at json at CLIJob.scala:104)
15/10/26 20:41:02 INFO YarnClusterScheduler: Adding task set 1.0 with 2 tasks
15/10/26 20:41:02 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 2087 bytes)
15/10/26 20:41:02 WARN TaskSetManager: Stage 1 contains a task of very large size (188 KB). The maximum recommended task size is 100 KB.
15/10/26 20:41:02 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 193533 bytes)
15/10/26 20:41:02 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-67-169-247.ec2.internal:37713 (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:02 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 116 ms on ip-10-67-169-247.ec2.internal (1/2)
15/10/26 20:41:02 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-169-170-124.ec2.internal:42829 (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:04 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 2424 ms on ip-10-169-170-124.ec2.internal (2/2)
15/10/26 20:41:04 INFO DAGScheduler: ResultStage 1 (json at CLIJob.scala:104) finished in 2.425 s
15/10/26 20:41:04 INFO YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
15/10/26 20:41:05 INFO DAGScheduler: Job 1 finished: json at CLIJob.scala:104, took 2.573385 s
15/10/26 20:41:05 INFO ContextCleaner: Cleaned accumulator 2
15/10/26 20:41:05 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 10.169.170.124:52722 in memory (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:05 INFO BlockManagerInfo: Removed broadcast_1_piece0 on ip-10-67-169-247.ec2.internal:37713 in memory (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:05 INFO BlockManagerInfo: Removed broadcast_1_piece0 on ip-10-169-170-124.ec2.internal:42829 in memory (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:05 INFO MemoryStore: ensureFreeSpace(93288) called with curMem=0, maxMem=560993402
15/10/26 20:41:05 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 91.1 KB, free 534.9 MB)
15/10/26 20:41:05 INFO MemoryStore: ensureFreeSpace(21698) called with curMem=93288, maxMem=560993402
15/10/26 20:41:05 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 21.2 KB, free 534.9 MB)
15/10/26 20:41:05 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 10.169.170.124:52722 (size: 21.2 KB, free: 535.0 MB)
15/10/26 20:41:05 INFO SparkContext: Created broadcast 2 from parquet at CLIJob.scala:108
15/10/26 20:41:05 INFO ParquetRelation: Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter
15/10/26 20:41:05 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 10.169.170.124:52722 in memory (size: 21.2 KB, free: 535.0 MB)
15/10/26 20:41:05 INFO DefaultWriterContainer: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
15/10/26 20:41:05 INFO SparkContext: Starting job: parquet at CLIJob.scala:108
15/10/26 20:41:05 INFO DAGScheduler: Got job 2 (parquet at CLIJob.scala:108) with 2 output partitions
15/10/26 20:41:05 INFO DAGScheduler: Final stage: ResultStage 2(parquet at CLIJob.scala:108)
15/10/26 20:41:05 INFO DAGScheduler: Parents of final stage: List()
15/10/26 20:41:05 INFO DAGScheduler: Missing parents: List()
15/10/26 20:41:05 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[6] at parquet at CLIJob.scala:108), which has no missing parents
15/10/26 20:41:05 INFO MemoryStore: ensureFreeSpace(82904) called with curMem=0, maxMem=560993402
15/10/26 20:41:05 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 81.0 KB, free 534.9 MB)
15/10/26 20:41:05 INFO MemoryStore: ensureFreeSpace(29341) called with curMem=82904, maxMem=560993402
15/10/26 20:41:05 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 28.7 KB, free 534.9 MB)
15/10/26 20:41:05 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 10.169.170.124:52722 (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:41:05 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:861
15/10/26 20:41:05 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 2 (MapPartitionsRDD[6] at parquet at CLIJob.scala:108)
15/10/26 20:41:05 INFO YarnClusterScheduler: Adding task set 2.0 with 2 tasks
15/10/26 20:41:05 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 4, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 2087 bytes)
15/10/26 20:41:05 WARN TaskSetManager: Stage 2 contains a task of very large size (188 KB). The maximum recommended task size is 100 KB.
15/10/26 20:41:05 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 5, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 193533 bytes)
15/10/26 20:41:05 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on ip-10-67-169-247.ec2.internal:37713 (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:41:05 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on ip-10-169-170-124.ec2.internal:42829 (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:41:07 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 4) in 1254 ms on ip-10-169-170-124.ec2.internal (1/2)
15/10/26 20:41:08 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 5) in 2810 ms on ip-10-67-169-247.ec2.internal (2/2)
15/10/26 20:41:08 INFO DAGScheduler: ResultStage 2 (parquet at CLIJob.scala:108) finished in 2.812 s
15/10/26 20:41:08 INFO YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool
15/10/26 20:41:08 INFO DAGScheduler: Job 2 finished: parquet at CLIJob.scala:108, took 2.853673 s
15/10/26 20:41:08 INFO DefaultWriterContainer: Job job_201510262041_0000 committed.
15/10/26 20:41:08 INFO ParquetRelation: Listing hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.parquet on driver
15/10/26 20:41:08 INFO ParquetRelation: Listing hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.parquet on driver
15/10/26 20:41:09 INFO ContextCleaner: Cleaned accumulator 3
15/10/26 20:41:09 INFO BlockManagerInfo: Removed broadcast_3_piece0 on 10.169.170.124:52722 in memory (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:41:09 INFO BlockManagerInfo: Removed broadcast_3_piece0 on ip-10-169-170-124.ec2.internal:42829 in memory (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:41:09 INFO BlockManagerInfo: Removed broadcast_3_piece0 on ip-10-67-169-247.ec2.internal:37713 in memory (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:41:09 INFO ParseDriver: Parsing command: select x.id, x.title, x.description, x.mediaavailableDate as available_date, x.mediaexpirationDate as expiration_date, mediacategories.medianame as media_name, x.mediakeywords as keywords, mediaratings.scheme as rating_scheme, mediaratings.rating, cast(mediaratings.subRatings as String) as sub_ratings, content.plfileduration as duration, x.plmediaprovider as provider, x.ngccontentAdType as ad_type, x.ngcepisodeNumber as episode, ngcnetwork as network, x.ngcseasonNumber as season_number, x.ngcuID as ngc_uid, x.ngcvideoType as video_type from etl lateral view explode(entries) entries as x lateral view explode(x.mediacategories) cat as mediacategories lateral view explode(x.mediaratings) r as mediaratings lateral view explode(x.mediacontent) mediacontent as content lateral view outer explode(x.ngcnetwork) net as ngcnetwork
15/10/26 20:41:10 INFO ParseDriver: Parse Completed
15/10/26 20:41:10 INFO MemoryStore: ensureFreeSpace(243880) called with curMem=0, maxMem=560993402
15/10/26 20:41:10 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 238.2 KB, free 534.8 MB)
15/10/26 20:41:11 INFO MemoryStore: ensureFreeSpace(21698) called with curMem=243880, maxMem=560993402
15/10/26 20:41:11 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 21.2 KB, free 534.8 MB)
15/10/26 20:41:11 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 10.169.170.124:52722 (size: 21.2 KB, free: 535.0 MB)
15/10/26 20:41:11 INFO SparkContext: Created broadcast 4 from cache at CLIJob.scala:114
15/10/26 20:41:11 INFO BlockManagerInfo: Removed broadcast_4_piece0 on 10.169.170.124:52722 in memory (size: 21.2 KB, free: 535.0 MB)
15/10/26 20:41:11 INFO SparkContext: Starting job: saveAsTextFile at CLIJob.scala:115
15/10/26 20:41:11 INFO DAGScheduler: Got job 3 (saveAsTextFile at CLIJob.scala:115) with 4 output partitions
15/10/26 20:41:11 INFO DAGScheduler: Final stage: ResultStage 3(saveAsTextFile at CLIJob.scala:115)
15/10/26 20:41:11 INFO DAGScheduler: Parents of final stage: List()
15/10/26 20:41:11 INFO DAGScheduler: Missing parents: List()
15/10/26 20:41:11 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[27] at saveAsTextFile at CLIJob.scala:115), which has no missing parents
15/10/26 20:41:11 INFO MemoryStore: ensureFreeSpace(170000) called with curMem=0, maxMem=560993402
15/10/26 20:41:11 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 166.0 KB, free 534.8 MB)
15/10/26 20:41:11 INFO MemoryStore: ensureFreeSpace(58865) called with curMem=170000, maxMem=560993402
15/10/26 20:41:11 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 57.5 KB, free 534.8 MB)
15/10/26 20:41:11 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on 10.169.170.124:52722 (size: 57.5 KB, free: 534.9 MB)
15/10/26 20:41:11 INFO SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:861
15/10/26 20:41:11 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 3 (MapPartitionsRDD[27] at saveAsTextFile at CLIJob.scala:115)
15/10/26 20:41:11 INFO YarnClusterScheduler: Adding task set 3.0 with 4 tasks
15/10/26 20:41:11 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 6, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 2196 bytes)
15/10/26 20:41:11 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 7, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 2378 bytes)
15/10/26 20:41:11 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on ip-10-169-170-124.ec2.internal:42829 (size: 57.5 KB, free: 534.9 MB)
15/10/26 20:41:11 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on ip-10-67-169-247.ec2.internal:37713 (size: 57.5 KB, free: 534.9 MB)
15/10/26 20:41:12 INFO TaskSetManager: Starting task 2.0 in stage 3.0 (TID 8, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 2196 bytes)
15/10/26 20:41:12 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 6) in 270 ms on ip-10-67-169-247.ec2.internal (1/4)
15/10/26 20:41:12 WARN TaskSetManager: Stage 3 contains a task of very large size (189 KB). The maximum recommended task size is 100 KB.
15/10/26 20:41:12 INFO TaskSetManager: Starting task 3.0 in stage 3.0 (TID 9, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 193642 bytes)
15/10/26 20:41:12 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 7) in 273 ms on ip-10-169-170-124.ec2.internal (2/4)
15/10/26 20:41:13 INFO BlockManagerInfo: Added rdd_21_0 in memory on ip-10-67-169-247.ec2.internal:37713 (size: 16.0 B, free: 534.9 MB)
15/10/26 20:41:13 INFO TaskSetManager: Finished task 2.0 in stage 3.0 (TID 8) in 1525 ms on ip-10-67-169-247.ec2.internal (3/4)
15/10/26 20:41:14 WARN TaskSetManager: Lost task 3.0 in stage 3.0 (TID 9, ip-10-169-170-124.ec2.internal): java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:41:14 INFO TaskSetManager: Starting task 3.1 in stage 3.0 (TID 10, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 193642 bytes)
15/10/26 20:41:15 INFO TaskSetManager: Lost task 3.1 in stage 3.0 (TID 10) on executor ip-10-67-169-247.ec2.internal: java.util.NoSuchElementException (key not found: http://data.media.theplatform.com/media/data/Media/525991491676) [duplicate 1]
15/10/26 20:41:15 INFO TaskSetManager: Starting task 3.2 in stage 3.0 (TID 11, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 193642 bytes)
15/10/26 20:41:15 INFO TaskSetManager: Lost task 3.2 in stage 3.0 (TID 11) on executor ip-10-67-169-247.ec2.internal: java.util.NoSuchElementException (key not found: http://data.media.theplatform.com/media/data/Media/525991491676) [duplicate 2]
15/10/26 20:41:15 INFO TaskSetManager: Starting task 3.3 in stage 3.0 (TID 12, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 193642 bytes)
15/10/26 20:41:15 INFO TaskSetManager: Lost task 3.3 in stage 3.0 (TID 12) on executor ip-10-67-169-247.ec2.internal: java.util.NoSuchElementException (key not found: http://data.media.theplatform.com/media/data/Media/525991491676) [duplicate 3]
15/10/26 20:41:15 ERROR TaskSetManager: Task 3 in stage 3.0 failed 4 times; aborting job
15/10/26 20:41:15 INFO YarnClusterScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool
15/10/26 20:41:15 INFO YarnClusterScheduler: Cancelling stage 3
15/10/26 20:41:15 INFO DAGScheduler: ResultStage 3 (saveAsTextFile at CLIJob.scala:115) failed in 3.625 s
15/10/26 20:41:15 INFO DAGScheduler: Job 3 failed: saveAsTextFile at CLIJob.scala:115, took 3.823248 s
15/10/26 20:41:15 ERROR ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 3.0 failed 4 times, most recent failure: Lost task 3.3 in stage 3.0 (TID 12, ip-10-67-169-247.ec2.internal): java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 3.0 failed 4 times, most recent failure: Lost task 3.3 in stage 3.0 (TID 12, ip-10-67-169-247.ec2.internal): java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1280)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1268)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1267)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1267)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1493)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1455)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1444)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1813)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1826)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1903)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1124)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1065)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1065)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1065)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply$mcV$sp(PairRDDFunctions.scala:989)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:965)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:965)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:965)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply$mcV$sp(PairRDDFunctions.scala:897)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:897)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:897)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:896)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply$mcV$sp(RDD.scala:1426)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1405)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1405)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1405)
at com.truex.prometheus.CLIJob$$anon$1.execute(CLIJob.scala:115)
at com.truex.prometheus.CLIJob$.main(CLIJob.scala:122)
at com.truex.prometheus.CLIJob.main(CLIJob.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:525)
Caused by: java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:41:15 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 3.0 failed 4 times, most recent failure: Lost task 3.3 in stage 3.0 (TID 12, ip-10-67-169-247.ec2.internal): java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:)
15/10/26 20:41:15 INFO SparkContext: Invoking stop() from shutdown hook
15/10/26 20:41:15 INFO SparkUI: Stopped Spark web UI at http://10.169.170.124:43347
15/10/26 20:41:15 INFO DAGScheduler: Stopping DAGScheduler
15/10/26 20:41:15 INFO YarnClusterSchedulerBackend: Shutting down all executors
15/10/26 20:41:15 INFO YarnClusterSchedulerBackend: Asking each executor to shut down
15/10/26 20:41:15 INFO ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. ip-10-67-169-247.ec2.internal:45434
15/10/26 20:41:15 INFO ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. ip-10-169-170-124.ec2.internal:59701
15/10/26 20:41:15 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
15/10/26 20:41:15 INFO MemoryStore: MemoryStore cleared
15/10/26 20:41:15 INFO BlockManager: BlockManager stopped
15/10/26 20:41:15 INFO BlockManagerMaster: BlockManagerMaster stopped
15/10/26 20:41:15 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
15/10/26 20:41:15 INFO SparkContext: Successfully stopped SparkContext
15/10/26 20:41:15 INFO ShutdownHookManager: Shutdown hook called
15/10/26 20:41:15 INFO ShutdownHookManager: Deleting directory /mnt/yarn/usercache/hadoop/appcache/application_1444274555723_0064/spark-058eba6b-ad25-4176-8cf6-4cb0e2f5e8c4
15/10/26 20:41:15 INFO ShutdownHookManager: Deleting directory /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/spark-b8e2f488-e40c-4671-b5f0-40d5b292abc2
15/10/26 20:41:15 INFO ShutdownHookManager: Deleting directory /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_01_000001/tmp/spark-29701bea-a219-4967-bf4e-d33f9bfd4ea3
15/10/26 20:41:15 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/10/26 20:41:15 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
LogType:stdout
Log Upload Time:26-Oct-2015 20:42:08
LogLength:0
Log Contents:
Container: container_1444274555723_0064_02_000001 on ip-10-169-170-124.ec2.internal_8041
==========================================================================================
LogType:stderr
Log Upload Time:26-Oct-2015 20:42:08
LogLength:71922
Log Contents:
log4j:ERROR Could not read configuration file from URL [file:/etc/spark/conf/log4j.properties].
java.io.FileNotFoundException: /etc/spark/conf/log4j.properties (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:146)
at java.io.FileInputStream.<init>(FileInputStream.java:101)
at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:557)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
at org.apache.spark.Logging$class.initializeLogging(Logging.scala:122)
at org.apache.spark.Logging$class.initializeIfNecessary(Logging.scala:107)
at org.apache.spark.Logging$class.log(Logging.scala:51)
at org.apache.spark.deploy.yarn.ApplicationMaster$.log(ApplicationMaster.scala:603)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:617)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
log4j:ERROR Ignoring configuration file [file:/etc/spark/conf/log4j.properties].
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/mnt1/yarn/usercache/hadoop/filecache/119/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/10/26 20:41:17 INFO ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
15/10/26 20:41:18 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1444274555723_0064_000002
15/10/26 20:41:19 INFO SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:41:19 INFO SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:41:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:41:19 INFO ApplicationMaster: Starting the user application in a separate Thread
15/10/26 20:41:19 INFO ApplicationMaster: Waiting for spark context initialization
15/10/26 20:41:19 INFO ApplicationMaster: Waiting for spark context initialization ...
15/10/26 20:41:19 INFO SparkContext: Running Spark version 1.5.0
15/10/26 20:41:19 INFO SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:41:19 INFO SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:41:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:41:20 INFO Slf4jLogger: Slf4jLogger started
15/10/26 20:41:20 INFO Remoting: Starting remoting
15/10/26 20:41:20 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@10.169.170.124:39575]
15/10/26 20:41:20 INFO Utils: Successfully started service 'sparkDriver' on port 39575.
15/10/26 20:41:20 INFO SparkEnv: Registering MapOutputTracker
15/10/26 20:41:20 INFO SparkEnv: Registering BlockManagerMaster
15/10/26 20:41:20 INFO DiskBlockManager: Created local directory at /mnt/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-de30fe47-b14d-4cd0-8405-ce176ee47357
15/10/26 20:41:20 INFO DiskBlockManager: Created local directory at /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-88e6f361-d3b6-4d2e-967b-c2f51f18c0df
15/10/26 20:41:20 INFO MemoryStore: MemoryStore started with capacity 535.0 MB
15/10/26 20:41:20 INFO HttpFileServer: HTTP File server directory is /mnt/yarn/usercache/hadoop/appcache/application_1444274555723_0064/spark-70134e0e-f756-4a8e-9957-c26c4ee31a50/httpd-180c5862-8a9a-4c28-8834-c604a4374260
15/10/26 20:41:20 INFO HttpServer: Starting HTTP Server
15/10/26 20:41:20 INFO Utils: Successfully started service 'HTTP file server' on port 44746.
15/10/26 20:41:20 INFO SparkEnv: Registering OutputCommitCoordinator
15/10/26 20:41:21 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15/10/26 20:41:21 INFO Utils: Successfully started service 'SparkUI' on port 44780.
15/10/26 20:41:21 INFO SparkUI: Started SparkUI at http://10.169.170.124:44780
15/10/26 20:41:21 INFO YarnClusterScheduler: Created YarnClusterScheduler
15/10/26 20:41:21 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/10/26 20:41:21 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 32888.
15/10/26 20:41:21 INFO NettyBlockTransferService: Server created on 32888
15/10/26 20:41:21 INFO BlockManagerMaster: Trying to register BlockManager
15/10/26 20:41:21 INFO BlockManagerMasterEndpoint: Registering block manager 10.169.170.124:32888 with 535.0 MB RAM, BlockManagerId(driver, 10.169.170.124, 32888)
15/10/26 20:41:21 INFO BlockManagerMaster: Registered BlockManager
15/10/26 20:41:22 INFO MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1444274560440
15/10/26 20:41:22 INFO MetricsSaver: Created MetricsSaver j-2US4HNPLS1SJO:i-031cded7:ApplicationMaster:16232 period:60 /mnt/var/em/raw/i-031cded7_20151026_ApplicationMaster_16232_raw.bin
15/10/26 20:41:22 INFO EventLoggingListener: Logging events to hdfs:///var/log/spark/apps/application_1444274555723_0064_2
15/10/26 20:41:22 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka://sparkDriver/user/YarnAM#-1806506353])
15/10/26 20:41:22 INFO RMProxy: Connecting to ResourceManager at ip-10-65-200-150.ec2.internal/10.65.200.150:8030
15/10/26 20:41:22 INFO YarnRMClient: Registering the ApplicationMaster
15/10/26 20:41:22 INFO YarnAllocator: Will request 2 executor containers, each with 1 cores and 1408 MB memory including 384 MB overhead
15/10/26 20:41:22 INFO YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
15/10/26 20:41:22 INFO YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
15/10/26 20:41:22 INFO ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
15/10/26 20:41:24 INFO AMRMClientImpl: Received new token for : ip-10-67-169-247.ec2.internal:8041
15/10/26 20:41:24 INFO AMRMClientImpl: Received new token for : ip-10-169-170-124.ec2.internal:8041
15/10/26 20:41:24 INFO YarnAllocator: Launching container container_1444274555723_0064_02_000002 for on host ip-10-67-169-247.ec2.internal
15/10/26 20:41:24 INFO YarnAllocator: Launching ExecutorRunnable. driverUrl: akka.tcp://sparkDriver@10.169.170.124:39575/user/CoarseGrainedScheduler, executorHostname: ip-10-67-169-247.ec2.internal
15/10/26 20:41:24 INFO YarnAllocator: Launching container container_1444274555723_0064_02_000003 for on host ip-10-169-170-124.ec2.internal
15/10/26 20:41:24 INFO ExecutorRunnable: Starting Executor Container
15/10/26 20:41:24 INFO YarnAllocator: Launching ExecutorRunnable. driverUrl: akka.tcp://sparkDriver@10.169.170.124:39575/user/CoarseGrainedScheduler, executorHostname: ip-10-169-170-124.ec2.internal
15/10/26 20:41:24 INFO ExecutorRunnable: Starting Executor Container
15/10/26 20:41:24 INFO YarnAllocator: Received 2 containers from YARN, launching executors on 2 of them.
15/10/26 20:41:24 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
15/10/26 20:41:24 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
15/10/26 20:41:24 INFO ExecutorRunnable: Setting up ContainerLaunchContext
15/10/26 20:41:24 INFO ExecutorRunnable: Setting up ContainerLaunchContext
15/10/26 20:41:24 INFO ExecutorRunnable: Preparing Local resources
15/10/26 20:41:24 INFO ExecutorRunnable: Preparing Local resources
15/10/26 20:41:24 INFO ExecutorRunnable: Prepared Local resources Map(__app__.jar -> resource { scheme: "hdfs" host: "ip-10-65-200-150.ec2.internal" port: 8020 file: "/user/hadoop/.sparkStaging/application_1444274555723_0064/Prometheus-assembly-0.0.1.jar" } size: 162982714 timestamp: 1445892020379 type: FILE visibility: PRIVATE, __spark__.jar -> resource { scheme: "hdfs" host: "ip-10-65-200-150.ec2.internal" port: 8020 file: "/user/hadoop/.sparkStaging/application_1444274555723_0064/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar" } size: 206949550 timestamp: 1445892018878 type: FILE visibility: PRIVATE)
15/10/26 20:41:24 INFO ExecutorRunnable: Prepared Local resources Map(__app__.jar -> resource { scheme: "hdfs" host: "ip-10-65-200-150.ec2.internal" port: 8020 file: "/user/hadoop/.sparkStaging/application_1444274555723_0064/Prometheus-assembly-0.0.1.jar" } size: 162982714 timestamp: 1445892020379 type: FILE visibility: PRIVATE, __spark__.jar -> resource { scheme: "hdfs" host: "ip-10-65-200-150.ec2.internal" port: 8020 file: "/user/hadoop/.sparkStaging/application_1444274555723_0064/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar" } size: 206949550 timestamp: 1445892018878 type: FILE visibility: PRIVATE)
15/10/26 20:41:24 INFO ExecutorRunnable:
===============================================================================
YARN executor launch context:
env:
CLASSPATH -> /etc/hadoop/conf:/etc/hive/conf:/usr/lib/hadoop/*:/usr/lib/hadoop-hdfs/*:/usr/lib/hadoop-mapreduce/*:/usr/lib/hadoop-yarn/*:/usr/lib/hadoop-lzo/lib/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>/usr/lib/hadoop-lzo/lib/*<CPS>/usr/share/aws/emr/emrfs/conf<CPS>/usr/share/aws/emr/emrfs/lib/*<CPS>/usr/share/aws/emr/emrfs/auxlib/*<CPS>/usr/share/aws/emr/lib/*<CPS>/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar<CPS>/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar<CPS>/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar<CPS>/usr/share/aws/emr/cloudwatch-sink/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>/usr/lib/hadoop-lzo/lib/*<CPS>/usr/share/aws/emr/emrfs/conf<CPS>/usr/share/aws/emr/emrfs/lib/*<CPS>/usr/share/aws/emr/emrfs/auxlib/*<CPS>/usr/share/aws/emr/lib/*<CPS>/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar<CPS>/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar<CPS>/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar<CPS>/usr/share/aws/emr/cloudwatch-sink/lib/*
SPARK_LOG_URL_STDERR -> http://ip-10-67-169-247.ec2.internal:8042/node/containerlogs/container_1444274555723_0064_02_000002/hadoop/stderr?start=-4096
SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1444274555723_0064
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 206949550,162982714
SPARK_USER -> hadoop
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PRIVATE
SPARK_YARN_MODE -> true
SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1445892018878,1445892020379
SPARK_LOG_URL_STDOUT -> http://ip-10-67-169-247.ec2.internal:8042/node/containerlogs/container_1444274555723_0064_02_000002/hadoop/stdout?start=-4096
SPARK_YARN_CACHE_FILES -> hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar#__spark__.jar,hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/Prometheus-assembly-0.0.1.jar#__app__.jar
command:
LD_LIBRARY_PATH="/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:$LD_LIBRARY_PATH" {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -Xmx1024m '-verbose:gc' '-XX:+PrintGCDetails' '-XX:+PrintGCDateStamps' '-XX:+UseConcMarkSweepGC' '-XX:CMSInitiatingOccupancyFraction=70' '-XX:MaxHeapFreeRatio=70' '-XX:+CMSClassUnloadingEnabled' '-XX:OnOutOfMemoryError=kill -9 %p' -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=39575' '-Dspark.history.ui.port=18080' '-Dspark.ui.port=0' -Dspark.yarn.app.container.log.dir=<LOG_DIR> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url akka.tcp://sparkDriver@10.169.170.124:39575/user/CoarseGrainedScheduler --executor-id 1 --hostname ip-10-67-169-247.ec2.internal --cores 1 --app-id application_1444274555723_0064 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
===============================================================================
15/10/26 20:41:24 INFO ExecutorRunnable:
===============================================================================
YARN executor launch context:
env:
CLASSPATH -> /etc/hadoop/conf:/etc/hive/conf:/usr/lib/hadoop/*:/usr/lib/hadoop-hdfs/*:/usr/lib/hadoop-mapreduce/*:/usr/lib/hadoop-yarn/*:/usr/lib/hadoop-lzo/lib/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>/usr/lib/hadoop-lzo/lib/*<CPS>/usr/share/aws/emr/emrfs/conf<CPS>/usr/share/aws/emr/emrfs/lib/*<CPS>/usr/share/aws/emr/emrfs/auxlib/*<CPS>/usr/share/aws/emr/lib/*<CPS>/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar<CPS>/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar<CPS>/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar<CPS>/usr/share/aws/emr/cloudwatch-sink/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>/usr/lib/hadoop-lzo/lib/*<CPS>/usr/share/aws/emr/emrfs/conf<CPS>/usr/share/aws/emr/emrfs/lib/*<CPS>/usr/share/aws/emr/emrfs/auxlib/*<CPS>/usr/share/aws/emr/lib/*<CPS>/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar<CPS>/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar<CPS>/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar<CPS>/usr/share/aws/emr/cloudwatch-sink/lib/*
SPARK_LOG_URL_STDERR -> http://ip-10-169-170-124.ec2.internal:8042/node/containerlogs/container_1444274555723_0064_02_000003/hadoop/stderr?start=-4096
SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1444274555723_0064
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 206949550,162982714
SPARK_USER -> hadoop
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PRIVATE
SPARK_YARN_MODE -> true
SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1445892018878,1445892020379
SPARK_LOG_URL_STDOUT -> http://ip-10-169-170-124.ec2.internal:8042/node/containerlogs/container_1444274555723_0064_02_000003/hadoop/stdout?start=-4096
SPARK_YARN_CACHE_FILES -> hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar#__spark__.jar,hdfs://ip-10-65-200-150.ec2.internal:8020/user/hadoop/.sparkStaging/application_1444274555723_0064/Prometheus-assembly-0.0.1.jar#__app__.jar
command:
LD_LIBRARY_PATH="/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:$LD_LIBRARY_PATH" {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -Xmx1024m '-verbose:gc' '-XX:+PrintGCDetails' '-XX:+PrintGCDateStamps' '-XX:+UseConcMarkSweepGC' '-XX:CMSInitiatingOccupancyFraction=70' '-XX:MaxHeapFreeRatio=70' '-XX:+CMSClassUnloadingEnabled' '-XX:OnOutOfMemoryError=kill -9 %p' -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=39575' '-Dspark.history.ui.port=18080' '-Dspark.ui.port=0' -Dspark.yarn.app.container.log.dir=<LOG_DIR> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url akka.tcp://sparkDriver@10.169.170.124:39575/user/CoarseGrainedScheduler --executor-id 2 --hostname ip-10-169-170-124.ec2.internal --cores 1 --app-id application_1444274555723_0064 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
===============================================================================
15/10/26 20:41:24 INFO ContainerManagementProtocolProxy: Opening proxy : ip-10-169-170-124.ec2.internal:8041
15/10/26 20:41:24 INFO ContainerManagementProtocolProxy: Opening proxy : ip-10-67-169-247.ec2.internal:8041
15/10/26 20:41:28 INFO ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. ip-10-169-170-124.ec2.internal:44676
15/10/26 20:41:28 INFO ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. ip-10-67-169-247.ec2.internal:58106
15/10/26 20:41:28 INFO YarnClusterSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@ip-10-169-170-124.ec2.internal:42939/user/Executor#-938790163]) with ID 2
15/10/26 20:41:29 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-169-170-124.ec2.internal:41776 with 535.0 MB RAM, BlockManagerId(2, ip-10-169-170-124.ec2.internal, 41776)
15/10/26 20:41:29 INFO YarnClusterSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@ip-10-67-169-247.ec2.internal:43171/user/Executor#612359746]) with ID 1
15/10/26 20:41:29 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
15/10/26 20:41:29 INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done
15/10/26 20:41:29 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-67-169-247.ec2.internal:37374 with 535.0 MB RAM, BlockManagerId(1, ip-10-67-169-247.ec2.internal, 37374)
15/10/26 20:41:30 INFO HiveContext: Initializing execution hive, version 1.2.1
15/10/26 20:41:30 INFO ClientWrapper: Inspected Hadoop version: 2.6.0-amzn-1
15/10/26 20:41:30 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-amzn-1
15/10/26 20:41:31 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
15/10/26 20:41:31 INFO ObjectStore: ObjectStore, initialize called
15/10/26 20:41:31 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/10/26 20:41:31 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
15/10/26 20:41:33 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
15/10/26 20:41:35 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:41:35 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:41:36 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:41:36 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:41:37 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
15/10/26 20:41:37 INFO ObjectStore: Initialized ObjectStore
15/10/26 20:41:37 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
15/10/26 20:41:37 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
15/10/26 20:41:37 INFO HiveMetaStore: Added admin role in metastore
15/10/26 20:41:37 INFO HiveMetaStore: Added public role in metastore
15/10/26 20:41:37 INFO HiveMetaStore: No user is added in admin role, since config is empty
15/10/26 20:41:37 INFO HiveMetaStore: 0: get_all_databases
15/10/26 20:41:37 INFO audit: ugi=hadoop ip=unknown-ip-addr cmd=get_all_databases
15/10/26 20:41:37 INFO HiveMetaStore: 0: get_functions: db=default pat=*
15/10/26 20:41:37 INFO audit: ugi=hadoop ip=unknown-ip-addr cmd=get_functions: db=default pat=*
15/10/26 20:41:37 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:41:38 INFO SessionState: Created local directory: /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_02_000001/tmp/yarn
15/10/26 20:41:38 INFO SessionState: Created local directory: /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_02_000001/tmp/e22c2577-e1e5-43eb-a0ff-36c2aa2367f4_resources
15/10/26 20:41:38 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/e22c2577-e1e5-43eb-a0ff-36c2aa2367f4
15/10/26 20:41:38 INFO SessionState: Created local directory: /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_02_000001/tmp/yarn/e22c2577-e1e5-43eb-a0ff-36c2aa2367f4
15/10/26 20:41:38 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/e22c2577-e1e5-43eb-a0ff-36c2aa2367f4/_tmp_space.db
15/10/26 20:41:38 INFO HiveContext: default warehouse location is /user/hive/warehouse
15/10/26 20:41:38 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
15/10/26 20:41:38 INFO ClientWrapper: Inspected Hadoop version: 2.4.0
15/10/26 20:41:38 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.4.0
15/10/26 20:41:38 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:38 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:39 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:39 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:39 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:39 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:39 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/10/26 20:41:39 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
15/10/26 20:41:39 INFO ObjectStore: ObjectStore, initialize called
15/10/26 20:41:39 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/10/26 20:41:39 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
15/10/26 20:41:41 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:41 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:41 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:41 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
15/10/26 20:41:42 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:41:42 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:41:43 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:41:43 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:41:44 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
15/10/26 20:41:44 INFO ObjectStore: Initialized ObjectStore
15/10/26 20:41:44 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
15/10/26 20:41:44 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
15/10/26 20:41:44 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:44 INFO HiveMetaStore: Added admin role in metastore
15/10/26 20:41:44 INFO HiveMetaStore: Added public role in metastore
15/10/26 20:41:44 INFO HiveMetaStore: No user is added in admin role, since config is empty
15/10/26 20:41:44 INFO HiveMetaStore: 0: get_all_databases
15/10/26 20:41:44 INFO audit: ugi=yarn ip=unknown-ip-addr cmd=get_all_databases
15/10/26 20:41:44 INFO HiveMetaStore: 0: get_functions: db=default pat=*
15/10/26 20:41:44 INFO audit: ugi=yarn ip=unknown-ip-addr cmd=get_functions: db=default pat=*
15/10/26 20:41:44 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
15/10/26 20:41:45 WARN Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.cluster.local.dir; Ignoring.
15/10/26 20:41:45 INFO SessionState: Created local directory: /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_02_000001/tmp/9653ecf3-b9d4-4cd3-8278-fd10048bb9ae_resources
15/10/26 20:41:45 INFO SessionState: Created HDFS directory: /tmp/hive/yarn/9653ecf3-b9d4-4cd3-8278-fd10048bb9ae
15/10/26 20:41:45 INFO SessionState: Created local directory: /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_02_000001/tmp/yarn/9653ecf3-b9d4-4cd3-8278-fd10048bb9ae
15/10/26 20:41:45 INFO SessionState: Created HDFS directory: /tmp/hive/yarn/9653ecf3-b9d4-4cd3-8278-fd10048bb9ae/_tmp_space.db
15/10/26 20:41:51 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
15/10/26 20:41:51 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
15/10/26 20:41:51 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
15/10/26 20:41:51 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
15/10/26 20:41:51 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
15/10/26 20:41:51 INFO SparkContext: Starting job: saveAsTextFile at CLIJob.scala:96
15/10/26 20:41:51 INFO DAGScheduler: Got job 0 (saveAsTextFile at CLIJob.scala:96) with 2 output partitions
15/10/26 20:41:51 INFO DAGScheduler: Final stage: ResultStage 0(saveAsTextFile at CLIJob.scala:96)
15/10/26 20:41:51 INFO DAGScheduler: Parents of final stage: List()
15/10/26 20:41:51 INFO DAGScheduler: Missing parents: List()
15/10/26 20:41:51 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at saveAsTextFile at CLIJob.scala:96), which has no missing parents
15/10/26 20:41:52 INFO MemoryStore: ensureFreeSpace(135712) called with curMem=0, maxMem=560993402
15/10/26 20:41:52 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 132.5 KB, free 534.9 MB)
15/10/26 20:41:52 INFO MemoryStore: ensureFreeSpace(47141) called with curMem=135712, maxMem=560993402
15/10/26 20:41:52 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 46.0 KB, free 534.8 MB)
15/10/26 20:41:52 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.169.170.124:32888 (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:52 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:861
15/10/26 20:41:52 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at saveAsTextFile at CLIJob.scala:96)
15/10/26 20:41:52 INFO YarnClusterScheduler: Adding task set 0.0 with 2 tasks
15/10/26 20:41:52 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 2087 bytes)
15/10/26 20:41:52 WARN TaskSetManager: Stage 0 contains a task of very large size (188 KB). The maximum recommended task size is 100 KB.
15/10/26 20:41:52 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 193533 bytes)
15/10/26 20:41:52 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-67-169-247.ec2.internal:37374 (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:52 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-169-170-124.ec2.internal:41776 (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:53 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1516 ms on ip-10-169-170-124.ec2.internal (1/2)
15/10/26 20:41:54 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 2026 ms on ip-10-67-169-247.ec2.internal (2/2)
15/10/26 20:41:54 INFO DAGScheduler: ResultStage 0 (saveAsTextFile at CLIJob.scala:96) finished in 2.050 s
15/10/26 20:41:54 INFO YarnClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/10/26 20:41:54 INFO DAGScheduler: Job 0 finished: saveAsTextFile at CLIJob.scala:96, took 2.257813 s
15/10/26 20:41:54 INFO SparkContext: Starting job: json at CLIJob.scala:104
15/10/26 20:41:54 INFO DAGScheduler: Got job 1 (json at CLIJob.scala:104) with 2 output partitions
15/10/26 20:41:54 INFO DAGScheduler: Final stage: ResultStage 1(json at CLIJob.scala:104)
15/10/26 20:41:54 INFO DAGScheduler: Parents of final stage: List()
15/10/26 20:41:54 INFO DAGScheduler: Missing parents: List()
15/10/26 20:41:54 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[5] at json at CLIJob.scala:104), which has no missing parents
15/10/26 20:41:54 INFO MemoryStore: ensureFreeSpace(3776) called with curMem=182853, maxMem=560993402
15/10/26 20:41:54 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.7 KB, free 534.8 MB)
15/10/26 20:41:54 INFO MemoryStore: ensureFreeSpace(2070) called with curMem=186629, maxMem=560993402
15/10/26 20:41:54 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.0 KB, free 534.8 MB)
15/10/26 20:41:54 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.169.170.124:32888 (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:54 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:861
15/10/26 20:41:54 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 1 (MapPartitionsRDD[5] at json at CLIJob.scala:104)
15/10/26 20:41:54 INFO YarnClusterScheduler: Adding task set 1.0 with 2 tasks
15/10/26 20:41:54 INFO ContextCleaner: Cleaned accumulator 1
15/10/26 20:41:54 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 2087 bytes)
15/10/26 20:41:54 INFO BlockManagerInfo: Removed broadcast_0_piece0 on ip-10-67-169-247.ec2.internal:37374 in memory (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:54 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 10.169.170.124:32888 in memory (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:54 INFO BlockManagerInfo: Removed broadcast_0_piece0 on ip-10-169-170-124.ec2.internal:41776 in memory (size: 46.0 KB, free: 535.0 MB)
15/10/26 20:41:54 WARN TaskSetManager: Stage 1 contains a task of very large size (188 KB). The maximum recommended task size is 100 KB.
15/10/26 20:41:54 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 193533 bytes)
15/10/26 20:41:54 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-169-170-124.ec2.internal:41776 (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:54 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-67-169-247.ec2.internal:37374 (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:54 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 128 ms on ip-10-169-170-124.ec2.internal (1/2)
15/10/26 20:41:57 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 2660 ms on ip-10-67-169-247.ec2.internal (2/2)
15/10/26 20:41:57 INFO DAGScheduler: ResultStage 1 (json at CLIJob.scala:104) finished in 2.664 s
15/10/26 20:41:57 INFO YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
15/10/26 20:41:57 INFO DAGScheduler: Job 1 finished: json at CLIJob.scala:104, took 2.799319 s
15/10/26 20:41:57 INFO ContextCleaner: Cleaned accumulator 2
15/10/26 20:41:57 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 10.169.170.124:32888 in memory (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:57 INFO BlockManagerInfo: Removed broadcast_1_piece0 on ip-10-67-169-247.ec2.internal:37374 in memory (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:57 INFO BlockManagerInfo: Removed broadcast_1_piece0 on ip-10-169-170-124.ec2.internal:41776 in memory (size: 2.0 KB, free: 535.0 MB)
15/10/26 20:41:57 INFO MemoryStore: ensureFreeSpace(93288) called with curMem=0, maxMem=560993402
15/10/26 20:41:57 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 91.1 KB, free 534.9 MB)
15/10/26 20:41:57 INFO MemoryStore: ensureFreeSpace(21698) called with curMem=93288, maxMem=560993402
15/10/26 20:41:57 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 21.2 KB, free 534.9 MB)
15/10/26 20:41:57 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 10.169.170.124:32888 (size: 21.2 KB, free: 535.0 MB)
15/10/26 20:41:57 INFO SparkContext: Created broadcast 2 from parquet at CLIJob.scala:108
15/10/26 20:41:57 INFO ParquetRelation: Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter
15/10/26 20:41:57 INFO DefaultWriterContainer: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
15/10/26 20:41:57 INFO SparkContext: Starting job: parquet at CLIJob.scala:108
15/10/26 20:41:57 INFO DAGScheduler: Got job 2 (parquet at CLIJob.scala:108) with 2 output partitions
15/10/26 20:41:57 INFO DAGScheduler: Final stage: ResultStage 2(parquet at CLIJob.scala:108)
15/10/26 20:41:57 INFO DAGScheduler: Parents of final stage: List()
15/10/26 20:41:57 INFO DAGScheduler: Missing parents: List()
15/10/26 20:41:57 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[6] at parquet at CLIJob.scala:108), which has no missing parents
15/10/26 20:41:57 INFO MemoryStore: ensureFreeSpace(82904) called with curMem=114986, maxMem=560993402
15/10/26 20:41:57 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 81.0 KB, free 534.8 MB)
15/10/26 20:41:57 INFO MemoryStore: ensureFreeSpace(29338) called with curMem=197890, maxMem=560993402
15/10/26 20:41:57 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 28.7 KB, free 534.8 MB)
15/10/26 20:41:57 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 10.169.170.124:32888 (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:41:57 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:861
15/10/26 20:41:57 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 2 (MapPartitionsRDD[6] at parquet at CLIJob.scala:108)
15/10/26 20:41:57 INFO YarnClusterScheduler: Adding task set 2.0 with 2 tasks
15/10/26 20:41:57 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 4, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 2087 bytes)
15/10/26 20:41:57 WARN TaskSetManager: Stage 2 contains a task of very large size (188 KB). The maximum recommended task size is 100 KB.
15/10/26 20:41:57 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 5, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 193533 bytes)
15/10/26 20:41:57 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 10.169.170.124:32888 in memory (size: 21.2 KB, free: 535.0 MB)
15/10/26 20:41:57 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on ip-10-67-169-247.ec2.internal:37374 (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:41:57 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on ip-10-169-170-124.ec2.internal:41776 (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:41:58 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 4) in 825 ms on ip-10-67-169-247.ec2.internal (1/2)
15/10/26 20:42:00 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 5) in 2657 ms on ip-10-169-170-124.ec2.internal (2/2)
15/10/26 20:42:00 INFO DAGScheduler: ResultStage 2 (parquet at CLIJob.scala:108) finished in 2.658 s
15/10/26 20:42:00 INFO YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool
15/10/26 20:42:00 INFO DAGScheduler: Job 2 finished: parquet at CLIJob.scala:108, took 2.699284 s
15/10/26 20:42:00 INFO DefaultWriterContainer: Job job_201510262041_0000 committed.
15/10/26 20:42:00 INFO ParquetRelation: Listing hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.parquet on driver
15/10/26 20:42:00 INFO ParquetRelation: Listing hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.parquet on driver
15/10/26 20:42:01 INFO ParseDriver: Parsing command: select x.id, x.title, x.description, x.mediaavailableDate as available_date, x.mediaexpirationDate as expiration_date, mediacategories.medianame as media_name, x.mediakeywords as keywords, mediaratings.scheme as rating_scheme, mediaratings.rating, cast(mediaratings.subRatings as String) as sub_ratings, content.plfileduration as duration, x.plmediaprovider as provider, x.ngccontentAdType as ad_type, x.ngcepisodeNumber as episode, ngcnetwork as network, x.ngcseasonNumber as season_number, x.ngcuID as ngc_uid, x.ngcvideoType as video_type from etl lateral view explode(entries) entries as x lateral view explode(x.mediacategories) cat as mediacategories lateral view explode(x.mediaratings) r as mediaratings lateral view explode(x.mediacontent) mediacontent as content lateral view outer explode(x.ngcnetwork) net as ngcnetwork
15/10/26 20:42:01 INFO ContextCleaner: Cleaned accumulator 3
15/10/26 20:42:01 INFO BlockManagerInfo: Removed broadcast_3_piece0 on 10.169.170.124:32888 in memory (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:42:01 INFO BlockManagerInfo: Removed broadcast_3_piece0 on ip-10-169-170-124.ec2.internal:41776 in memory (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:42:01 INFO BlockManagerInfo: Removed broadcast_3_piece0 on ip-10-67-169-247.ec2.internal:37374 in memory (size: 28.7 KB, free: 535.0 MB)
15/10/26 20:42:01 INFO ParseDriver: Parse Completed
15/10/26 20:42:02 INFO MemoryStore: ensureFreeSpace(243880) called with curMem=0, maxMem=560993402
15/10/26 20:42:02 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 238.2 KB, free 534.8 MB)
15/10/26 20:42:02 INFO MemoryStore: ensureFreeSpace(21698) called with curMem=243880, maxMem=560993402
15/10/26 20:42:02 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 21.2 KB, free 534.8 MB)
15/10/26 20:42:02 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 10.169.170.124:32888 (size: 21.2 KB, free: 535.0 MB)
15/10/26 20:42:02 INFO SparkContext: Created broadcast 4 from cache at CLIJob.scala:114
15/10/26 20:42:02 INFO BlockManagerInfo: Removed broadcast_4_piece0 on 10.169.170.124:32888 in memory (size: 21.2 KB, free: 535.0 MB)
15/10/26 20:42:02 INFO SparkContext: Starting job: saveAsTextFile at CLIJob.scala:115
15/10/26 20:42:02 INFO DAGScheduler: Got job 3 (saveAsTextFile at CLIJob.scala:115) with 4 output partitions
15/10/26 20:42:02 INFO DAGScheduler: Final stage: ResultStage 3(saveAsTextFile at CLIJob.scala:115)
15/10/26 20:42:02 INFO DAGScheduler: Parents of final stage: List()
15/10/26 20:42:03 INFO DAGScheduler: Missing parents: List()
15/10/26 20:42:03 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[27] at saveAsTextFile at CLIJob.scala:115), which has no missing parents
15/10/26 20:42:03 INFO MemoryStore: ensureFreeSpace(170000) called with curMem=0, maxMem=560993402
15/10/26 20:42:03 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 166.0 KB, free 534.8 MB)
15/10/26 20:42:03 INFO MemoryStore: ensureFreeSpace(58865) called with curMem=170000, maxMem=560993402
15/10/26 20:42:03 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 57.5 KB, free 534.8 MB)
15/10/26 20:42:03 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on 10.169.170.124:32888 (size: 57.5 KB, free: 534.9 MB)
15/10/26 20:42:03 INFO SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:861
15/10/26 20:42:03 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 3 (MapPartitionsRDD[27] at saveAsTextFile at CLIJob.scala:115)
15/10/26 20:42:03 INFO YarnClusterScheduler: Adding task set 3.0 with 4 tasks
15/10/26 20:42:03 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 6, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 2196 bytes)
15/10/26 20:42:03 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 7, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 2378 bytes)
15/10/26 20:42:03 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on ip-10-67-169-247.ec2.internal:37374 (size: 57.5 KB, free: 534.9 MB)
15/10/26 20:42:03 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on ip-10-169-170-124.ec2.internal:41776 (size: 57.5 KB, free: 534.9 MB)
15/10/26 20:42:03 INFO TaskSetManager: Starting task 2.0 in stage 3.0 (TID 8, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 2196 bytes)
15/10/26 20:42:03 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 7) in 283 ms on ip-10-67-169-247.ec2.internal (1/4)
15/10/26 20:42:03 WARN TaskSetManager: Stage 3 contains a task of very large size (189 KB). The maximum recommended task size is 100 KB.
15/10/26 20:42:03 INFO TaskSetManager: Starting task 3.0 in stage 3.0 (TID 9, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 193642 bytes)
15/10/26 20:42:03 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 6) in 356 ms on ip-10-169-170-124.ec2.internal (2/4)
15/10/26 20:42:04 INFO BlockManagerInfo: Added rdd_21_0 in memory on ip-10-67-169-247.ec2.internal:37374 (size: 16.0 B, free: 534.9 MB)
15/10/26 20:42:05 INFO TaskSetManager: Finished task 2.0 in stage 3.0 (TID 8) in 1672 ms on ip-10-67-169-247.ec2.internal (3/4)
15/10/26 20:42:05 WARN TaskSetManager: Lost task 3.0 in stage 3.0 (TID 9, ip-10-169-170-124.ec2.internal): java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:42:05 INFO TaskSetManager: Starting task 3.1 in stage 3.0 (TID 10, ip-10-169-170-124.ec2.internal, PROCESS_LOCAL, 193642 bytes)
15/10/26 20:42:06 INFO TaskSetManager: Lost task 3.1 in stage 3.0 (TID 10) on executor ip-10-169-170-124.ec2.internal: java.util.NoSuchElementException (key not found: http://data.media.theplatform.com/media/data/Media/525991491676) [duplicate 1]
15/10/26 20:42:06 INFO TaskSetManager: Starting task 3.2 in stage 3.0 (TID 11, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 193642 bytes)
15/10/26 20:42:07 INFO TaskSetManager: Lost task 3.2 in stage 3.0 (TID 11) on executor ip-10-67-169-247.ec2.internal: java.util.NoSuchElementException (key not found: http://data.media.theplatform.com/media/data/Media/525991491676) [duplicate 2]
15/10/26 20:42:07 INFO TaskSetManager: Starting task 3.3 in stage 3.0 (TID 12, ip-10-67-169-247.ec2.internal, PROCESS_LOCAL, 193642 bytes)
15/10/26 20:42:07 INFO TaskSetManager: Lost task 3.3 in stage 3.0 (TID 12) on executor ip-10-67-169-247.ec2.internal: java.util.NoSuchElementException (key not found: http://data.media.theplatform.com/media/data/Media/525991491676) [duplicate 3]
15/10/26 20:42:07 ERROR TaskSetManager: Task 3 in stage 3.0 failed 4 times; aborting job
15/10/26 20:42:07 INFO YarnClusterScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool
15/10/26 20:42:07 INFO YarnClusterScheduler: Cancelling stage 3
15/10/26 20:42:07 INFO DAGScheduler: ResultStage 3 (saveAsTextFile at CLIJob.scala:115) failed in 4.293 s
15/10/26 20:42:07 INFO DAGScheduler: Job 3 failed: saveAsTextFile at CLIJob.scala:115, took 4.369250 s
15/10/26 20:42:07 ERROR ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 3.0 failed 4 times, most recent failure: Lost task 3.3 in stage 3.0 (TID 12, ip-10-67-169-247.ec2.internal): java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 3.0 failed 4 times, most recent failure: Lost task 3.3 in stage 3.0 (TID 12, ip-10-67-169-247.ec2.internal): java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1280)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1268)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1267)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1267)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1493)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1455)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1444)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1813)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1826)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1903)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1124)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1065)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1065)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1065)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply$mcV$sp(PairRDDFunctions.scala:989)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:965)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:965)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:965)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply$mcV$sp(PairRDDFunctions.scala:897)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:897)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:897)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:896)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply$mcV$sp(RDD.scala:1426)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1405)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1405)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1405)
at com.truex.prometheus.CLIJob$$anon$1.execute(CLIJob.scala:115)
at com.truex.prometheus.CLIJob$.main(CLIJob.scala:122)
at com.truex.prometheus.CLIJob.main(CLIJob.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:525)
Caused by: java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:42:07 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 3.0 failed 4 times, most recent failure: Lost task 3.3 in stage 3.0 (TID 12, ip-10-67-169-247.ec2.internal): java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:)
15/10/26 20:42:07 INFO SparkContext: Invoking stop() from shutdown hook
15/10/26 20:42:07 INFO SparkUI: Stopped Spark web UI at http://10.169.170.124:44780
15/10/26 20:42:07 INFO DAGScheduler: Stopping DAGScheduler
15/10/26 20:42:07 INFO YarnClusterSchedulerBackend: Shutting down all executors
15/10/26 20:42:07 INFO YarnClusterSchedulerBackend: Asking each executor to shut down
15/10/26 20:42:07 INFO ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. ip-10-169-170-124.ec2.internal:42939
15/10/26 20:42:07 INFO ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. ip-10-67-169-247.ec2.internal:43171
15/10/26 20:42:07 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
15/10/26 20:42:07 INFO MemoryStore: MemoryStore cleared
15/10/26 20:42:07 INFO BlockManager: BlockManager stopped
15/10/26 20:42:07 INFO BlockManagerMaster: BlockManagerMaster stopped
15/10/26 20:42:07 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
15/10/26 20:42:07 INFO SparkContext: Successfully stopped SparkContext
15/10/26 20:42:07 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/10/26 20:42:07 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 3.0 failed 4 times, most recent failure: Lost task 3.3 in stage 3.0 (TID 12, ip-10-67-169-247.ec2.internal): java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:)
15/10/26 20:42:07 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/10/26 20:42:07 INFO AMRMClientImpl: Waiting for application to be successfully unregistered.
15/10/26 20:42:07 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/10/26 20:42:07 INFO ApplicationMaster: Deleting staging directory .sparkStaging/application_1444274555723_0064
15/10/26 20:42:07 INFO ShutdownHookManager: Shutdown hook called
15/10/26 20:42:07 INFO ShutdownHookManager: Deleting directory /mnt/yarn/usercache/hadoop/appcache/application_1444274555723_0064/spark-70134e0e-f756-4a8e-9957-c26c4ee31a50
15/10/26 20:42:07 INFO ShutdownHookManager: Deleting directory /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/spark-ac3d61be-f8ae-4e20-815e-00b7815358e2
15/10/26 20:42:07 INFO ShutdownHookManager: Deleting directory /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/container_1444274555723_0064_02_000001/tmp/spark-bec2839f-bb7a-4fba-bf4c-82a158f7098b
LogType:stdout
Log Upload Time:26-Oct-2015 20:42:08
LogLength:0
Log Contents:
15/10/26 20:43:23 INFO metrics.MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1444274560440
15/10/26 20:43:23 INFO metrics.MetricsSaver: Created MetricsSaver j-2US4HNPLS1SJO:i-131cdec7:LogsCLI:17354 period:60 /mnt/var/em/raw/i-131cdec7_20151026_LogsCLI_17354_raw.bin
Container: container_1444274555723_0064_02_000002 on ip-10-67-169-247.ec2.internal_8041
=========================================================================================
LogType:stderr
Log Upload Time:26-Oct-2015 20:42:09
LogLength:19321
Log Contents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/mnt1/yarn/usercache/hadoop/filecache/117/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/10/26 20:41:25 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
15/10/26 20:41:26 INFO spark.SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:41:26 INFO spark.SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:41:26 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:41:27 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/10/26 20:41:27 INFO Remoting: Starting remoting
15/10/26 20:41:28 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@ip-10-67-169-247.ec2.internal:58106]
15/10/26 20:41:28 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 58106.
15/10/26 20:41:28 INFO spark.SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:41:28 INFO spark.SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:41:28 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:41:28 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/10/26 20:41:28 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/10/26 20:41:28 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/10/26 20:41:28 INFO Remoting: Starting remoting
15/10/26 20:41:28 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/10/26 20:41:28 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor@ip-10-67-169-247.ec2.internal:43171]
15/10/26 20:41:28 INFO util.Utils: Successfully started service 'sparkExecutor' on port 43171.
15/10/26 20:41:28 INFO storage.DiskBlockManager: Created local directory at /mnt/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-226ebb11-5777-4c53-8e24-3b2d97150051
15/10/26 20:41:28 INFO storage.DiskBlockManager: Created local directory at /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-7bf4bad9-f097-487c-a301-1874cf677bb4
15/10/26 20:41:28 INFO storage.MemoryStore: MemoryStore started with capacity 535.0 MB
15/10/26 20:41:29 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sparkDriver@10.169.170.124:39575/user/CoarseGrainedScheduler
15/10/26 20:41:29 INFO executor.CoarseGrainedExecutorBackend: Successfully registered with driver
15/10/26 20:41:29 INFO executor.Executor: Starting executor ID 1 on host ip-10-67-169-247.ec2.internal
15/10/26 20:41:29 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 37374.
15/10/26 20:41:29 INFO netty.NettyBlockTransferService: Server created on 37374
15/10/26 20:41:29 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/10/26 20:41:29 INFO storage.BlockManagerMaster: Registered BlockManager
15/10/26 20:41:29 INFO storage.BlockManager: Registering executor with local external shuffle service.
15/10/26 20:41:52 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1
15/10/26 20:41:52 INFO executor.Executor: Running task 1.0 in stage 0.0 (TID 1)
15/10/26 20:41:52 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0
15/10/26 20:41:52 INFO storage.MemoryStore: ensureFreeSpace(47141) called with curMem=0, maxMem=560993402
15/10/26 20:41:52 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 46.0 KB, free 535.0 MB)
15/10/26 20:41:52 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 242 ms
15/10/26 20:41:52 INFO storage.MemoryStore: ensureFreeSpace(135712) called with curMem=47141, maxMem=560993402
15/10/26 20:41:52 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 132.5 KB, free 534.8 MB)
15/10/26 20:41:52 INFO Configuration.deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
15/10/26 20:41:52 INFO Configuration.deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
15/10/26 20:41:52 INFO Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
15/10/26 20:41:52 INFO Configuration.deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
15/10/26 20:41:52 INFO Configuration.deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
15/10/26 20:41:53 INFO metrics.MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1444274560440
15/10/26 20:41:53 INFO metrics.MetricsSaver: Created MetricsSaver j-2US4HNPLS1SJO:i-021cded6:CoarseGrainedExecutorBackend:16941 period:60 /mnt/var/em/raw/i-021cded6_20151026_CoarseGrainedExecutorBackend_16941_raw.bin
15/10/26 20:41:54 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0000_m_000001_1' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.original/_temporary/0/task_201510262041_0000_m_000001
15/10/26 20:41:54 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0000_m_000001_1: Committed
15/10/26 20:41:54 INFO executor.Executor: Finished task 1.0 in stage 0.0 (TID 1). 1884 bytes result sent to driver
15/10/26 20:41:54 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 3
15/10/26 20:41:54 INFO executor.Executor: Running task 1.0 in stage 1.0 (TID 3)
15/10/26 20:41:54 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 1
15/10/26 20:41:54 INFO storage.MemoryStore: ensureFreeSpace(2070) called with curMem=0, maxMem=560993402
15/10/26 20:41:54 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.0 KB, free 535.0 MB)
15/10/26 20:41:54 INFO broadcast.TorrentBroadcast: Reading broadcast variable 1 took 72 ms
15/10/26 20:41:54 INFO storage.MemoryStore: ensureFreeSpace(3776) called with curMem=2070, maxMem=560993402
15/10/26 20:41:54 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.7 KB, free 535.0 MB)
15/10/26 20:41:57 INFO executor.Executor: Finished task 1.0 in stage 1.0 (TID 3). 6394 bytes result sent to driver
15/10/26 20:41:57 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4
15/10/26 20:41:57 INFO executor.Executor: Running task 0.0 in stage 2.0 (TID 4)
15/10/26 20:41:57 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 3
15/10/26 20:41:57 INFO storage.MemoryStore: ensureFreeSpace(29338) called with curMem=0, maxMem=560993402
15/10/26 20:41:57 INFO storage.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 28.7 KB, free 535.0 MB)
15/10/26 20:41:57 INFO broadcast.TorrentBroadcast: Reading broadcast variable 3 took 16 ms
15/10/26 20:41:57 INFO storage.MemoryStore: ensureFreeSpace(82904) called with curMem=29338, maxMem=560993402
15/10/26 20:41:57 INFO storage.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 81.0 KB, free 534.9 MB)
15/10/26 20:41:58 INFO datasources.DefaultWriterContainer: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
15/10/26 20:41:58 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
15/10/26 20:41:58 INFO compress.CodecPool: Got brand-new compressor [.gz]
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
15/10/26 20:41:58 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0002_m_000000_0' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.parquet/_temporary/0/task_201510262041_0002_m_000000
15/10/26 20:41:58 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0002_m_000000_0: Committed
15/10/26 20:41:58 INFO executor.Executor: Finished task 0.0 in stage 2.0 (TID 4). 935 bytes result sent to driver
15/10/26 20:42:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 7
15/10/26 20:42:03 INFO executor.Executor: Running task 1.0 in stage 3.0 (TID 7)
15/10/26 20:42:03 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 5
15/10/26 20:42:03 INFO storage.MemoryStore: ensureFreeSpace(58865) called with curMem=0, maxMem=560993402
15/10/26 20:42:03 INFO storage.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 57.5 KB, free 534.9 MB)
15/10/26 20:42:03 INFO broadcast.TorrentBroadcast: Reading broadcast variable 5 took 15 ms
15/10/26 20:42:03 INFO storage.MemoryStore: ensureFreeSpace(170000) called with curMem=58865, maxMem=560993402
15/10/26 20:42:03 INFO storage.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 166.0 KB, free 534.8 MB)
15/10/26 20:42:03 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262042_0003_m_000001_7' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.tsv/_temporary/0/task_201510262042_0003_m_000001
15/10/26 20:42:03 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262042_0003_m_000001_7: Committed
15/10/26 20:42:03 INFO executor.Executor: Finished task 1.0 in stage 3.0 (TID 7). 2310 bytes result sent to driver
15/10/26 20:42:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 8
15/10/26 20:42:03 INFO executor.Executor: Running task 2.0 in stage 3.0 (TID 8)
15/10/26 20:42:03 INFO spark.CacheManager: Partition rdd_21_0 not found, computing it
15/10/26 20:42:03 INFO codegen.GenerateUnsafeProjection: Code generated in 421.829475 ms
15/10/26 20:42:04 INFO codegen.GenerateSafeProjection: Code generated in 93.378192 ms
15/10/26 20:42:04 INFO codegen.GenerateUnsafeProjection: Code generated in 408.928592 ms
15/10/26 20:42:04 INFO codegen.GenerateSafeProjection: Code generated in 54.541443 ms
15/10/26 20:42:04 INFO codegen.GenerateUnsafeProjection: Code generated in 189.240094 ms
15/10/26 20:42:04 INFO codegen.GenerateSafeProjection: Code generated in 42.07393 ms
15/10/26 20:42:04 INFO codegen.GenerateUnsafeProjection: Code generated in 88.93472 ms
15/10/26 20:42:04 INFO storage.MemoryStore: ensureFreeSpace(16) called with curMem=228865, maxMem=560993402
15/10/26 20:42:04 INFO storage.MemoryStore: Block rdd_21_0 stored as values in memory (estimated size 16.0 B, free 534.8 MB)
15/10/26 20:42:04 INFO codegen.GeneratePredicate: Code generated in 5.99599 ms
15/10/26 20:42:05 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262042_0003_m_000002_8' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.tsv/_temporary/0/task_201510262042_0003_m_000002
15/10/26 20:42:05 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262042_0003_m_000002_8: Committed
15/10/26 20:42:05 INFO executor.Executor: Finished task 2.0 in stage 3.0 (TID 8). 2890 bytes result sent to driver
15/10/26 20:42:06 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 11
15/10/26 20:42:06 INFO executor.Executor: Running task 3.2 in stage 3.0 (TID 11)
15/10/26 20:42:06 INFO spark.CacheManager: Partition rdd_21_1 not found, computing it
15/10/26 20:42:07 ERROR executor.Executor: Exception in task 3.2 in stage 3.0 (TID 11)
java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:42:07 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 12
15/10/26 20:42:07 INFO executor.Executor: Running task 3.3 in stage 3.0 (TID 12)
15/10/26 20:42:07 INFO spark.CacheManager: Partition rdd_21_1 not found, computing it
15/10/26 20:42:07 ERROR executor.Executor: Exception in task 3.3 in stage 3.0 (TID 12)
java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:42:07 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown
15/10/26 20:42:07 INFO storage.MemoryStore: MemoryStore cleared
15/10/26 20:42:07 INFO storage.BlockManager: BlockManager stopped
15/10/26 20:42:07 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/10/26 20:42:07 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/10/26 20:42:07 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/10/26 20:42:07 INFO util.ShutdownHookManager: Shutdown hook called
LogType:stdout
Log Upload Time:26-Oct-2015 20:42:09
LogLength:4212
Log Contents:
2015-10-26T20:41:27.665+0000: [GC [1 CMS-initial-mark: 0K(707840K)] 256519K(1014528K), 0.0490610 secs] [Times: user=0.05 sys=0.00, real=0.06 secs]
2015-10-26T20:41:27.741+0000: [CMS-concurrent-mark: 0.025/0.026 secs] [Times: user=0.03 sys=0.02, real=0.02 secs]
2015-10-26T20:41:27.770+0000: [GC2015-10-26T20:41:27.770+0000: [ParNew: 272640K->17632K(306688K), 0.0270940 secs] 272640K->17632K(1014528K), 0.0271760 secs] [Times: user=0.05 sys=0.02, real=0.03 secs]
2015-10-26T20:41:27.797+0000: [CMS-concurrent-preclean: 0.022/0.056 secs] [Times: user=0.10 sys=0.03, real=0.06 secs]
2015-10-26T20:41:29.093+0000: [CMS-concurrent-abortable-preclean: 1.003/1.296 secs] [Times: user=2.77 sys=0.44, real=1.29 secs]
2015-10-26T20:41:29.093+0000: [GC[YG occupancy: 166245 K (306688 K)]2015-10-26T20:41:29.093+0000: [Rescan (parallel) , 0.0167640 secs]2015-10-26T20:41:29.110+0000: [weak refs processing, 0.0000400 secs]2015-10-26T20:41:29.110+0000: [class unloading, 0.0030270 secs]2015-10-26T20:41:29.113+0000: [scrub symbol table, 0.0043010 secs]2015-10-26T20:41:29.117+0000: [scrub string table, 0.0003570 secs] [1 CMS-remark: 0K(707840K)] 166245K(1014528K), 0.0248900 secs] [Times: user=0.08 sys=0.00, real=0.03 secs]
2015-10-26T20:41:29.123+0000: [CMS-concurrent-sweep: 0.004/0.005 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
2015-10-26T20:41:29.148+0000: [CMS-concurrent-reset: 0.025/0.025 secs] [Times: user=0.04 sys=0.03, real=0.03 secs]
2015-10-26T20:41:52.361+0000: [GC2015-10-26T20:41:52.361+0000: [ParNew: 278442K->30836K(306688K), 0.0710210 secs] 278442K->36719K(1014528K), 0.0710890 secs] [Times: user=0.17 sys=0.04, real=0.07 secs]
2015-10-26T20:41:53.295+0000: [GC [1 CMS-initial-mark: 5883K(707840K)] 123655K(1014528K), 0.0252480 secs] [Times: user=0.04 sys=0.00, real=0.02 secs]
2015-10-26T20:41:53.376+0000: [CMS-concurrent-mark: 0.047/0.055 secs] [Times: user=0.08 sys=0.01, real=0.06 secs]
2015-10-26T20:41:53.399+0000: [CMS-concurrent-preclean: 0.017/0.023 secs] [Times: user=0.05 sys=0.00, real=0.02 secs]
2015-10-26T20:41:56.338+0000: [GC2015-10-26T20:41:56.338+0000: [ParNew: 303476K->34048K(306688K), 0.0510070 secs] 309359K->64733K(1014528K), 0.0510870 secs] [Times: user=0.11 sys=0.04, real=0.05 secs]
CMS: abort preclean due to time 2015-10-26T20:41:58.417+0000: [CMS-concurrent-abortable-preclean: 3.200/5.019 secs] [Times: user=8.33 sys=1.06, real=5.02 secs]
2015-10-26T20:41:58.418+0000: [GC[YG occupancy: 110817 K (306688 K)]2015-10-26T20:41:58.418+0000: [Rescan (parallel) , 0.0113940 secs]2015-10-26T20:41:58.429+0000: [weak refs processing, 0.0000530 secs]2015-10-26T20:41:58.429+0000: [class unloading, 0.0099670 secs]2015-10-26T20:41:58.439+0000: [scrub symbol table, 0.0064180 secs]2015-10-26T20:41:58.446+0000: [scrub string table, 0.0004890 secs] [1 CMS-remark: 30685K(707840K)] 141503K(1014528K), 0.0288120 secs] [Times: user=0.06 sys=0.00, real=0.03 secs]
2015-10-26T20:41:58.467+0000: [CMS-concurrent-sweep: 0.019/0.020 secs] [Times: user=0.04 sys=0.00, real=0.02 secs]
2015-10-26T20:41:58.470+0000: [CMS-concurrent-reset: 0.003/0.003 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2015-10-26T20:42:04.373+0000: [GC2015-10-26T20:42:04.373+0000: [ParNew: 306688K->31381K(306688K), 0.0486970 secs] 337319K->83202K(1014528K), 0.0487840 secs] [Times: user=0.14 sys=0.03, real=0.05 secs]
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.codec.CodecConfig: Compression: GZIP
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet block size to 134217728
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet page size to 1048576
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet dictionary page size to 1048576
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Dictionary is on
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Validation is off
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Writer version is: PARQUET_1_0
Oct 26, 2015 8:41:58 PM INFO: org.apache.parquet.hadoop.InternalParquetRecordWriter: Flushing mem columnStore to file. allocated memory: 65,568
Container: container_1444274555723_0064_01_000002 on ip-10-67-169-247.ec2.internal_8041
=========================================================================================
LogType:stderr
Log Upload Time:26-Oct-2015 20:42:09
LogLength:22880
Log Contents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/mnt1/yarn/usercache/hadoop/filecache/117/spark-assembly-1.5.0-hadoop2.6.0-amzn-1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/10/26 20:40:32 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
15/10/26 20:40:32 INFO spark.SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:40:32 INFO spark.SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:40:32 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:40:33 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/10/26 20:40:33 INFO Remoting: Starting remoting
15/10/26 20:40:34 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@ip-10-67-169-247.ec2.internal:49653]
15/10/26 20:40:34 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 49653.
15/10/26 20:40:34 INFO spark.SecurityManager: Changing view acls to: yarn,hadoop
15/10/26 20:40:34 INFO spark.SecurityManager: Changing modify acls to: yarn,hadoop
15/10/26 20:40:34 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hadoop); users with modify permissions: Set(yarn, hadoop)
15/10/26 20:40:35 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/10/26 20:40:35 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/10/26 20:40:35 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/10/26 20:40:35 INFO Remoting: Starting remoting
15/10/26 20:40:35 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor@ip-10-67-169-247.ec2.internal:45434]
15/10/26 20:40:35 INFO util.Utils: Successfully started service 'sparkExecutor' on port 45434.
15/10/26 20:40:35 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/10/26 20:40:35 INFO storage.DiskBlockManager: Created local directory at /mnt/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-8400987e-8021-4b1a-a965-238ec6e1f253
15/10/26 20:40:35 INFO storage.DiskBlockManager: Created local directory at /mnt1/yarn/usercache/hadoop/appcache/application_1444274555723_0064/blockmgr-6f95ef44-3ba9-47f4-a896-49537ccc391a
15/10/26 20:40:35 INFO storage.MemoryStore: MemoryStore started with capacity 535.0 MB
15/10/26 20:40:35 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sparkDriver@10.169.170.124:53152/user/CoarseGrainedScheduler
15/10/26 20:40:35 INFO executor.CoarseGrainedExecutorBackend: Successfully registered with driver
15/10/26 20:40:35 INFO executor.Executor: Starting executor ID 1 on host ip-10-67-169-247.ec2.internal
15/10/26 20:40:35 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 37713.
15/10/26 20:40:35 INFO netty.NettyBlockTransferService: Server created on 37713
15/10/26 20:40:35 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/10/26 20:40:35 INFO storage.BlockManagerMaster: Registered BlockManager
15/10/26 20:40:35 INFO storage.BlockManager: Registering executor with local external shuffle service.
15/10/26 20:41:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1
15/10/26 20:41:00 INFO executor.Executor: Running task 1.0 in stage 0.0 (TID 1)
15/10/26 20:41:00 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0
15/10/26 20:41:00 INFO storage.MemoryStore: ensureFreeSpace(47141) called with curMem=0, maxMem=560993402
15/10/26 20:41:00 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 46.0 KB, free 535.0 MB)
15/10/26 20:41:00 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 264 ms
15/10/26 20:41:00 INFO storage.MemoryStore: ensureFreeSpace(135712) called with curMem=47141, maxMem=560993402
15/10/26 20:41:00 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 132.5 KB, free 534.8 MB)
15/10/26 20:41:01 INFO Configuration.deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
15/10/26 20:41:01 INFO Configuration.deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
15/10/26 20:41:01 INFO Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
15/10/26 20:41:01 INFO Configuration.deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
15/10/26 20:41:01 INFO Configuration.deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
15/10/26 20:41:01 INFO metrics.MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1444274560440
15/10/26 20:41:01 INFO metrics.MetricsSaver: Created MetricsSaver j-2US4HNPLS1SJO:i-021cded6:CoarseGrainedExecutorBackend:16762 period:60 /mnt/var/em/raw/i-021cded6_20151026_CoarseGrainedExecutorBackend_16762_raw.bin
15/10/26 20:41:02 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0000_m_000001_1' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.original/_temporary/0/task_201510262041_0000_m_000001
15/10/26 20:41:02 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0000_m_000001_1: Committed
15/10/26 20:41:02 INFO executor.Executor: Finished task 1.0 in stage 0.0 (TID 1). 1884 bytes result sent to driver
15/10/26 20:41:02 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 2
15/10/26 20:41:02 INFO executor.Executor: Running task 0.0 in stage 1.0 (TID 2)
15/10/26 20:41:02 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 1
15/10/26 20:41:02 INFO storage.MemoryStore: ensureFreeSpace(2070) called with curMem=0, maxMem=560993402
15/10/26 20:41:02 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.0 KB, free 535.0 MB)
15/10/26 20:41:02 INFO broadcast.TorrentBroadcast: Reading broadcast variable 1 took 21 ms
15/10/26 20:41:02 INFO storage.MemoryStore: ensureFreeSpace(3776) called with curMem=2070, maxMem=560993402
15/10/26 20:41:02 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.7 KB, free 535.0 MB)
15/10/26 20:41:02 INFO executor.Executor: Finished task 0.0 in stage 1.0 (TID 2). 1615 bytes result sent to driver
15/10/26 20:41:05 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 5
15/10/26 20:41:05 INFO executor.Executor: Running task 1.0 in stage 2.0 (TID 5)
15/10/26 20:41:05 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 3
15/10/26 20:41:05 INFO storage.MemoryStore: ensureFreeSpace(29341) called with curMem=0, maxMem=560993402
15/10/26 20:41:05 INFO storage.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 28.7 KB, free 535.0 MB)
15/10/26 20:41:05 INFO broadcast.TorrentBroadcast: Reading broadcast variable 3 took 13 ms
15/10/26 20:41:05 INFO storage.MemoryStore: ensureFreeSpace(82904) called with curMem=29341, maxMem=560993402
15/10/26 20:41:05 INFO storage.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 81.0 KB, free 534.9 MB)
15/10/26 20:41:07 INFO datasources.DefaultWriterContainer: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
15/10/26 20:41:07 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
15/10/26 20:41:07 INFO compress.CodecPool: Got brand-new compressor [.gz]
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
15/10/26 20:41:08 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0002_m_000001_0' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.parquet/_temporary/0/task_201510262041_0002_m_000001
15/10/26 20:41:08 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0002_m_000001_0: Committed
15/10/26 20:41:08 INFO executor.Executor: Finished task 1.0 in stage 2.0 (TID 5). 935 bytes result sent to driver
15/10/26 20:41:11 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 6
15/10/26 20:41:11 INFO executor.Executor: Running task 0.0 in stage 3.0 (TID 6)
15/10/26 20:41:11 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 5
15/10/26 20:41:11 INFO storage.MemoryStore: ensureFreeSpace(58865) called with curMem=0, maxMem=560993402
15/10/26 20:41:11 INFO storage.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 57.5 KB, free 534.9 MB)
15/10/26 20:41:11 INFO broadcast.TorrentBroadcast: Reading broadcast variable 5 took 12 ms
15/10/26 20:41:11 INFO storage.MemoryStore: ensureFreeSpace(170000) called with curMem=58865, maxMem=560993402
15/10/26 20:41:11 INFO storage.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 166.0 KB, free 534.8 MB)
15/10/26 20:41:12 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0003_m_000000_6' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.tsv/_temporary/0/task_201510262041_0003_m_000000
15/10/26 20:41:12 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0003_m_000000_6: Committed
15/10/26 20:41:12 INFO executor.Executor: Finished task 0.0 in stage 3.0 (TID 6). 2310 bytes result sent to driver
15/10/26 20:41:12 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 8
15/10/26 20:41:12 INFO executor.Executor: Running task 2.0 in stage 3.0 (TID 8)
15/10/26 20:41:12 INFO spark.CacheManager: Partition rdd_21_0 not found, computing it
15/10/26 20:41:12 INFO codegen.GenerateUnsafeProjection: Code generated in 413.294749 ms
15/10/26 20:41:12 INFO codegen.GenerateSafeProjection: Code generated in 89.654254 ms
15/10/26 20:41:13 INFO codegen.GenerateUnsafeProjection: Code generated in 276.535225 ms
15/10/26 20:41:13 INFO codegen.GenerateSafeProjection: Code generated in 34.850834 ms
15/10/26 20:41:13 INFO codegen.GenerateUnsafeProjection: Code generated in 166.601785 ms
15/10/26 20:41:13 INFO codegen.GenerateSafeProjection: Code generated in 67.975471 ms
15/10/26 20:41:13 INFO codegen.GenerateUnsafeProjection: Code generated in 96.271111 ms
15/10/26 20:41:13 INFO storage.MemoryStore: ensureFreeSpace(16) called with curMem=228865, maxMem=560993402
15/10/26 20:41:13 INFO storage.MemoryStore: Block rdd_21_0 stored as values in memory (estimated size 16.0 B, free 534.8 MB)
15/10/26 20:41:13 INFO codegen.GeneratePredicate: Code generated in 6.680021 ms
15/10/26 20:41:13 INFO output.FileOutputCommitter: Saved output of task 'attempt_201510262041_0003_m_000002_8' to hdfs://ip-10-65-200-150.ec2.internal:8020/tmp/ngcngw-analytics.tsv/_temporary/0/task_201510262041_0003_m_000002
15/10/26 20:41:13 INFO mapred.SparkHadoopMapRedUtil: attempt_201510262041_0003_m_000002_8: Committed
15/10/26 20:41:13 INFO executor.Executor: Finished task 2.0 in stage 3.0 (TID 8). 2890 bytes result sent to driver
15/10/26 20:41:14 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 10
15/10/26 20:41:14 INFO executor.Executor: Running task 3.1 in stage 3.0 (TID 10)
15/10/26 20:41:14 INFO spark.CacheManager: Partition rdd_21_1 not found, computing it
15/10/26 20:41:15 ERROR executor.Executor: Exception in task 3.1 in stage 3.0 (TID 10)
java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:41:15 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 11
15/10/26 20:41:15 INFO executor.Executor: Running task 3.2 in stage 3.0 (TID 11)
15/10/26 20:41:15 INFO spark.CacheManager: Partition rdd_21_1 not found, computing it
15/10/26 20:41:15 ERROR executor.Executor: Exception in task 3.2 in stage 3.0 (TID 11)
java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:41:15 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 12
15/10/26 20:41:15 INFO executor.Executor: Running task 3.3 in stage 3.0 (TID 12)
15/10/26 20:41:15 INFO spark.CacheManager: Partition rdd_21_1 not found, computing it
15/10/26 20:41:15 ERROR executor.Executor: Exception in task 3.3 in stage 3.0 (TID 12)
java.util.NoSuchElementException: key not found: http://data.media.theplatform.com/media/data/Media/525991491676
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
at org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258)
at org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110)
at org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152)
at org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/26 20:41:15 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown
15/10/26 20:41:15 INFO storage.MemoryStore: MemoryStore cleared
15/10/26 20:41:15 INFO storage.BlockManager: BlockManager stopped
15/10/26 20:41:15 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/10/26 20:41:15 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/10/26 20:41:15 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/10/26 20:41:15 INFO util.ShutdownHookManager: Shutdown hook called
LogType:stdout
Log Upload Time:26-Oct-2015 20:42:09
LogLength:15575
Log Contents:
2015-10-26T20:40:33.956+0000: [GC [1 CMS-initial-mark: 0K(707840K)] 251060K(1014528K), 0.0494220 secs] [Times: user=0.04 sys=0.00, real=0.04 secs]
2015-10-26T20:40:34.040+0000: [CMS-concurrent-mark: 0.032/0.034 secs] [Times: user=0.04 sys=0.02, real=0.04 secs]
2015-10-26T20:40:34.070+0000: [GC2015-10-26T20:40:34.070+0000: [ParNew: 272640K->17627K(306688K), 0.0271620 secs] 272640K->17627K(1014528K), 0.0272360 secs] [Times: user=0.04 sys=0.02, real=0.03 secs]
2015-10-26T20:40:34.097+0000: [CMS-concurrent-preclean: 0.025/0.058 secs] [Times: user=0.10 sys=0.03, real=0.06 secs]
2015-10-26T20:40:35.522+0000: [CMS-concurrent-abortable-preclean: 1.038/1.424 secs] [Times: user=2.75 sys=0.67, real=1.42 secs]
2015-10-26T20:40:35.522+0000: [GC[YG occupancy: 172672 K (306688 K)]2015-10-26T20:40:35.522+0000: [Rescan (parallel) , 0.0125850 secs]2015-10-26T20:40:35.535+0000: [weak refs processing, 0.0000340 secs]2015-10-26T20:40:35.535+0000: [class unloading, 0.0023690 secs]2015-10-26T20:40:35.537+0000: [scrub symbol table, 0.0032500 secs]2015-10-26T20:40:35.541+0000: [scrub string table, 0.0002630 secs] [1 CMS-remark: 0K(707840K)] 172672K(1014528K), 0.0188170 secs] [Times: user=0.05 sys=0.01, real=0.02 secs]
2015-10-26T20:40:35.548+0000: [CMS-concurrent-sweep: 0.006/0.007 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
2015-10-26T20:40:35.580+0000: [CMS-concurrent-reset: 0.032/0.032 secs] [Times: user=0.03 sys=0.03, real=0.03 secs]
2015-10-26T20:41:00.677+0000: [GC2015-10-26T20:41:00.677+0000: [ParNew: 275053K->30835K(306688K), 0.0708730 secs] 275053K->36721K(1014528K), 0.0709460 secs] [Times: user=0.16 sys=0.06, real=0.07 secs]
2015-10-26T20:41:01.815+0000: [GC [1 CMS-initial-mark: 5886K(707840K)] 142692K(1014528K), 0.0237070 secs] [Times: user=0.02 sys=0.00, real=0.02 secs]
2015-10-26T20:41:01.909+0000: [CMS-concurrent-mark: 0.064/0.070 secs] [Times: user=0.12 sys=0.01, real=0.07 secs]
2015-10-26T20:41:01.948+0000: [CMS-concurrent-preclean: 0.030/0.039 secs] [Times: user=0.06 sys=0.02, real=0.04 secs]
CMS: abort preclean due to time 2015-10-26T20:41:06.975+0000: [CMS-concurrent-abortable-preclean: 1.945/5.028 secs] [Times: user=3.29 sys=0.65, real=5.03 secs]
2015-10-26T20:41:06.976+0000: [GC[YG occupancy: 244452 K (306688 K)]2015-10-26T20:41:06.976+0000: [Rescan (parallel) , 0.0482870 secs]2015-10-26T20:41:07.024+0000: [weak refs processing, 0.0000550 secs]2015-10-26T20:41:07.024+0000: [class unloading, 0.0046910 secs]2015-10-26T20:41:07.029+0000: [scrub symbol table, 0.0059150 secs]2015-10-26T20:41:07.035+0000: [scrub string table, 0.0004610 secs] [1 CMS-remark: 5886K(707840K)] 250338K(1014528K), 0.0598030 secs] [Times: user=0.20 sys=0.00, real=0.06 secs]
2015-10-26T20:41:07.050+0000: [CMS-concurrent-sweep: 0.013/0.014 secs] [Times: user=0.01 sys=0.01, real=0.01 secs]
2015-10-26T20:41:07.053+0000: [CMS-concurrent-reset: 0.003/0.003 secs] [Times: user=0.00 sys=0.01, real=0.00 secs]
2015-10-26T20:41:07.530+0000: [GC2015-10-26T20:41:07.530+0000: [ParNew: 303475K->34048K(306688K), 0.0700810 secs] 309329K->81643K(1014528K), 0.0701630 secs] [Times: user=0.15 sys=0.07, real=0.07 secs]
Oct 26, 2015 8:41:07 PM INFO: org.apache.parquet.hadoop.codec.CodecConfig: Compression: GZIP
Oct 26, 2015 8:41:07 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet block size to 134217728
Oct 26, 2015 8:41:07 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet page size to 1048576
Oct 26, 2015 8:41:07 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Parquet dictionary page size to 1048576
Oct 26, 2015 8:41:07 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Dictionary is on
Oct 26, 2015 8:41:07 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Validation is off
Oct 26, 2015 8:41:07 PM INFO: org.apache.parquet.hadoop.ParquetOutputFormat: Writer version is: PARQUET_1_0
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.InternalParquetRecordWriter: Flushing mem columnStore to file. allocated memory: 133,966
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 129B for [$xmlns, dcterms] BINARY: 1 values, 36B raw, 54B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 141B for [$xmlns, media] BINARY: 1 values, 40B raw, 58B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 144B for [$xmlns, ngc] BINARY: 1 values, 41B raw, 59B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 169B for [$xmlns, pl] BINARY: 1 values, 49B raw, 67B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 187B for [$xmlns, pla] BINARY: 1 values, 55B raw, 73B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 194B for [$xmlns, plfile] BINARY: 1 values, 58B raw, 74B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 182B for [$xmlns, plmedia] BINARY: 1 values, 54B raw, 70B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 190B for [$xmlns, plrelease] BINARY: 1 values, 56B raw, 74B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 537B for [entries, bag, array_element, description] BINARY: 100 values, 109B raw, 130B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 98 entries, 19,996B raw, 98B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 993B for [entries, bag, array_element, id] BINARY: 100 values, 6,716B raw, 839B comp, 1 pages, encodings: [PLAIN, RLE]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 161B for [entries, bag, array_element, mediaavailableDate] INT64: 100 values, 96B raw, 117B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 49 entries, 392B raw, 49B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 147B for [entries, bag, array_element, mediacategories, bag, array_element, medialabel] BINARY: 190 values, 117B raw, 108B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 7 entries, 86B raw, 7B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 257B for [entries, bag, array_element, mediacategories, bag, array_element, medianame] BINARY: 190 values, 178B raw, 159B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 24 entries, 822B raw, 24B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 126B for [entries, bag, array_element, mediacategories, bag, array_element, mediascheme] BINARY: 190 values, 79B raw, 84B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 2 entries, 22B raw, 2B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 207B for [entries, bag, array_element, mediacontent, bag, array_element, plfileduration] DOUBLE: 200 values, 189B raw, 163B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 29 entries, 232B raw, 29B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 62B for [entries, bag, array_element, mediacopyright] BINARY: 100 values, 19B raw, 36B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 1 entries, 4B raw, 1B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 62B for [entries, bag, array_element, mediacopyrightUrl] BINARY: 100 values, 19B raw, 36B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 1 entries, 4B raw, 1B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 59B for [entries, bag, array_element, mediacountries, bag, array_element] BINARY: 100 values, 17B raw, 36B comp, 1 pages, encodings: [PLAIN, RLE]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 122B for [entries, bag, array_element, mediacredits, bag, array_element, mediarole] BINARY: 181 values, 80B raw, 89B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 2 entries, 13B raw, 2B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 94B for [entries, bag, array_element, mediacredits, bag, array_element, mediascheme] BINARY: 181 values, 61B raw, 67B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 1 entries, 4B raw, 1B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 248B for [entries, bag, array_element, mediacredits, bag, array_element, mediavalue] BINARY: 181 values, 198B raw, 198B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 43 entries, 757B raw, 43B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 67B for [entries, bag, array_element, mediaexcludeCountries] BOOLEAN: 100 values, 29B raw, 39B comp, 1 pages, encodings: [PLAIN, RLE]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 148B for [entries, bag, array_element, mediaexpirationDate] INT64: 100 values, 83B raw, 104B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 21 entries, 168B raw, 21B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 1,806B for [entries, bag, array_element, mediakeywords] BINARY: 100 values, 4,910B raw, 1,677B comp, 1 pages, encodings: [PLAIN, RLE]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 87B for [entries, bag, array_element, mediaratings, bag, array_element, rating] BINARY: 100 values, 32B raw, 51B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 2 entries, 18B raw, 2B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 83B for [entries, bag, array_element, mediaratings, bag, array_element, scheme] BINARY: 100 values, 20B raw, 37B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 1 entries, 14B raw, 1B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 75B for [entries, bag, array_element, mediaratings, bag, array_element, subRatings, bag, array_element] BINARY: 100 values, 29B raw, 46B comp, 1 pages, encodings: [PLAIN, RLE]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 62B for [entries, bag, array_element, mediatext] BINARY: 100 values, 19B raw, 36B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 1 entries, 4B raw, 1B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 79B for [entries, bag, array_element, mediathumbnails, bag, array_element, plfileduration] DOUBLE: 100 values, 20B raw, 37B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 1 entries, 8B raw, 1B comp}
2015-10-26T20:41:12.958+0000: [GC2015-10-26T20:41:12.958+0000: [ParNew: 306688K->15267K(306688K), 0.0455180 secs] 354283K->83862K(1014528K), 0.0456030 secs] [Times: user=0.14 sys=0.03, real=0.04 secs]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 74B for [entries, bag, array_element, ngccontentAdType] BINARY: 100 values, 22B raw, 41B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 2 entries, 15B raw, 2B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 163B for [entries, bag, array_element, ngcepisodeNumber] INT64: 100 values, 96B raw, 119B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 30 entries, 240B raw, 30B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 107B for [entries, bag, array_element, ngcnetwork, bag, array_element] BINARY: 100 values, 35B raw, 54B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 2 entries, 35B raw, 2B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 120B for [entries, bag, array_element, ngcseasonNumber] INT64: 100 values, 57B raw, 77B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 6 entries, 48B raw, 6B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 707B for [entries, bag, array_element, ngcuID] BINARY: 100 values, 2,711B raw, 639B comp, 1 pages, encodings: [PLAIN, RLE]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 86B for [entries, bag, array_element, ngcvideoType] BINARY: 100 values, 19B raw, 36B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 1 entries, 16B raw, 1B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 177B for [entries, bag, array_element, plmediachapters, bag, array_element, plmediaendTime] DOUBLE: 569 values, 207B raw, 133B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 46 entries, 368B raw, 46B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 797B for [entries, bag, array_element, plmediachapters, bag, array_element, plmediastartTime] DOUBLE: 569 values, 809B raw, 753B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 411 entries, 3,288B raw, 411B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 113B for [entries, bag, array_element, plmediachapters, bag, array_element, plmediathumbnailUrl] BINARY: 569 values, 161B raw, 85B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 1 entries, 4B raw, 1B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 125B for [entries, bag, array_element, plmediachapters, bag, array_element, plmediatitle] BINARY: 569 values, 172B raw, 95B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 2 entries, 10B raw, 2B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 68B for [entries, bag, array_element, plmediaprovider] BINARY: 100 values, 19B raw, 36B comp, 1 pages, encodings: [PLAIN_DICTIONARY, RLE], dic { 1 entries, 7B raw, 1B comp}
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 1,358B for [entries, bag, array_element, title] BINARY: 100 values, 2,271B raw, 1,300B comp, 1 pages, encodings: [PLAIN, RLE]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 70B for [entryCount] INT64: 1 values, 14B raw, 29B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 70B for [itemsPerPage] INT64: 1 values, 14B raw, 29B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 70B for [startIndex] INT64: 1 values, 14B raw, 29B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Oct 26, 2015 8:41:08 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 110B for [title] BINARY: 1 values, 29B raw, 47B comp, 1 pages, encodings: [PLAIN, RLE, BIT_PACKED]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment