Skip to content

Instantly share code, notes, and snippets.

@mshirley
Last active August 29, 2015 14:17
Show Gist options
  • Save mshirley/efb3f8f038d34b6dbaa7 to your computer and use it in GitHub Desktop.
Save mshirley/efb3f8f038d34b6dbaa7 to your computer and use it in GitHub Desktop.
elasticsearch-hadoop spark-shell error
[root@localhost bin]# SPARK_DAEMON_JAVA_OPTS="-Dlog4j.configuration=../conf/log4j.properties" ./spark-shell --jars elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar
15/03/27 21:17:59 INFO SecurityManager: Changing view acls to: root
15/03/27 21:17:59 INFO SecurityManager: Changing modify acls to: root
15/03/27 21:17:59 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/03/27 21:17:59 INFO HttpServer: Starting HTTP Server
15/03/27 21:18:00 DEBUG HttpServer: HttpServer is not using security
15/03/27 21:18:00 INFO Utils: Successfully started service 'HTTP class server' on port 44365.
15/03/27 21:18:03 DEBUG SparkILoop: Clearing 6 thunks.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.2.1
/_/
Using Scala version 2.10.4 (OpenJDK 64-Bit Server VM, Java 1.7.0_75)
Type in expressions to have them evaluated.
Type :help for more information.
15/03/27 21:18:06 WARN Utils: Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 192.168.122.235 instead (on interface eth0)
15/03/27 21:18:06 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/03/27 21:18:06 INFO SecurityManager: Changing view acls to: root
15/03/27 21:18:06 INFO SecurityManager: Changing modify acls to: root
15/03/27 21:18:06 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/03/27 21:18:06 DEBUG AkkaUtils: In createActorSystem, requireCookie is: off
15/03/27 21:18:06 INFO Slf4jLogger: Slf4jLogger started
15/03/27 21:18:06 INFO Remoting: Starting remoting
15/03/27 21:18:06 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.122.235:47248]
15/03/27 21:18:06 INFO Utils: Successfully started service 'sparkDriver' on port 47248.
15/03/27 21:18:06 DEBUG SparkEnv: Using serializer: class org.apache.spark.serializer.JavaSerializer
15/03/27 21:18:06 INFO SparkEnv: Registering MapOutputTracker
15/03/27 21:18:06 INFO SparkEnv: Registering BlockManagerMaster
15/03/27 21:18:07 DEBUG BlockManagerMasterActor: [actor] received message ExpireDeadHosts from Actor[akka://sparkDriver/user/BlockManagerMaster#-1026505032]
15/03/27 21:18:07 DEBUG BlockManagerMasterActor: [actor] handled message (1.630384 ms) ExpireDeadHosts from Actor[akka://sparkDriver/user/BlockManagerMaster#-1026505032]
15/03/27 21:18:07 INFO DiskBlockManager: Created local directory at /tmp/spark-8cdefa93-cab0-425b-93ec-0c88b2838636/spark-0b67d94b-741c-4267-98a5-355f03d882d3
15/03/27 21:18:07 INFO MemoryStore: MemoryStore started with capacity 265.4 MB
15/03/27 21:18:07 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[Rate of successful kerberos logins and latency (milliseconds)], about=, valueName=Time, type=DEFAULT, always=false, sampleName=Ops)
15/03/27 21:18:07 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[Rate of failed kerberos logins and latency (milliseconds)], about=, valueName=Time, type=DEFAULT, always=false, sampleName=Ops)
15/03/27 21:18:07 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[GetGroups], about=, valueName=Time, type=DEFAULT, always=false, sampleName=Ops)
15/03/27 21:18:07 DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
15/03/27 21:18:07 DEBUG Groups: Creating new Groups object
15/03/27 21:18:07 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
15/03/27 21:18:07 DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
15/03/27 21:18:07 DEBUG NativeCodeLoader: java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
15/03/27 21:18:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/03/27 21:18:07 DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
15/03/27 21:18:07 DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
15/03/27 21:18:07 DEBUG Shell: Failed to detect a valid hadoop home directory
java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:44)
at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:214)
at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1873)
at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:105)
at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:180)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:308)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:159)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:240)
at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
at $line3.$read$$iwC$$iwC.<init>(<console>:9)
at $line3.$read$$iwC.<init>(<console>:18)
at $line3.$read.<init>(<console>:20)
at $line3.$read$.<init>(<console>:24)
at $line3.$read$.<clinit>(<console>)
at $line3.$eval$.<init>(<console>:7)
at $line3.$eval$.<clinit>(<console>)
at $line3.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:852)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1125)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:674)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:705)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:669)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:828)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:873)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:785)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:123)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:122)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:270)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:122)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:60)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:147)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:60)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:106)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:60)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:962)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:916)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1011)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/03/27 21:18:07 DEBUG Shell: setsid exited with exit code 0
15/03/27 21:18:07 DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
15/03/27 21:18:07 INFO HttpFileServer: HTTP File server directory is /tmp/spark-314df99a-b2f7-466a-84c6-847f5766bdc5/spark-6fa09169-2be9-4001-8ccf-38969816a940
15/03/27 21:18:07 INFO HttpServer: Starting HTTP Server
15/03/27 21:18:07 DEBUG HttpServer: HttpServer is not using security
15/03/27 21:18:07 INFO Utils: Successfully started service 'HTTP file server' on port 32985.
15/03/27 21:18:07 DEBUG HttpFileServer: HTTP file server started at: http://192.168.122.235:32985
15/03/27 21:18:07 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/03/27 21:18:07 INFO SparkUI: Started SparkUI at http://192.168.122.235:4040
15/03/27 21:18:07 INFO SparkContext: Added JAR file:/root/spark-1.2.1-bin-hadoop2.4/bin/elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar at http://192.168.122.235:32985/jars/elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar with timestamp 1427505487813
15/03/27 21:18:07 INFO Executor: Starting executor ID <driver> on host localhost
15/03/27 21:18:07 DEBUG InternalLoggerFactory: Using SLF4J as the default logging framework
15/03/27 21:18:07 INFO Executor: Using REPL class URI: http://192.168.122.235:44365
15/03/27 21:18:07 DEBUG PlatformDependent0: java.nio.Buffer.address: available
15/03/27 21:18:07 DEBUG PlatformDependent0: sun.misc.Unsafe.theUnsafe: available
15/03/27 21:18:07 DEBUG PlatformDependent0: sun.misc.Unsafe.copyMemory: available
15/03/27 21:18:07 DEBUG PlatformDependent0: java.nio.Bits.unaligned: true
15/03/27 21:18:07 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@192.168.122.235:47248/user/HeartbeatReceiver
15/03/27 21:18:07 DEBUG PlatformDependent: UID: 0
15/03/27 21:18:07 DEBUG PlatformDependent: Java version: 7
15/03/27 21:18:07 DEBUG PlatformDependent: -Dio.netty.noUnsafe: false
15/03/27 21:18:07 DEBUG PlatformDependent: sun.misc.Unsafe: available
15/03/27 21:18:07 DEBUG PlatformDependent: -Dio.netty.noJavassist: false
15/03/27 21:18:07 DEBUG PlatformDependent: Javassist: unavailable
15/03/27 21:18:07 DEBUG PlatformDependent: You don't have Javassist in your class path or you don't have enough permission to load dynamically generated classes. Please check the configuration for better performance.
15/03/27 21:18:07 DEBUG PlatformDependent: -Dio.netty.tmpdir: /tmp (java.io.tmpdir)
15/03/27 21:18:07 DEBUG PlatformDependent: -Dio.netty.bitMode: 64 (sun.arch.data.model)
15/03/27 21:18:07 DEBUG PlatformDependent: -Dio.netty.noPreferDirect: false
15/03/27 21:18:07 DEBUG MultithreadEventLoopGroup: -Dio.netty.eventLoopThreads: 4
15/03/27 21:18:07 DEBUG NioEventLoop: -Dio.netty.noKeySetOptimization: false
15/03/27 21:18:07 DEBUG NioEventLoop: -Dio.netty.selectorAutoRebuildThreshold: 512
15/03/27 21:18:07 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.numHeapArenas: 2
15/03/27 21:18:07 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.numDirectArenas: 2
15/03/27 21:18:07 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.pageSize: 8192
15/03/27 21:18:07 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.maxOrder: 11
15/03/27 21:18:07 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.chunkSize: 16777216
15/03/27 21:18:07 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.tinyCacheSize: 512
15/03/27 21:18:07 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.smallCacheSize: 256
15/03/27 21:18:07 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.normalCacheSize: 64
15/03/27 21:18:07 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.maxCachedBufferCapacity: 32768
15/03/27 21:18:07 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.cacheTrimInterval: 8192
15/03/27 21:18:08 DEBUG ThreadLocalRandom: -Dio.netty.initialSeedUniquifier: 0x83820ca774c58e7c (took 1 ms)
15/03/27 21:18:08 DEBUG ByteBufUtil: -Dio.netty.allocator.type: unpooled
15/03/27 21:18:08 DEBUG ByteBufUtil: -Dio.netty.threadLocalDirectBufferSize: 65536
15/03/27 21:18:08 DEBUG NetUtil: Loopback interface: lo (lo, 0:0:0:0:0:0:0:1%1)
15/03/27 21:18:08 DEBUG NetUtil: /proc/sys/net/core/somaxconn: 128
15/03/27 21:18:08 DEBUG TransportServer: Shuffle server started on port :47191
15/03/27 21:18:08 INFO NettyBlockTransferService: Server created on 47191
15/03/27 21:18:08 INFO BlockManagerMaster: Trying to register BlockManager
15/03/27 21:18:08 DEBUG BlockManagerMasterActor: [actor] received message RegisterBlockManager(BlockManagerId(<driver>, localhost, 47191),278302556,Actor[akka://sparkDriver/user/BlockManagerActor1#-1010684591]) from Actor[akka://sparkDriver/temp/$a]
15/03/27 21:18:08 INFO BlockManagerMasterActor: Registering block manager localhost:47191 with 265.4 MB RAM, BlockManagerId(<driver>, localhost, 47191)
15/03/27 21:18:08 INFO BlockManagerMaster: Registered BlockManager
15/03/27 21:18:08 DEBUG BlockManagerMasterActor: [actor] handled message (12.576447 ms) RegisterBlockManager(BlockManagerId(<driver>, localhost, 47191),278302556,Actor[akka://sparkDriver/user/BlockManagerActor1#-1010684591]) from Actor[akka://sparkDriver/temp/$a]
15/03/27 21:18:08 INFO SparkILoop: Created spark context..
Spark context available as sc.
scala> import org.apache.spark.SparkContext._
import org.apache.spark.SparkContext._
scala> import org.elasticsearch.spark._
import org.elasticsearch.spark._
scala>
scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@1316f8c2
scala> import sqlContext._
import sqlContext._
scala>
scala> val input = sqlContext.jsonFile("hdfs://localhost:9000/test.json")
15/03/27 21:18:17 INFO MemoryStore: ensureFreeSpace(163705) called with curMem=0, maxMem=278302556
15/03/27 21:18:17 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 159.9 KB, free 265.3 MB)
15/03/27 21:18:17 DEBUG BlockManager: Put block broadcast_0 locally took 145 ms
15/03/27 21:18:17 DEBUG BlockManager: Putting block broadcast_0 without replication took 146 ms
15/03/27 21:18:17 INFO MemoryStore: ensureFreeSpace(22692) called with curMem=163705, maxMem=278302556
15/03/27 21:18:17 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 22.2 KB, free 265.2 MB)
15/03/27 21:18:17 DEBUG BlockManagerMasterActor: [actor] received message UpdateBlockInfo(BlockManagerId(<driver>, localhost, 47191),broadcast_0_piece0,StorageLevel(false, true, false, false, 1),22692,0,0) from Actor[akka://sparkDriver/temp/$b]
15/03/27 21:18:17 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:47191 (size: 22.2 KB, free: 265.4 MB)
15/03/27 21:18:17 DEBUG BlockManagerMasterActor: [actor] handled message (1.313954 ms) UpdateBlockInfo(BlockManagerId(<driver>, localhost, 47191),broadcast_0_piece0,StorageLevel(false, true, false, false, 1),22692,0,0) from Actor[akka://sparkDriver/temp/$b]
15/03/27 21:18:17 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
15/03/27 21:18:17 DEBUG BlockManager: Told master about block broadcast_0_piece0
15/03/27 21:18:17 DEBUG BlockManager: Put block broadcast_0_piece0 locally took 5 ms
15/03/27 21:18:17 DEBUG BlockManager: Putting block broadcast_0_piece0 without replication took 5 ms
15/03/27 21:18:17 INFO SparkContext: Created broadcast 0 from textFile at SQLContext.scala:193
15/03/27 21:18:17 DEBUG BlockManager: Getting local block broadcast_0
15/03/27 21:18:17 DEBUG BlockManager: Level for block broadcast_0 is StorageLevel(true, true, false, true, 1)
15/03/27 21:18:17 DEBUG BlockManager: Getting block broadcast_0 from memory
15/03/27 21:18:17 DEBUG HadoopRDD: SplitLocationInfo and other new Hadoop classes are unavailable. Using the older Hadoop location info code.
java.lang.ClassNotFoundException: org.apache.hadoop.mapred.InputSplitWithLocationInfo
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at org.apache.spark.rdd.HadoopRDD$SplitInfoReflections.<init>(HadoopRDD.scala:381)
at org.apache.spark.rdd.HadoopRDD$.liftedTree1$1(HadoopRDD.scala:391)
at org.apache.spark.rdd.HadoopRDD$.<init>(HadoopRDD.scala:390)
at org.apache.spark.rdd.HadoopRDD$.<clinit>(HadoopRDD.scala)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:159)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:194)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:220)
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:220)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:220)
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:220)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1390)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:884)
at org.apache.spark.sql.json.JsonRDD$.inferSchema(JsonRDD.scala:57)
at org.apache.spark.sql.SQLContext.jsonRDD(SQLContext.scala:232)
at org.apache.spark.sql.SQLContext.jsonFile(SQLContext.scala:194)
at org.apache.spark.sql.SQLContext.jsonFile(SQLContext.scala:173)
at $line21.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:23)
at $line21.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:28)
at $line21.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)
at $line21.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:32)
at $line21.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:34)
at $line21.$read$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:36)
at $line21.$read$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38)
at $line21.$read$$iwC$$iwC$$iwC.<init>(<console>:40)
at $line21.$read$$iwC$$iwC.<init>(<console>:42)
at $line21.$read$$iwC.<init>(<console>:44)
at $line21.$read.<init>(<console>:46)
at $line21.$read$.<init>(<console>:50)
at $line21.$read$.<clinit>(<console>)
at $line21.$eval$.<init>(<console>:7)
at $line21.$eval$.<clinit>(<console>)
at $line21.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:852)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1125)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:674)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:705)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:669)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:828)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:873)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:785)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:628)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:636)
at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:641)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:968)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:916)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1011)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/03/27 21:18:17 DEBUG HadoopRDD: Creating new JobConf and caching it for later re-use
15/03/27 21:18:17 DEBUG UserGroupInformation: hadoop login
15/03/27 21:18:17 DEBUG UserGroupInformation: hadoop login commit
15/03/27 21:18:17 DEBUG UserGroupInformation: using local user:UnixPrincipal: root
15/03/27 21:18:17 DEBUG UserGroupInformation: UGI loginUser:root (auth:SIMPLE)
15/03/27 21:18:17 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
15/03/27 21:18:17 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false
15/03/27 21:18:17 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
15/03/27 21:18:17 DEBUG BlockReaderLocal: dfs.domain.socket.path =
15/03/27 21:18:17 DEBUG RetryUtils: multipleLinearRandomRetry = null
15/03/27 21:18:17 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@7f9aba9a
15/03/27 21:18:17 DEBUG Client: getting client out of cache: org.apache.hadoop.ipc.Client@76984879
15/03/27 21:18:18 DEBUG BlockReaderLocal: Both short-circuit local reads and UNIX domain socket are disabled.
15/03/27 21:18:18 DEBUG Client: The ping interval is 60000 ms.
15/03/27 21:18:18 DEBUG Client: Connecting to localhost/127.0.0.1:9000
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root: starting, having connections 1
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root sending #0
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root got value #0
15/03/27 21:18:18 DEBUG ProtobufRpcEngine: Call: getFileInfo took 49ms
15/03/27 21:18:18 DEBUG FileInputFormat: Time taken to get FileStatuses: 509
15/03/27 21:18:18 INFO FileInputFormat: Total input paths to process : 1
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root sending #1
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root got value #1
15/03/27 21:18:18 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 4ms
15/03/27 21:18:18 DEBUG FileInputFormat: Total # of splits generated by getSplits: 2, TimeTaken: 554
15/03/27 21:18:18 INFO SparkContext: Starting job: reduce at JsonRDD.scala:57
15/03/27 21:18:18 INFO DAGScheduler: Got job 0 (reduce at JsonRDD.scala:57) with 2 output partitions (allowLocal=false)
15/03/27 21:18:18 INFO DAGScheduler: Final stage: Stage 0(reduce at JsonRDD.scala:57)
15/03/27 21:18:18 INFO DAGScheduler: Parents of final stage: List()
15/03/27 21:18:18 DEBUG BlockManagerMasterActor: [actor] received message GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@290f6bff) from Actor[akka://sparkDriver/temp/$c]
15/03/27 21:18:18 DEBUG BlockManagerMasterActor: [actor] handled message (0.664791 ms) GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@290f6bff) from Actor[akka://sparkDriver/temp/$c]
15/03/27 21:18:18 DEBUG BlockManagerMasterActor: [actor] received message GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@4eb3255a) from Actor[akka://sparkDriver/temp/$d]
15/03/27 21:18:18 DEBUG BlockManagerMasterActor: [actor] handled message (0.110894 ms) GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@4eb3255a) from Actor[akka://sparkDriver/temp/$d]
15/03/27 21:18:18 DEBUG BlockManagerMasterActor: [actor] received message GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@53596013) from Actor[akka://sparkDriver/temp/$e]
15/03/27 21:18:18 DEBUG BlockManagerMasterActor: [actor] handled message (0.087779 ms) GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@53596013) from Actor[akka://sparkDriver/temp/$e]
15/03/27 21:18:18 DEBUG BlockManagerMasterActor: [actor] received message GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@10b2aa25) from Actor[akka://sparkDriver/temp/$f]
15/03/27 21:18:18 DEBUG BlockManagerMasterActor: [actor] handled message (0.088175 ms) GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@10b2aa25) from Actor[akka://sparkDriver/temp/$f]
15/03/27 21:18:18 INFO DAGScheduler: Missing parents: List()
15/03/27 21:18:18 DEBUG DAGScheduler: submitStage(Stage 0)
15/03/27 21:18:18 DEBUG DAGScheduler: missing: List()
15/03/27 21:18:18 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[3] at map at JsonRDD.scala:57), which has no missing parents
15/03/27 21:18:18 DEBUG DAGScheduler: submitMissingTasks(Stage 0)
15/03/27 21:18:18 INFO MemoryStore: ensureFreeSpace(3176) called with curMem=186397, maxMem=278302556
15/03/27 21:18:18 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.1 KB, free 265.2 MB)
15/03/27 21:18:18 DEBUG BlockManager: Put block broadcast_1 locally took 3 ms
15/03/27 21:18:18 DEBUG BlockManager: Putting block broadcast_1 without replication took 3 ms
15/03/27 21:18:18 INFO MemoryStore: ensureFreeSpace(2179) called with curMem=189573, maxMem=278302556
15/03/27 21:18:18 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.1 KB, free 265.2 MB)
15/03/27 21:18:18 DEBUG BlockManagerMasterActor: [actor] received message UpdateBlockInfo(BlockManagerId(<driver>, localhost, 47191),broadcast_1_piece0,StorageLevel(false, true, false, false, 1),2179,0,0) from Actor[akka://sparkDriver/temp/$g]
15/03/27 21:18:18 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:47191 (size: 2.1 KB, free: 265.4 MB)
15/03/27 21:18:18 DEBUG BlockManagerMasterActor: [actor] handled message (1.008528 ms) UpdateBlockInfo(BlockManagerId(<driver>, localhost, 47191),broadcast_1_piece0,StorageLevel(false, true, false, false, 1),2179,0,0) from Actor[akka://sparkDriver/temp/$g]
15/03/27 21:18:18 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
15/03/27 21:18:18 DEBUG BlockManager: Told master about block broadcast_1_piece0
15/03/27 21:18:18 DEBUG BlockManager: Put block broadcast_1_piece0 locally took 4 ms
15/03/27 21:18:18 DEBUG BlockManager: Putting block broadcast_1_piece0 without replication took 4 ms
15/03/27 21:18:18 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:838
15/03/27 21:18:18 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (MappedRDD[3] at map at JsonRDD.scala:57)
15/03/27 21:18:18 DEBUG DAGScheduler: New pending tasks: Set(ResultTask(0, 0), ResultTask(0, 1))
15/03/27 21:18:18 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
15/03/27 21:18:18 DEBUG TaskSetManager: Epoch for TaskSet 0.0: 0
15/03/27 21:18:18 DEBUG TaskSetManager: Valid locality levels for TaskSet 0.0: NO_PREF, ANY
15/03/27 21:18:18 DEBUG LocalActor: [actor] received message ReviveOffers from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:18 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_0, runningTasks: 0
15/03/27 21:18:18 DEBUG TaskSetManager: Valid locality levels for TaskSet 0.0: NO_PREF, ANY
15/03/27 21:18:18 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1395 bytes)
15/03/27 21:18:18 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 1395 bytes)
15/03/27 21:18:18 DEBUG LocalActor: [actor] handled message (38.69249 ms) ReviveOffers from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:18 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
15/03/27 21:18:18 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
15/03/27 21:18:18 DEBUG LocalActor: [actor] received message StatusUpdate(1,RUNNING,java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:18 DEBUG LocalActor: [actor] handled message (1.094402 ms) StatusUpdate(1,RUNNING,java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:18 DEBUG LocalActor: [actor] received message StatusUpdate(0,RUNNING,java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:18 DEBUG LocalActor: [actor] handled message (0.036259 ms) StatusUpdate(0,RUNNING,java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:18 INFO Executor: Fetching http://192.168.122.235:32985/jars/elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar with timestamp 1427505487813
15/03/27 21:18:18 INFO Utils: Fetching http://192.168.122.235:32985/jars/elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar to /tmp/spark-34a36258-a1fb-4710-aa7c-e9172e881f22/spark-ba16588b-cb63-49f6-940c-70c3af72353e/fetchFileTemp7165382945673504248.tmp
15/03/27 21:18:18 DEBUG Utils: fetchFile not using security
15/03/27 21:18:18 INFO Executor: Adding file:/tmp/spark-34a36258-a1fb-4710-aa7c-e9172e881f22/spark-ba16588b-cb63-49f6-940c-70c3af72353e/elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar to class loader
15/03/27 21:18:18 DEBUG Executor: Task 0's epoch is 0
15/03/27 21:18:18 DEBUG Executor: Task 1's epoch is 0
15/03/27 21:18:18 DEBUG BlockManager: Getting local block broadcast_1
15/03/27 21:18:18 DEBUG BlockManager: Level for block broadcast_1 is StorageLevel(true, true, false, true, 1)
15/03/27 21:18:18 DEBUG BlockManager: Getting block broadcast_1 from memory
15/03/27 21:18:18 DEBUG BlockManager: Getting local block broadcast_1
15/03/27 21:18:18 DEBUG BlockManager: Level for block broadcast_1 is StorageLevel(true, true, false, true, 1)
15/03/27 21:18:18 DEBUG BlockManager: Getting block broadcast_1 from memory
15/03/27 21:18:18 INFO HadoopRDD: Input split: hdfs://localhost:9000/test.json:0+8
15/03/27 21:18:18 DEBUG BlockManager: Getting local block broadcast_0
15/03/27 21:18:18 DEBUG BlockManager: Level for block broadcast_0 is StorageLevel(true, true, false, true, 1)
15/03/27 21:18:18 DEBUG BlockManager: Getting block broadcast_0 from memory
15/03/27 21:18:18 DEBUG HadoopRDD: Re-using cached JobConf
15/03/27 21:18:18 INFO HadoopRDD: Input split: hdfs://localhost:9000/test.json:8+8
15/03/27 21:18:18 DEBUG BlockManager: Getting local block broadcast_0
15/03/27 21:18:18 DEBUG BlockManager: Level for block broadcast_0 is StorageLevel(true, true, false, true, 1)
15/03/27 21:18:18 DEBUG BlockManager: Getting block broadcast_0 from memory
15/03/27 21:18:18 DEBUG HadoopRDD: Re-using cached JobConf
15/03/27 21:18:18 DEBUG SparkHadoopUtil: Couldn't find method for retrieving thread-level FileSystem input data
java.lang.NoSuchMethodException: org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()
at java.lang.Class.getDeclaredMethod(Class.java:2009)
at org.apache.spark.util.Utils$.invoke(Utils.scala:1822)
at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$getFileSystemThreadStatistics$1.apply(SparkHadoopUtil.scala:179)
at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$getFileSystemThreadStatistics$1.apply(SparkHadoopUtil.scala:179)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.deploy.SparkHadoopUtil.getFileSystemThreadStatistics(SparkHadoopUtil.scala:179)
at org.apache.spark.deploy.SparkHadoopUtil.getFSBytesReadOnThreadCallback(SparkHadoopUtil.scala:139)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:210)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:99)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/03/27 21:18:18 DEBUG SparkHadoopUtil: Couldn't find method for retrieving thread-level FileSystem input data
java.lang.NoSuchMethodException: org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()
at java.lang.Class.getDeclaredMethod(Class.java:2009)
at org.apache.spark.util.Utils$.invoke(Utils.scala:1822)
at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$getFileSystemThreadStatistics$1.apply(SparkHadoopUtil.scala:179)
at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$getFileSystemThreadStatistics$1.apply(SparkHadoopUtil.scala:179)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.deploy.SparkHadoopUtil.getFileSystemThreadStatistics(SparkHadoopUtil.scala:179)
at org.apache.spark.deploy.SparkHadoopUtil.getFSBytesReadOnThreadCallback(SparkHadoopUtil.scala:139)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:210)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:99)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/03/27 21:18:18 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
15/03/27 21:18:18 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
15/03/27 21:18:18 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
15/03/27 21:18:18 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
15/03/27 21:18:18 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root sending #3
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root sending #2
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root got value #3
15/03/27 21:18:18 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 2ms
15/03/27 21:18:18 DEBUG DFSClient: newInfo = LocatedBlocks{
fileLength=16
underConstruction=false
blocks=[LocatedBlock{BP-2062118600-127.0.0.1-1427504068037:blk_1073741825_1001; getBlockSize()=16; corrupt=false; offset=0; locs=[127.0.0.1:50010]}]
lastLocatedBlock=LocatedBlock{BP-2062118600-127.0.0.1-1427504068037:blk_1073741825_1001; getBlockSize()=16; corrupt=false; offset=0; locs=[127.0.0.1:50010]}
isLastBlockComplete=true}
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root got value #2
15/03/27 21:18:18 DEBUG DFSClient: Connecting to datanode 127.0.0.1:50010
15/03/27 21:18:18 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 6ms
15/03/27 21:18:18 DEBUG DFSClient: newInfo = LocatedBlocks{
fileLength=16
underConstruction=false
blocks=[LocatedBlock{BP-2062118600-127.0.0.1-1427504068037:blk_1073741825_1001; getBlockSize()=16; corrupt=false; offset=0; locs=[127.0.0.1:50010]}]
lastLocatedBlock=LocatedBlock{BP-2062118600-127.0.0.1-1427504068037:blk_1073741825_1001; getBlockSize()=16; corrupt=false; offset=0; locs=[127.0.0.1:50010]}
isLastBlockComplete=true}
15/03/27 21:18:18 DEBUG DFSClient: Connecting to datanode 127.0.0.1:50010
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root sending #4
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root got value #4
15/03/27 21:18:18 DEBUG ProtobufRpcEngine: Call: getServerDefaults took 2ms
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root sending #5
15/03/27 21:18:18 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root got value #5
15/03/27 21:18:18 DEBUG ProtobufRpcEngine: Call: getServerDefaults took 3ms
15/03/27 21:18:18 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 1734 bytes result sent to driver
15/03/27 21:18:18 DEBUG LocalActor: [actor] received message StatusUpdate(1,FINISHED,java.nio.HeapByteBuffer[pos=0 lim=1734 cap=1734]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:18 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_0, runningTasks: 1
15/03/27 21:18:18 DEBUG LocalActor: [actor] handled message (3.654888 ms) StatusUpdate(1,FINISHED,java.nio.HeapByteBuffer[pos=661 lim=1734 cap=1734]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:18 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 449 ms on localhost (1/2)
15/03/27 21:18:19 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 2026 bytes result sent to driver
15/03/27 21:18:19 DEBUG LocalActor: [actor] received message StatusUpdate(0,FINISHED,java.nio.HeapByteBuffer[pos=0 lim=2026 cap=2026]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:19 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_0, runningTasks: 0
15/03/27 21:18:19 DEBUG LocalActor: [actor] handled message (0.921676 ms) StatusUpdate(0,FINISHED,java.nio.HeapByteBuffer[pos=0 lim=2026 cap=2026]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:19 INFO DAGScheduler: Stage 0 (reduce at JsonRDD.scala:57) finished in 0.623 s
15/03/27 21:18:19 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 600 ms on localhost (2/2)
15/03/27 21:18:19 DEBUG DAGScheduler: After removal of stage 0, remaining stages = 0
15/03/27 21:18:19 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/03/27 21:18:19 INFO DAGScheduler: Job 0 finished: reduce at JsonRDD.scala:57, took 0.774018 s
input: org.apache.spark.sql.SchemaRDD =
SchemaRDD[6] at RDD at SchemaRDD.scala:108
== Query Plan ==
== Physical Plan ==
PhysicalRDD [key#0], MappedRDD[5] at map at JsonRDD.scala:47
scala> input.saveToEs("spark_test/test")
15/03/27 21:18:20 INFO SparkContext: Starting job: runJob at EsSpark.scala:51
15/03/27 21:18:20 INFO DAGScheduler: Got job 1 (runJob at EsSpark.scala:51) with 2 output partitions (allowLocal=false)
15/03/27 21:18:20 INFO DAGScheduler: Final stage: Stage 1(runJob at EsSpark.scala:51)
15/03/27 21:18:20 INFO DAGScheduler: Parents of final stage: List()
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] received message GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@48e0ba6f) from Actor[akka://sparkDriver/temp/$h]
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] handled message (0.19888 ms) GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@48e0ba6f) from Actor[akka://sparkDriver/temp/$h]
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] received message GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@19eff2ca) from Actor[akka://sparkDriver/temp/$i]
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] handled message (0.128824 ms) GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@19eff2ca) from Actor[akka://sparkDriver/temp/$i]
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] received message GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@582cb913) from Actor[akka://sparkDriver/temp/$j]
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] handled message (0.346206 ms) GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@582cb913) from Actor[akka://sparkDriver/temp/$j]
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] received message GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@602abb9e) from Actor[akka://sparkDriver/temp/$k]
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] handled message (0.125965 ms) GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@602abb9e) from Actor[akka://sparkDriver/temp/$k]
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] received message GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@1583b365) from Actor[akka://sparkDriver/temp/$l]
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] handled message (0.125473 ms) GetLocationsMultipleBlockIds([Lorg.apache.spark.storage.BlockId;@1583b365) from Actor[akka://sparkDriver/temp/$l]
15/03/27 21:18:20 INFO DAGScheduler: Missing parents: List()
15/03/27 21:18:20 DEBUG DAGScheduler: submitStage(Stage 1)
15/03/27 21:18:20 DEBUG DAGScheduler: missing: List()
15/03/27 21:18:20 INFO DAGScheduler: Submitting Stage 1 (SchemaRDD[6] at RDD at SchemaRDD.scala:108
== Query Plan ==
== Physical Plan ==
PhysicalRDD [key#0], MappedRDD[5] at map at JsonRDD.scala:47), which has no missing parents
15/03/27 21:18:20 DEBUG DAGScheduler: submitMissingTasks(Stage 1)
15/03/27 21:18:20 INFO MemoryStore: ensureFreeSpace(5128) called with curMem=191752, maxMem=278302556
15/03/27 21:18:20 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 5.0 KB, free 265.2 MB)
15/03/27 21:18:20 DEBUG BlockManager: Put block broadcast_2 locally took 2 ms
15/03/27 21:18:20 DEBUG BlockManager: Putting block broadcast_2 without replication took 2 ms
15/03/27 21:18:20 INFO MemoryStore: ensureFreeSpace(3685) called with curMem=196880, maxMem=278302556
15/03/27 21:18:20 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 3.6 KB, free 265.2 MB)
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] received message UpdateBlockInfo(BlockManagerId(<driver>, localhost, 47191),broadcast_2_piece0,StorageLevel(false, true, false, false, 1),3685,0,0) from Actor[akka://sparkDriver/temp/$m]
15/03/27 21:18:20 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:47191 (size: 3.6 KB, free: 265.4 MB)
15/03/27 21:18:20 DEBUG BlockManagerMasterActor: [actor] handled message (0.485172 ms) UpdateBlockInfo(BlockManagerId(<driver>, localhost, 47191),broadcast_2_piece0,StorageLevel(false, true, false, false, 1),3685,0,0) from Actor[akka://sparkDriver/temp/$m]
15/03/27 21:18:20 INFO BlockManagerMaster: Updated info of block broadcast_2_piece0
15/03/27 21:18:20 DEBUG BlockManager: Told master about block broadcast_2_piece0
15/03/27 21:18:20 DEBUG BlockManager: Put block broadcast_2_piece0 locally took 3 ms
15/03/27 21:18:20 DEBUG BlockManager: Putting block broadcast_2_piece0 without replication took 3 ms
15/03/27 21:18:20 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:838
15/03/27 21:18:20 INFO DAGScheduler: Submitting 2 missing tasks from Stage 1 (SchemaRDD[6] at RDD at SchemaRDD.scala:108
== Query Plan ==
== Physical Plan ==
PhysicalRDD [key#0], MappedRDD[5] at map at JsonRDD.scala:47)
15/03/27 21:18:20 DEBUG DAGScheduler: New pending tasks: Set(ResultTask(1, 0), ResultTask(1, 1))
15/03/27 21:18:20 INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks
15/03/27 21:18:20 DEBUG TaskSetManager: Epoch for TaskSet 1.0: 0
15/03/27 21:18:20 DEBUG TaskSetManager: Valid locality levels for TaskSet 1.0: NO_PREF, ANY
15/03/27 21:18:20 DEBUG LocalActor: [actor] received message ReviveOffers from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:20 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_1, runningTasks: 0
15/03/27 21:18:20 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, localhost, PROCESS_LOCAL, 1395 bytes)
15/03/27 21:18:20 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, localhost, PROCESS_LOCAL, 1395 bytes)
15/03/27 21:18:20 INFO Executor: Running task 0.0 in stage 1.0 (TID 2)
15/03/27 21:18:20 DEBUG Executor: Task 2's epoch is 0
15/03/27 21:18:20 DEBUG BlockManager: Getting local block broadcast_2
15/03/27 21:18:20 DEBUG BlockManager: Level for block broadcast_2 is StorageLevel(true, true, false, true, 1)
15/03/27 21:18:20 DEBUG BlockManager: Getting block broadcast_2 from memory
15/03/27 21:18:20 INFO Executor: Running task 1.0 in stage 1.0 (TID 3)
15/03/27 21:18:20 DEBUG Executor: Task 3's epoch is 0
15/03/27 21:18:20 DEBUG BlockManager: Getting local block broadcast_2
15/03/27 21:18:20 DEBUG BlockManager: Level for block broadcast_2 is StorageLevel(true, true, false, true, 1)
15/03/27 21:18:20 DEBUG BlockManager: Getting block broadcast_2 from memory
15/03/27 21:18:20 DEBUG LocalActor: [actor] handled message (8.544249 ms) ReviveOffers from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:20 DEBUG LocalActor: [actor] received message StatusUpdate(2,RUNNING,java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:20 DEBUG LocalActor: [actor] handled message (0.053521 ms) StatusUpdate(2,RUNNING,java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:20 DEBUG LocalActor: [actor] received message StatusUpdate(3,RUNNING,java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:20 DEBUG LocalActor: [actor] handled message (0.034453 ms) StatusUpdate(3,RUNNING,java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:20 INFO HadoopRDD: Input split: hdfs://localhost:9000/test.json:0+8
15/03/27 21:18:20 DEBUG BlockManager: Getting local block broadcast_0
15/03/27 21:18:20 DEBUG BlockManager: Level for block broadcast_0 is StorageLevel(true, true, false, true, 1)
15/03/27 21:18:20 DEBUG BlockManager: Getting block broadcast_0 from memory
15/03/27 21:18:20 DEBUG HadoopRDD: Re-using cached JobConf
15/03/27 21:18:20 DEBUG SparkHadoopUtil: Couldn't find method for retrieving thread-level FileSystem input data
java.lang.NoSuchMethodException: org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()
at java.lang.Class.getDeclaredMethod(Class.java:2009)
at org.apache.spark.util.Utils$.invoke(Utils.scala:1822)
at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$getFileSystemThreadStatistics$1.apply(SparkHadoopUtil.scala:179)
at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$getFileSystemThreadStatistics$1.apply(SparkHadoopUtil.scala:179)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.deploy.SparkHadoopUtil.getFileSystemThreadStatistics(SparkHadoopUtil.scala:179)
at org.apache.spark.deploy.SparkHadoopUtil.getFSBytesReadOnThreadCallback(SparkHadoopUtil.scala:139)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:210)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:99)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.sql.SchemaRDD.compute(SchemaRDD.scala:120)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/03/27 21:18:20 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root sending #6
15/03/27 21:18:20 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root got value #6
15/03/27 21:18:20 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 2ms
15/03/27 21:18:20 DEBUG DFSClient: newInfo = LocatedBlocks{
fileLength=16
underConstruction=false
blocks=[LocatedBlock{BP-2062118600-127.0.0.1-1427504068037:blk_1073741825_1001; getBlockSize()=16; corrupt=false; offset=0; locs=[127.0.0.1:50010]}]
lastLocatedBlock=LocatedBlock{BP-2062118600-127.0.0.1-1427504068037:blk_1073741825_1001; getBlockSize()=16; corrupt=false; offset=0; locs=[127.0.0.1:50010]}
isLastBlockComplete=true}
15/03/27 21:18:20 INFO HadoopRDD: Input split: hdfs://localhost:9000/test.json:8+8
15/03/27 21:18:20 DEBUG BlockManager: Getting local block broadcast_0
15/03/27 21:18:20 DEBUG BlockManager: Level for block broadcast_0 is StorageLevel(true, true, false, true, 1)
15/03/27 21:18:20 DEBUG BlockManager: Getting block broadcast_0 from memory
15/03/27 21:18:20 DEBUG HadoopRDD: Re-using cached JobConf
15/03/27 21:18:20 DEBUG SparkHadoopUtil: Couldn't find method for retrieving thread-level FileSystem input data
java.lang.NoSuchMethodException: org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()
at java.lang.Class.getDeclaredMethod(Class.java:2009)
at org.apache.spark.util.Utils$.invoke(Utils.scala:1822)
at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$getFileSystemThreadStatistics$1.apply(SparkHadoopUtil.scala:179)
at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$getFileSystemThreadStatistics$1.apply(SparkHadoopUtil.scala:179)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.deploy.SparkHadoopUtil.getFileSystemThreadStatistics(SparkHadoopUtil.scala:179)
at org.apache.spark.deploy.SparkHadoopUtil.getFSBytesReadOnThreadCallback(SparkHadoopUtil.scala:139)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:210)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:99)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.sql.SchemaRDD.compute(SchemaRDD.scala:120)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/03/27 21:18:20 DEBUG EsRDDWriter: Using pre-defined writer serializer [org.elasticsearch.spark.serialization.ScalaValueWriter] as default
15/03/27 21:18:20 DEBUG EsRDDWriter: Using pre-defined field extractor [org.elasticsearch.spark.serialization.ScalaMapFieldExtractor] as default
15/03/27 21:18:20 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root sending #7
15/03/27 21:18:20 DEBUG Client: IPC Client (1633098863) connection to localhost/127.0.0.1:9000 from root got value #7
15/03/27 21:18:20 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 3ms
15/03/27 21:18:20 DEBUG DFSClient: newInfo = LocatedBlocks{
fileLength=16
underConstruction=false
blocks=[LocatedBlock{BP-2062118600-127.0.0.1-1427504068037:blk_1073741825_1001; getBlockSize()=16; corrupt=false; offset=0; locs=[127.0.0.1:50010]}]
lastLocatedBlock=LocatedBlock{BP-2062118600-127.0.0.1-1427504068037:blk_1073741825_1001; getBlockSize()=16; corrupt=false; offset=0; locs=[127.0.0.1:50010]}
isLastBlockComplete=true}
15/03/27 21:18:20 DEBUG DFSClient: Connecting to datanode 127.0.0.1:50010
15/03/27 21:18:20 DEBUG EsRDDWriter: Using pre-defined writer serializer [org.elasticsearch.spark.serialization.ScalaValueWriter] as default
15/03/27 21:18:20 DEBUG EsRDDWriter: Using pre-defined field extractor [org.elasticsearch.spark.serialization.ScalaMapFieldExtractor] as default
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.useragent = Jakarta Commons-HttpClient/3.1
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.protocol.version = HTTP/1.1
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.connection-manager.class = class org.apache.commons.httpclient.SimpleHttpConnectionManager
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.protocol.cookie-policy = default
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.protocol.element-charset = US-ASCII
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.protocol.content-charset = ISO-8859-1
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.method.retry-handler = org.apache.commons.httpclient.DefaultHttpMethodRetryHandler@e3dcba3
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.dateparser.patterns = [EEE, dd MMM yyyy HH:mm:ss zzz, EEEE, dd-MMM-yy HH:mm:ss zzz, EEE MMM d HH:mm:ss yyyy, EEE, dd-MMM-yyyy HH:mm:ss z, EEE, dd-MMM-yyyy HH-mm-ss z, EEE, dd MMM yy HH:mm:ss z, EEE dd-MMM-yyyy HH:mm:ss z, EEE dd MMM yyyy HH:mm:ss z, EEE dd-MMM-yyyy HH-mm-ss z, EEE dd-MMM-yy HH:mm:ss z, EEE dd MMM yy HH:mm:ss z, EEE,dd-MMM-yy HH:mm:ss z, EEE,dd-MMM-yyyy HH:mm:ss z, EEE, dd-MM-yyyy HH:mm:ss z]
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.method.retry-handler = org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport$1@2a7631c0
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.method.retry-handler = org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport$1@35bc4a07
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.connection-manager.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.socket.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.connection-manager.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.socket.timeout = 60000
15/03/27 21:18:20 DEBUG HttpClient: Java version: 1.7.0_75
15/03/27 21:18:20 DEBUG HttpClient: Java vendor: Oracle Corporation
15/03/27 21:18:20 DEBUG HttpClient: Java class path: ::/root/spark-1.2.1-bin-hadoop2.4/conf:/root/spark-1.2.1-bin-hadoop2.4/lib/spark-assembly-1.2.1-hadoop2.4.0.jar
15/03/27 21:18:20 DEBUG HttpClient: Operating system name: Linux
15/03/27 21:18:20 DEBUG HttpClient: Operating system architecture: amd64
15/03/27 21:18:20 DEBUG HttpClient: Operating system version: 2.6.32-504.el6.x86_64
15/03/27 21:18:20 DEBUG HttpClient: SUN 1.7: SUN (DSA key/parameter generation; DSA signing; SHA-1, MD5 digests; SecureRandom; X.509 certificates; JKS keystore; PKIX CertPathValidator; PKIX CertPathBuilder; LDAP, Collection CertStores, JavaPolicy Policy; JavaLoginConfig Configuration)
15/03/27 21:18:20 DEBUG HttpClient: SunRsaSign 1.7: Sun RSA signature provider
15/03/27 21:18:20 DEBUG HttpClient: SunJSSE 1.7: Sun JSSE provider(PKCS12, SunX509 key/trust factories, SSLv3, TLSv1)
15/03/27 21:18:20 DEBUG HttpClient: SunJCE 1.7: SunJCE Provider (implements RSA, DES, Triple DES, AES, Blowfish, ARCFOUR, RC2, PBE, Diffie-Hellman, HMAC)
15/03/27 21:18:20 DEBUG HttpClient: SunJGSS 1.7: Sun (Kerberos v5, SPNEGO)
15/03/27 21:18:20 DEBUG HttpClient: SunSASL 1.7: Sun SASL provider(implements client mechanisms for: DIGEST-MD5, GSSAPI, EXTERNAL, PLAIN, CRAM-MD5, NTLM; server mechanisms for: DIGEST-MD5, GSSAPI, CRAM-MD5, NTLM)
15/03/27 21:18:20 DEBUG HttpClient: XMLDSig 1.0: XMLDSig (DOM XMLSignatureFactory; DOM KeyInfoFactory)
15/03/27 21:18:20 DEBUG HttpClient: SunPCSC 1.7: Sun PC/SC provider
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.tcp.nodelay = true
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.tcp.nodelay = true
15/03/27 21:18:20 DEBUG HttpConnection: Open connection to localhost:9200
15/03/27 21:18:20 DEBUG HttpConnection: Open connection to localhost:9200
15/03/27 21:18:20 DEBUG header: >> "GET /_nodes/transport HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "GET /_nodes/transport HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Host: localhost:9200[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Host: localhost:9200[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 373[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 373[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG content: << "{"cluster_name":"elasticsearch","nodes":{"LqqiMzLcQAuBISXvr7FspA":{"name":"Grotesk","transport_address":"inet[/192.168.122.235:9300]","host":"localhost.localdomain","ip":"127.0.0.1","version":"1.4.0","build":"bc94bd8","http_address":"inet[/192.168.122.235:9200]","transport":{"bound_address":"inet[/0:0:0:0:0:0:0:0:9300]","publish_address":"inet[/192.168.122.235:9300]"}}}}"
15/03/27 21:18:20 DEBUG content: << "{"cluster_name":"elasticsearch","nodes":{"LqqiMzLcQAuBISXvr7FspA":{"name":"Grotesk","transport_address":"inet[/192.168.122.235:9300]","host":"localhost.localdomain","ip":"127.0.0.1","version":"1.4.0","build":"bc94bd8","http_address":"inet[/192.168.122.235:9200]","transport":{"bound_address":"inet[/0:0:0:0:0:0:0:0:9300]","publish_address":"inet[/192.168.122.235:9300]"}}}}"
15/03/27 21:18:20 DEBUG EsRDDWriter: Nodes discovery enabled - found [192.168.122.235:9200]
15/03/27 21:18:20 DEBUG EsRDDWriter: Nodes discovery enabled - found [192.168.122.235:9200]
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.method.retry-handler = org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport$1@5927f89b
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.method.retry-handler = org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport$1@4f32b5d7
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.connection-manager.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.connection-manager.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.socket.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.socket.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.tcp.nodelay = true
15/03/27 21:18:20 DEBUG HttpConnection: Open connection to 192.168.122.235:9200
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.tcp.nodelay = true
15/03/27 21:18:20 DEBUG HttpConnection: Open connection to localhost:9200
15/03/27 21:18:20 DEBUG header: >> "GET / HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Host: 192.168.122.235:9200[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "GET / HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Host: localhost:9200[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 335[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG content: << "{[\n]"
15/03/27 21:18:20 DEBUG content: << " "status" : 200,[\n]"
15/03/27 21:18:20 DEBUG content: << " "name" : "Grotesk",[\n]"
15/03/27 21:18:20 DEBUG content: << " "cluster_name" : "elasticsearch",[\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG content: << " "version" : {[\n]"
15/03/27 21:18:20 DEBUG content: << " "number" : "1.4.0",[\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG content: << " "build_hash" : "bc94bd81298f81c656893ab1ddddd30a99356066",[\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG content: << " "build_timestamp" : "2014-11-05T14:26:12Z",[\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 335[\r][\n]"
15/03/27 21:18:20 DEBUG content: << " "build_snapshot" : false,[\n]"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG content: << " "lucene_version" : "4.10.2"[\n]"
15/03/27 21:18:20 DEBUG content: << " },[\n]"
15/03/27 21:18:20 DEBUG content: << "{[\n]"
15/03/27 21:18:20 DEBUG content: << " "tagline" : "You Know, for Search"[\n]"
15/03/27 21:18:20 DEBUG content: << "}[\n]"
15/03/27 21:18:20 DEBUG content: << " "status" : 200,[\n]"
15/03/27 21:18:20 DEBUG content: << " "name" : "Grotesk",[\n]"
15/03/27 21:18:20 DEBUG content: << " "cluster_name" : "elasticsearch",[\n]"
15/03/27 21:18:20 DEBUG content: << " "version" : {[\n]"
15/03/27 21:18:20 DEBUG content: << " "number" : "1.4.0",[\n]"
15/03/27 21:18:20 DEBUG content: << " "build_hash" : "bc94bd81298f81c656893ab1ddddd30a99356066",[\n]"
15/03/27 21:18:20 DEBUG content: << " "build_timestamp" : "2014-11-05T14:26:12Z",[\n]"
15/03/27 21:18:20 DEBUG content: << " "build_snapshot" : false,[\n]"
15/03/27 21:18:20 DEBUG content: << " "lucene_version" : "4.10.2"[\n]"
15/03/27 21:18:20 DEBUG content: << " },[\n]"
15/03/27 21:18:20 DEBUG content: << " "tagline" : "You Know, for Search"[\n]"
15/03/27 21:18:20 DEBUG content: << "}[\n]"
15/03/27 21:18:20 DEBUG EsRDDWriter: Discovered Elasticsearch version [1.4.0]
15/03/27 21:18:20 DEBUG EsRDDWriter: Discovered Elasticsearch version [1.4.0]
15/03/27 21:18:20 INFO EsRDDWriter: Writing to [spark_test/test]
15/03/27 21:18:20 INFO Version: Elasticsearch Hadoop v2.1.0.BUILD-SNAPSHOT [c06648e958]
15/03/27 21:18:20 INFO EsRDDWriter: Writing to [spark_test/test]
15/03/27 21:18:20 DEBUG EsRDDWriter: Resource [spark_test/test] resolves as a single index
15/03/27 21:18:20 DEBUG EsRDDWriter: Resource [spark_test/test] resolves as a single index
15/03/27 21:18:20 DEBUG NetworkClient: Opening (pinned) network client to 192.168.122.235:9200
15/03/27 21:18:20 DEBUG NetworkClient: Opening (pinned) network client to localhost:9200
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.method.retry-handler = org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport$1@1b8edbe1
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.connection-manager.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.method.retry-handler = org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport$1@40059ee9
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.socket.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.connection-manager.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.socket.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.tcp.nodelay = true
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.tcp.nodelay = true
15/03/27 21:18:20 DEBUG HttpConnection: Open connection to localhost:9200
15/03/27 21:18:20 DEBUG HttpConnection: Open connection to 192.168.122.235:9200
15/03/27 21:18:20 DEBUG header: >> "PUT /spark_test HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "PUT /spark_test HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Host: 192.168.122.235:9200[\r][\n]"
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG header: >> "Content-Length: 0[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG EntityEnclosingMethod: Request body has not been specified
15/03/27 21:18:20 DEBUG header: >> "Host: localhost:9200[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Content-Length: 0[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG EntityEnclosingMethod: Request body has not been specified
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 400 Bad Request[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 400 Bad Request[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 81[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 400 Bad Request[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 400 Bad Request[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 81[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG content: << "{"error":"IndexAlreadyExistsException[[spark_test] already exists]","status":400}"
15/03/27 21:18:20 DEBUG content: << "{"error":"IndexAlreadyExistsException[[spark_test] already exists]","status":400}"
15/03/27 21:18:20 DEBUG HttpMethodBase: Resorting to protocol version default close connection policy
15/03/27 21:18:20 DEBUG HttpMethodBase: Resorting to protocol version default close connection policy
15/03/27 21:18:20 DEBUG HttpMethodBase: Should NOT close connection, using HTTP/1.1
15/03/27 21:18:20 DEBUG HttpConnection: Releasing connection back to connection manager.
15/03/27 21:18:20 DEBUG HttpMethodBase: Should NOT close connection, using HTTP/1.1
15/03/27 21:18:20 DEBUG HttpConnection: Releasing connection back to connection manager.
15/03/27 21:18:20 DEBUG header: >> "GET /spark_test/_search_shards HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Host: 192.168.122.235:9200[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "GET /spark_test/_search_shards HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Host: localhost:9200[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 731[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG content: << "{"nodes":{"LqqiMzLcQAuBISXvr7FspA":{"name":"Grotesk","transport_address":"inet[/192.168.122.235:9300]"}},"shards":[[{"state":"STARTED","primary":true,"node":"LqqiMzLcQAuBISXvr7FspA","relocating_node":null,"shard":0,"index":"spark_test"}],[{"state":"STARTED","primary":true,"node":"LqqiMzLcQAuBISXvr7FspA","relocating_node":null,"shard":1,"index":"spark_test"}],[{"state":"STARTED","primary":true,"node":"LqqiMzLcQAuBISXvr7FspA","relocating_node":null,"shard":2,"index":"spark_test"}],[{"state":"STARTED","primary":true,"node":"LqqiMzLcQAuBISXvr7FspA","relocating_node":null,"shard":3,"index":"spark_test"}],[{"state":"STARTED","primary":true,"node":"LqqiMzLcQAuBISXvr7FspA","relocating_node":null,"shard":4,"index":"spark_test"}]]}"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 731[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG content: << "{"nodes":{"LqqiMzLcQAuBISXvr7FspA":{"name":"Grotesk","transport_address":"inet[/192.168.122.235:9300]"}},"shards":[[{"state":"STARTED","primary":true,"node":"LqqiMzLcQAuBISXvr7FspA","relocating_node":null,"shard":0,"index":"spark_test"}],[{"state":"STARTED","primary":true,"node":"LqqiMzLcQAuBISXvr7FspA","relocating_node":null,"shard":1,"index":"spark_test"}],[{"state":"STARTED","primary":true,"node":"LqqiMzLcQAuBISXvr7FspA","relocating_node":null,"shard":2,"index":"spark_test"}],[{"state":"STARTED","primary":true,"node":"LqqiMzLcQAuBISXvr7FspA","relocating_node":null,"shard":3,"index":"spark_test"}],[{"state":"STARTED","primary":true,"node":"LqqiMzLcQAuBISXvr7FspA","relocating_node":null,"shard":4,"index":"spark_test"}]]}"
15/03/27 21:18:20 DEBUG HttpMethodBase: Resorting to protocol version default close connection policy
15/03/27 21:18:20 DEBUG HttpMethodBase: Resorting to protocol version default close connection policy
15/03/27 21:18:20 DEBUG HttpMethodBase: Should NOT close connection, using HTTP/1.1
15/03/27 21:18:20 DEBUG HttpMethodBase: Should NOT close connection, using HTTP/1.1
15/03/27 21:18:20 DEBUG HttpConnection: Releasing connection back to connection manager.
15/03/27 21:18:20 DEBUG HttpConnection: Releasing connection back to connection manager.
15/03/27 21:18:20 DEBUG header: >> "GET /_nodes/http HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "GET /_nodes/http HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Host: 192.168.122.235:9200[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Host: localhost:9200[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 408[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 408[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG content: << "{"cluster_name":"elasticsearch","nodes":{"LqqiMzLcQAuBISXvr7FspA":{"name":"Grotesk","transport_address":"inet[/192.168.122.235:9300]","host":"localhost.localdomain","ip":"127.0.0.1","version":"1.4.0","build":"bc94bd8","http_address":"inet[/192.168.122.235:9200]","http":{"bound_address":"inet[/0:0:0:0:0:0:0:0:9200]","publish_address":"inet[/192.168.122.235:9200]","max_content_length_in_bytes":104857600}}}}"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG content: << "{"cluster_name":"elasticsearch","nodes":{"LqqiMzLcQAuBISXvr7FspA":{"name":"Grotesk","transport_address":"inet[/192.168.122.235:9300]","host":"localhost.localdomain","ip":"127.0.0.1","version":"1.4.0","build":"bc94bd8","http_address":"inet[/192.168.122.235:9200]","http":{"bound_address":"inet[/0:0:0:0:0:0:0:0:9200]","publish_address":"inet[/192.168.122.235:9200]","max_content_length_in_bytes":104857600}}}}"
15/03/27 21:18:20 DEBUG RestRepository: Closing repository and connection to Elasticsearch ...
15/03/27 21:18:20 DEBUG RestRepository: Closing repository and connection to Elasticsearch ...
15/03/27 21:18:20 DEBUG RestRepository: Sending batch of [0] bytes/[0] entries
15/03/27 21:18:20 DEBUG RestRepository: Sending batch of [0] bytes/[0] entries
15/03/27 21:18:20 DEBUG NetworkClient: Opening (pinned) network client to 192.168.122.235:9200
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.method.retry-handler = org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport$1@74c9fc32
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.connection-manager.timeout = 60000
15/03/27 21:18:20 DEBUG NetworkClient: Opening (pinned) network client to 192.168.122.235:9200
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.socket.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.method.retry-handler = org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport$1@78cc7284
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.connection-manager.timeout = 60000
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.tcp.nodelay = true
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.socket.timeout = 60000
15/03/27 21:18:20 DEBUG EsRDDWriter: Partition writer instance [3] assigned to primary shard [3] at address [192.168.122.235:9200]
15/03/27 21:18:20 DEBUG DefaultHttpParams: Set parameter http.tcp.nodelay = true
15/03/27 21:18:20 DEBUG EsRDDWriter: Partition writer instance [1] assigned to primary shard [1] at address [192.168.122.235:9200]
15/03/27 21:18:20 DEBUG DFSClient: Connecting to datanode 127.0.0.1:50010
15/03/27 21:18:20 DEBUG RestRepository: Closing repository and connection to Elasticsearch ...
15/03/27 21:18:20 DEBUG RestRepository: Sending batch of [0] bytes/[0] entries
15/03/27 21:18:20 INFO Executor: Finished task 1.0 in stage 1.0 (TID 3). 1719 bytes result sent to driver
15/03/27 21:18:20 DEBUG LocalActor: [actor] received message StatusUpdate(3,FINISHED,java.nio.HeapByteBuffer[pos=0 lim=1719 cap=1719]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:20 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_1, runningTasks: 1
15/03/27 21:18:20 DEBUG LocalActor: [actor] handled message (0.801022 ms) StatusUpdate(3,FINISHED,java.nio.HeapByteBuffer[pos=525 lim=1719 cap=1719]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:20 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 432 ms on localhost (1/2)
15/03/27 21:18:20 DEBUG RestRepository: Closing repository and connection to Elasticsearch ...
15/03/27 21:18:20 DEBUG RestRepository: Sending batch of [23] bytes/[1] entries
15/03/27 21:18:20 DEBUG HttpConnection: Open connection to 192.168.122.235:9200
15/03/27 21:18:20 DEBUG header: >> "PUT /spark_test/test/_bulk HTTP/1.1[\r][\n]"
15/03/27 21:18:20 DEBUG HttpMethodBase: Adding Host request header
15/03/27 21:18:20 DEBUG header: >> "User-Agent: Jakarta Commons-HttpClient/3.1[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Host: 192.168.122.235:9200[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Content-Length: 23[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: >> "[\r][\n]"
15/03/27 21:18:20 DEBUG content: >> "{"index":{}}[\n]"
15/03/27 21:18:20 DEBUG content: >> "["value"][\n]"
15/03/27 21:18:20 DEBUG EntityEnclosingMethod: Request body sent
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "HTTP/1.1 200 OK[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Type: application/json; charset=UTF-8[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "Content-Length: 376[\r][\n]"
15/03/27 21:18:20 DEBUG header: << "[\r][\n]"
15/03/27 21:18:20 DEBUG content: << "{"took":2,"errors":true,"items":[{"create":{"_index":"spark_test","_type":"test","_id":"AUxd9VHHw0xLzjueIqCk","status":400,"error":"MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 34, 118, 97, 108, 117, 101, 34, 93, 10]]; "}}]}"
15/03/27 21:18:20 ERROR TaskContextImpl: Error in TaskCompletionListener
org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found unrecoverable error [Bad Request(400) - Invalid JSON fragment received[["value"]][MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 34, 118, 97, 108, 117, 101, 34, 93, 10]]; ]]; Bailing out..
at org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:202)
at org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:166)
at org.elasticsearch.hadoop.rest.RestRepository.tryFlush(RestRepository.java:209)
at org.elasticsearch.hadoop.rest.RestRepository.flush(RestRepository.java:232)
at org.elasticsearch.hadoop.rest.RestRepository.close(RestRepository.java:245)
at org.elasticsearch.hadoop.rest.RestService$PartitionWriter.close(RestService.java:129)
at org.elasticsearch.spark.rdd.EsRDDWriter$$anonfun$write$1.apply$mcV$sp(EsRDDWriter.scala:40)
at org.apache.spark.TaskContextImpl$$anon$2.onTaskCompletion(TaskContextImpl.scala:57)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:68)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:66)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:58)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/03/27 21:18:20 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 2)
org.apache.spark.util.TaskCompletionListenerException: Found unrecoverable error [Bad Request(400) - Invalid JSON fragment received[["value"]][MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 34, 118, 97, 108, 117, 101, 34, 93, 10]]; ]]; Bailing out..
at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:76)
at org.apache.spark.scheduler.Task.run(Task.scala:58)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/03/27 21:18:20 DEBUG LocalActor: [actor] received message StatusUpdate(2,FAILED,java.nio.HeapByteBuffer[pos=0 lim=3602 cap=3602]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:20 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_1, runningTasks: 0
15/03/27 21:18:20 DEBUG LocalActor: [actor] handled message (2.565788 ms) StatusUpdate(2,FAILED,java.nio.HeapByteBuffer[pos=0 lim=3602 cap=3602]) from Actor[akka://sparkDriver/deadLetters]
15/03/27 21:18:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 2, localhost): org.apache.spark.util.TaskCompletionListenerException: Found unrecoverable error [Bad Request(400) - Invalid JSON fragment received[["value"]][MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 34, 118, 97, 108, 117, 101, 34, 93, 10]]; ]]; Bailing out..
at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:76)
at org.apache.spark.scheduler.Task.run(Task.scala:58)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/03/27 21:18:20 ERROR TaskSetManager: Task 0 in stage 1.0 failed 1 times; aborting job
15/03/27 21:18:20 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
15/03/27 21:18:20 INFO TaskSchedulerImpl: Cancelling stage 1
15/03/27 21:18:20 INFO DAGScheduler: Job 1 failed: runJob at EsSpark.scala:51, took 0.542840 s
15/03/27 21:18:20 DEBUG DAGScheduler: Removing running stage 1
15/03/27 21:18:20 DEBUG DAGScheduler: After removal of stage 1, remaining stages = 0
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 2, localhost): org.apache.spark.util.TaskCompletionListenerException: Found unrecoverable error [Bad Request(400) - Invalid JSON fragment received[["value"]][MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 34, 118, 97, 108, 117, 101, 34, 93, 10]]; ]]; Bailing out..
at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:76)
at org.apache.spark.scheduler.Task.run(Task.scala:58)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1375)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
scala>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment