Skip to content

Instantly share code, notes, and snippets.

[2018-03-02 15:49:18,256] ERROR .jobserver.JobManagerActor [] [akka://JobServer/user/jobManager-1f-9f09-200f964be651] - Exception from job f373c504-cb25-49f6-a78d-226340db7cf8:
org.apache.spark.SparkException: org.apache.spark.SparkException: Error getting partition metadata for 'energyEvent'. Does the topic exist?
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:385)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:385)
at scala.util.Either.fold(Either.scala:98)
at org.apache.spark.streaming.kafka.KafkaCluster$.checkErrors(KafkaCluster.scala:384)
at org.apache.spark.streaming.kafka.KafkaUtils$.getFromOffsets(KafkaUtils.scala:222)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:484)
at com.test.KafkaReceiver$.getKafkaStream(KafkaReceiver.scala:57)
at com.test.KafkaReceiver$.runJob(KafkaReceiver.scala:44)

Spark - High availability

Components in play

As a reminder, here are the components in play to run an application:

  • The cluster:
    • Spark Master: coordinates the resources
    • Spark Workers: offer resources to run the applications
  • The application:
@bsikander
bsikander / spark-env.sh
Created March 23, 2017 10:28
Spark-env.sh on MASTER
SPARK_MASTER_IP=<MASTER_IP>
SPARK_MASTER_PORT=7077
SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=<OUR_IPS> -Dspark.deploy.zookeeper.dir=/spark"
SPARK_WORKER_DIR=/<BASE>/spark/worker_dir
SPARK_LOCAL_DIRS=/<BASE>/spark/local_dirs
SPARK_CONF_DIR=/<BASE>/spark-master/config
@bsikander
bsikander / spark-default.conf
Last active March 23, 2017 09:39
Default Configuration file for MASTER
spark.streaming.backpressure.enabled true
spark.eventLog.enabled true
#### configurable ####
spark.driver.memory 2g
spark.executor.memory 2g
@bsikander
bsikander / spark-default.conf
Last active March 23, 2017 09:39
Default configuration file for SLAVE
spark.streaming.backpressure.enabled true
spark.eventLog.enabled true
spark.worker.cleanup.enabled=true
# check for outdated stuff every 30 minutes
spark.worker.cleanup.interval=1800
# remove stuff older than 7 days
spark.worker.cleanup.appDataTtl=604800
@bsikander
bsikander / Worker crashing
Created March 23, 2017 09:11
Worker crashing
Exception in thread "dispatcher-event-loop-0" java.lang.OutOfMemoryError: GC overhead limit exceeded (max heap: 1024 MB)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2710)
at java.lang.Class.getDeclaredMethod(Class.java:2137)
at java.io.ObjectStreamClass.getInheritableMethod(ObjectStreamClass.java:1444)
at java.io.ObjectStreamClass.access$2200(ObjectStreamClass.java:72)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:508)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
at java.security.AccessController.doPrivileged(Native Method)