Skip to content

Instantly share code, notes, and snippets.

@apple-corps
Last active April 26, 2016 00:48
Show Gist options
  • Save apple-corps/c545934b2d505461807df1005b89016c to your computer and use it in GitHub Desktop.
Save apple-corps/c545934b2d505461807df1005b89016c to your computer and use it in GitHub Desktop.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>ad-export</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<spark.version>1.2.0</spark.version>
</properties>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>com.cloudera</groupId>
<artifactId>spark-hbase</artifactId>
<version>0.0.2-clabs</version>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
</dependency>
<dependency>
<groupId>com.cloudera</groupId>
<artifactId>spark-hbase</artifactId>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<manifestEntries>
<Main-Class>com.example.KafkaToAD</Main-Class>
<Build-Number>1</Build-Number>
</manifestEntries>
</transformer>
</transformers>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
@apple-corps
Copy link
Author

apple-corps commented Apr 22, 2016

I've changed my spark submit and provided the jars as suggested by the author in the README. When I launch the application, it runs for 10-20 seconds, and the application completes successfully according to YARN. However I cannot find a debugging message in STDOUT as the first line in the main method.

`

!/usr/bin/env bash

export SPARK_CONF_DIR=/home/colin.williams/spark
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HADOOP_CLASSPATH=/etc/hbase/conf:/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/*:/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/hbase-protocol-0.98.6-cdh5.3.0.jar

spark-submit
--class com.example.ad.KafkaToAD
--master yarn
--deploy-mode client
--num-executors 2
--driver-memory 512m
--executor-memory 4096m
--executor-cores 3
--queue search
--jars /home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/zookeeper/zookeeper-3.4.5-cdh5.3.0.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/guava-12.0.1.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/protobuf-java-2.5.0.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-protocol.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-client.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-common.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-hadoop2-compat.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-hadoop-compat.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/hbase-server.jar,/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/htrace-core.jar
--conf spark.yarn.submit.waitAppCompletion=true
--conf spark.app.name="Kafka To AD"
--conf spark.streaming.receiver.maxRate=1000
--conf spark.streaming.concurrentJobs=2
--conf spark.eventLog.dir="hdfs:///user/spark/applicationHistory"
--conf spark.eventLog.enabled=true
--conf spark.eventLog.overwrite=true
--conf spark.yarn.historyServer.address="http://utl03.comp.local:18080/"
--conf spark.yarn.dist.files=hdfs:///user/colin.williams/ad-export-1.0-SNAPSHOT-uber.jar
--driver-java-options -Dspark.executor.extraClassPath=/home/colin.williams/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/*
/home/colin.williams/ad-export-1.0-SNAPSHOT-uber.jar "hello world"
`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment