Skip to content

Instantly share code, notes, and snippets.

@abdullahainun
Forked from pratos/zeppelin_ubuntu.md
Created July 19, 2018 22:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save abdullahainun/1a3469dbee1c5c97c5efb4286d817c23 to your computer and use it in GitHub Desktop.
Save abdullahainun/1a3469dbee1c5c97c5efb4286d817c23 to your computer and use it in GitHub Desktop.
To Install Zeppelin [Scala and Spark] in Ubuntu 16.04LTS

Install Zeppelin in Ubuntu systems

  • First install Java, Scala and Spark in Ubuntu

    • Install Java

      sudo apt-add-repository ppa:webupd8team/java
      sudo apt-get update
      sudo apt-get install oracle-java8-installer
      
      java -version
      
    • Install Scala

      wget http://downloads.lightbend.com/scala/2.12.0/scala-2.12.0.tgz
      sudo mkdir /usr/local/src/scala
      tar -xvf scala-2.12.0.tgz -C /usr/local/src/scala/
      nano .bashrc
      export SCALA_HOME=/usr/local/src/scala/scala-2.12.0
      export PATH=$SCALA_HOME/bin:$PATH
      source .bashrc
      
      scala -version
      
    • Install Git

     sudo apt-get install git
    
    • Install sbt wget https://bintray.com/artifact/download/sbt/debian/sbt-0.13.6.deb sudo dpkg -i sbt-0.13.6.deb sudo apt-get update sudo apt-get install sbt

    • To Install Spark

      # Follow the link: https://www.santoshsrinivas.com/installing-apache-spark-on-ubuntu-16-04/
      wget http://d3kbcqa49mib13.cloudfront.net/spark-2.0.2-bin-hadoop2.7.tgz
      tar -xvf spark-2.0.2-bin-hadoop2.7.tgz
      mv spark-2.0.2-bin-hadoop2.7/ spark
      
      cd conf/
      cp spark-env.sh.template spark-env.sh
      nano spark-env.sh
      
  • Add the following lines to spark-env.sh

JAVA_HOME=/usr/lib/jvm/java-8-oracle  
SPARK_WORKER_MEMORY=4g
PYSPARK_PYTHON=/home/<username>/anaconda3/bin/python

source spark-env.sh
  • Run pyspark
username@machine:~$ pyspark
Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul  2 2016, 17:53:06)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/12/06 12:05:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/12/06 12:05:55 WARN Utils: Your hostname, username resolves to a loopback address: xxx.x.x.x; using xxx.x.x.xx instead (on interface enp0s31f6)
16/12/06 12:05:55 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.0.2
      /_/

Using Python version 3.5.2 (default, Jul  2 2016 17:53:06)
SparkSession available as 'spark'.
>>> exit()
  • To install Zeppelin notebook
sudo apt-get update
sudo apt-get install npm
sudo apt install nodejs-legacy
sudo apt-get install libfontconfig
  • Install maven
wget http://www-eu.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz
sudo tar -zxf apache-maven-3.3.9-bin.tar.gz -C /usr/local/
sudo ln -s /usr/local/apache-maven-3.3.9/bin/mvn /usr/local/bin/mvn
  • Check for the versions
node --version
mvn --version
  • Clone zeppelin repo
git clone https://github.com/apache/zeppelin.git
  • Add the below lines to .bashrc
export M2_HOME=/usr/local/apache-maven-3.3.9
export PATH=${M2_HOME}/bin:${PATH}

source .bashrc
  • Install 'bower'
sudo npm install -g bower
  • Maven Build for running on local
mvn clean install -DskipTests

# The Zeppelin: web-application [FAILURE] would be there
# Do the below steps to rectify it

cd zeppelin/zeppelin-web

bower install

# A lot of packages would be shown on the screen
# At the end it would show the following message::
# Answer? 
# Type --- !2 (Select the option which has "... needed for zeppelin-web"
# Above persists the angularjs package for our maven build
# Source: https://stackoverflow.com/questions/35014273/failed-to-run-task-bower-allow-root-install-failed/37527480#37527480

# Run
mvn clean install -DskipTests

# We should get the below message for successful build::
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ zeppelin-distribution ---
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Zeppelin ........................................... SUCCESS [  2.173 s]
[INFO] Zeppelin: Interpreter .............................. SUCCESS [  5.891 s]
[INFO] Zeppelin: Zengine .................................. SUCCESS [  3.633 s]
[INFO] Zeppelin: Display system apis ...................... SUCCESS [  7.594 s]
[INFO] Zeppelin: Spark dependencies ....................... SUCCESS [ 20.124 s]
[INFO] Zeppelin: Spark .................................... SUCCESS [ 10.737 s]
[INFO] Zeppelin: Markdown interpreter ..................... SUCCESS [  0.249 s]
[INFO] Zeppelin: Angular interpreter ...................... SUCCESS [  0.170 s]
[INFO] Zeppelin: Shell interpreter ........................ SUCCESS [  0.270 s]
[INFO] Zeppelin: Livy interpreter ......................... SUCCESS [  0.429 s]
[INFO] Zeppelin: HBase interpreter ........................ SUCCESS [  1.928 s]
[INFO] Zeppelin: PostgreSQL interpreter ................... SUCCESS [  0.268 s]
[INFO] Zeppelin: JDBC interpreter ......................... SUCCESS [  0.277 s]
[INFO] Zeppelin: File System Interpreters ................. SUCCESS [  0.445 s]
[INFO] Zeppelin: Flink .................................... SUCCESS [  3.841 s]
[INFO] Zeppelin: Apache Ignite interpreter ................ SUCCESS [  0.496 s]
[INFO] Zeppelin: Kylin interpreter ........................ SUCCESS [  0.229 s]
[INFO] Zeppelin: Python interpreter ....................... SUCCESS [  0.249 s]
[INFO] Zeppelin: Lens interpreter ......................... SUCCESS [  1.866 s]
[INFO] Zeppelin: Apache Cassandra interpreter ............. SUCCESS [ 20.993 s]
[INFO] Zeppelin: Elasticsearch interpreter ................ SUCCESS [  1.514 s]
[INFO] Zeppelin: BigQuery interpreter ..................... SUCCESS [  0.410 s]
[INFO] Zeppelin: Alluxio interpreter ...................... SUCCESS [  1.497 s]
[INFO] Zeppelin: web Application .......................... SUCCESS [01:07 min]
[INFO] Zeppelin: Server ................................... SUCCESS [ 35.707 s]
[INFO] Zeppelin: Packaging distribution ................... SUCCESS [  1.318 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:10 min
[INFO] Finished at: 2016-12-06T16:47:24+05:30
[INFO] Final Memory: 225M/3228M
[INFO] ------------------------------------------------------------------------
  • Start the daemon.sh
$ bin/zeppelin-daemon.sh start
Zeppelin start                                             [  OK  ]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment