Skip to content

Instantly share code, notes, and snippets.

@purukaushik
Last active September 22, 2016 02:00
Show Gist options
  • Save purukaushik/78e3e4b9178dc87d7c13cae3b5f500d1 to your computer and use it in GitHub Desktop.
Save purukaushik/78e3e4b9178dc87d7c13cae3b5f500d1 to your computer and use it in GitHub Desktop.
Hadoop-Spark setup on standalone ubuntu
  1. Download Hadoop and install.

  2. Setup $HADOOP_HOME

  3. Install openssh-server and run

    sudo /etc/init.d/ssh restart
    

    Check if ssh is running on port 22 by:

    netstat -tupln | grep 22
    
  4. Edit core-site.xml, hdfs-site.xml to add the following

    $HADOOP_HOME/etc/hadoop/core-site-xml
        <configuration>
            <property>
                <name>fs.defaultFS</name>
                <value>hdfs://localhost:9000</value>
            </property>
        </configuration>
    
    $HADOOP_HOME/etc/hadoop/hdfs-site.xml
        <configuration>
            <property>
                <name>dfs.replication</name>
                <value>1</value>
            </property>
        </configuration>
    
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment