purukaushik/spark-hadoop-setup.md

## spark-hadoop-setup.md

      
    Raw
  

              spark-hadoop-setup.md
            
          
Download Hadoop and install.


Setup $HADOOP_HOME


Install openssh-server and run
sudo /etc/init.d/ssh restart

Check if ssh is running on port 22 by:
netstat -tupln | grep 22


Edit core-site.xml, hdfs-site.xml to add the following
$HADOOP_HOME/etc/hadoop/core-site-xml
    <configuration>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://localhost:9000</value>
        </property>
    </configuration>

$HADOOP_HOME/etc/hadoop/hdfs-site.xml
    <configuration>
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
    </configuration>