Skip to content

Instantly share code, notes, and snippets.

@bakkujp
Forked from viecode09/Hadoop_install_osx.md
Created May 6, 2021 04:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bakkujp/aad35ca128d2acfcbfb5298711f214b1 to your computer and use it in GitHub Desktop.
Save bakkujp/aad35ca128d2acfcbfb5298711f214b1 to your computer and use it in GitHub Desktop.
This is how to install hadoop on Mac OS

STEP 1: First Install HomeBrew, download it from http://brew.sh

$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

STEP 2: Install Hadoop

$ brew search hadoop
$ brew install hadoop

Hadoop will be installed at path /usr/local/Cellar/hadoop

STEP 3: Configure Hadoop:

Edit hadoop-env.sh, the file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/hadoop-env.sh where 2.6.0 is the hadoop version. Change the line

export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true" to

export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc=" Edit Core-site.xml, The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/core-site.xml add below config

<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>

Edit mapred-site.xml, The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/mapred-site.xml and by default will be blank add below config

<configuration>
 <property>
  <name>mapred.job.tracker</name>
  <value>localhost:9010</value>
 </property>
</configuration>

Edit hdfs-site.xml, The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/hdfs-site.xml add

<configuration>
 <property>
  <name>dfs.replication</name>
  <value></value>
 </property>
</configuration>

To simplify life edit a ~/.profile and add the following commands. By default ~/.profile might not exist.

alias hstart=<"/usr/local/Cellar/hadoop/2.6.0/sbin/start-dfs.sh;/usr/local/Cellar/hadoop/2.6.0/sbin/start-yarn.sh">
alias hstop=<"/usr/local/Cellar/hadoop/2.6.0/sbin/stop-yarn.sh;/usr/local/Cellar/hadoop/2.6.0/sbin/stop-dfs.sh">

and source it

$ source ~/.profile

Before running Hadoop format HDFS

$ hdfs namenode -format

STEP 4: To verify if SSH Localhost is working check for files ~/.ssh/id_rsa and the ~/.ssh/id_rsa.pub files. If they don’t exist generate the keys using below command

$ ssh-keygen -t rsa

Enable Remote Login: “System Preferences” -> “Sharing”. Check “Remote Login” Authorize SSH Keys: To allow your system to accept login, we have to make it aware of the keys that will be used

$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Test login.

$ ssh localhost
Last login: Fri Mar 6 20:30:53 2015
$ exit

STEP 5: Run Hadoop

$ hstart

and stop using

$ hstop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment