Config to apply once Hadoop (2.7.0 as of now) is installed on OS X
Here's some thought on config:
-
Nodename = localhost means Hadoop will be accessed from the same host. If accessing from another host, change localhost to actual host name. If namenode name in core-site.xml doesn't match what client uses to connect, you'll get the dreaded connection refused error.
-
The system has plenty (16G) of RAM. It's better to overcommit maximum allocation, so you might as well set maximum allocaiton even higher to like 24G. Otherwise your Yarn jobs can get stuck in wait states. Yarn's memory calculation doesn't reflect actual memory usage.
-
To run many small long running Yarn jobs (i.e. Samza tasks), specify minimum allocation and make it low. Samza tasks are typically pretty small, but every Samza job ends up starting an AM. If you don't specify lower minimum allocation when running many Samza tasks, Yarn will overallocate memory and stop running more jobs when there's plenty of memory still.