Skip to content

Instantly share code, notes, and snippets.

@wags
Last active April 16, 2023 18:06
Show Gist options
  • Save wags/59808086832b477e18f6 to your computer and use it in GitHub Desktop.
Save wags/59808086832b477e18f6 to your computer and use it in GitHub Desktop.
DRAFT - Elasticsearch Cluster Installation

Elasticsearch Cluster Installation & Configuration

A guide for installing and configuring an Elasticsearch cluster according to best practices. It contains a mix of setup steps and theory. Assuming that Ubuntu Server is already installed, and you have the server resources at your disposal (I used virtual machines), you can set up a complete, robust Elasticsearch cluster.

The information in this guide was assembled from the Elasticsearch documentation, notes from online training classes, and personal experience with the software.

What We're Building

A fully-functional Elasticsearch cluster consisting of nine total servers. There will be two client nodes, three master nodes and four data nodes. The theory section will provide insight into the node types and how to plan accordingly. The hands-on installation section will go through all the steps required to build it.

Theory

This section covers key concepts that are important for planning, installation and management of a healthy Elasticsearch cluster. Any elasticsearch.yml configuration settings mentioned here will be covered more thoroughly in the implementation section later in this guide.

Node Types

There are three types of nodes in Elasticsearch. Client nodes, Master nodes and Data nodes.

Client nodes are the gateway to the cluster. They handles all query requests and direct them to data nodes in the cluster. They're important but don't need a lot of server resources.

Master nodes are the brains of the whole operation. They are in charge of maintaining the cluster state and are the only nodes allowed to update the state. Without a master the cluster won't function.

Data nodes are the workhorses in the cluster since they physically house all the shards and data. They do not typically service query requests. They tend to have the beefiest resources.

Capacity Planning

Many factors will contribute to your particular capacity needs. This guide will cover some high-level concepts to keep in mind.

A good approach to start is to set up and load test a single node in your environment and establish some baseline metrics. Once you know how much a single node can handle, you can better predict the necessary number of nodes for your cluster.

There is an elasticsearch.yml setting for minimum_master_nodes. This is how many eligible nodes must be online before one is promoted to master. A good formula to use is number of master nodes / 2 + 1. For example, if there are 10 nodes, the quorum would be 6. Six nodes must be online before a master is elected.

This setting is configured carefully in order to help protect against a "split brain" scenario (two masters at the same time). If there are ever two master nodes, they might both think they are in charge and this will cause all kinds of problems.

Elastic's recommendation is a minimum of 3 master-eligible nodes. In an outage, this would be a quorum of 2.

Remember, this example environment will contain

  • 4 Data nodes
  • 3 Master nodes
  • 2 Client nodes (ideally, behind a virtual IP or load balancer)

Server Hardware

CPU

In general, it is best to prefer more cores over clock speed.

Server RAM

The Java Virtual Machine (JVM) has heap limitations to consider. 64 GB of RAM is the sweet spot for data nodes (and no more than 32 GB for master and client nodes). Performance will degrade if Elasticsearch is given more than 32 GB RAM! On the data nodes, half of the RAM should be allocated for Elasticsearch and the other half is available for Lucene to use.

Disks

This should be fairly obvious. Get the fastest disks you possibly can (especially for the data nodes). It's safe to use RAID 0 for more speed even though it's not fault tolerant, because we have multiple nodes in the cluster. In case of a disk failure, the other Elasticsearch nodes pick up the slack.

Network-based storage will not perform as well as local storage on dedicated hardware, but it may be adequate for your needs.

Networking

1 Gigabit Ethernet will suffice. 10 Gigabit Ethernet will provide better throughput if shards are large. Never cluster across datacenters or different geological locations.

Virtual Machines and Docker

Virtualization makes it easy to add nodes, especially for a development environment. In fact, VMware was used to create VMs for all the nodes in this guide.

The usual caveats apply:

  • Make sure the hosts aren't too busy or virtual machine performance will suffer.
  • Data nodes may not perform as well as they would on dedicated hardware.
  • Spread nodes of the same role across multiple hosts, in case of host failure.

A production-ready cluster might look like this: Data Nodes Dedicated servers with 64 GB of RAM, 4-core CPU, 4 ✕ 1 TB SSD disks in RAID 0 Optional: Operating System on its own disk array

Client and Master Nodes Virtual machines with 32 GB of RAM, 2 CPU cores, 20 GB disk space

Operating System and Elasticsearch Settings

As your cluster grows, it is no fun to modify the many configuration settings manually. This guide is focused on the nuts and bolts of building an Elasticsearch cluster, so we will tweak the configuration by hand. Configuration management (using a solution like Chef or Ansible) is great, but outside the scope of this guide.

Settings for all nodes in the cluster

These settings apply to all nodes, configured in /etc/elasticsearch/elasticsearch.yml.

Set the cluster name and give each node an appropriate name:

cluster.name: awesome_production_cluster
node.name: es-role-0x

You may notice that the guides changing the default paths (path.data, path.logs and path.plugins) used by Elasticsearch. We do not need to worry about this because the Deb package we will use takes care of it automatically.

Configure the minimum number of master nodes:

discovery.zen.minimum_master_nodes: 2

Avoid rebalancing problems in situations like full cluster shutdown / start up:

gateway.recovery_after_nodes: 8
gateway.expected_nodes: 9
gateway.recover_after_time: 3m

We have 9 nodes in the cluster. We'll wait for 8 of them to be live. A recovery will start after 3 minutes if all nodes are not present within that time.

Role-specific configuration

Master Node
node.master: true
node.data: false
Client Node
node.client: true
node.data: false

Setting node.client to true sets node.master to false by default.

Data Node
node.data: true
node.master: false
node.client: false

http.enabled: false

http.enabled is set to false because nothing should query data nodes directly! Query requests will be handled by the client nodes.

Java JVM

There is a pretty simple rule of thumb when it comes to Java JVM settings. Don't mess with most of them! All we need to do is adjust the heap size based on how much RAM is available. The default heap is 1 GB.

Recommended heap for data nodes = 32 GB (64 GB RAM / 2)

Why?

  • 32 GB is enough room to work
  • Lucene needs the rest of the RAM
  • Heap sizes > 32 GB are inefficient

The easiest way to set heap size is to set up an environment variable with a value of 32g:

ES_HEAP_SIZE= 32g

Set to 16GB on the other nodes - master and client

Swap settings:

To disable the swap completely, edit /etc/sysctl.conf and add the following line to it:

vm.swappiness=0

After rebooting the server, you can verify the swappiness setting with cat /proc/sys/vm/swappiness.

Or, in elasticsearch.yml...

bootstrap.mlockall: true

The bootstrap.mlockall setting can be checked by querying the node:

curl http://localhost:9200/_nodes/process?pretty

Often, mlockall remains false, even though it is configured differently. This is because the user of the process does not have proper permission to lock the memory. Using the ulimit -l unlimited command before starting the Elasticsearch service will grant the correct privileges.

MMap and File Descriptors

The File Descriptors setting determines how many files are allowed to be open per process on the operating system. Elasticsearch and Lucene use many files, so this needs to be increased. The defaults can vary by OS, but on Ubuntu, File Descriptors defaults to 1024 and MMap to unlimted. We'll change the File Descriptors value to 64000 and leave MMap at unlimited.

Step-By-Step Installation Guide

Now that you've made it through the theory, it is time to build the cluster!

Notes:

  • Software versions used: Ubuntu version 14.04.3 LTS and Elasticsearch version 2.0.2.
  • Check your OS version details with cat /etc/os-release
  • Run commands as root

The first node we'll set up is a client node, named es-client-01. Use this as the hostname when installing Ubuntu on the first server.

Java Installation

Elasticsearch requires Java. Use the following commands to install Java 8.

add-apt-repository ppa:webupd8team/java
apt-get update && apt-get install oracle-java8-installer

Verify your Java installation with java -version.

Elasticsearch Installation

Download Elasticsearch

wget https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/deb/elasticsearch/2.0.2/elasticsearch-2.0.2.deb

Install the Elasticsearch Deb package

dpkg -i elasticsearch-2.0.2.deb

Start the Elasticsearch Service

service elasticsearch start

Any time this guide mentions restarting the Elasticsearch service, run service elasticsearch restart.

Verify that Elasticsearch is running

curl http://localhost:9200

You should get back a JSON response with node and cluster names, version information and their tagline, "You Know, for Search":

{
  "name" : "(random Marvel character name)",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.0.2",
    "build_hash" : "6abf5d8e5b37c2735d8e2f110f9743c453f71e92",
    "build_timestamp" : "2015-12-16T12:49:58Z",
    "build_snapshot" : false,
    "lucene_version" : "5.2.1"
  },
  "tagline" : "You Know, for Search"
}

Elasticsearch Configuration

The primary Elasticsearch configuration file is /etc/elasticsearch/elasticsearch.yml (this location was determined by the Deb package). Edit this file with nano, or the editor of your choice.

nano /etc/elasticsearch/elasticsearch.yml

In the configuration file, change the values for cluster.name and node.name:

cluster.name: awesome_production_cluster

node.name: es-client-01

The first node will be configured as a client node in the cluster.

Right under node.name add

node.client: true
node.data: false

Restart the Elasticsearch service for these changes to go into effect.

Next, configure Elasticsearch to start automatically when the server boots:

update-rc.d elasticsearch defaults 95 10

For clarity, make sure the host name of the server matches the name used in the elasticsearch.yml configuration.

nano /etc/hostname

Now modify system-level settings for this node.

Check the current file descriptor limit setting with ulimit -n. The default value is 1024. We'll increase this limit to 64,000 for all users and processes.

nano /etc/security/limits.conf

Before the end of the file, add:

*       soft      nofile    64000
*       hard      nofile    64000
root    soft      nofile    64000
root    hard      nofile    64000

Save changes and exit.

Next up...

nano /etc/pam.d/common-session

Look for the session required line and add another right below it:

session required            pam_limits.so

Save and exit.

Next...

nano /etc/pam.d/common-session-noninteractive

Follow the same steps as the previous file, adding a line right after session required:

session required            pam_limits.so

That's it for file descriptors! This change will not go into effect until the server is rebooted.

Check MMap settings with the ulimit -m command. On modern operating systems, this is usually set to a high enough or unlimited value. (We want unlimited.)

Add one final setting to the system... the ES_HEAP_SIZE environment variable! Remember, formula to calculate the appropriate heap value is RAM / 2:

nano /etc/environment

There is one line in the file by default. On the second line, add ES_HEAP_SIZE="4g" (use the number appropriate to your server)

Save the changes and reboot the server.

If this is a VM, you can clone it to make all the other nodes that we'll use.

Following this guide, we'll need nine total:

es-client-01
es-client-02
es-master-01
es-master-02
es-master-03
es-data-01
es-data-02
es-data-03
es-data-04

On each server, update the hosts file (/etc/hosts) with the IP addresses and names of each node, so they can use hostnames to talk to each other on the network.

Hosts file entries look like this (use the appropriate IP addresses for your servers)

10.1.0.1    es-client-01
10.1.0.2    es-client-02
10.1.0.3    es-master-01
10.1.0.4    es-master-02
10.1.0.5    es-master-03
10.1.0.6    es-data-01
10.1.0.7    es-data-02
10.1.0.8    es-data-03
10.1.0.9    es-data-04

Next we'll configure the first master node.

On the es-master-01 node, edit /etc/elasticsearch/elasticsearch.yml:

node.name: es-master-01
node.master: true
node.data: false

At end of the file, look for network.host: line. Add the IP address so Elasticsearch will bind to the external IP address so we can access it remotely.

Set Unicast and Multicast settings so the nodes can find each other (in the Discovery section).

discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["es-client-01", "es-client-02", "es-master-01", "es-master-02", "es-master-03", "es-data-01", "es-data-03", "es-data-04"]

Save and exit. At the command line, restart the Elasticsearch service.

Back on the first client node es-client-01, edit /etc/elasticsearch/elasticsearch.yml.

Update network.host setting with external IP address for the server and add the same multicast and unicast settings as the first server. Save and exit, then restart the Elasticsearch service.

At this point, the client and master should have found each other and we should have a working cluster.

We can verify this with a curl command:

curl http://es-client-01:9200/_cluster/stats?pretty

(Look in the nodes section of the response to see the node count for the various roles.)

Plugins

Looking through raw JSON data can be difficult, and there are plugins that make it easier to view important information.

On the client node, navigate to the /usr/share/elasticsearch/bin directory. Install the head plugin using this command ./plugin install mobz/elasticsearch-head

Restart Elasticsearch to enable the plugin.

In a browser, navigate to http://es-client-01:9200/_plugin/head This will show a visual representation of the cluster.

Client nodes get a circle icon. Master node gets a star icon.

Data Node configuration

Log in to the first data node, es-data-01.

Edit the /etc/elasticsearch/elasticsearch.yml file.

Update the node settings:

node.name: es-data-01
node.master: false
node.data: true

Just like with the other nodes, update the network.host setting with the server's external IP address, and add the multicast & unicast settings.

Restart Elasticsearch.

Check the Head plugin UI to make sure the data node is included. It will appear with a filled-in black circle icon.

Update the settings on any remaining nodes using the steps for each node type, respectively. The elected master will have a star icon, and the others will have a hollow circle.

Note: Only one client node might appear in the UI, but the other one is still part of the cluster. This can be verified by looking at the Info dropdown and choosing Cluster State.

At this point, the cluster installation is complete. Two client nodes, three master nodes and four data nodes have been provisioned and the head plugin is installed. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment