Skip to content

Instantly share code, notes, and snippets.

@davewongillies
Forked from timconradinc/gist:5654473
Last active December 17, 2015 22:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save davewongillies/5683355 to your computer and use it in GitHub Desktop.
Save davewongillies/5683355 to your computer and use it in GitHub Desktop.
ElasticSearch Quick Guide

ElasticSearch Quick Guide

ElasticSearch for Logstash Overview

There are two ways to send data to ElasticSearch from Logstash. The first is the 'elasticsearch' output and the other is the 'elasticsearch_http' output. In a nutshell, the 'elasticsearch' output is tightly coupled with your elasticsearch cluster, and the 'elasticsearch_http' output isn't.

What does this mean? The 'elasticsearch' output will always start up a local ElasticSearch node and try to join it to your ElasticSearch cluster. This has the end goal of making Logstash aware of your cluster - if a node goes down, Logstash can simply re-route the data to a functioning node. The difference is when you set 'embedded' => false in your Logstash config, the Logstash node simply gets set to 'data = false' in the ElasticSearch configuration.

The 'elasticsearch_http' output uses port 9200 to send data. This connection uses the ElasticSearch HTTP API which sends data via JSON. This makes it cross-version compatible - so you can run 0.90 ElasticSearch even though the embedded ElasticSearch is only 0.20.5.

Planning for ElasticSearch

  • Events stored in ES will take 2-3x what a raw text event takes while compressed.
  • This can vary based on how the data in the event is modified during the filter stage - as an example, the 'geoip' filter adds a number of fields which obviously will take more space.
  • ES memory should be 50% of phyiscal memory up to 30GB.

Tips for running ElasticSearch Embedded

  • It will always start up, even if you have 'embedded' => false. This is due to the nature of how the plugin works.
  • When you set 'embedded' => false, there just won't be any local data stored. You probably don't want to run the embedded plugin as a data node.
  • The embedded ElasticSearch can be configured by either having an elasticsearch.yml file in the same directory as your Logstash process or by passing -Des.config.directive=foo along the commandline.
  • Make sure to prefix it with es.

Running ElasticSearch

  • The number of open files needed to run ElasticSearch will exceed 1024.
  • Make sure the user that ES is running under can open more than 1024 files by (most likely) editing /etc/security/limits.conf and modifying the following:

Ensure ElasticSearch can open files and lock memory!

    elasticsearch   soft    nofile          64000
    elasticsearch   hard    nofile          64000
    elasticsearch   -       memlock         unlimited

Then make sure the startup script does 'ulimit -n 64000' prior to starting up ES.

  • By default, ES is only given 1 GB of memory. This can be expanded to 30 GB - but a general recommendation is to use no more than 50% of your system memory up to 30 GB. There needs to be plenty of memory left over for the Linux filesystem cache.

Securing ElasticSearch

ElasticSearch isn't very secure by default. It doesn't have much by way of built in security - no users/groups/etc.

ElasticsSearch Tools

  • There's a number of plugins available for ES that make managing ES much easier. They're highly recommended to get and use to simplify ES management.
    • ElasticSearch Head - This is an excellent plugin that will show basic cluster status and custom queries can be created quickly with minimal APIknowledge.
    • BigDesk - Shows graphs of what your ES nodes are doing.
      • Quite helpful to diagnose GC related issues.
      • Can show the # of open files
      • All data in this plugin comes from ElasticSearch
    • Paramedic - Shows graphs of

Helpful Things to Read

Other things to add

  • section about mmapfs with more memory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment