BurgerZ/elk.md

## elk.md

      
    Raw
  

              elk.md
            
          
    Private ELK Server

I needed a syslog server and had been reading about ELK for the past few months. I finally decided to throw together a basic implementation in my home lab. I've recorded my notes for this process in this document & dumped the notes online at the following locations:

Markdown Dump
Python Syslog Client
Kibana Screenshot

The implementation I built is super basic, it's just in my lab for dev purposes atm - so I didn't finish securing or building the integrations - just needed it to visualize some syslog data ATM.
What is ELK?

From the overview pages:

Elasticsearch is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time. It is generally used as the underlying engine/technology that powers applications that have complex search features and requirements.


Logstash is a tool for receiving, processing and outputting logs. All kinds of logs. System logs, webserver logs, error logs, application logs and just about anything you can throw at it. Sounds great, eh?


Kibana is an open source (Apache Licensed), browser based analytics and search dashboard for ElasticSearch. Kibana is a snap to setup and start using. Written entirely in HTML and Javascript it requires only a plain webserver, Kibana requires no fancy server side components. Kibana strives to be easy to get started with, while also being flexible and powerful, just like Elasticsearch.


Using Elasticsearch as a backend datastore, and kibana as a frontend reporting tool, Logstash acts as the workhorse, creating a powerful pipeline for storing, querying and analyzing your logs. With an arsenal of built-in inputs, filters, codecs and outputs, you can harness some powerful functionality with a small amount of effort. So, let’s get started!

-- Getting Started with Logstash
And in case you're wondering how the datastore works:

Elasticsearch is a standalone database server, written in Java, that takes data in and stores it in a sophisticated format optimized for language based searches. Working with it is convenient as its main protocol is implemented with HTTP/JSON. Elasticsearch is also easily scalable, supporting clustering and leader election out of the box.

-- What is Elasticsearch
A more in depth overview of each tool can be found at the following pages:

Elasticsearch - Overview
Logstash - Overview
Kibana - Overview

Installation

There are specific version of each tool that are recommended for compatibility. The logstash change log recommends using the following versions:

Elasticsearch 1.1
Logstash 1.4.2
Kibana 3.0.1

I'm installing the current stack (as of 2014.08.13) on a completely fresh, private Ubuntu 14.04 x64 server (running on my ESXi server) - so I'll start from the very beginning.
# copy our ssh keys to the fresh server
ssh-copy-id logs.homenet.home
ssh logs.homenet.home

# update the server & reboot
sudo apt-get -y update
sudo apt-get -y upgrade
sudo apt-get -y dist-upgrade
sudo reboot

# log back in & install some basic tools
sudo apt-get -y install python-software-properties software-properties-common
sudo apt-get -y install git-core vim-nox tmux byobu
byobu-enable

# install java - it's required by logstash & elasticsearch
sudo add-apt-repository -y ppa:webupd8team/java
sudo apt-get -y update
echo debconf shared/accepted-oracle-license-v1-1 select true | sudo debconf-set-selections
echo debconf shared/accepted-oracle-license-v1-1 seen true | sudo debconf-set-selections
sudo apt-get -q -y install oracle-java7-installer
sudo bash -c "echo JAVA_HOME=/usr/lib/jvm/java-7-oracle/ >> /etc/environment"

# install nginx (or another webserver of your choice) so we can use the kibana web frontend
# we also want apache2-utils to generate and htpasswd file
sudo aptitude install nginx apache2-utils

Logstash

Install the latest version of logstash using the apt repo:
echo 'deb http://packages.elasticsearch.org/logstash/1.4/debian stable main' | sudo tee /etc/apt/sources.list.d/logstash.list
sudo apt-get -y update
sudo apt-get install logstash

Now we can test logstash:
# start a new logstash pipeline
/opt/logstash/bin/logstash -e 'input { stdin { } } output { stdout {} }'

The logstash pipeline will take a few seconds to start up.  We can now type directly into stdin and logstash will throw some messages back to us on stdout.
test
> 2014-08-14T03:17:05.521+0000 log test
test2
> 2014-08-14T03:17:06.845+0000 log test2

We can also pipe stdout to rubydebug using this same method:
/opt/logstash/bin/logstash -e 'input { stdin { } } output { stdout { codec => rubydebug } }'

Great, logstash is running, but not running yet as a service. Before we configure the service, we'll first set up elasticsearch so we can use it as a backend.
Elasticsearch

Install elasticsearch 1.1.1 (for logstash compatibility) using the apt repo:
wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
echo 'deb http://packages.elasticsearch.org/elasticsearch/1.1/debian stable main' | sudo tee /etc/apt/sources.list.d/elasticsearch.list
sudo apt-get update
sudo apt-get -y install elasticsearch=1.1.1

# disable dynamic updates
sudo bash -c 'echo -e "\nscript.disable_dynamic: true\n" >> /etc/elasticsearch/elasticsearch.yml'

# setup the rc.d defaults for the service and restart it
sudo update-rc.d elasticsearch defaults 95 10
sudo service elasticsearch start

Now that we have the elasticsearch service running with the default configuration, we should see java services running on ports 9200 (HTTP query API) & 9300 (elasticsearch transport).
sudo netstat -tulpn

> Proto Recv-Q Send-Q Local Address Foreign Address State       PID/Program name
> tcp6       0      0 :::9200       :::*            LISTEN      17031/java
> tcp6       0      0 :::9300       :::*            LISTEN      17031/java

We can use the API service on 9200 to query the datastore:
curl http://localhost:9200
# or curl http://localhost:9200/_status?pretty=true

> {
>   "status" : 200,
>   "name" : "Rachel van Helsing",
>   "version" : {
>     "number" : "1.1.1",
>     "build_hash" : "f1585f096d3f3985e73456debdc1a0745f512bbc",
>     "build_timestamp" : "2014-04-16T14:27:12Z",
>     "build_snapshot" : false,
>     "lucene_version" : "4.7"
>   },
>   "tagline" : "You Know, for Search"
> }

So now we can fire up a another logstash pipeline (using elasticsearch datastore as the output), and create a few test messages:
/opt/logstash/bin/logstash -e 'input { stdin { } } output { elasticsearch { host => localhost } }'

> test
> test2

Now that we have some data in the datastore, let's query the API and ensure we're seeing it:
curl http://localhost:9200/_search?pretty

> {
>   "took" : 1,
>   "timed_out" : false,
>   "_shards" : {
>     "total" : 5,
>     "successful" : 5,
>     "failed" : 0
>   },
>   "hits" : {
>     "total" : 2,
>     "max_score" : 1.0,
>     "hits" : [ {
>       "_index" : "logstash-2014.08.14",
>       "_type" : "logs",
>       "_id" : "dN6YnS72R1eNdVzknHs9ww",
>       "_score" : 1.0, "_source" : {"message":"test","@version":"1","@timestamp":"2014-08-14T03:02:03.130Z","host":"log"}
>     }, {
>       "_index" : "logstash-2014.08.14",
>       "_type" : "logs",
>       "_id" : "ZXAutT4gT824GgAc8JNSRw",
>       "_score" : 1.0, "_source" : {"message":"test2","@version":"1","@timestamp":"2014-08-14T03:02:11.043Z","host":"log"}
>     } ]
>   }
> }

We can query the indexes using the format:
curl http://localhost:9200/logstash-`date +'%Y.%m.%d'`/_search?pretty=true+q=@type:stdin

... which is just a more specific way of getting exactly the same data that we previously saw, however this becomes useful as we shove more data into the datastore.
Kibana

Install kibana (there is no apt repository as of 2014.08.13):
sudo mkdir -p /srv/www
wget -P /tmp https://download.elasticsearch.org/kibana/kibana/kibana-3.1.0.tar.gz
sudo tar xf /tmp/kibana-3.1.0.tar.gz -C /srv/www/
sudo mv /srv/www/kibana-3.1.0 /srv/www/kibana
sudo chown -R www-data:www-data /srv/www/kibana

And update the kibana config so it knows we'll be running the web app on port 80:
sudo sed 's/    elasticsearch: "http://"+window.location.hostname+":9200",/    elasticsearch: "http://"+window.location.hostname+":80",' /srv/www/kibana/config.js

Now we need to setup our elk site configuration for nginx. Elasticsearch provides a nice nginx template that we can work from.
sudo wget -O /etc/nginx/sites-available/elk https://raw.githubusercontent.com/elasticsearch/kibana/master/sample/nginx.conf

Modify the site configuration with your settings:
    server {
      listen                *:80 ;

      server_name           logs.homenet.home;
      access_log            /var/log/nginx/logs.homenet.home.access.log;

      location / {
        root  /srv/www/kibana;
        index  index.html  index.htm;
      }

Kill the default site and enable the elk site configuration:
sudo unlink /etc/nginx/sites-enabled/default
pushd /etc/nginx/sites-enabled
sudo ln -s ../sites-available/elk elk
popd

Restart nginx:
sudo service nginx restart

And you should now be able to browse to the kibana front end and view the logs we shoved into the elasticsearch backend earlier.
Build a Syslog Service

Now that we have everything running, we need to create a persistent configuration to run a logstash syslog service, using elastic search as the backend.
Note: The logstash configuration files are stored in /etc/logstash/conf.d/.
sudo su -
cat > /etc/logstash/conf.d/10-syslog.conf <<EOF
input {
  tcp {
    port => 514
    type => syslog
  }
  udp {
    port => 514
    type => syslog
  }
}
filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}
output {
  elasticsearch { host => localhost }
  stdout { codec => rubydebug }
}
EOF

service logstash restart

netstat -ant | grep 5514
tcp6       0      0 :::5514                 :::*                    LISTEN

Note: To run on port 514, you'd need to configure the service to run as root. Running on ports > 1024 is better practice.
Note: Our syslog service is currently unsecured. It would be really bad to run this way on a public network.
Some references to alternate methods for setting up a logstash syslog service:

OpenStack Lumberjack – Part 3 Logstash and Kibana
Logging with Logstash
Logstash.net - Syslog Input
A Bit of Logstash Cooking
Gist - Logstash Config Example
Gist - Logstash Config & Filter to Fully Parse a Syslog Message
Gist - Logstash Init script

Testing Syslog Messages

Time to test the syslog server. I've created and dumped a python client here for generating test messages.
Note: The log level is not currently being parsed correctly. Need to figure out why.
Securing the Syslog Service

Note: This section is currently not completed.
We can query data (via the elasticsearch API) and we can view and search data (via the kibana web frontend), but we don't have any way to insert data from a remote host yet. Now we can setup a syslog daemon using logstash.
Use htpasswd to generate a password configuration file that we will use to secure the logstash service:
sudo htpasswd -c /etc/nginx/conf.d/logs.farrars.home.htpasswd nfarrar

Generate our SSL certificates:
sudo mkdir /etc/nginx/ssl
sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/nginx/ssl/kibana.key -out /etc/nginx/ssl/kibana.crt

    Country Name (2 letter code) [AU]:US
    State or Province Name (full name) [Some-State]:CO
    Locality Name (eg, city) []:Colorado Springs
    Organization Name (eg, company) [Internet Widgits Pty Ltd]:Lab
    Organizational Unit Name (eg, section) []:Logging Server
    Common Name (e.g. server FQDN or YOUR name) []:logs.homenet.home
    Email Address []:me@myemail.com

Elasticsearch Python Client

Note: This section is currently not completed.
pip install elasticsearch


elasticsearch-py
Python Logstash Formatter
Python-logstash
Elasticsearch Clients

Graphite

Note: This section is currently not completed.

Graphite

References

Getting Started:

Getting Started with Logstash
A Bit on Elasticsearch, Logstash & Kibana
Introduction to Logstash, ElasticSearch & Kibana
How To Use Logstash and Kibana To Centralize Logs On Ubuntu 14.04
Setting up Elasticsearch, Kibana, and Logstash
Howto view and analyze your logs on a web page
Elastic Search - Running as service on Linux
Central Syslog server with NXlog, Logstash & Kibana
Collect & visualize your logs with Logstash, Elasticsearch & Redis

Performance:

Elasticsearch for Logging
Highly Available ELK (Elasticsearch, Logstash and Kibana) Setup
Highly Available ELK (Elasticsearch, Logstash and Kibana) Unicast Mode