Skip to content

Instantly share code, notes, and snippets.

@ejhayes
Created February 18, 2015 20:03
Show Gist options
  • Save ejhayes/5bef4545e5e74551bc7f to your computer and use it in GitHub Desktop.
Save ejhayes/5bef4545e5e74551bc7f to your computer and use it in GitHub Desktop.
Logstash setup notes

Logstash Master Setup on AWS

AWS Configurations

  • Port 80 should be open on the server so kibana can be accessed
  • Port 5000 should be open for incoming traffic via logstash-forwarder aka lumberjack (this is where logstash data comes in). I set this to 0.0.0.0/0 although you could restrict it to just the internal machines you are monitoring.

MASTER: ElasticSearch Install

First install dependencies on the "master" server:

sudo add-apt-repository -y ppa:webupd8team/java
sudo apt-get update
sudo apt-get -y install oracle-java7-installer
wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
echo 'deb http://packages.elasticsearch.org/elasticsearch/1.1/debian stable main' | sudo tee /etc/apt/sources.list.d/elasticsearch.list
sudo apt-get update
sudo apt-get -y install elasticsearch=1.1.1

Following install, set elasticsearch host to point locally (the master runs elasticsearch):

sudo vi /etc/elasticsearch/elasticsearch.yml
# EDIT: network.host: localhost

And start elasticsearch...but set to start on reboot as well:

sudo update-rc.d elasticsearch defaults 95 10
sudo service elasticsearch restart

MASTER: Kibana Install (frontend graphing interface)

Install from source:

cd ~; wget https://download.elasticsearch.org/kibana/kibana/kibana-3.0.1.tar.gz
tar xvf kibana-3.0.1.tar.gz

Edit configuration to point at port 80 instead of another port. This is because we'll be having nginx reverse proxy to whatever port we are running on.

sudo vi ~/kibana-3.0.1/config.js
# EDIT: elasticsearch: "http://"+window.location.hostname+":80",

Now move things into place:

sudo mkdir -p /var/www/kibana3
sudo cp -R ~/kibana-3.0.1/* /var/www/kibana3/
sudo apt-get install nginx -y

And configure nginx. Update /etc/nginx/sites-enabled/kibana to look like:

# Nginx proxy for Elasticsearch + Kibana
#
# In this setup, we are password protecting the saving of dashboards. You may
# wish to extend the password protection to all paths.
#
# Even though these paths are being called as the result of an ajax request, the
# browser will prompt for a username/password on the first request
#
# If you use this, you'll want to point config.js at http://FQDN:80/ instead of
# http://FQDN:9200
#
server {
  listen                *:80 ;

  server_name           kibana.XXXXXX.com;
  access_log            /var/log/nginx/kibana.XXXXXX.com.access.log;

  location / {
    root  /var/www/kibana3;
    index  index.html  index.htm;
  }

  location ~ ^/_aliases$ {
    proxy_pass http://127.0.0.1:9200;
    proxy_read_timeout 90;
  }
  location ~ ^/.*/_aliases$ {
    proxy_pass http://127.0.0.1:9200;
    proxy_read_timeout 90;
  }
  location ~ ^/_nodes$ {
    proxy_pass http://127.0.0.1:9200;
    proxy_read_timeout 90;
  }
  location ~ ^/.*/_search$ {
    proxy_pass http://127.0.0.1:9200;
    proxy_read_timeout 90;
  }
  location ~ ^/.*/_mapping {
    proxy_pass http://127.0.0.1:9200;
    proxy_read_timeout 90;
  }

  # Password protected end points
  location ~ ^/kibana-int/dashboard/.*$ {
    proxy_pass http://127.0.0.1:9200;
    proxy_read_timeout 90;
    limit_except GET {
      proxy_pass http://127.0.0.1:9200;
      auth_basic "Restricted";
      auth_basic_user_file /etc/nginx/conf.d/kibana.XXXXXX.com.htpasswd;
    }
  }
  location ~ ^/kibana-int/temp.*$ {
    proxy_pass http://127.0.0.1:9200;
    proxy_read_timeout 90;
    limit_except GET {
      proxy_pass http://127.0.0.1:9200;
      auth_basic "Restricted";
      auth_basic_user_file /etc/nginx/conf.d/kibana.XXXXXX.com.htpasswd;
    }
  }
}

The above uses basic auth, so you can set that up like this:

apt-get install -y apache-utils
sudo htpasswd -c /etc/nginx/conf.d/kibana.XXXXXX.com.htpasswd admin # set to asdf1234

MASTER: Install LogStash

Install package

echo 'deb http://packages.elasticsearch.org/logstash/1.4/debian stable main' | sudo tee /etc/apt/sources.list.d/logstash.list
apt-get update
sudo apt-get install logstash=1.4.2-1-2c0f5a1

We want all of the logstash forwarder clients to verify that they are talking to the correct server (this one). So...generate the certs needed:

sudo mkdir -p /etc/pki/tls/certs
sudo mkdir /etc/pki/tls/private
cd /etc/pki/tls; sudo openssl req -x509 -batch -nodes -days 3650 -newkey rsa:2048 -keyout private/logstash-forwarder.key -out certs/logstash-forwarder.crt

MASTER: Configure Input

Tell master that all input will be coming in on port 5000 using the logstash-forwarder. Edit /etc/logstash/conf.d/01-lumberjack-input.conf:

input {
  lumberjack {
    port => 5000
    type => "logs"
    ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
    ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
  }
}

Logstash specifies different types of data coming in and all we need to do is tell logstash how to correctly interpret the incoming data. This is done by using the grok filter and a conditional check for type syslog (type is arbitrary so you can come up with whatever). Update `/etc/logstash/conf.d/10-syslog.conf':

filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}

In the event that you're using the ruby Logger gem to do logging, you could setup a filter like this:

filter {
  if [type] == "rubylog" {
    grok {
      match => { "message" => "%{RUBY_LOGGER}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
  }
}

The above match RUBY_LOGGER is built in to elasticsearch and has the following regex defined for it:

RUBY_LOGGER [DFEWI], \[%{TIMESTAMP_ISO8601:timestamp} #%{POSINT:pid}\] *%{RUBY_LOGLEVEL:loglevel} -- +%{DATA:progname}: %{GREEDYDATA:message}

Using irb I'll create a sample log message using ruby logger (default for rails):

irb(main):003:0> a=Logger.new(STDOUT)
=> #<Logger:0x00000002049eb0 @progname=nil, @level=0, @default_formatter=#<Logger::Formatter:0x00000002049e10 @datetime_format=nil>, @formatter=nil, @logdev=#<Logger::LogDevice:0x00000002049d98 @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<IO:<STDOUT>>, @mutex=#<Logger::LogDevice::LogDeviceMutex:0x00000002049d48 @mon_owner=nil, @mon_count=0, @mon_mutex=#<Mutex:0x00000002049c58>>>>
irb(main):004:0> a.warn('hey')
W, [2015-02-08T19:35:49.634481 #9026]  WARN -- : hey
=> true
irb(main):005:0> exit
ubuntu@ip-172-31-20-30:~$ irb
irb(main):001:0> require 'logger'
=> true
irb(main):002:0> logger = Logger.new(STDOUT)
=> #<Logger:0x000000022eea08 @progname=nil, @level=0, @default_formatter=#<Logger::Formatter:0x000000022ee918 @datetime_format=nil>, @formatter=nil, @logdev=#<Logger::LogDevice:0x000000022ee850 @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<IO:<STDOUT>>, @mutex=#<Logger::LogDevice::LogDeviceMutex:0x000000022ee7d8 @mon_owner=nil, @mon_count=0, @mon_mutex=#<Mutex:0x000000022ee710>>>>
irb(main):003:0> logger.warn('hey')
W, [2015-02-08T23:07:14.931913 #12139]  WARN -- : hey
=> true

We'd get the following event created by logstash:

{
  "RUBY_LOGGER": [
    [
      "W, [2015-02-08T19:35:49.634481 #9026]  WARN -- : hey"
    ]
  ],
  "timestamp": [
    [
      "2015-02-08T19:35:49.634481"
    ]
  ],
  "YEAR": [
    [
      "2015"
    ]
  ],
  "MONTHNUM": [
    [
      "02"
    ]
  ],
  "MONTHDAY": [
    [
      "08"
    ]
  ],
  "HOUR": [
    [
      "19",
      null
    ]
  ],
  "MINUTE": [
    [
      "35",
      null
    ]
  ],
  "SECOND": [
    [
      "49.634481"
    ]
  ],
  "ISO8601_TIMEZONE": [
    [
      null
    ]
  ],
  "pid": [
    [
      "9026"
    ]
  ],
  "loglevel": [
    [
      "WARN"
    ]
  ],
  "progname": [
    [
      ""
    ]
  ],
  "message": [
    [
      "hey"
    ]
  ]
}

Filtering resources:

Now tell logstash to output data to the local elasticsearch we setup earlier. Also tell logstash to output to stdout. Yes, you can output in multiple formats. Edit /etc/logstash/conf.d/30-lumberjack-output.conf:

output {
  elasticsearch { host => localhost }
  stdout { codec => rubydebug }
}

And restart logstash so we can test this out:

sudo service logstash restart

How do I prevent uncontrolled growth?

We're going to be collecting a lot of data...therefore it seems like a good idea to clean things up on a regular basis. There's elasticsearch-curator for this. To set it up do this:

# install curator (to keep elasticache cleaned up)
sudo apt-get install python-pip
sudo pip install elasticsearch-curator

# keep logs for 1 week
/usr/local/bin/curator delete --older-than 7

# only need to index for the current day so keep that up to date
/usr/local/bin/curator bloom --older-than 1

Logstash Forwarder Setup

This includes information for setting up the forwarders. Event data makes it to the master like this:

  • Something generates a log
  • The logstash-forwarder is set to run and harvest incoming data. The log above may be something that is being monitored
  • The agent that collects the data runs a quick filter to strip out any log data that doesn't need to be shipped off to the master. Not all event data is created equally...
  • Periodically the agent sends off a batch of event data to the Master

Configure logstash-forwarder

On the client install the package:

echo 'deb http://packages.elasticsearch.org/logstashforwarder/debian stable main' | sudo tee /etc/apt/sources.list.d/logstashforwarder.list
sudo apt-get update
sudo apt-get install logstash-forwarder

Create a service for the forwarder (one was already created, use that one):

cd /etc/init.d/; sudo wget https://raw.github.com/elasticsearch/logstash-forwarder/master/logstash-forwarder.init -O logstash-forwarder
sudo chmod +x logstash-forwarder
sudo update-rc.d logstash-forwarder defaults

Put the certificate in place (the master created the cert earlier).

sudo mkdir -p /etc/pki/tls/certs

cat > /etc/pki/tls/certs/logstash-forwarder.crt <<EOF
-----BEGIN CERTIFICATE-----
MIIDXTCCAkWgAwIBAgIJAOWPVPL6qoz7MA0GCSqGSIb3DQEBCwUAMEUxCzAJBgNV
BAYTAkFVMRMwEQYDVQQIDApTb21lLVN0YXRlMSEwHwYDVQQKDBhJbnRlcm5ldCBX
aWRnaXRzIFB0eSBMdGQwHhcNMTUwMjA3MTc1NzIyWhcNMjUwMjA0MTc1NzIyWjBF
MQswCQYDVQQGEwJBVTETMBEGA1UECAwKU29tZS1TdGF0ZTEhMB8GA1UECgwYSW50
ZXJuZXQgV2lkZ2l0cyBQdHkgTHRkMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIB
CgKCAQEA2wlOHPyrECV3aEQ+/SilVYCD5410HQPH2sovldZSW/I8YCF4VTbeTb1q
tCvm4HSPGyqd2I/ry6l3L97X9ZRKh8TqwtSax8MKYbt58swncPEAHwj0QhTr+3CP
Vg4LIlX8CY52b64//gzQqRgaw1mT+Hvm/Mr9Nn5iuyn6SN71ub4/GzOK88g3cgOH
GvsT5Hn4F1DWKUqVXqb48684Fidxb4u0stvujOULsZVTu5Xz7BRK5qag2Vy5xHuR
rdPJbMyRzmdwxYS8hbPbT/PyGZskwXnkMg2gEzL57lAiMNYFk4jQ1wP6mEQNQMez
9yp6ywE8QAMwrA0Pw3XSmraxx0sQZQIDAQABo1AwTjAdBgNVHQ4EFgQUWEOF7j1Q
wMb3gfLsYYXorCDTSxAwHwYDVR0jBBgwFoAUWEOF7j1QwMb3gfLsYYXorCDTSxAw
DAYDVR0TBAUwAwEB/zANBgkqhkiG9w0BAQsFAAOCAQEAc+c6iTFP4OOciknNk298
HI8T+m0FIUWzUXQ/IstrmLk2QThP++sE6ag6i1IGMZWZ+eB3a4NcusuBXyYa7z4s
mO6LFq7AfsVg3A8UDWufO73gu1iQw8RMJAVc6bgkFXAK4AyPWhhYGEw+8l0D7IxM
uneWHMpK8st3TR50CrCVFEkbfkNgInJOb+Gw7SQSNHpqZOYVnxtEeESVM/cIEAnG
axYAu0IYZU2BSv9ah6CkBMhGwbe2XsHeD6w1uhn4u6yBjXIEiZ5ywh9bgqnrArex
xdBjQ+VImSvZ+ImyFrex2532+sEzP22IjB8wmdviTY3+vkKS1G8J1j1X4YQ6mPpd
XA==
-----END CERTIFICATE-----
EOF

What would you like to harvest and ship off? Tell the forwarder by editing vi /etc/logstash-forwarder:

{
  "network": {
    "servers": [ "localhost:5000" ],
    "timeout": 15,
    "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt"
  },
  "files": [
    {
      "paths": [
        "/var/log/syslog",
        "/var/log/auth.log"
       ],
      "fields": { "type": "syslog" }
    }
   ]
}    

Bam! Restart the forwarder to start shipping off events:

sudo service logstash-forwarder restart

Memory Consumption

Keep in mind that memory may become an issue. Since we are running selenium, logstash, and elasticsearch on a 2gb box...the java heap size may become a problem. If every process keeps the default settings, we will eventually run out of memory available to us and the process . I went with the following settings:

  • logstash: 512mb to 256mb
  • STOPPED logstash-web process which caused major CPU spikes on the box
  • elasticsearch: 1gb to 512mb
  • selenium: 512mb

Running with these settings now and looks like memory is still available:

    root@ip-172-31-20-30:/etc/init# free -m
                 total       used       free     shared    buffers     cached
    Mem:          2000       1831        168          0         24        314
    -/+ buffers/cache:       1493        507
    Swap:            0          0          0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment