View sysinfo.js
/**
* Get System Information in json format. Gets Run Queue, Memory and Swap Info.
*/
var os = require('os');
var fs = require('fs');
var sysinfo = {};
sysinfo.hostname = os.hostname();
View log_filters.md

Log Filtering

This is a filter/rating class to look at log objects and decide if they're interesting (worthy of review). Error messages are rated higher, as are logs from production hosts.

'use strict'

module.exports = function(options) {
  var my = {};
View nodejs_fileparser.md

NodeJS File parsing

Here's a skelleton for ripping files apart in NodeJS and processing each line.

var fs = require('fs');
var zlib = require('zlib');
var stream = require('stream');
var es = require('event-stream');
View ElasticSearchTuning.md

ElasticSearch Tuning in Anger

So. I ran into a great deal of stress around ElasticSearch/Logstash performance lately. These are just a few lessons learned, documented so I have a chance of finding them again.

Logs

Both ElasticSearch and Logstash produce logs. On my RHEL install they're located in /var/log/elasticsearch and /var/log/logstash. These will give you some idea of problems then things go really wrong. For example, in my case, ElasticSearch got so slow that Logstash would time out sending it logs. These issues show up in the logs. Also, Elasticsearch would start logging problems when JVM Garbage collection took longer than 30 seconds, which is a good indicator of memory pressure on ElasticSearch.

Pending Tasks

ElasticSearch (and Logstash when it's joined to an ES Cluster) processes tasks in a queue, that you can peek into. Before realizing this I didn't have any way to understand what was happening in ElasticSearch besides the logs. You can look at the pending tasks queue with this command

View JavaScriptModules.md
View LogstashReplay.md

Replaying logs to logstash

  • Copy comprssed log files to a work area.
  • Uncompress them, remove date part of file name.
  • Copy /etc/logstash/conf.d/*.conf to a work location.
  • Modify conf files to change output to stdout { codev => "rubydebug" }
  • You want to do this to make sure things are working before you push logs into ElasticSearch.
  • Modify conf files to change path in the input/file section
View gist:8ccc6a9e711ee229efa6

Setting up InfluxDB on CentOS/RHEL

The InfluxDB Docs give you a very brief overview of installing InfluxDB on a host. It boils down to 'here's the RPM, install it.' That's fine for looking at the software, but you'll probably want to adjust the configuration a bit for a production environment.

Basic Install

https://influxdb.com/docs/v0.9/introduction/installation.html

Config changes

Modify /etc/opt/influxdb/influxdb.conf

View gist:1b549b783e909d546eec

Remove Syslog line headers from multi-line logs in logstash

Overview

Rather than run a log shipper on hosts, we use Syslog when shipping logs out of monolog. This works great for single-line logs. It breaks when a log message gets split up by syslog. When syslog does this, it duplicates the line header, like so:

2015-06-09T05:39:31.457042-05:00 host.example.edu : This is a really really really
2015-06-09T05:39:31.475414-05:00 host.example.edu : really long message
View gist:7f7035edf40a679ff9c4
# ==[ printSlack ]=============================================================
# Function to send output from the commandline to Slack.
# (wants SLACK_TOKEN to be defined in .bashrc or other ENV method, or you can set it here.)
#
# @parameter string $LEVEL INFO/ERROR/WARNING message. Changes emoji
# @parameter string $MESSAGE Message to send to slack.
printSlack()
{
SLACK_HOSTNAME=${SLACK_HOSTNAME-'mycompany.slack.com'};
SLACK_TOKEN=${SLACK_TOKEN-'oops'};
View gist:0c5f4fa9baa0f3a45a76
# Docker file to create a CentOS StatsD host.
# This uses Elasticsearch as a backend rather than Graphite/Carbon.
# Depends on having an Elasticsearch container.
FROM centos:centos6
MAINTAINER Gary Rogers <gary-rogers@uiowa.edu>
# Install things as root
USER root
RUN \