Skip to content

Instantly share code, notes, and snippets.

@bmaupin
Last active May 17, 2019 14:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bmaupin/c230fd622965c1a301225e75dafdd973 to your computer and use it in GitHub Desktop.
Save bmaupin/c230fd622965c1a301225e75dafdd973 to your computer and use it in GitHub Desktop.
Graylog vs Splunk Free

Overall impression

Splunk—the de-facto standard—is a much better product overall and requires less tweaking to get things working. If you can work with the limitations of Splunk Free (500 MB daily limit, no authentication/RBAC, no alerts, no clustering, etc), use it.

Otherwise, Graylog is a decent product and works well, but will require some tweaking.

Role-based access control (RBAC)

Splunk Free has no authentication or RBAC. You can, however, put a reverse proxy server in front of it (Nginx, Apache, etc) to provide authentication if you know what you're doing. You will still not have any RBAC, and all users who log in will effectively be logged in as the same administrative user.

Graylog has RBAC, but it has its limits. For example, there is no separation of dashboards by users (I imagine this applies to many other features but I'm not sure which). So if you create a dashboard, any user who can see dashboards can see your dashboard. Any user who can modify dashboards can modify your dashboard.

Creating user roles in Graylog, surprisingly, cannot be done in the UI and must be done via the API. For example, see: Create Graylog power user role

Documentation

The Splunk documentation is excellent. Period.

The Graylog documentation looks organized, but it only takes a few minutes of actually reading it to realize it's a bit of a mess. It's an organized mess, but a mess nonetheless. It's much more difficult to determine how to do a task using the Graylog documentation.

This will become evident as soon as you try to do one of the first tasks after setting up the Graylog server: trying to send data to it (see: Sending in log data).

Parsing received data

In Splunk, received data is quite often automatically parsed and there's nothing more to do. Any additional parsing/formatting you wish to do with the data after it's been received can typically be done on demand.

Timestamps

Graylog seems to require much more customization to handle received data. It often doesn't correctly parse timestamps, and so they have to be modified at the source (for example, the default log4j v1 timestamp format wasn't correctly parsed).

Syslog

Another example: Splunk will by default join any multiline syslog messages that may get split by the client. Graylog not only does not do this, but it doesn't seem to be even possible to customize it to do so. So this must be corrected at the client.

Syslog messages over 1024 bytes will be split by Splunk just as they will Graylog. With Splunk this can be fixed, e.g. by putting this into props.conf:

SEDCMD-join_log4j_syslog_lines=s/\.\.\.[\r\n]+\.\.\.//g

There doesn't seem to be a way to fix this at all in Graylog.

Custom parsing

While Splunk can easily do custom parsing of data after it's been received, with Graylog this typically needs to be set up before the data is received. For example, see: Extract Java exceptions in Graylog

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment