Skip to content

Instantly share code, notes, and snippets.

@D-saif
Last active February 3, 2022 15:35
Show Gist options
  • Save D-saif/6889e51ab8970864251bdb8f97796631 to your computer and use it in GitHub Desktop.
Save D-saif/6889e51ab8970864251bdb8f97796631 to your computer and use it in GitHub Desktop.

What is a Logging system?

We will build a system that implements the practice of continuously gathering, storing, processing and visualizing log data from disparate docker containers in order to optimize system performance, identify technical issues, better manage resources, strengthen security and improve compliance. The proccess will be dividing into three steps:

  1. Collecting logs data
  2. Storing logs data
  3. and finally visualizing the logs in a dashboard

Why we need Log Management?

Our docker environment generates a lot of log data. For ensuring our containers are secure and performing well, it’s really important to analyze these logs and gather insights. Locating, monitoring, and managing them manually can become such a complex task. Here’s where log management and analysis tools come into play. They basically gather the log data, analyze them, and create visualizations.

Caracteristiques of the logging system

Level of monitoring (docker-level vs host-level)

In our logging system we will focus on logs generated by docker containers rather than collecting all different logs in the host.

Which architecture we are seeking (pull-based vs push-based)

Pull-based architecture is when the collector asks for the data from the monitored system. While with push-based architecture (which is our preference here) containers are emitters, sending data to a central collector so the collection is fully distributed on the the containers.

Advantages of push based aproach based over poll-based app:

-pull requires ressources of the monitoring platform since it is the responsible on pulling data. +push-based platform only receives data because the collection is fully distributed on the hosts that emit data. +push-based are so good for stateless components (docker) +push-based good for dynamic environment. -pull are less secure because of bidirctional network channels. +push are unidirectional so we don't have to expose services.

Which data are we collecting? (logs vs metrics)

metrics: Metrics are measures of properties in pieces of software or hardware.Whenusing metrics we keep track of its state, generally recording data points or observations over time. logs: Even though metrics always appear to be the most straightforward part of any monitoring system in our system, we’ll rely most heavily on logs generated by docker containers to help us understand what’s going on in our environment.

Stacks comparison

Categorization

Monitoring tools are divided into two big families:

  • SaaS solutions (datadog,sematext,sensu ...)
  • Open source tools (ELK, Grafana, FLuentd, ...) And since we want to build our own logging system, we will choose an open source stack. Having the previous information we can now compare between tools and decide which suits our needs.

Elastick Stack (ELK) (Elatsick)

It is the acronym for three open source projects: Elasticsearch, Logstash, and Kibana. Elasticsearch is a search and analytics engine. Logstash is a server‑side data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a "stash" like Elasticsearch. Kibana lets users visualize data with charts and graphs in Elasticsearch.

Fluentd

Fluentd is a robust solution for data collection and is entirely open source. It does not offer a full frontend interface but instead acts as a collection layer to help organize different pipelines. Fluentd is used by some of the largest companies worldwide but can be implemented in smaller organizations as well.

GRAYLOG

Graylog is a popular open source log management tool with a GUI that uses Elasticsearch as a backend. It provides centralized log collection, analysis, searching, visualization, and alerting features.

Nagios

Nagios started with a single developer back in 1999 and has since evolved into one of the most reliable open source tools for managing log data. The current version of Nagios can integrate with servers running Microsoft Windows, Linux, or Unix. But Nagios has a pull-based architecture.

Heka (Mozilla)

Heka is an open-source log management solution. Designed by the team at Mozilla, it is a log and data collection tool written in Go. Like Logstash it provides a pluggable framework for collecting and processing logs. It also allows output to a variety of destinations, including Elasticsearch. But Heka is currently not maintained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment