Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community. It is now a standalone open source project and maintained independently of any company
In this guide we will use Debian based Linux distribution version 10 (Buster). If you are using an Cloud Instance like in Amazon Web Service you can use the default provided Ubuntu Amazon Machine Image (AMI) Debian 10 (Buster) 64-bit when you are provisioning it
Don't forget to add some RAW disk, because in this tutorial it'll show you how to manage your disk as RAID using Z Pool on ZFS for latter used by Prometheus data store
Before we do the installation make sure you can escalate your privileges on the system by typing the following command in your command line
sudo -i
To get the latest ZFS version we need to enable or adding Debian 10 (Buster) Backports repository, to do that we need to add some file by using our text editor
nano /etc/apt/sources.list.d/backports.list
Add the following line to the file we opened
deb http://deb.debian.org/debian buster-backports main contrib
Then we need to prioritize the backports repository to always supply the latest ZFS packages, it can be done by adding some new other file
nano /etc/apt/preferences.d/90_zfs
Then add the following line
Package: libnvpair1linux libuutil1linux libzfs2linux libzpool2linux spl-dkms zfs-dkms zfs-test zfsutils-linux zfsutils-linux-dev zfs-zed
Pin: release n=buster-backports
Pin-Priority: 990
After you gain the root access by escalating your privilege and adding Backports repository now you can easily run the installation or the configuration later on, but before it we must make sure that our Operating System (OS) in this case our Debian 10 (Buster) is already using it's latest packages. Type following command to upgrade our OS to the latest version
apt-get update
apt-get -y dist-upgrade
apt-get -y install wget nano tar gzip
apt-get -y purge --autoremove
apt-get -y clean
We are using a distribution upgrade here not a standard upgrade command, because we want to make sure that our OS also using it's latest minor build version
After the upgrade is complete don't forget to reboot the OS just to make sure that it's also using the latest Linux Kernel provided by the distribution
reboot
After the reboot is completed don't forget to re-escalate your privilege, since in this tutorial we will keep to use the root user
Before installing Prometheus we need to install ZFS as we want a better disk management in the future. After adding Debian 10 (Buster) Backports repository now we can get latest ZFS easily by running the following command
apt-get -y install dpkg-dev linux-headers-$(uname -r)
apt-get -y install zfs-dkms zfsutils-linux
To make sure the ZFS run or works properly you can type following command
lsmod | grep zfs
zfs version
It should return loaded zfs kernel module and zfs version installed
Before we create a Z Pool make sure that we already initialize our RAW disk, this step is optional. We can initialize the disk by using the following step
-
Check available disk
lsblk
In this tutorial we will use disk that will showed from above command as /dev/vdb
-
Then we need to create a partition as we intialized the disk by using following command
fdisk /dev/vdb > n > p > (Enter) > (Enter) > w
After partitioning is complete now we can create the Z Pool by using the following command
zpool create -O compression=lz4 -O atime=off -O mountpoint=none -o ashift=12 data vdb1
zpool list -v
Next we need to create ZFS Dataset for Prometheus
mkdir -p /var/lib/prometheus
zfs create -o mountpoint=/var/lib/prometheus data/prometheus
And it's done for the Prometheus datastore part
Since Prometheus are build on top Go programming language it'll be easy to install it from pre-built release in the Github repository here. So we need to download the pre-built release first from the Github using the following command
PROMETHEUS_VERSION=2.28.1 # You can change the version as you need it
PROMETHEUS_ARCH="amd64" # You can change the arch based on your architecture
wget -O /tmp/prometheus-${PROMETHEUS_VERSION}-${PROMETHEUS_ARCH}.tar.gz https://github.com/prometheus/prometheus/releases/download/v${PROMETHEUS_VERSION}/prometheus-${PROMETHEUS_VERSION}.linux-${PROMETHEUS_ARCH}.tar.gz
We already download the pre-build release and save it in temporary directory in /tmp
, next we need to prepare the installation directory, in this guide we will use /opt/prometheus
directory as installation destination. To do that we can use following command
mkdir -p /opt/promtheus
tar xzvf /tmp/prometheus-${PROMETHEUS_VERSION}-${PROMETHEUS_ARCH}.tar.gz -C /opt/prometheus --strip-component 1 --no-same-owner
rm -f /tmp/prometheus-${PROMETHEUS_VERSION}-${PROMETHEUS_ARCH}.tar.gz
At this point pre-build Prometheus release is successfuly extracted. Next we need to create prometheus
user and group
groupadd prometheus --system
useradd prometheus --system -G prometheus -M -N -d /opt/prometheus
Then we need to set all Prometheus files ownership to prometehus
user and group
chown -R prometheus:prometheus /opt/prometheus
chown -R prometheus:prometheus /var/lib/prometheus
Prometheus must run as service as we can start it automatically when system is boot up. First we need to create service file
nano /etc/systemd/system/prometheus.service
Add the following line as the service file content
[Unit]
Description=Prometheus Service
Documentation=https://prometheus.io/docs/
Requires=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
ExecStart=/opt/prometheus/prometheus --config.file /opt/prometheus/prometheus.yml --storage.tsdb.path /var/lib/prometheus --storage.tsdb.retention.time=30d --web.console.templates=/opt/prometheus/consoles --web.console.libraries=/opt/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
Hint: You can also change the Prometheus data retention by configuring it in the service file
After creating or editing the service file we need to reload systemd as service manager
systemctl daemon-reload
In this section we will configure Prometheus to monitor our OpenShift / OKD cluster, we can configure it by following this step
-
Backing-up default Prometheus configuration file
mv /opt/prometheus/prometheus.yml /opt/prometheus/prometheus.yml.orig~
-
Creating Prometheus configuration file for OpenShift / OKD monitoring metrics
nano /opt/prometheus/prometheus.yml
Then add the following line
global: scrape_interval: 30s scrape_timeout: 10s scrape_configs: - job_name: cadvisor scheme: https tls_config: insecure_skip_verify: true bearer_token_file: /etc/prometheus/token_YOUR-CLUSTER-NAME.gob metrics_path: /metrics/cadvisor kubernetes_sd_configs: - api_server: https://YOUR-OKD-CLUSTER-CONSOLE-URL tls_config: insecure_skip_verify: true bearer_token_file: /etc/prometheus/token_YOUR-CLUSTER-NAME.gob role: node relabel_configs: metric_relabel_configs: - action: replace source_labels: [id] regex: '^/system\.slice/(.+)\.service$' target_label: systemd_service_name replacement: '${1}'
-
Create Prometheus configuration directory for the OpenShift / OKD Cluster Token
mkdir -p /etc/prometheus
-
Create OpenShift / OKD Cluster Token to be used by the Prometheus (from Cluster Master node)
oc create sa prometheus -n default oc adm policy add-cluster-role-to-user cluster-admin -z prometheus -n default oc sa get-token prometheus
-
Copy the Token output and put it in the Prometheus configuration directory that we made before
nano /etc/prometheus/token_YOUR-CLUSTER-NAME.gob
To starting and enabling Prometheus service we can use following command
systemctl start prometheus
systemctl enable prometheus
To check the Prometheus service status and logs you can use following command
systemctl status -l prometheus
# OR
journalctl -ru prometheus