Skip to content

Instantly share code, notes, and snippets.

@jakedsouza
Created February 16, 2018 02:18
Show Gist options
  • Save jakedsouza/6c8b41b6cc0757c49911bdaa0c36dbc4 to your computer and use it in GitHub Desktop.
Save jakedsouza/6c8b41b6cc0757c49911bdaa0c36dbc4 to your computer and use it in GitHub Desktop.
Prometheus notes

% Memory Used

(node_memory_MemTotal - node_memory_MemFree )/ node_memory_MemTotal * 100

% Cpu

((sum(node_cpu{mode=~"user|nice|system|irq|softirq|steal|idle|iowait"}) by (instance, job)) - ( sum(node_cpu{mode=~"idle|iowait"}) by (instance,job) )   )   /  (sum(node_cpu{mode=~"user|nice|system|irq|softirq|steal|idle|iowait"}) by (instance, job)) * 100

http://stackoverflow.com/questions/23367857/accurate-calculation-of-cpu-usage-given-in-percentage-in-linux

According the htop source code at the time of writing:

https://github.com/hishamhm/htop/blob/master/ProcessList.c

// Guest time is already accounted in usertime

  • usertime = usertime - guest; # As you see here, it subtracts guest from user time
  • nicetime = nicetime - guestnice; # and guest_nice from nice time // Fields existing on kernels >= 2.6 // (and RHEL's patched kernel 2.4...)
  • idlealltime = idletime + ioWait; # ioWait is added in the idleTime
  • systemalltime = systemtime + irq + softIrq;
  • virtalltime = guest + guestnice;
  • totaltime = usertime + nicetime + systemalltime + idlealltime + steal + virtalltime;

So from the following:

    user    nice   system  idle      iowait irq   softirq  steal  guest  guest_nice
cpu  74608   2520   24433   1117073   6176   4054  0        0      0      0
  • PrevIdle=previdle+previowait
  • Idle=idle+iowait
  • PrevNonIdle=prevuser+prevnice+prevsystem+previrq+prevsoftirq+prevsteal
  • NonIdle=user+nice+system+irq+softirq+steal
  • PrevTotal=PrevIdle+PrevNonIdle
  • Total=Idle+NonIdle
  • CPU_Percentage=((Total-PrevTotal)-(Idle-PrevIdle))/(Total-PrevTotal)

Disk

http://www.xaprb.com/blog/2010/01/09/how-linux-iostat-computes-its-results/

https://en.wikipedia.org/wiki/Df_(Unix)

Disk Percent Used - is off from what df returns because of how df calculates. space used / (space used + space free) However, we don't have "space used" from prometheus (free, avail, size).

% Disk Used

100 *(1 - (node_filesystem_free{ mountpoint="/"}  / node_filesystem_size{ mountpoint="/"}) ) 

% iNodes used

100 * (1 - (node_filesystem_files_free{mountpoint="/"} / node_filesystem_files{mountpoint="/"}))

Disk Queue Length

node_disk_io_now{device="xvda"}

https://www.kernel.org/doc/Documentation/iostats.txt

No. of I/Os currently in progress - The only field that should go to zero. Incremented as requests are given to appropriate struct request_queue and decremented as they finish.

Disk weighted i/o milliseconds

weighted # of milliseconds spent doing I/Os

This field is incremented at each I/O start, I/O completion, I/O merge, or read of these stats by the number of I/Os in progress (field 9) times the number of milliseconds spent doing I/O since the last update of this field. This can provide an easy measure of both I/O completion time and the backlog that may be accumulating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment