Last active
March 12, 2021 14:43
-
-
Save freeseacher/9dd3566a5d917c9fd3f41bd5f87427bc to your computer and use it in GitHub Desktop.
one-more-prom-slides
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
version: '2.3' | |
services: | |
prom: | |
image: prom/prometheus:v2.23.0 | |
ports: | |
- 9090:9090 | |
volumes: | |
- "$PWD/configs:/etc/prometheus:ro" | |
command: | |
- --config.file=/etc/prometheus/prometheus.yml | |
- --storage.tsdb.retention=1d | |
sample-app: | |
build: | |
dockerfile: Dockerfile | |
context: sample-app | |
# ports: | |
# - 8017:8000 | |
hey_summary: | |
restart: always | |
image: williamyeh/hey:latest | |
command: -n 10 -c 1 http://sample-app:8000/summary/ | |
hey_histogram: | |
restart: always | |
image: williamyeh/hey:latest | |
command: -n 10 -c 2 http://sample-app:8000/histogram/ | |
hey_counter: | |
restart: always | |
image: williamyeh/hey:latest | |
command: -n 10 -c 1 http://sample-app:8000/counter/1/ | |
node_exporter: | |
image: prom/node-exporter:v0.18.1 | |
ports: | |
- 7080:9100 | |
volumes: | |
- /proc:/host/proc:ro | |
- /sys:/host/sys:ro | |
- /:/rootfs:ro | |
command: | |
- '--path.procfs=/host/proc' | |
- '--path.sysfs=/host/sys' | |
- '--collector.filesystem.ignored-mount-points' | |
- "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
type: slide | |
--- | |
# Yet another prometheus getting started | |
--- | |
## before we begin | |
```shell | |
git clone \ | |
git@.... | |
cd prom-workshop-v2 | |
docker-compose build | |
``` | |
--- | |
## docker-compose | |
```shell | |
sudo curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose | |
sudo chmod +x /usr/local/bin/docker-compose | |
``` | |
https://docs.docker.com/compose/install/ | |
--- | |
# monitoring models | |
--- | |
## Push | |
``` mermaid | |
graph LR; | |
App[App \'] -->C[collector \']; | |
C[collector \'] --> S(storage \'); | |
G[grafana \'] -->S((storage \')); | |
``` | |
--- | |
## Pull | |
```mermaid | |
graph LR; | |
C[collector \'] --> A[App \']; | |
C[collector \'] --> S((storage \')); | |
G[grafana \'] --> S((storage \')); | |
``` | |
--- | |
# sample app | |
sample-app/app.py | |
--- | |
# Metric types | |
* gauge | |
* conter | |
* summary | |
* histogram | |
--- | |
## Gauge | |
* temperature | |
* status | |
* memory used | |
* disk used | |
--- | |
## gauge typical operations | |
* show raw value | |
* deriv -- per second change | |
* delta -- delta between two points | |
Note: | |
температура вырасла на 10. не учитывает скачки между значениями | |
--- | |
```python | |
from prometheus_client import Gauge | |
g = Gauge('sensor_temperature', 'Cpu temperature') | |
g.inc() # Increment by 1 | |
g.dec(10) # Decrement by given value | |
g.set(4.2) # Set to a given value | |
``` | |
--- | |
## Counter | |
* increments each time event happen | |
* http requests count | |
* error count | |
--- | |
## Counter typical operations | |
* rate -- per second change | |
* increase -- delta between two points | |
--- | |
```python= | |
from prometheus_client import Counter | |
c = Counter('app_failures', 'Errors count') | |
c.inc() # Increment by 1 | |
c.inc(1.6) # Increment by given value | |
``` | |
--- | |
# counter -- is a king! | |
## Use counter each time you doubt | |
--- | |
# Seems we are ready to start | |
```shell | |
% docker-compose up -d | |
``` | |
--- | |
# Latency && sizes | |
Note: | |
Некоторые события имеют много событий в пределах периода опроса и несколько измерений. Например размер ответа или время ответа | |
--- | |
# Complex types | |
* summary | |
* histogram | |
--- | |
## summary | |
* precalculated | |
* incomparable | |
* uses gauge | |
* 2/4 of people jumps lower than 50 cm | |
* 3/4 of people jumps lower than 90 cm | |
* 99/100 of people jumps lower than 120 cm | |
--- | |
## summary example | |
``` | |
go_gc_duration_seconds{quantile="0"} 0.000008394 | |
go_gc_duration_seconds{quantile="0.25"} 0.000010507 | |
go_gc_duration_seconds{quantile="0.5"} 0.000011205 | |
go_gc_duration_seconds{quantile="0.75"} 0.000012347 | |
go_gc_duration_seconds{quantile="1"} 0.000040238 | |
``` | |
--- | |
```shell | |
Latency distribution: | |
10% in 0.1564 secs | |
25% in 0.2939 secs | |
50% in 0.4126 secs | |
75% in 0.8355 secs | |
90% in 1.0241 secs | |
95% in 1.4024 secs | |
99% in 1.4448 secs | |
``` | |
--- | |
## summary typical operations | |
* show specific quantile | |
* alert on it :) | |
--- | |
```python= | |
from prometheus_client import Summary | |
s = Summary('request_latency_seconds', 'Description of summary') | |
s.observe(4.7) # Observe 4.7 (seconds in this case) | |
``` | |
<span>Useless for official python client<!-- .element: class="fragment" data-fragment-index="1" --></span> | |
--- | |
## histogram | |
* uses counters | |
* comparable | |
--- | |
## histogram example | |
``` | |
sample_app_histogram_bucket{le="0.005"} 0 | |
sample_app_histogram_bucket{le="0.01"} 0 | |
sample_app_histogram_bucket{le="0.025"} 0 | |
sample_app_histogram_bucket{le="0.05"} 0 | |
sample_app_histogram_bucket{le="0.075"} 0 | |
sample_app_histogram_bucket{le="0.1"} 0 | |
... | |
sample_app_histogram_bucket{le="2.5"} 3 | |
sample_app_histogram_bucket{le="5.0"} 8 | |
sample_app_histogram_bucket{le="7.5"} 11 | |
sample_app_histogram_bucket{le="10.0"} 19 | |
sample_app_histogram_bucket{le="+Inf"} 20 | |
``` | |
--- | |
``` | |
Response time histogram: | |
0.069 [1] |■ | |
0.214 [24] |■■■■■■■■■■■■■■ | |
0.359 [71] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ | |
0.505 [12] |■■■■■■■ | |
0.650 [33] |■■■■■■■■■■■■■■■■■■■ | |
0.795 [5] |■■■ | |
0.941 [19] |■■■■■■■■■■■ | |
1.086 [19] |■■■■■■■■■■■ | |
1.231 [1] |■ | |
1.376 [1] |■ | |
1.522 [14] |■■■■■■■■ | |
--- | |
## histogram typical operations | |
* make it quantile with some precision | |
* rate it to make distribution | |
--- | |
```python= | |
from prometheus_client import Histogram | |
h = Histogram('request_latency_seconds', 'Description of histogram') | |
h.observe(4.7) # Observe 4.7 (seconds in this case) | |
``` | |
--- | |
# Labels | |
* kv | |
* label value always qouted | |
```{tier="prod"}``` | |
--- | |
## Labels with special meaning | |
* instance -- one examplar of service | |
* job -- some group of instances | |
--- | |
## label operations | |
* key = "value" | |
* key =~ "value" | |
* key != "value" | |
* key !~ "value" | |
--- | |
# Re2 warning | |
https://github.com/google/re2/wiki/Syntax | |
Lot's of (NOT SUPPORTED) | |
--- | |
## typical re operations | |
``` | |
{status=~"200|201|3.."} | |
{status!~"5.."} | |
``` | |
--- | |
# Lets get back to our workshop | |
--- | |
```shell | |
% docker-compose up -d | |
``` | |
--- | |
## description | |
* http://0.0.0.0:9090 -- prometheus | |
* http://0.0.0.0:8017 -- sample app (disabled) | |
--- | |
# basic quering | |
http://0.0.0.0:9090 | |
--- | |
## prepare | |
http://0.0.0.0:9090/targets | |
green ? | |
--- | |
## simple query | |
```plaintext | |
up | |
``` | |
--- | |
## add labels | |
```plaintext | |
up{job="prometheus"} | |
``` | |
--- | |
## apply function | |
``` | |
count(up{}) | |
``` | |
--- | |
## add group by | |
``` | |
count(up{}) by (job) | |
count(up{}) by (job, instance) | |
``` | |
Note: | |
all functions and help | |
--- | |
# all metrics returned by app | |
``` | |
{job="app"} | |
``` | |
--- | |
tasks | |
1. show up instances for prometheus/node_exporter using regexp | |
2. show disk space usage last minute. какой тип файловой системы используется? как вы это поняли? | |
3. make distibution for prometheus_http_request_duration_seconds_bucket | |
4. avalaibility of exporters | |
Note: | |
``` | |
1. count(up) by (job) | |
2. deriv(node_filesystem_free_bytes[1m]) | |
3. rate(prometheus_http_request_duration_seconds_bucket[1m]) | |
4. sum_over_time(up[1h]) / count_over_time(up[1h]) | |
``` | |
--- | |
# Questions? | |
--- | |
# Next workshop | |
1. how to get an alert | |
2. How our monitoring related to this | |
--- | |
# homework | |
* latency buckets for your app. exclude static |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from uvicorn import run | |
from fastapi import FastAPI | |
from fastapi.responses import PlainTextResponse | |
import random | |
from prometheus_client import generate_latest, REGISTRY, Counter, Gauge, Histogram, Summary | |
app = FastAPI() | |
PROMETHEUS_COUNTER: Counter = Counter('sample_app_counter', 'count') | |
PROMETHEUS_GAUGE: Gauge = Gauge('sample_app_gauge', 'gauge') | |
PROMETHEUS_HISTOGRAM: Histogram = Histogram('sample_app_histogram', 'histogram') | |
PROMETHEUS_SUMMARY: Summary = Summary('sample_app_summary', 'summary') | |
# The Python client doesn't store or expose quantile information at this time. | |
@app.get('/summary/{num}') | |
def summary(num: int) -> None: | |
PROMETHEUS_SUMMARY.observe(num) | |
@app.get('/histogram/') | |
def histogram() -> None: | |
num = random.uniform(0, 11.0) | |
PROMETHEUS_HISTOGRAM.observe(num) | |
@app.get('/gauge/+/{num}') | |
def gauge_inc(num: int) -> None: | |
PROMETHEUS_GAUGE.inc(num) | |
@app.get('/gauge/-/{num}') | |
def gauge_dec(num: int) -> None: | |
PROMETHEUS_GAUGE.dec(num) | |
@app.get('/gauge/=/{num}') | |
def gauge_set(num: int) -> None: | |
PROMETHEUS_GAUGE.set(num) | |
@app.get('/counter/{num}') | |
def inc(num: int) -> None: | |
PROMETHEUS_COUNTER.inc(num) | |
@app.get('/metrics', response_class=PlainTextResponse) | |
def metrics(): | |
return generate_latest(REGISTRY) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment