sudo apt install prometheus prometheus-node-exporter
tee /etc/ufw/applications.d/prometheus <<EOF
[Prometheus]
title=Prometheus UI
description=Prometheus monitoring web UI.
ports=9090/tcp
EOF
sudo ufw allow from 85.195.116.121 to any app prometheus
In Grafana: Add data source -> Prometheus -> URL: http://server:9090/
Run the Fantom Opera with following parameters:
opera ... --pprof --metrics --metrics.expensive
You can skip --metrics.expensive
if you are not interested in StateDB timers or per-method RPC stats.
You can check available metrics and necessary scrape_timeout using:
time curl http://localhost:6060/debug/metrics/prometheus
Let Prometheus grab data from Opera - append into /etc/prometheus/prometheus.yml
:
- job_name: opera
scrape_interval: 30s
scrape_timeout: 30s
metrics_path: '/debug/metrics/prometheus'
static_configs:
- targets: ['localhost:6060']
sudo systemctl restart prometheus
The job_name needs to be unique, otherwise Prometheus will not start!
# /etc/google-cloud-ops-agent/config.yaml
# apply changes: sudo service google-cloud-ops-agent restart
# from https://cloud.google.com/monitoring/agent/ops-agent/prometheus#oagent-config-json-exporter
logging:
service:
pipelines:
default_pipeline:
receivers: []
metrics:
receivers:
prometheus:
type: prometheus
config:
scrape_configs:
- job_name: 'sonic'
scrape_interval: 30s
scrape_timeout: 30s
metrics_path: /debug/metrics/prometheus
static_configs:
- targets: ['localhost:6060']
service:
pipelines:
prometheus_pipeline:
receivers:
- prometheus
-
p2p_peers
- the amount of Opera nodes the node is connected to -
rpc_success
- the amount of successful RPC requests -
rpc_failure
- the amount of failed RPC requests -
rpc_duration_all
- the amount of time consumed by one RPC request (nanoseconds) -
rpc_duration_${Operation}_success_count
- the amount of requests of given RPC method -
rpc_duration_${Operation}_failure_count
- the amount of failed requests of given RPC method -
txpool_slots
- current amount of used memory slots (each pending or queued tx consumes one or more 32kB slots) -
txpool_pending
- current amount of pending txs (waiting to be included into the chain) -
txpool_queued
- current amount of queued txs (nonce out of order, waiting for previous tx of the account) -
txpool_local
- current amount of local txs (recieved from RPC, not from P2P, pending or queued) -
txpool_reheap
- time consumed by Reheap operation -
txpool_valid
- total amount of added valid txs (multiple txs of one account in one batch is counted as one tx) (???) -
txpool_invalid
- total amount of discarded invalid txs (invalid signature, underpriced, nonce too low, insufficient balance for value+gas*gasPrice) -
txpool_underpriced
- total amount of txs removed because underpricing (when adding into the pool, or when making space for a more valuable one) -
txpool_overflowed
- total amount of remote txs discarded, because failed to make a space for it (but they was not underpriced) -
txpool_queued_discard
- txs discarded when inserting into queue (tx for the sender+nonce already exists and insufficient price bump) -
txpool_pending_discard
- tx discarded when inserting into pending (tx for the sender+nonce already exists and insufficient price bump) -
txpool_queued_replace
- txs replaced using price bump in queue -
txpool_pending_replace
- txs replaced using price bump in pending -
txpool_queued_ratelimit
- txs dropped from queue because of rate limiting -
txpool_pending_ratelimit
- txs dropped from pending because of rate limiting -
txpool_queued_nofunds
- txs dropped from queue becase of insufficient sender balance -
txpool_pending_nofunds
- txs dropped from pending becase of insufficient sender balance -
txpool_queued_eviction
- txs dropped from queue because account inactive too long (lifetime exceeded)
go-opera-norma specific metrics:
chain_txs_processed
- the total amout of txs in the chain (for on-chain txs/sec)txpool_received
- the total amount of txs added into the txpool (excluding invalid and already included ones)
Some details in Ethereum blog.
Tip: when --pprof is enabled, you can also use http://localhost:6060/debug/pprof/
where you can browse currently running gorutines or memory allocation.
Metrics can be available also during events import:
opera --datadir /var/opera/mainnet --pprof --metrics --metrics.expensive import events ./exported-events-file
Exporting events first:
opera --datadir /var/opera/mainnet/ export events ./exported-events-file
For prometheus installed using APT can be configured in config file:
sudo nano /etc/default/prometheus
ARGS="--storage.tsdb.retention.time=365d"