Skip to content

Instantly share code, notes, and snippets.

@thaarok
Last active June 24, 2024 08:28
Show Gist options
  • Save thaarok/ea077a14019d4c0e9b59acabfd0f87a6 to your computer and use it in GitHub Desktop.
Save thaarok/ea077a14019d4c0e9b59acabfd0f87a6 to your computer and use it in GitHub Desktop.

Prometheus installation

sudo apt install prometheus prometheus-node-exporter
tee /etc/ufw/applications.d/prometheus <<EOF
[Prometheus]
title=Prometheus UI
description=Prometheus monitoring web UI.
ports=9090/tcp
EOF
sudo ufw allow from 85.195.116.121 to any app prometheus

In Grafana: Add data source -> Prometheus -> URL: http://server:9090/

Enable Prometheus metrics in Fantom Opera

Run the Fantom Opera with following parameters:

opera ... --pprof --metrics --metrics.expensive

You can skip --metrics.expensive if you are not interested in StateDB timers or per-method RPC stats.

You can check available metrics and necessary scrape_timeout using:

time curl http://localhost:6060/debug/metrics/prometheus

Let Prometheus grab data from Opera - append into /etc/prometheus/prometheus.yml:

  - job_name: opera
    scrape_interval: 30s
    scrape_timeout: 30s
    metrics_path: '/debug/metrics/prometheus'
    static_configs:
      - targets: ['localhost:6060']
sudo systemctl restart prometheus

The job_name needs to be unique, otherwise Prometheus will not start!

Google Cloud Monitoring using Ops Agent

# /etc/google-cloud-ops-agent/config.yaml
# apply changes: sudo service google-cloud-ops-agent restart
# from https://cloud.google.com/monitoring/agent/ops-agent/prometheus#oagent-config-json-exporter
logging:
  service:
    pipelines:
      default_pipeline:
        receivers: []
metrics:
  receivers:
    prometheus:
        type: prometheus
        config:
          scrape_configs:
            - job_name: 'sonic'
              scrape_interval: 30s
              scrape_timeout: 30s
              metrics_path: /debug/metrics/prometheus
              static_configs:
                - targets: ['localhost:6060']
  service:
    pipelines:
      prometheus_pipeline:
        receivers:
          - prometheus

Interesting metrics

  • p2p_peers - the amount of Opera nodes the node is connected to

  • rpc_success - the amount of successful RPC requests

  • rpc_failure - the amount of failed RPC requests

  • rpc_duration_all - the amount of time consumed by one RPC request (nanoseconds)

  • rpc_duration_${Operation}_success_count - the amount of requests of given RPC method

  • rpc_duration_${Operation}_failure_count - the amount of failed requests of given RPC method

  • txpool_slots - current amount of used memory slots (each pending or queued tx consumes one or more 32kB slots)

  • txpool_pending - current amount of pending txs (waiting to be included into the chain)

  • txpool_queued - current amount of queued txs (nonce out of order, waiting for previous tx of the account)

  • txpool_local - current amount of local txs (recieved from RPC, not from P2P, pending or queued)

  • txpool_reheap - time consumed by Reheap operation

  • txpool_valid - total amount of added valid txs (multiple txs of one account in one batch is counted as one tx) (???)

  • txpool_invalid - total amount of discarded invalid txs (invalid signature, underpriced, nonce too low, insufficient balance for value+gas*gasPrice)

  • txpool_underpriced - total amount of txs removed because underpricing (when adding into the pool, or when making space for a more valuable one)

  • txpool_overflowed - total amount of remote txs discarded, because failed to make a space for it (but they was not underpriced)

  • txpool_queued_discard - txs discarded when inserting into queue (tx for the sender+nonce already exists and insufficient price bump)

  • txpool_pending_discard - tx discarded when inserting into pending (tx for the sender+nonce already exists and insufficient price bump)

  • txpool_queued_replace - txs replaced using price bump in queue

  • txpool_pending_replace - txs replaced using price bump in pending

  • txpool_queued_ratelimit - txs dropped from queue because of rate limiting

  • txpool_pending_ratelimit - txs dropped from pending because of rate limiting

  • txpool_queued_nofunds - txs dropped from queue becase of insufficient sender balance

  • txpool_pending_nofunds - txs dropped from pending becase of insufficient sender balance

  • txpool_queued_eviction - txs dropped from queue because account inactive too long (lifetime exceeded)

go-opera-norma specific metrics:

  • chain_txs_processed - the total amout of txs in the chain (for on-chain txs/sec)
  • txpool_received - the total amount of txs added into the txpool (excluding invalid and already included ones)

Some details in Ethereum blog.

Tip: when --pprof is enabled, you can also use http://localhost:6060/debug/pprof/ where you can browse currently running gorutines or memory allocation.

Importing events with metrics

Metrics can be available also during events import:

opera --datadir /var/opera/mainnet --pprof --metrics --metrics.expensive import events ./exported-events-file

Exporting events first:

opera --datadir /var/opera/mainnet/ export events ./exported-events-file

Prometheus Retention

For prometheus installed using APT can be configured in config file:

sudo nano /etc/default/prometheus
ARGS="--storage.tsdb.retention.time=365d"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment