Skip to content

Instantly share code, notes, and snippets.

@scmmishra
Created March 5, 2024 06:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save scmmishra/fec0177fa6047367204afaf66bfb8da7 to your computer and use it in GitHub Desktop.
Save scmmishra/fec0177fa6047367204afaf66bfb8da7 to your computer and use it in GitHub Desktop.
Queries I run to check for DB health

Cache Efficiency Ratio

SELECT relname, heap_blks_read, heap_blks_hit, 
       (heap_blks_hit / NULLIF(heap_blks_hit + heap_blks_read, 0)) AS hit_ratio 
FROM pg_statio_user_tables 
ORDER BY hit_ratio ASC, heap_blks_read DESC 
LIMIT 10;

-- hit_ratio: Represents the proportion of times data requested from a table was 
-- found in the cache (memory) as opposed to being read from the disk. 
-- A high ratio indicates good cache performance, meaning most data
-- requests are satisfied from memory, which is faster than disk reads.

-- heap_blks_read and heap_blks_hit: These columns provide raw numbers of 
-- block reads from disk (heap_blks_read) and block hits in cache (heap_blks_hit). 
-- Tables with high heap_blks_read and low heap_blks_hit are candidates for optimization.

This query identifies tables with low cache hit ratios, indicating frequent disk reads over cache (memory) reads. Tables with low hit ratios suggest performance issues, as accessing data from disk is slower than from cache. This situation often arises when tables are too large for the available cache, or the workload involves diverse data access, leading to insufficient cache utilization.

To address this, consider increasing cache memory, optimizing query patterns, improving indexing strategies, or employing data management techniques like partitioning. These steps aim to enhance cache performance, reduce disk reads, and overall improve database efficiency.


High Data Volume Queries

SELECT query, rows, calls, rows / calls AS rows_per_call
FROM pg_stat_statements
WHERE calls > 100 AND query NOT LIKE '-- Metabase%'
ORDER BY rows_per_call DESC
LIMIT 10;

Identifies frequently executed queries returning many rows per call. Useful for pinpointing performance-impacting queries needing optimization like selectivity refinement or pagination.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment