ilhnctn/postgresql-performance-tuning.md

## postgresql-performance-tuning.md

      
    Raw
  

              postgresql-performance-tuning.md
            
          
    Identify Slow Queries

See talk

use application_name in qs (if more than one app connects to db)
Native Partitioning is better than pg_partman

Logging

=> The more you log, the better you locate performance battlenecks. Also the mor you log, the more you slow down your system.
=> pgbadger, pg_stat_statements (what is currently happening)

Settings (reload-only):


log_min_duration_system = {x: time in ms}
=> log every query longer than x  (don't use log_statement,  log_duration)


log_line_prefix:


log_locks_wait


log_temp_files => log every disk-based process


log_autovacuum_min_duration


Use pgbadger (to analyze logs)


Configuration


work_mem


maintainance_work_mem


effective_cache_size


shared_buffers


CHECKPOINT_SEGMENTS


{min,max}_wal_size


checkpoint_timeout


chekpoint_completetion_TARGET


auto_explain

Logs explain for query based on legth of time

shared_preload_libraries = 'auto_explain'
explain.log_min_duration = 50; # ms
explain.log_nested_statements = true


Note: explain analyze is also same with the settings above but too expensive.
explain.depesz.com
Vacuum


Marks records as reusable (kind of garbage collector)
Bloat: dead tuples. Also for indexes too

Retrieving Data


Seq. Scan
Bulk Operation
Bitmap Scan

=> Use index as much as possible
Index-only scan:  ?
Joins


Nested Loop


Merge Join

Order each table
Walk trhough both, return matches
Good for bulk
Sorting is expensive


Hash Join:

Scan One table and build a hash table
Step through other table using hash table
Slow start
Very fast but memory intensive


Aggregates


Group Agg

Order / Sort Input
Step through eac record, if matches, combine
Expensive Sort


Hash Agg

Scan Table, building Hash tbl
Hash tablematches are combined
Memory instensive


Query Plan

Database gathers & uses statistics

ANALYZE
VACUUM ANALYZE
pg_statistics
Autovacuum

Solution


Indexes

Postgresql Index Types & where to use


Increase work_mem (can be configured on query)
ANALYZE tables (to fix the stats & indexes)
Prepared Queries

Avoids repeated planning cost
First five times rule: if the first 5 time returns the same plan, continue with it


Don't use select (*) (especially if you have TOASTED columns)
Prefer JOINS.
Don't use NOT EXISTS if possible
Use CTE (Common Table Expression): If CTE part is expensive to pull but will be small in size absolutely use it :)
For really fast count(*) use pg_statistics (will not be exact count). pg_class.reltuples
Use partial (conditional) indexes that eliminate null or non-related
Make sure your query-plan uses your index
Drop un-used indexes