See talk
- use application_name in qs (if more than one app connects to db)
- Native Partitioning is better than pg_partman
=> The more you log, the better you locate performance battlenecks. Also the mor you log, the more you slow down your system.
=> pgbadger, pg_stat_statements (what is currently happening)
Settings (reload-only):
-
log_min_duration_system = {x: time in ms} => log every query longer than x (don't use log_statement, log_duration)
-
log_line_prefix:
-
log_locks_wait
-
log_temp_files => log every disk-based process
-
log_autovacuum_min_duration
-
Use pgbadger (to analyze logs)
-
work_mem
-
maintainance_work_mem
-
effective_cache_size
-
shared_buffers
-
CHECKPOINT_SEGMENTS
-
{min,max}_wal_size
-
checkpoint_timeout
-
chekpoint_completetion_TARGET
-
auto_explain
- Logs explain for query based on legth of time
shared_preload_libraries = 'auto_explain' explain.log_min_duration = 50; # ms explain.log_nested_statements = true
Note: explain analyze
is also same with the settings above but too expensive.
explain.depesz.com
- Marks records as reusable (kind of garbage collector)
- Bloat: dead tuples. Also for indexes too
- Seq. Scan
- Bulk Operation
- Bitmap Scan
=> Use index as much as possible Index-only scan: ?
-
Nested Loop
-
Merge Join
- Order each table
- Walk trhough both, return matches
- Good for bulk
- Sorting is expensive
-
Hash Join:
- Scan One table and build a hash table
- Step through other table using hash table
- Slow start
- Very fast but memory intensive
- Group Agg
- Order / Sort Input
- Step through eac record, if matches, combine
- Expensive Sort
- Hash Agg
- Scan Table, building Hash tbl
- Hash tablematches are combined
- Memory instensive
Database gathers & uses statistics
- ANALYZE
- VACUUM ANALYZE
- pg_statistics
- Autovacuum
- Indexes
- Postgresql Index Types & where to use
- Increase work_mem (can be configured on query)
- ANALYZE tables (to fix the stats & indexes)
- Prepared Queries
- Avoids repeated planning cost
- First five times rule: if the first 5 time returns the same plan, continue with it
- Don't use select (*) (especially if you have TOASTED columns)
- Prefer JOINS.
- Don't use
NOT EXISTS
if possible - Use CTE (Common Table Expression): If CTE part is expensive to pull but will be small in size absolutely use it :)
- For really fast count(*) use pg_statistics (will not be exact count). pg_class.reltuples
- Use partial (conditional) indexes that eliminate null or non-related
- Make sure your query-plan uses your index
- Drop un-used indexes