just3ws/explain_explained_notes.md

## explain_explained_notes.md

      
    Raw
  

              explain_explained_notes.md
            
          
    Postgres EXPLAIN Lunch & Learn @ BenchPrep

EXPLAIN Explained video on YouTube

What EXPLAIN doesn't do


Tell you why a particular index isn't used
Explain how to rewrite your queries
Show what other factors make the DB slow
Tell you how much time the request took outside the DB

The query planner


Break the query down into atomic "nodes"
Figure out every possible way to execute each "node" and it's cost
Chain the combinations together into plans
Calculate the total "cost" of each plan
Pick the plan with the lowest "cost"

Nodes

(Rowset + Parameters) =(Node Operation)=> Rowset
Sort node

(Scan rows + Sort columns) =(Sort)=> Sorted rows
$cost$

A figure in completely arbitrary cost units (float) which is intended to
represent estimated time and system resources required to execute the query
based on dubious theor and proven practice.
(some people adjust seq_page_cost)
EXPLAIN vs EXPLAIN ANALYZE


EXPLAIN shows what the planner planned to do.
EXPLAIN ANALYZE shows the plan, executes the query, then shows how it did.
Most of the time you want to run EXPLAIN ANALYZE unless:

The query never finishes.
You're just checking index usage.
Your just checking for partition scans.


Other EXPLAIN options


buffers
costs & timing
format (yaml, xml, json)
verbose

explain (analyze on, verbose on, format yaml)
NOTE

The explain output is an inverted tree. Read from inside to outside.
Node Types

Scans

Most will contain a filter condition
Seq Scan

Full-table scan
Index Scan


Scan index
Look up tuples in table

Index-Only Scan


Scan index
Ignore Heap

Bitmap Scan

Bitmap Heap Scan, Bitmap Index Scan

Scan index
Build bitmap of compliant rows

Joins

NestLoop


For each row in Set A:

Look up rows in Set B


# Essentially...
for each row in input_a
  for each row in input_b
    do something
  next
next
# Ref: https://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT

Hash Join


Make a Hash Table of Set A:

Match groups of rows in Set B


Merge Join


Sort both Set A and Set B

The "zip" them together


Lateral Join


Join against table expressions

Semi-Join


Partial join, for outer join

Anti-Join


Exclusion, such as for NOT IN()

Aggregates

GroupAggregate


Regular aggregation

HashAggregate


In-memory aggregate using hash tables

WindowAgg


Used for windowing clause expressions

Other operations

Unique


De-deduplicate rows, like for DISTINCT or UNION

Sort


Sort

Limit


Supports LIMIT/OFFSET

CTEScan


Joins a CTE (WITH clause) to the main query

SubqueryScan


Same as CTEScan but for subqueries

Materialize


Create a recordset in-memory from a query fragment

Append


Merge rowsets for UNION ALL and partitions

Hints when reading EXPLAIN


Seq Scan usually can mean an index is needed on the scanned columns
Index Scan usually can mean there was a bad row estimate (?)
Nested Loops look at time consumption, rows, loop count
Index Scan look at time consumption, rows, loop count

Tweaking


Run EXPLAIN
Toggle options based on initial output
GOTO 1

Automatically check


auto_explain


F.3. auto_explain
The auto_explain module provides a means for logging execution plans of slow
statements automatically, without having to run EXPLAIN by hand. This is
especially helpful for tracking down un-optimized queries in large
applications.
The module provides no SQL-accessible functions. To use it, simply load it
into the server. You can load it into an individual session:
LOAD 'auto_explain'; (You must be superuser to do that.) More typical usage
is to preload it into some or all sessions by including auto_explain in
session_preload_libraries or shared_preload_libraries in postgresql.conf.
Then you can track unexpectedly slow queries no matter when they happen. Of
course there is a price in overhead for that.

Reading


Using EXPLAIN
EXPLAIN docs
Reading PgAdmin Graphical EXPLAIN Plans
Introduction to VACUUM, ANALYZE, EXPLAIN, and COUNT
Reading a Postgres EXPLAIN ANALYZE Query Plan
Advanced Postgres Performance Tips
Explaining Your PostgreSQL Data
Understanding Postgres Performance
More on Postgres Performance
Speed up with Materialized Views on PostgreSQL and Rails

Other tools

User.where(:id => 1).joins(:posts).explain

config.active_record.auto_explain_threshold_in_seconds = 0.5

Bullet

Help to kill N+1 queries and unused eager loading.
pghero

A performance dashboard for Postgres
sqlint

pg_query

Parses SQL queries using a copy of the PostgreSQL server query parser