- Don’t
SELECT *
, Specify explicit column names (columnar store) - Avoid large JOINs (filter each table first)
- In PRESTO tables are joined in the order they are listed!!
- Join small tables earlier in the plan and leave larger fact tables to the end
- Avoid cross joins or 1 to many joins as these can degrade performance
- Order by and group by take time
- only use order by in subqueries if it is really necessary
- When using GROUP BY, order the columns by the highest cardinality (that is, most number of unique values) to the lowest.
FWIW: I (@rondy) am not the creator of the content shared here, which is an excerpt from Edmond Lau's book. I simply copied and pasted it from another location and saved it as a personal note, before it gained popularity on news.ycombinator.com. Unfortunately, I cannot recall the exact origin of the original source, nor was I able to find the author's name, so I am can't provide the appropriate credits.
- By Edmond Lau
- Highly Recommended 👍
- http://www.theeffectiveengineer.com/