Skip to content

Instantly share code, notes, and snippets.

View aaneja's full-sized avatar

Anant Aneja aaneja

  • Ahana
  • New Delhi
  • 14:13 (UTC +05:30)
View GitHub Profile
We can make this file beautiful and searchable if this error is corrected: It looks like row 5 should actually have 9 columns, instead of 6. in line 4.
"dt","query_id","query","schema_name","catalog_name","environment","query_wall_time_ms","total_split_cpu_time_ms","splits"
"2024-03-19 03:53:16","20240319_035316_00019_bcy7w","explain analyze SELECT CAST(ss_quantity AS decimal(10,0)) FROM hive.tpcds_sf1000_parquet_varchar.store_sales
",tpcds_sf1000_parquet_varchar,hive,enganantb1022n,24113,153564,1066
"2024-03-19 03:53:42","20240319_035342_00020_bcy7w","explain analyze SELECT CAST(ss_quantity AS decimal(10,0)) FROM hive.tpcds_sf1000_parquet_varchar.store_sales
",tpcds_sf1000_parquet_varchar,hive,enganantb1022n,21210,159054,1066
"2024-03-19 03:54:05","20240319_035405_00021_bcy7w","explain analyze SELECT CAST(ss_quantity AS decimal(10,0)) FROM hive.tpcds_sf1000_parquet_varchar.store_sales
",tpcds_sf1000_parquet_varchar,hive,enganantb1022n,21156,162602,1065
"2024-03-19 03:54:28","20240319_035428_00022_bcy7w","explain analyze SELECT CAST(ss_quantity AS decimal(10,0)) FROM hive.tpcds_sf1000_parquet_varchar.store_sales
",tpcds_sf1000_parquet_varchar,hive,enganantb1

Cost based evaluation of predicates

Queries are run against a Parquet TPCH SF 1 schema on Presto 0.286 on a single node cluster

Logical plan comparison of equivalent queries

Q1

presto:tpch_sf1_parquet> explain select l_quantity from lineitem l, orders o where l_orderkey = o_orderkey and o_totalprice < 1000 and l_quantity  < 48;
Query Plan
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
{
"id" : "8",
@aaneja
aaneja / plan_log_with_filter_pushdown.log
Last active September 25, 2023 09:14
TPCDS Q9 with and without connector filter pushdown
2023-09-25T14:32:22.405+0530 INFO Query-20230925_090220_00000_asxxc-1169 com.facebook.presto.sql.planner.optimizations.LogPlanTreeOptimizer ------ Before ApplyConnectorOptimization
2023-09-25T14:32:22.409+0530 INFO Query-20230925_090220_00000_asxxc-1169 com.facebook.presto.sql.planner.optimizations.LogPlanTreeOptimizer Plan :
project, Assignments ({expr_449=SWITCH(true, WHEN(GREATER_THAN(count, 74129), avg), avg_64), expr_450=SWITCH(true, WHEN(GREATER_THAN(count_95, 122840), avg_127), avg_159), expr_451=SWITCH(true, WHEN(GREATER_THAN(count_190, 56580), avg_222), avg_254), expr_452=SWITCH(true, WHEN(GREATER_THAN(count_285, 10097), avg_317), avg_349), expr_453=SWITCH(true, WHEN(GREATER_THAN(count_380, 165306), avg_412), avg_444)}), Outputs ([expr_449, expr_450, expr_451, expr_452, expr_453]):
join (INNER), Equi-join condition([]), Filter (Optional.empty), Outputs([r_reason_sk, count, avg, avg_64, count_95, avg_127, avg_159, count_190, avg_222, avg_254, count_285, avg_317, avg_349, count_380, avg_412, avg_44
@aaneja
aaneja / results.txt
Created August 19, 2022 18:19
net.agkn.hll.HLL v. com.facebook.airlift.stats.cardinality.HyperLogLog
net.agkn.hll.HLL cardinality estimate : 100000
com.facebook.airlift.stats.cardinality.HyperLogLog cardinality estimate : 100426
@aaneja
aaneja / show_stats_all.sql
Last active August 15, 2022 18:48
Presto PR 18188 : SF100 Stats from TPCH connector v. Stats from materialized table
show stats for customer ;
show stats for lineitem ;
show stats for nation ;
show stats for orders ;
show stats for part ;
show stats for partsupp ;
show stats for region ;
show stats for supplier;
@aaneja
aaneja / run-java
Created February 22, 2019 22:38 — forked from PetrGlad/run-java
Enable java remote debugger (JDWP, openjdk)
java -agentlib:jdwp=transport=dt_socket,address=8000,server=y,suspend=n ClassName