Skip to content

Instantly share code, notes, and snippets.

@1ambda
Created December 25, 2021 04:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save 1ambda/eb2bddde3c8fae5ce1c311fe99473726 to your computer and use it in GitHub Desktop.
Save 1ambda/eb2bddde3c8fae5ce1c311fe99473726 to your computer and use it in GitHub Desktop.
== Physical Plan ==
AdaptiveSparkPlan (6)
+- == Final Plan ==
* HashAggregate (5)
+- ShuffleQueryStage (4)
+- Exchange (3)
+- * HashAggregate (2)
+- InMemoryTableScan (1)
+- InMemoryRelation (2)
+- * Filter (4)
+- InMemoryTableScan (3)
+- InMemoryRelation (4)
+- * ColumnarToRow (6)
+- Scan parquet (5)
+- == Initial Plan ==
HashAggregate (unknown)
+- Exchange (unknown)
+- HashAggregate (unknown)
+- InMemoryTableScan (1)
+- InMemoryRelation (2)
+- * Filter (4)
+- InMemoryTableScan (3)
+- InMemoryRelation (4)
+- * ColumnarToRow (6)
+- Scan parquet (5)
(1) InMemoryTableScan
Output: []
(2) InMemoryRelation
Arguments: [listing_id#10, listing_name#12], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@df15d03,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Filter (isnotnull(listing_id#10) AND (listing_id#10 >= 10000000))
+- InMemoryTableScan [listing_id#10, listing_name#12], [isnotnull(listing_id#10), (listing_id#10 >= 10000000)]
+- InMemoryRelation [listing_id#10, listing_url#11, listing_name#12, listing_summary#13, listing_desc#14], StorageLevel(disk, memory, deserialized, 1 replicas)
+- *(1) ColumnarToRow
+- FileScan parquet [listing_id#10,listing_url#11,listing_name#12,listing_summary#13,listing_desc#14] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/home/1ambda/airbnb_listings_parquet], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<listing_id:int,listing_url:string,listing_name:string,listing_summary:string,listing_desc:...
,None)
(3) InMemoryTableScan
Output [2]: [listing_id#10, listing_name#12]
Arguments: [listing_id#10, listing_name#12], [isnotnull(listing_id#10), (listing_id#10 >= 10000000)]
(4) InMemoryRelation
Arguments: [listing_id#10, listing_url#11, listing_name#12, listing_summary#13, listing_desc#14], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@df15d03,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) ColumnarToRow
+- FileScan parquet [listing_id#10,listing_url#11,listing_name#12,listing_summary#13,listing_desc#14] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/home/1ambda/airbnb_listings_parquet], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<listing_id:int,listing_url:string,listing_name:string,listing_summary:string,listing_desc:...
,None)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment