Skip to content

Instantly share code, notes, and snippets.

@aialenti
Created September 17, 2020 23:46
Show Gist options
  • Save aialenti/d5372a4ce295503ca5ffdbcb7d8ea976 to your computer and use it in GitHub Desktop.
Save aialenti/d5372a4ce295503ca5ffdbcb7d8ea976 to your computer and use it in GitHub Desktop.
# Read the source tables in Parquet format
sales_table = spark.read.parquet("./data/sales_parquet")
'''
SELECT order_id AS the_order_id,
seller_id AS the_seller_id,
num_pieces_sold AS the_number_of_pieces_sold
FROM sales_table
'''
# Execution Plan and show action in one line
sales_table_execution_plan = sales_table.select(
sales_table["order_id"].alias("the_order_id"),
sales_table["seller_id"].alias("the_seller_id"),
sales_table["num_pieces_sold"].alias("the_number_of_pieces_sold")
).show(5, True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment