Skip to content

Instantly share code, notes, and snippets.

@aialenti
Last active September 20, 2020 14:06
Show Gist options
  • Save aialenti/49a214c72154b0e2eded4155c75c92e7 to your computer and use it in GitHub Desktop.
Save aialenti/49a214c72154b0e2eded4155c75c92e7 to your computer and use it in GitHub Desktop.
# Read the source tables in Parquet format
sales_table = spark.read.parquet("./data/sales_parquet")
'''
SELECT *
FROM sales_table
WHERE bill_raw_text RLIKE '(ab[cd]{2,4})|(aa[abcde]{1,2})'
'''
sales_table_execution_plan = sales_table.where(
col('bill_raw_text').rlike("(ab[cd]{2,4})|(aa[abcde]{1,2})")
)
sales_table_execution_plan.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment