Skip to content

Instantly share code, notes, and snippets.

@dmateusp
Last active June 5, 2019 19:16
Show Gist options
  • Save dmateusp/9e81ce2751a3d0ee1cdb77c0758cc1a6 to your computer and use it in GitHub Desktop.
Save dmateusp/9e81ce2751a3d0ee1cdb77c0758cc1a6 to your computer and use it in GitHub Desktop.
DataFrame.transform - Spark Function Composition - Functions pre refactor
def sumAmounts(df: DataFrame, by: Column*): DataFrame =
df.groupBy(by: _*).agg(sum(col("amount")))
def extractPayerBeneficiary(columnName: String, df: DataFrame): DataFrame =
df.withColumn(
s"${columnName}_payer",
regexp_extract(
col(columnName),
"paid by ([A-Z])",
1
)
).withColumn(
s"${columnName}_beneficiary",
regexp_extract(
col(columnName),
"to ([A-Z])",
1
)
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment