Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@1ambda
Created December 21, 2021 15:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save 1ambda/24964af57d7669907f0fbf13b6d8e278 to your computer and use it in GitHub Desktop.
Save 1ambda/24964af57d7669907f0fbf13b6d8e278 to your computer and use it in GitHub Desktop.
dfCalculated = df\
    .select(
        col("brand"),
        col("category_code"),
        col("price"),
    )\
    .where(col("brand").isNotNull() & col("category_code").isNotNull())\
    .groupBy("brand", "category_code")\
    .agg(sum("price").alias("price_sum"))
dfRanked = dfCalculated\
    .select(
        col("brand"),
        col("category_code"),
        col("price_sum"),
        rank().over(Window.partitionBy(col("brand")).orderBy(desc("price_sum"))).alias("rank")
    )
dfRanked\
    .where(col("rank") == lit(1))\
    .orderBy(desc("price_sum"))\
    .show(truncate=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment