Skip to content

Instantly share code, notes, and snippets.

@1ambda
Created December 21, 2021 15:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save 1ambda/21c019180709731b2396aa55f7e963be to your computer and use it in GitHub Desktop.
Save 1ambda/21c019180709731b2396aa55f7e963be to your computer and use it in GitHub Desktop.
spark.sql("""
WITH CALCULATED (
    SELECT
        brand,
        category_code,
        sum(price) as price_sum
       
    FROM PURCHASE
   
    WHERE
        brand IS NOT NULL
        AND category_code IS NOT NULL
       
    GROUP BY brand, category_code
),
RANKED (
    SELECT
        brand,
        category_code,
        price_sum,
        rank() OVER (PARTITION BY brand ORDER BY price_sum DESC) as rank
       
    FROM CALCULATED
)
SELECT *
FROM RANKED
WHERE rank = 1
ORDER BY price_sum DESC
""").show(truncate=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment