Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
spark.sql("""
WITH CALCULATED (
    SELECT
        brand,
        category_code,
        sum(price) as price_sum
       
    FROM PURCHASE
   
    WHERE
        brand IS NOT NULL
        AND category_code IS NOT NULL
       
    GROUP BY brand, category_code
),
RANKED (
    SELECT
        brand,
        category_code,
        price_sum,
        rank() OVER (PARTITION BY brand ORDER BY price_sum DESC) as rank
       
    FROM CALCULATED
)
SELECT *
FROM RANKED
WHERE rank = 1
ORDER BY price_sum DESC
""").show(truncate=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment