Skip to content

Instantly share code, notes, and snippets.

@garystafford
Last active May 16, 2022 06:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save garystafford/e05de9db6ae79617702e53ab9a314e71 to your computer and use it in GitHub Desktop.
Save garystafford/e05de9db6ae79617702e53ab9a314e71 to your computer and use it in GitHub Desktop.
WITH credit_demographics AS (
SELECT
(year (now()) - c_birth_year) AS age,
cd_credit_rating AS credit_rating,
cd_gender AS gender,
count(cd_gender) AS gender_count
FROM
postgresql.public.customer
LEFT JOIN hive.default.customer_demographics ON c_current_cdemo_sk = cd_demo_sk
WHERE
c_birth_year IS NOT NULL
AND cd_credit_rating IS NOT NULL
AND lower(cd_credit_rating) != 'unknown'
AND cd_gender IS NOT NULL
GROUP BY
cd_credit_rating,
c_birth_year,
cd_gender
)
SELECT
age,
credit_rating,
gender,
gender_count
FROM
credit_demographics
WHERE
age BETWEEN 21 AND 65
ORDER BY
age,
credit_rating,
gender;
@WanjohiChristopher
Copy link

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment