Skip to content

Instantly share code, notes, and snippets.

@robert8138
Last active July 11, 2022 12:08
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save robert8138/23b6d78e78a90423ab0a45df20d4a714 to your computer and use it in GitHub Desktop.
Save robert8138/23b6d78e78a90423ab0a45df20d4a714 to your computer and use it in GitHub Desktop.
Demonstrate how to do incremental load
-- Not Recommended Approach: Scan the entire table and rebuild everyday
INSERT OVERWRITE TABLE dim_total_bookings PARTITION (ds = '{{ ds }}')
SELECT
dim_market
, SUM(m_bookings) AS m_bookings
FROM
fct_bookings
WHERE
ds <= '{{ ds }}' -- this is expensive, and can quickly run into scale issue
GROUP BY
dim_market
;
-- Recommended Approach: Incremental Load
INSERT OVERWRITE TABLE dim_total_bookings PARTITION (ds = '{{ ds }}')
SELECT
dim_market
, SUM(m_bookings) AS m_bookings
FROM (
SELECT
dim_market
, m_bookings
FROM
dim_total_bookings -- a dim table
WHERE
ds = DATE_SUB('{{ ds }}', 1) -- from the previous ds
UNION
SELECT
dim_market
, SUM(m_bookings) AS m_bookings
FROM
fct_bookings -- a fct table
WHERE
ds = '{{ ds }}' -- from the current ds
GROUP BY
dim_market
) a
GROUP BY
dim_market
;
@Meena1118
Copy link

Hello,
Can you explain what is ds here and also can you share your datasets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment