stephenleo/01_bigquery.md

## 01_bigquery.md

      
    Raw
  

              01_bigquery.md
            
          
    Big Query and SQL Hacks


Use QUALIFY to filter rows after a window function. It's way more computationally efficient than using a CTE.
SELECT RANK() OVER (PARTITION BY product_id ORDER BY rating DESC) AS rank 
FROM table
QUALIFY rank = 1


ARRAY_AGG() doesn't preserve row order, so you need to specify the ordering within the function:
ARRAY_AGG(
    STRUCT(rev.createdAt, rev.rating, rev.title, rev.content) 
    ORDER BY rev.createdAt DESC
    ) AS PRODUCT_REVIEWS


BigQuery must NEVER be used in a for loop kind of query where you run the same query multiple times with different parameters. It's very expensive. Instead, try to combine multiple queries into one such as using the IN clause in a WHERE clause.
-- DO NOT DO THIS
SELECT * FROM table WHERE id = 1

UNION ALL

SELECT * FROM table WHERE id = 2

-- DO THIS INSTEAD
SELECT * FROM table WHERE id IN (1, 2)


BQ Table Value Functions (TVF) get executed whenever each row is created. This is troublesome if you're using TVF's to calculate something like ingestion time and have to recover a table using time travel. You're ingestion time for all the rows in the time travel recovered table will be the time that you recovered the table at, not the time that the row was originally ingested.


## 02_DE_concepts.md

      
    Raw
  

              02_DE_concepts.md
            
          
    Important Data Engineering concepts


Idenpotency: The same operation applied multiple times should have the same effect as applying it once. This ensures you can safely rerun pipeline without worry about the result changing.
Reverse ETL: Extracting data from a data warehouse and loading it into a source system like SFMC or SAP. Popular choices are: Census, Hightouch and Rudderstack.
Semantic Layer: A layer of abstraction between the data warehouse and the data mart that allows for consistent definitions of business KPIs in yaml files. Popular choices are: DBT Metrics and Cube.