Parameters | ETL | ELT |
---|---|---|
process | data is transformed at staging server and then transferred to Datawarehouse DB | data remains in the DB of the Datawarehouse |
code usage | compute-intensive transformations and for small amount of data | high amounts of data |
transformation | done in server/staging area | performed in the target system |
time-load | data first loaded into staging and later loaded into target system (time intensive) | data loaded into target system only once (faster) |
time-transformation | needs to wait for transformation to complete. As data size grows, transformation time increases | speed is never dependant on the size of the data |
time- maintenance | needs high maintenance for selecting data to load and transform | low maintenance as data is always available |
implementation complexity | at an early stage, easier to implement | requires deep knowledge of tools and expert skills |
support for data warehouse | used for on-premises, relational and structured data | used in scalable cloud infrastructure which supports structured, unstructured data sources |
support for lake/mart/lakehouse | no | yes |
complexity | oads only the important data, as identified at design time | involves development from the output-backward and loading only relevant data |
cost | high costs for small and medium businesses | low entry costs using online SasS Platforms |
lookups | both facts and dimensions need to be available in staging area | all data will be available because extract and load occur in one single action |
aggregations | complexity increase with the additional amount of data in the dataset | power of the target platform can process significant amount of data quickly |
calculations | overwrites existing column or need to append the dataset and push to the target platform | easily add the calculated column to the existing table |
maturity | mature and well documented | relatively new concept and complex to implement |
hardware | most tools have unique hardware requirements that are expensive | being Saas hardware cost is not an issue |
support unstructured data | mostly supports relational data | supports |
size/type of data set | best for small relational data sets with complex transformations relevant to analysis goals | can process any big data, structured or unstructured |
compliance/privacy | better for GDPR, HIPAA, and CCPA compliance since sensitive data can be omitted before loading | having more risk of exposing private data |
data output | tipically structured | structured, semi-structured, unstructured |
Created
June 19, 2023 06:30
-
-
Save EliFuzz/43aa4fe25acbd7d306c2f5f6a1cea344 to your computer and use it in GitHub Desktop.
Comparison table of ETL vs ELT
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment