This page has the resources for my Azure Data Lake Design Patterns talk.
Data lakes have been around for several years and there is still much hype and hyperbole surrounding their use. This session covers the basic design patterns and architectural principles to make sure you are using the data lake and underlying technologies effectively.
We will cover things like best practices for data ingestion and recommendations on file formats as well as designing effective zones and folder hierarchies to prevent the dreaded data swamp. We’ll also discuss how to consume and process data from a data lake. And we will cover the often overlooked areas of governance and security best practices.
This session goes beyond corny puns and broken metaphors and provides real-world guidance from dozens of successful implementations in Azure.
The Latest PDF copy of the slides
https://docs.microsoft.com/en-us/azure/architecture/data-guide/scenarios/data-lake
https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-best-practices
https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-performance-tuning-guidance
https://github.com/Azure/AzureDataLake
https://github.com/rukmani-msft/adlsguidancedoc/blob/master/Hitchhikers_Guide_to_the_Datalake.md
This is a link of the recorded versions of this talk in reverse chronological order
SQLBITS 2020
TBD
SQLSaturday LA
TBD
SQL PASS Cloud Virtual Chapter 2020 http://bit.ly/datalakecloudvc
SQLBITS 2019 http://bit.ly/datalakesqlbits2019