This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import boto3 | |
import pandas as pd | |
import s3fs | |
import time | |
GLUE_DATABASE = 'sample_db' | |
ATHENA_S3_OUTPUT_PATH = 's3://athena-query-results-bucket/etl-queries-temp-output-folder' | |
athena_query = '''SELECT * FROM table_name WHERE column_1="value1" AND column_2="value2"''' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Method 1 (Glue Crawler) | Method 2 (MSCK Repair) | Method 3 (Alter Table Command) | Method 4 (Boto3 SDK) | ||
---|---|---|---|---|---|
Costly | Yes | Free | Free | Very less | |
Time Taken | Minutes | Minutes | Seconds | Less than 2 seconds | |
Schema Change detection | Yes | No | No | No | |
Limitations(Current use case) | - | Athena Service Quota Limits | Athena Service Quota Limits | - |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import boto3 | |
import urllib.parse | |
import os | |
import copy | |
def create_glue_partition_handler(event, context): | |
for record in event['Records']: | |
try: | |
source_bucket = record['s3']['bucket']['name'] |