This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def extract_numbers_from_string(str): | |
return int(''.join(i for i in str if i.isdigit())) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from datetime import datetime, timedelta | |
from holidayskr import today_is_holiday | |
def check_if_business_day(): | |
# 현재 UTC 시간 | |
now_utc = datetime.utcnow() | |
# 한국 시간 | |
kst_offset = timedelta(hours=9) | |
now_kst = now_utc + kst_offset |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
from bs4 import BeautifulSoup | |
from concurrent.futures import ThreadPoolExecutor, as_completed | |
import pandas as pd | |
import threading | |
import datetime | |
import boto3 | |
from pytz import timezone | |
import pickle |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from datetime import datetime, timedelta | |
def get_kst_dates(): | |
""" | |
현재 UTC 시간을 기준으로 한국 시간대(KST)의 '오늘'과 '어제' 날짜를 문자열 형태로 반환한다. | |
반환값: | |
today_str (str): 오늘 날짜 (KST)의 문자열 형태 ('YYYY-MM-DD'). | |
yesterday_str (str): 어제 날짜 (KST)의 문자열 형태 ('YYYY-MM-DD'). | |
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import json | |
import requests | |
import traceback | |
import time | |
def send_slack_notification(blocks, thread_ts=None): | |
""" | |
Slack 채널에 메시지를 전송하는 함수이다. | |
Args: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import time | |
def measure_runtime(func): | |
""" | |
함수의 실행 시간을 측정하여 시간, 분, 초 단위로 표시하는 데코레이터 함수입니다. | |
인자: | |
func (function): 실행 시간을 측정할 함수입니다. | |
반환값: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
from sqlalchemy import create_engine | |
def query_to_dataframe(host, port, username, password, database, query): | |
""" | |
주어진 SQL 쿼리를 실행하고 결과를 pandas DataFrame으로 반환한다. | |
매개변수: | |
host (str): 데이터베이스 서버 호스트 주소 | |
port (int): 데이터베이스 서버 포트 번호 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
import boto3 | |
from io import BytesIO | |
import time | |
def save_dataframe_to_parquet_on_s3(s3_bucket_name, df, s3_path): | |
""" | |
Pandas DataFrame을 S3 버킷에 Parquet 파일로 저장한다. | |
주어진 DataFrame을 Parquet 형식으로 변환한 후, 지정된 S3 경로에 저장한다. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
이 파일은 S3ParquetHandler 클래스를 정의하고 있으며, 이 클래스는 AWS S3에서 Parquet 파일을 효율적으로 읽거나 Pandas DataFrame을 S3에 Parquet 형태로 저장하기 위한 기능을 제공한다. 클래스의 주요 기능은 다음과 같다: | |
1. 클래스 초기화 (__init__ 메소드): | |
- AWS S3 자격 증명 및 지역 설정을 받아 boto3 세션을 초기화한다. | |
- 멀티스레딩 사용 여부(use_multithreading)와 최대 스레드 수(max_threads)를 설정할 수 있다. 멀티스레딩 사용 시, 파일 로딩 속도가 향상될 수 있다. | |
2. 단일 Parquet 파일 읽기 (read_parquet_from_s3 메소드): | |
- 지정된 S3 버킷과 키를 사용하여 단일 Parquet 파일을 읽고, Pandas DataFrame으로 반환한다. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
from pymongo import MongoClient | |
import json | |
data = json.loads(pd.read_csv('csv.csv').to_json(orient='records')) | |
coll = MongoClient(MONGO_URI)[DATABASE_NAME][COLLECTION_NAME] | |
coll.remove() | |
coll.insert_many(data) |
NewerOlder