Skip to content

Instantly share code, notes, and snippets.

@sean-azoci
sean-azoci / test.py
Last active December 14, 2021 02:28
test
#쿼리 문법 점검
client = get_bigquery_client()
job_config = bigquery.QueryJobConfig(dry_run=True, use_legacy_sql=False)
try:
query_job = client.query(
query=query,
job_config=job_config)
except Exception as e:
raise Exception(f'{file_name} SQL Syntax is not valid.')
@sean-azoci
sean-azoci / ci2.py
Last active December 14, 2021 06:03
팀 블록-CI 점검2
graph = defaultdict(list)
visited = defaultdict(int)
for node in list(graph):
if not visited[node]:
if not dfs(graph, visited, node):
raise Exception(f'{node} has not acyclic dependencies.')
@sean-azoci
sean-azoci / ci.py
Last active December 14, 2021 06:03
팀 블로그-CI 점검
#쿼리 문법 점검
client = get_bigquery_client()
job_config = bigquery.QueryJobConfig(dry_run=True, use_legacy_sql=False)
try:
query_job = client.query(
query=query,
job_config=job_config)
except Exception as e:
raise Exception(f'{file_name} SQL Syntax is not valid.')
@sean-azoci
sean-azoci / gist:85fbf1e08e91b3d65b6c196fe1540865
Last active December 14, 2021 06:01
팀 블로그 - 태스크 자동 생성
json_array = Variable.get(variable, deserialize_json=True)
filtered = [x for x in json_array if x['is_enabled'] == 'true']
for item in filtered:
ecs_dict = get_ecs_operator_dict(service_name, item, date_kst)
ecs_task = ECSOperator(
**ecs_dict
)
gcs_del_dict = get_gcs_delete_prefix_dict(service_name, item, date_kst)
gcs_del_task = GoogleCloudStorageDeleteOperator(
@sean-azoci
sean-azoci / client.json
Last active December 13, 2021 08:26
팀 블로그 - 2.클라이언트 로그 수집 spec 선언
{
"question": [
{
"event_name": "question"
}
],
"question_payment": [
{
"event_name": "question_payment"
}
@sean-azoci
sean-azoci / gist:def7a9688ffa20cad8eeddd96790e1f3
Last active December 14, 2021 05:53
팀 블로그 - ECSOperator의 파라미터
command = [
"spark-2.4.5-bin-hadoop2.7/bin/spark-submit",
"--class",
get_class_name(item_json["etl_type"]),
"--master",
"local[*]",
"--driver-memory",
resource_type["driver"],
"qanda-data-ecs-task-assembly-0.1.0-SNAPSHOT.jar",
service_name,
@sean-azoci
sean-azoci / gist:ab2c97c3ab03714bbe238a8c5835477e
Created October 2, 2021 06:14
마트 디펜던시 점검하기
#디펜던시 코드
{
'mart_a' : ['start', 'mart_d'],
'mart_b' : ['start'],
'mart_c' : ['mart_a', 'mart_b'],
'mart_d' : ['mart_c', 'mart_b']
}
#순환을 찾는 코드 예
@sean-azoci
sean-azoci / gist:983d3dc18e6f13c4ed00d7a387a7d68a
Created October 2, 2021 06:12
bigquery로 syntax 점검하기
client = get_bigquery_client()
job_config = bigquery.QueryJobConfig(dry_run=True, use_query_cache=False, use_legacy_sql=False)
try:
query_job = client.query(
query=query,
job_config=job_config)
except Exception as e:
raise Exception(f'{file_name} SQL Syntax is not valid.')
@sean-azoci
sean-azoci / gist:d831fd6efeef6cd78ac08dc9421e6f60
Created October 2, 2021 06:06
마트 태스크 코드 없애기#1
#파일명
{dataset}/{table_name}.sql
#태스크명
{dataset}_{table_name}
#테이블명
{dataset}.{table_name}
@sean-azoci
sean-azoci / gist:6b3f6707f7be09bcb884435dc42174d0
Created September 27, 2021 23:51
ETL 코드 없애기(airflow)
#Dag 내 실행 코드 예시
variable = '{service_name}'
tables = Variable.get(variable, deserialize_json=True)
for table in tables:
spark_task = Spark_job_Operator(task_id = 'fspark_{table}' ...)
load_task = Python_Operator(task_id = f'load_{table}' ...)
spark_task >> load_task