Skip to content

Instantly share code, notes, and snippets.

@mohittalele
Created September 7, 2022 13:24
Show Gist options
  • Save mohittalele/74d7133c852de407aa3fa20a78d0b627 to your computer and use it in GitHub Desktop.
Save mohittalele/74d7133c852de407aa3fa20a78d0b627 to your computer and use it in GitHub Desktop.
*** Reading local file: /opt/airflow/logs/dag_id=astro-sdk-test/run_id=scheduled__2019-01-01T11:00:00+00:00/task_id=load_file/attempt=2.log
[2022-09-07, 13:18:11 UTC] {taskinstance.py:1179} INFO - Dependencies all met for <TaskInstance: astro-sdk-test.load_file scheduled__2019-01-01T11:00:00+00:00 [queued]>
[2022-09-07, 13:18:11 UTC] {taskinstance.py:1179} INFO - Dependencies all met for <TaskInstance: astro-sdk-test.load_file scheduled__2019-01-01T11:00:00+00:00 [queued]>
[2022-09-07, 13:18:11 UTC] {taskinstance.py:1376} INFO -
--------------------------------------------------------------------------------
[2022-09-07, 13:18:11 UTC] {taskinstance.py:1377} INFO - Starting attempt 2 of 2
[2022-09-07, 13:18:11 UTC] {taskinstance.py:1378} INFO -
--------------------------------------------------------------------------------
[2022-09-07, 13:18:11 UTC] {taskinstance.py:1397} INFO - Executing <Task(LoadFileOperator): load_file> on 2019-01-01 11:00:00+00:00
[2022-09-07, 13:18:11 UTC] {standard_task_runner.py:52} INFO - Started process 243 to run task
[2022-09-07, 13:18:11 UTC] {standard_task_runner.py:79} INFO - Running: ['airflow', 'tasks', 'run', 'astro-sdk-test', 'load_file', 'scheduled__2019-01-01T11:00:00+00:00', '--job-id', '312', '--raw', '--subdir', 'DAGS_FOLDER/astro-sdk-dag.py', '--cfg-path', '/tmp/tmp0rvmby8_', '--error-file', '/tmp/tmprnvnx0d3']
[2022-09-07, 13:18:11 UTC] {standard_task_runner.py:80} INFO - Job 312: Subtask load_file
[2022-09-07, 13:18:11 UTC] {task_command.py:371} INFO - Running <TaskInstance: astro-sdk-test.load_file scheduled__2019-01-01T11:00:00+00:00 [running]> on host airflow-worker-0.airflow-worker.airflow.svc.cluster.local
[2022-09-07, 13:18:11 UTC] {taskinstance.py:1591} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=astro-sdk-test
AIRFLOW_CTX_TASK_ID=load_file
AIRFLOW_CTX_EXECUTION_DATE=2019-01-01T11:00:00+00:00
AIRFLOW_CTX_TRY_NUMBER=2
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2019-01-01T11:00:00+00:00
[2022-09-07, 13:18:11 UTC] {base.py:68} INFO - Using connection ID 's3_conn' for task execution.
[2022-09-07, 13:18:11 UTC] {load_file.py:75} INFO - Loading s3://tmp9/data/falcon_heartbeat.csv into None ...
[2022-09-07, 13:18:11 UTC] {base.py:68} INFO - Using connection ID 's3_conn' for task execution.
[2022-09-07, 13:18:11 UTC] {base_aws.py:206} INFO - Credentials retrieved from login
[2022-09-07, 13:18:11 UTC] {base.py:68} INFO - Using connection ID 's3_conn' for task execution.
[2022-09-07, 13:18:11 UTC] {base_aws.py:206} INFO - Credentials retrieved from login
[2022-09-07, 13:18:16 UTC] {taskinstance.py:1909} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/connection.py", line 175, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/httpsession.py", line 457, in send
chunked=self._chunked(request.headers),
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 786, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/util/retry.py", line 525, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/packages/six.py", line 770, in reraise
raise value
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 710, in urlopen
chunked=chunked,
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 1040, in _validate_conn
conn.connect()
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/connection.py", line 358, in connect
self.sock = conn = self._new_conn()
File "/home/airflow/.local/lib/python3.7/site-packages/urllib3/connection.py", line 187, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <botocore.awsrequest.AWSHTTPSConnection object at 0x7f126ddb0c90>: Failed to establish a new connection: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/astro/sql/operators/load_file.py", line 71, in execute
return self.load_data(input_file=self.input_file)
File "/home/airflow/.local/lib/python3.7/site-packages/astro/sql/operators/load_file.py", line 84, in load_data
return self.load_data_to_dataframe(input_file)
File "/home/airflow/.local/lib/python3.7/site-packages/astro/sql/operators/load_file.py", line 136, in load_data_to_dataframe
columns_names_capitalization=self.columns_names_capitalization
File "/home/airflow/.local/lib/python3.7/site-packages/astro/files/base.py", line 108, in export_to_dataframe
self._convert_remote_file_to_byte_stream(), **kwargs
File "/home/airflow/.local/lib/python3.7/site-packages/astro/files/base.py", line 93, in _convert_remote_file_to_byte_stream
self.path, mode=mode, transport_params=self.location.transport_params
File "/home/airflow/.local/lib/python3.7/site-packages/smart_open/smart_open_lib.py", line 224, in open
binary = _open_binary_stream(uri, binary_mode, transport_params)
File "/home/airflow/.local/lib/python3.7/site-packages/smart_open/smart_open_lib.py", line 400, in _open_binary_stream
fobj = submodule.open_uri(uri, mode, transport_params)
File "/home/airflow/.local/lib/python3.7/site-packages/smart_open/s3.py", line 224, in open_uri
return open(parsed_uri['bucket_id'], parsed_uri['key_id'], mode, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/smart_open/s3.py", line 298, in open
client_kwargs=client_kwargs,
File "/home/airflow/.local/lib/python3.7/site-packages/smart_open/s3.py", line 574, in __init__
self.seek(0)
File "/home/airflow/.local/lib/python3.7/site-packages/smart_open/s3.py", line 666, in seek
self._current_pos = self._raw_reader.seek(offset, whence)
File "/home/airflow/.local/lib/python3.7/site-packages/smart_open/s3.py", line 417, in seek
self._open_body(start, stop)
File "/home/airflow/.local/lib/python3.7/site-packages/smart_open/s3.py", line 443, in _open_body
range_string,
File "/home/airflow/.local/lib/python3.7/site-packages/smart_open/s3.py", line 330, in _get
return client.get_object(Bucket=bucket, Key=key, Range=range_string)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/client.py", line 508, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/client.py", line 899, in _make_api_call
operation_model, request_dict, request_context
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/client.py", line 921, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 119, in make_request
return self._send_request(request_dict, operation_model)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 207, in _send_request
exception,
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 361, in _needs_retry
request_dict=request_dict,
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/hooks.py", line 412, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/hooks.py", line 256, in emit
return self._emit(event_name, kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/hooks.py", line 239, in _emit
response = handler(**kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/retryhandler.py", line 207, in __call__
if self._checker(**checker_kwargs):
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/retryhandler.py", line 285, in __call__
attempt_number, response, caught_exception
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/retryhandler.py", line 320, in _should_retry
return self._checker(attempt_number, response, caught_exception)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/retryhandler.py", line 364, in __call__
attempt_number, response, caught_exception
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/retryhandler.py", line 248, in __call__
attempt_number, caught_exception
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/retryhandler.py", line 416, in _check_caught_exception
raise caught_exception
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 281, in _do_get_response
http_response = self._send(request)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 377, in _send
return self.http_session.send(request)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/httpsession.py", line 477, in send
raise EndpointConnectionError(endpoint_url=request.url, error=e)
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://tmp9.s3.amazonaws.com/data/falcon_heartbeat.csv"
[2022-09-07, 13:18:16 UTC] {taskinstance.py:1420} INFO - Marking task as FAILED. dag_id=astro-sdk-test, task_id=load_file, execution_date=20190101T110000, start_date=20220907T131811, end_date=20220907T131816
[2022-09-07, 13:18:16 UTC] {standard_task_runner.py:97} ERROR - Failed to execute job 312 for task load_file (Could not connect to the endpoint URL: "https://tmp9.s3.amazonaws.com/data/falcon_heartbeat.csv"; 243)
[2022-09-07, 13:18:16 UTC] {local_task_job.py:156} INFO - Task exited with return code 1
[2022-09-07, 13:18:16 UTC] {local_task_job.py:273} INFO - 0 downstream tasks scheduled from follow-on schedule check
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment