Skip to content

Instantly share code, notes, and snippets.

@ferryvg
Created February 11, 2021 09:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ferryvg/d6be9f5b1ca739bc378d7743cb1b8e07 to your computer and use it in GitHub Desktop.
Save ferryvg/d6be9f5b1ca739bc378d7743cb1b8e07 to your computer and use it in GitHub Desktop.
Apache Beam "Must provide one of client or options" (Flink, external env)
beam-worker | Starting worker with command ['/opt/apache/beam/boot', '--id=1-1', '--logging_endpoint=localhost:45389', '--artifact_endpoint=localhost:34731', '--provision_endpoint=localhost:43605', '--control_endpoint=localhost:41783']
beam-worker | 2021/02/11 08:27:51 Provision info:
beam-worker | pipeline_options:{fields:{key:"beam:option:app_name:v1" value:{null_value:NULL_VALUE}} fields:{key:"beam:option:artifact_port:v1" value:{string_value:"0"}} fields:{key:"beam:option:beam_services:v1" value:{struct_value:{}}} fields:{key:"beam:option:dataflow_endpoint:v1" value:{string_value:"https://dataflow.googleapis.com"}} fields:{key:"beam:option:direct_num_workers:v1" value:{string_value:"1"}} fields:{key:"beam:option:direct_runner_bundle_repeat:v1" value:{string_value:"0"}} fields:{key:"beam:option:direct_runner_use_stacked_bundle:v1" value:{bool_value:true}} fields:{key:"beam:option:direct_running_mode:v1" value:{string_value:"in_memory"}} fields:{key:"beam:option:dry_run:v1" value:{bool_value:false}} fields:{key:"beam:option:enable_streaming_engine:v1" value:{bool_value:false}} fields:{key:"beam:option:environment_cache_millis:v1" value:{string_value:"0"}} fields:{key:"beam:option:environment_config:v1" value:{string_value:"127.0.0.1:50000"}} fields:{key:"beam:option:environment_type:v1" value:{string_value:"EXTERNAL"}} fields:{key:"beam:option:expansion_port:v1" value:{string_value:"0"}} fields:{key:"beam:option:experiments:v1" value:{list_value:{values:{string_value:"beam_fn_api"}}}} fields:{key:"beam:option:extra_packages:v1" value:{list_value:{values:{string_value:"/tmp/tmpvy0mx3pm/build/tfx/dist/tfx_ephemeral-0.27.0.tar.gz"}}}} fields:{key:"beam:option:flink_master:v1" value:{string_value:"[auto]"}} fields:{key:"beam:option:flink_submit_uber_jar:v1" value:{bool_value:true}} fields:{key:"beam:option:flink_version:v1" value:{string_value:"1.10"}} fields:{key:"beam:option:hdfs_full_urls:v1" value:{bool_value:false}} fields:{key:"beam:option:job_name:v1" value:{string_value:"BeamApp-orbitadserving-0211082747-ebedf113"}} fields:{key:"beam:option:job_port:v1" value:{string_value:"0"}} fields:{key:"beam:option:job_server_java_launcher:v1" value:{string_value:"java"}} fields:{key:"beam:option:job_server_timeout:v1" value:{string_value:"60"}} fields:{key:"beam:option:labels:v1" value:{list_value:{values:{string_value:"tfx_executor=tfx-components-example_gen-import_example_gen-executor-executor"} values:{string_value:"tfx_py_version=3-8"} values:{string_value:"tfx_runner=local"} values:{string_value:"tfx_version=0-27-0"}}}} fields:{key:"beam:option:no_auth:v1" value:{bool_value:false}} fields:{key:"beam:option:options_id:v1" value:{number_value:1}} fields:{key:"beam:option:performance_runtime_type_check:v1" value:{bool_value:false}} fields:{key:"beam:option:pipeline_type_check:v1" value:{bool_value:true}} fields:{key:"beam:option:profile_cpu:v1" value:{bool_value:false}} fields:{key:"beam:option:profile_memory:v1" value:{bool_value:false}} fields:{key:"beam:option:profile_sample_rate:v1" value:{number_value:1}} fields:{key:"beam:option:runtime_type_check:v1" value:{bool_value:false}} fields:{key:"beam:option:s3_access_key_id:v1" value:{string_value:"valid-access-key"}} fields:{key:"beam:option:s3_disable_ssl:v1" value:{bool_value:false}} fields:{key:"beam:option:s3_endpoint_url:v1" value:{string_value:"http://192.168.0.117:49000"}} fields:{key:"beam:option:s3_region_name:v1" value:{string_value:"us-east-1"}} fields:{key:"beam:option:s3_secret_access_key:v1" value:{string_value:"valid-secret-key"}} fields:{key:"beam:option:s3_verify:v1" value:{string_value:"false"}} fields:{key:"beam:option:save_main_session:v1" value:{bool_value:false}} fields:{key:"beam:option:sdk_location:v1" value:{string_value:"container"}} fields:{key:"beam:option:sdk_worker_parallelism:v1" value:{string_value:"1"}} fields:{key:"beam:option:spark_master_url:v1" value:{string_value:"local[4]"}} fields:{key:"beam:option:spark_submit_uber_jar:v1" value:{bool_value:false}} fields:{key:"beam:option:streaming:v1" value:{bool_value:false}} fields:{key:"beam:option:type_check_additional:v1" value:{string_value:""}} fields:{key:"beam:option:type_check_strictness:v1" value:{string_value:"DEFAULT_TO_ANY"}} fields:{key:"beam:option:update:v1" value:{bool_value:false}}} retrieval_token:"__no_artifacts_staged__" logging_endpoint:{url:"localhost:45389"} artifact_endpoint:{url:"localhost:34731"} control_endpoint:{url:"localhost:41783"} dependencies:{type_urn:"beam:artifact:type:file:v1" type_payload:"\n\xb2\x01classpath://BEAM-PIPELINE/pipeline/artifacts/job-43df62c0-1fc5-4803-9fd1-f79edae75dbb/e8fb0840cad8d39c94306ae72f60eaef875fa8494e831c359990191c82ee2fa3-tfx_ephemeral-0.27.0.tar.gz" role_urn:"beam:artifact:role:staging_to:v1" role_payload:"\n\x1btfx_ephemeral-0.27.0.tar.gz"} dependencies:{type_urn:"beam:artifact:type:file:v1" type_payload:"\n\xa9\x01classpath://BEAM-PIPELINE/pipeline/artifacts/job-43df62c0-1fc5-4803-9fd1-f79edae75dbb/9d0ea0019a043f9037a05efac4457cb449632fec335676550dbbc67e7e3bf83f-extra_packages.txt" role_urn:"beam:artifact:role:staging_to:v1" role_payload:"\n\x12extra_packages.txt"}
beam-worker | 2021/02/11 08:27:51 Initializing python harness: /opt/apache/beam/boot --id=1-1 --logging_endpoint=localhost:45389 --artifact_endpoint=localhost:34731 --provision_endpoint=localhost:43605 --control_endpoint=localhost:41783
beam-worker | 2021/02/11 08:27:51 Found artifact: tfx_ephemeral-0.27.0.tar.gz
beam-worker | 2021/02/11 08:27:51 Found artifact: extra_packages.txt
beam-worker | 2021/02/11 08:27:51 Installing setup packages ...
beam-worker | 2021/02/11 08:27:51 Installing extra package: tfx_ephemeral-0.27.0.tar.gz
beam-worker | WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.
beam-worker | You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
beam-worker | DEPRECATION: Source distribution is being reinstalled despite an installed package having the same name and version as the installed package. pip 21.1 will remove support for this functionality. A possible replacement is use --force-reinstall. You can find discussion regarding this at https://github.com/pypa/pip/issues/8711.
beam-worker | WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.
beam-worker | You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
beam-worker | 2021/02/11 08:30:01 Executing: python -m apache_beam.runners.worker.sdk_worker_main
beam-worker | Traceback (most recent call last):
beam-worker | File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
beam-worker | File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
beam-worker | File "apache_beam/runners/common.py", line 1374, in apache_beam.runners.common._OutputProcessor.process_outputs
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1425, in process
beam-worker | initial_restriction = self.restriction_provider.initial_restriction(
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/io/iobase.py", line 1545, in initial_restriction
beam-worker | range_tracker = element_source.get_range_tracker(None, None)
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 210, in get_range_tracker
beam-worker | return self._get_concat_source().get_range_tracker(
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 200, in _f
beam-worker | return fnc(self, *args, **kwargs)
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 145, in _get_concat_source
beam-worker | match_result = FileSystems.match([pattern])[0]
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystems.py", line 209, in match
beam-worker | return filesystem.match(patterns, limits)
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystem.py", line 765, in match
beam-worker | raise BeamIOError("Match operation failed", exceptions)
beam-worker | apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'s3://bucket_name/data_path/*': BeamIOError("List operation failed with exceptions {'s3://bucket_name/data_path/': ValueError('Must provide one of client or options')}")}
beam-worker |
beam-worker | During handling of the above exception, another exception occurred:
beam-worker |
beam-worker | Traceback (most recent call last):
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 289, in _execute
beam-worker | response = task()
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 362, in <lambda>
beam-worker | lambda: self.create_worker().do_instruction(request), request)
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 606, in do_instruction
beam-worker | return getattr(self, request_type)(
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 644, in process_bundle
beam-worker | bundle_processor.process_bundle(instruction_id))
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 999, in process_bundle
beam-worker | input_op_by_transform_id[element.transform_id].process_encoded(
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 228, in process_encoded
beam-worker | self.output(decoded_value)
beam-worker | File "apache_beam/runners/worker/operations.py", line 357, in apache_beam.runners.worker.operations.Operation.output
beam-worker | File "apache_beam/runners/worker/operations.py", line 359, in apache_beam.runners.worker.operations.Operation.output
beam-worker | File "apache_beam/runners/worker/operations.py", line 221, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
beam-worker | File "apache_beam/runners/worker/operations.py", line 718, in apache_beam.runners.worker.operations.DoOperation.process
beam-worker | File "apache_beam/runners/worker/operations.py", line 719, in apache_beam.runners.worker.operations.DoOperation.process
beam-worker | File "apache_beam/runners/common.py", line 1241, in apache_beam.runners.common.DoFnRunner.process
beam-worker | File "apache_beam/runners/common.py", line 1306, in apache_beam.runners.common.DoFnRunner._reraise_augmented
beam-worker | File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
beam-worker | File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
beam-worker | File "apache_beam/runners/common.py", line 1401, in apache_beam.runners.common._OutputProcessor.process_outputs
beam-worker | File "apache_beam/runners/worker/operations.py", line 221, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
beam-worker | File "apache_beam/runners/worker/operations.py", line 718, in apache_beam.runners.worker.operations.DoOperation.process
beam-worker | File "apache_beam/runners/worker/operations.py", line 719, in apache_beam.runners.worker.operations.DoOperation.process
beam-worker | File "apache_beam/runners/common.py", line 1241, in apache_beam.runners.common.DoFnRunner.process
beam-worker | File "apache_beam/runners/common.py", line 1321, in apache_beam.runners.common.DoFnRunner._reraise_augmented
beam-worker | File "/usr/local/lib/python3.8/site-packages/future/utils/__init__.py", line 446, in raise_with_traceback
beam-worker | raise exc.with_traceback(traceback)
beam-worker | File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
beam-worker | File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
beam-worker | File "apache_beam/runners/common.py", line 1374, in apache_beam.runners.common._OutputProcessor.process_outputs
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1425, in process
beam-worker | initial_restriction = self.restriction_provider.initial_restriction(
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/io/iobase.py", line 1545, in initial_restriction
beam-worker | range_tracker = element_source.get_range_tracker(None, None)
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 210, in get_range_tracker
beam-worker | return self._get_concat_source().get_range_tracker(
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 200, in _f
beam-worker | return fnc(self, *args, **kwargs)
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 145, in _get_concat_source
beam-worker | match_result = FileSystems.match([pattern])[0]
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystems.py", line 209, in match
beam-worker | return filesystem.match(patterns, limits)
beam-worker | File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystem.py", line 765, in match
beam-worker | raise BeamIOError("Match operation failed", exceptions)
beam-worker | apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'s3://bucket_name/data_path/*': BeamIOError("List operation failed with exceptions {'s3://bucket_name/data_path/': ValueError('Must provide one of client or options')}")} [while running 'InputToRecord/_ImportSerializedRecord({'input_base': 's3://bucket_name/data_path', 'input_config': '{\n "splits": [\n {\n "name": "single_split",\n "pattern": "*"\n }\n ]\n}', 'output_config': '{\n "split_config": {\n "splits": [\n {\n "hash_buckets": 2,\n "name": "train"\n },\n {\n "hash_buckets": 1,\n "name": "eval"\n }\n ]\n }\n}', 'output_data_format': 6, 'custom_config': None, 'range_config': None, 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:24321563,xor_checksum:1612792982,sum_checksum:1612792982', '_beam_pipeline_args': ['--runner=FlinkRunner', '--flink_master=stand7:8081', '--environment_type=EXTERNAL', '--environment_config=127.0.0.1:50000', '--flink_version=1.10', '--flink_submit_uber_jar', '--s3_access_key=valid-access-key', '--s3_secret_access_key=valid-secret-key', '--s3_endpoint_url=http://192.168.0.117:49000', '--s3_region_name=us-east-1', '--s3_verify=false', '--extra_package=/tmp/tmpvy0mx3pm/build/tfx/dist/tfx_ephemeral-0.27.0.tar.gz', '--labels', 'tfx_executor=tfx-components-example_gen-import_example_gen-executor-executor', '--labels', 'tfx_py_version=3-8', '--labels', 'tfx_runner=local', '--labels', 'tfx_version=0-27-0']})/ReadFromTFRecord/Read/SDFBoundedSourceReader/ParDo(SDFBoundedSourceDoFn)/PairWithRestriction0'] with exceptions None
DEBUG:absl:Processing input s3://bucket_name/data_path.
INFO:absl:select span and version = (0, None)
INFO:absl:latest span and version = (0, None)
DEBUG:absl:Resolved input artifacts are: {}
DEBUG:absl:Run context pipe-name already exists.
DEBUG:absl:ID of run context pipe-name is 1.
DEBUG:absl:Pipeline context [pipe-name : 1]
DEBUG:absl:ID of run context pipe-name.2021-02-11T11:27:15.999211 is 575.
DEBUG:absl:Pipeline run context [pipe-name.2021-02-11T11:27:15.999211 : 575]
DEBUG:absl:Prepared EXECUTION:
type_id: 3
last_known_state: RUNNING
properties {
key: "component_id"
value {
string_value: "ImportExampleGen"
}
}
properties {
key: "custom_config"
value {
string_value: "None"
}
}
properties {
key: "input_base"
value {
string_value: "s3://bucket_name/data_path"
}
}
properties {
key: "input_config"
value {
string_value: "{\n \"splits\": [\n {\n \"name\": \"single_split\",\n \"pattern\": \"*\"\n }\n ]\n}"
}
}
properties {
key: "input_fingerprint"
value {
string_value: "split:single_split,num_files:1,total_bytes:24321563,xor_checksum:1612792982,sum_checksum:1612792982"
}
}
properties {
key: "output_config"
value {
string_value: "{\n \"split_config\": {\n \"splits\": [\n {\n \"hash_buckets\": 2,\n \"name\": \"train\"\n },\n {\n \"hash_buckets\": 1,\n \"name\": \"eval\"\n }\n ]\n }\n}"
}
}
properties {
key: "output_data_format"
value {
string_value: "6"
}
}
properties {
key: "pipeline_name"
value {
string_value: "pipe-name"
}
}
properties {
key: "pipeline_root"
value {
string_value: "s3://bucket_name/pipeline_output/pipe-name"
}
}
properties {
key: "range_config"
value {
string_value: "None"
}
}
properties {
key: "run_id"
value {
string_value: "2021-02-11T11:27:15.999211"
}
}
properties {
key: "span"
value {
string_value: "0"
}
}
properties {
key: "state"
value {
string_value: "new"
}
}
properties {
key: "version"
value {
string_value: "None"
}
}
DEBUG:absl:Cached results not found, move on to new execution
DEBUG:absl:Creating output artifact uri s3://bucket_name/pipeline_output/pipe-name/ImportExampleGen/examples/192 as directory
DEBUG:absl:Output artifacts skeleton for the upcoming execution are: {'examples': [Artifact(artifact: uri: "s3://bucket_name/pipeline_output/pipe-name/ImportExampleGen/examples/192"
custom_properties {
key: "input_fingerprint"
value {
string_value: "split:single_split,num_files:1,total_bytes:24321563,xor_checksum:1612792982,sum_checksum:1612792982"
}
}
custom_properties {
key: "span"
value {
string_value: "0"
}
}
, artifact_type: name: "Examples"
properties {
key: "span"
value: INT
}
properties {
key: "split_names"
value: STRING
}
properties {
key: "version"
value: INT
}
)]}
DEBUG:absl:Execution properties for the upcoming execution are: {'input_base': 's3://bucket_name/data_path', 'input_config': '{\n "splits": [\n {\n "name": "single_split",\n "pattern": "*"\n }\n ]\n}', 'output_config': '{\n "split_config": {\n "splits": [\n {\n "hash_buckets": 2,\n "name": "train"\n },\n {\n "hash_buckets": 1,\n "name": "eval"\n }\n ]\n }\n}', 'output_data_format': 6, 'custom_config': None, 'range_config': None, 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:24321563,xor_checksum:1612792982,sum_checksum:1612792982'}
INFO:absl:Running executor for ImportExampleGen
INFO:absl:Attempting to infer TFX Python dependency for beam
INFO:absl:Copying all content from install dir /home/local/projects/py/project_name/venv/lib/python3.8/site-packages/tfx to temp dir /tmp/tmpvy0mx3pm/build/tfx
INFO:absl:Generating a temp setup file at /tmp/tmpvy0mx3pm/build/tfx/setup.py
INFO:absl:Creating temporary sdist package, logs available at /tmp/tmpvy0mx3pm/build/tfx/setup.log
INFO:absl:Added --extra_package=/tmp/tmpvy0mx3pm/build/tfx/dist/tfx_ephemeral-0.27.0.tar.gz to beam args
DEBUG:absl:Starting Executor execution.
DEBUG:absl:Inputs for Executor are: {}
DEBUG:absl:Outputs for Executor are: {"examples": [{"artifact": {"id": "233", "type_id": "5", "uri": "s3://bucket_name/pipeline_output/pipe-name/ImportExampleGen/examples/192", "custom_properties": {"span": {"string_value": "0"}, "input_fingerprint": {"string_value": "split:single_split,num_files:1,total_bytes:24321563,xor_checksum:1612792982,sum_checksum:1612792982"}}}, "artifact_type": {"id": "5", "name": "Examples", "properties": {"version": "INT", "split_names": "STRING", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}]}
DEBUG:absl:Execution properties for Executor are: {"input_base": "s3://bucket_name/data_path", "input_config": "{\n \"splits\": [\n {\n \"name\": \"single_split\",\n \"pattern\": \"*\"\n }\n ]\n}", "output_config": "{\n \"split_config\": {\n \"splits\": [\n {\n \"hash_buckets\": 2,\n \"name\": \"train\"\n },\n {\n \"hash_buckets\": 1,\n \"name\": \"eval\"\n }\n ]\n }\n}", "output_data_format": 6, "custom_config": null, "range_config": null, "span": 0, "version": null, "input_fingerprint": "split:single_split,num_files:1,total_bytes:24321563,xor_checksum:1612792982,sum_checksum:1612792982"}
INFO:absl:Generating examples.
INFO:absl:Reading input TFRecord data s3://bucket_name/data_path/*.
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.8 interpreter.
ERROR:root:java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error received from SDK harness for instruction 7: Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1374, in apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1425, in process
initial_restriction = self.restriction_provider.initial_restriction(
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/iobase.py", line 1545, in initial_restriction
range_tracker = element_source.get_range_tracker(None, None)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 210, in get_range_tracker
return self._get_concat_source().get_range_tracker(
File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 200, in _f
return fnc(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 145, in _get_concat_source
match_result = FileSystems.match([pattern])[0]
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystems.py", line 209, in match
return filesystem.match(patterns, limits)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystem.py", line 765, in match
raise BeamIOError("Match operation failed", exceptions)
apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'s3://bucket_name/data_path/*': BeamIOError("List operation failed with exceptions {'s3://bucket_name/data_path/': ValueError('Must provide one of client or options')}")}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 289, in _execute
response = task()
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 362, in <lambda>
lambda: self.create_worker().do_instruction(request), request)
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 606, in do_instruction
return getattr(self, request_type)(
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 644, in process_bundle
bundle_processor.process_bundle(instruction_id))
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 999, in process_bundle
input_op_by_transform_id[element.transform_id].process_encoded(
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 228, in process_encoded
self.output(decoded_value)
File "apache_beam/runners/worker/operations.py", line 357, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 359, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 221, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 718, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 719, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1241, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1306, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1401, in apache_beam.runners.common._OutputProcessor.process_outputs
File "apache_beam/runners/worker/operations.py", line 221, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 718, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 719, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1241, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1321, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "/usr/local/lib/python3.8/site-packages/future/utils/__init__.py", line 446, in raise_with_traceback
raise exc.with_traceback(traceback)
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1374, in apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1425, in process
initial_restriction = self.restriction_provider.initial_restriction(
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/iobase.py", line 1545, in initial_restriction
range_tracker = element_source.get_range_tracker(None, None)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 210, in get_range_tracker
return self._get_concat_source().get_range_tracker(
File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 200, in _f
return fnc(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 145, in _get_concat_source
match_result = FileSystems.match([pattern])[0]
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystems.py", line 209, in match
return filesystem.match(patterns, limits)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystem.py", line 765, in match
raise BeamIOError("Match operation failed", exceptions)
apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'s3://bucket_name/data_path/*': BeamIOError("List operation failed with exceptions {'s3://bucket_name/data_path/': ValueError('Must provide one of client or options')}")} [while running 'InputToRecord/_ImportSerializedRecord({'input_base': 's3://bucket_name/data_path', 'input_config': '{\n "splits": [\n {\n "name": "single_split",\n "pattern": "*"\n }\n ]\n}', 'output_config': '{\n "split_config": {\n "splits": [\n {\n "hash_buckets": 2,\n "name": "train"\n },\n {\n "hash_buckets": 1,\n "name": "eval"\n }\n ]\n }\n}', 'output_data_format': 6, 'custom_config': None, 'range_config': None, 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:24321563,xor_checksum:1612792982,sum_checksum:1612792982', '_beam_pipeline_args': ['--runner=FlinkRunner', '--flink_master=flink-cluster:8081', '--environment_type=EXTERNAL', '--environment_config=127.0.0.1:50000', '--flink_version=1.10', '--flink_submit_uber_jar', '--s3_access_key=valid-access-key', '--s3_secret_access_key=valid-secret-key', '--s3_endpoint_url=http://192.168.0.117:49000', '--s3_region_name=us-east-1', '--s3_verify=false', '--extra_package=/tmp/tmpvy0mx3pm/build/tfx/dist/tfx_ephemeral-0.27.0.tar.gz', '--labels', 'tfx_executor=tfx-components-example_gen-import_example_gen-executor-executor', '--labels', 'tfx_py_version=3-8', '--labels', 'tfx_runner=local', '--labels', 'tfx_version=0-27-0']})/ReadFromTFRecord/Read/SDFBoundedSourceReader/ParDo(SDFBoundedSourceDoFn)/PairWithRestriction0'] with exceptions None
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:60)
at org.apache.beam.runners.fnexecution.control.SdkHarnessClient$BundleProcessor$ActiveBundle.close(SdkHarnessClient.java:504)
at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory$1.close(DefaultJobBundleFactory.java:555)
at org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.$closeResource(FlinkExecutableStageFunction.java:268)
at org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:268)
at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Error received from SDK harness for instruction 7: Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1374, in apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1425, in process
initial_restriction = self.restriction_provider.initial_restriction(
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/iobase.py", line 1545, in initial_restriction
range_tracker = element_source.get_range_tracker(None, None)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 210, in get_range_tracker
return self._get_concat_source().get_range_tracker(
File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 200, in _f
return fnc(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 145, in _get_concat_source
match_result = FileSystems.match([pattern])[0]
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystems.py", line 209, in match
return filesystem.match(patterns, limits)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystem.py", line 765, in match
raise BeamIOError("Match operation failed", exceptions)
apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'s3://bucket_name/data_path/*': BeamIOError("List operation failed with exceptions {'s3://bucket_name/data_path/': ValueError('Must provide one of client or options')}")}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 289, in _execute
response = task()
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 362, in <lambda>
lambda: self.create_worker().do_instruction(request), request)
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 606, in do_instruction
return getattr(self, request_type)(
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 644, in process_bundle
bundle_processor.process_bundle(instruction_id))
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 999, in process_bundle
input_op_by_transform_id[element.transform_id].process_encoded(
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 228, in process_encoded
self.output(decoded_value)
File "apache_beam/runners/worker/operations.py", line 357, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 359, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 221, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 718, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 719, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1241, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1306, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1401, in apache_beam.runners.common._OutputProcessor.process_outputs
File "apache_beam/runners/worker/operations.py", line 221, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 718, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 719, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1241, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1321, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "/usr/local/lib/python3.8/site-packages/future/utils/__init__.py", line 446, in raise_with_traceback
raise exc.with_traceback(traceback)
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1374, in apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1425, in process
initial_restriction = self.restriction_provider.initial_restriction(
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/iobase.py", line 1545, in initial_restriction
range_tracker = element_source.get_range_tracker(None, None)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 210, in get_range_tracker
return self._get_concat_source().get_range_tracker(
File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 200, in _f
return fnc(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 145, in _get_concat_source
match_result = FileSystems.match([pattern])[0]
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystems.py", line 209, in match
return filesystem.match(patterns, limits)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystem.py", line 765, in match
raise BeamIOError("Match operation failed", exceptions)
apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'s3://bucket_name/data_path/*': BeamIOError("List operation failed with exceptions {'s3://bucket_name/data_path/': ValueError('Must provide one of client or options')}")} [while running 'InputToRecord/_ImportSerializedRecord({'input_base': 's3://bucket_name/data_path', 'input_config': '{\n "splits": [\n {\n "name": "single_split",\n "pattern": "*"\n }\n ]\n}', 'output_config': '{\n "split_config": {\n "splits": [\n {\n "hash_buckets": 2,\n "name": "train"\n },\n {\n "hash_buckets": 1,\n "name": "eval"\n }\n ]\n }\n}', 'output_data_format': 6, 'custom_config': None, 'range_config': None, 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:24321563,xor_checksum:1612792982,sum_checksum:1612792982', '_beam_pipeline_args': ['--runner=FlinkRunner', '--flink_master=flink-cluster:8081', '--environment_type=EXTERNAL', '--environment_config=127.0.0.1:50000', '--flink_version=1.10', '--flink_submit_uber_jar', '--s3_access_key=valid-access-key', '--s3_secret_access_key=valid-secret-key', '--s3_endpoint_url=http://192.168.0.117:49000', '--s3_region_name=us-east-1', '--s3_verify=false', '--extra_package=/tmp/tmpvy0mx3pm/build/tfx/dist/tfx_ephemeral-0.27.0.tar.gz', '--labels', 'tfx_executor=tfx-components-example_gen-import_example_gen-executor-executor', '--labels', 'tfx_py_version=3-8', '--labels', 'tfx_runner=local', '--labels', 'tfx_version=0-27-0']})/ReadFromTFRecord/Read/SDFBoundedSourceReader/ParDo(SDFBoundedSourceDoFn)/PairWithRestriction0'] with exceptions None
at org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:180)
at org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:160)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:251)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 more
Traceback (most recent call last):
File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/home/local/projects/py/project_name/venv/lib/python3.8/site-packages/tfx/utils/telemetry_utils.py", line 51, in scoped_labels
yield
File "/home/local/projects/py/project_name/venv/lib/python3.8/site-packages/tfx/orchestration/local/local_dag_runner.py", line 92, in run
component_launcher.launch()
File "/home/local/projects/py/project_name/venv/lib/python3.8/site-packages/tfx/orchestration/launcher/base_component_launcher.py", line 206, in launch
self._run_executor(execution_decision.execution_id,
File "/home/local/projects/py/project_name/venv/lib/python3.8/site-packages/tfx/orchestration/launcher/in_process_component_launcher.py", line 71, in _run_executor
executor.Do(
File "/home/local/projects/py/project_name/venv/lib/python3.8/site-packages/tfx/components/example_gen/base_example_gen_executor.py", line 300, in Do
(example_split
File "/home/local/projects/py/project_name/venv/lib/python3.8/site-packages/apache_beam/pipeline.py", line 583, in __exit__
self.result.wait_until_finish()
File "/home/local/projects/py/project_name/venv/lib/python3.8/site-packages/apache_beam/runners/portability/portable_runner.py", line 581, in wait_until_finish
raise self._runtime_exception
RuntimeError: Pipeline job-43df62c0-1fc5-4803-9fd1-f79edae75dbb failed in state FAILED: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error received from SDK harness for instruction 7: Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1374, in apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1425, in process
initial_restriction = self.restriction_provider.initial_restriction(
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/iobase.py", line 1545, in initial_restriction
range_tracker = element_source.get_range_tracker(None, None)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 210, in get_range_tracker
return self._get_concat_source().get_range_tracker(
File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 200, in _f
return fnc(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 145, in _get_concat_source
match_result = FileSystems.match([pattern])[0]
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystems.py", line 209, in match
return filesystem.match(patterns, limits)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystem.py", line 765, in match
raise BeamIOError("Match operation failed", exceptions)
apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'s3://bucket_name/data_path/*': BeamIOError("List operation failed with exceptions {'s3://bucket_name/data_path/': ValueError('Must provide one of client or options')}")}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 289, in _execute
response = task()
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 362, in <lambda>
lambda: self.create_worker().do_instruction(request), request)
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 606, in do_instruction
return getattr(self, request_type)(
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 644, in process_bundle
bundle_processor.process_bundle(instruction_id))
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 999, in process_bundle
input_op_by_transform_id[element.transform_id].process_encoded(
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 228, in process_encoded
self.output(decoded_value)
File "apache_beam/runners/worker/operations.py", line 357, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 359, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 221, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 718, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 719, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1241, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1306, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1401, in apache_beam.runners.common._OutputProcessor.process_outputs
File "apache_beam/runners/worker/operations.py", line 221, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 718, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 719, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1241, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1321, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "/usr/local/lib/python3.8/site-packages/future/utils/__init__.py", line 446, in raise_with_traceback
raise exc.with_traceback(traceback)
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1374, in apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1425, in process
initial_restriction = self.restriction_provider.initial_restriction(
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/iobase.py", line 1545, in initial_restriction
range_tracker = element_source.get_range_tracker(None, None)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 210, in get_range_tracker
return self._get_concat_source().get_range_tracker(
File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 200, in _f
return fnc(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 145, in _get_concat_source
match_result = FileSystems.match([pattern])[0]
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystems.py", line 209, in match
return filesystem.match(patterns, limits)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystem.py", line 765, in match
raise BeamIOError("Match operation failed", exceptions)
apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'s3://bucket_name/data_path/*': BeamIOError("List operation failed with exceptions {'s3://bucket_name/data_path/': ValueError('Must provide one of client or options')}")} [while running 'InputToRecord/_ImportSerializedRecord({'input_base': 's3://bucket_name/data_path', 'input_config': '{\n "splits": [\n {\n "name": "single_split",\n "pattern": "*"\n }\n ]\n}', 'output_config': '{\n "split_config": {\n "splits": [\n {\n "hash_buckets": 2,\n "name": "train"\n },\n {\n "hash_buckets": 1,\n "name": "eval"\n }\n ]\n }\n}', 'output_data_format': 6, 'custom_config': None, 'range_config': None, 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:24321563,xor_checksum:1612792982,sum_checksum:1612792982', '_beam_pipeline_args': ['--runner=FlinkRunner', '--flink_master=flink-cluster:8081', '--environment_type=EXTERNAL', '--environment_config=127.0.0.1:50000', '--flink_version=1.10', '--flink_submit_uber_jar', '--s3_access_key=valid-access-key', '--s3_secret_access_key=valid-secret-key', '--s3_endpoint_url=http://192.168.0.117:49000', '--s3_region_name=us-east-1', '--s3_verify=false', '--extra_package=/tmp/tmpvy0mx3pm/build/tfx/dist/tfx_ephemeral-0.27.0.tar.gz', '--labels', 'tfx_executor=tfx-components-example_gen-import_example_gen-executor-executor', '--labels', 'tfx_py_version=3-8', '--labels', 'tfx_runner=local', '--labels', 'tfx_version=0-27-0']})/ReadFromTFRecord/Read/SDFBoundedSourceReader/ParDo(SDFBoundedSourceDoFn)/PairWithRestriction0'] with exceptions None
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:60)
at org.apache.beam.runners.fnexecution.control.SdkHarnessClient$BundleProcessor$ActiveBundle.close(SdkHarnessClient.java:504)
at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory$1.close(DefaultJobBundleFactory.java:555)
at org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.$closeResource(FlinkExecutableStageFunction.java:268)
at org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:268)
at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Error received from SDK harness for instruction 7: Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1374, in apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1425, in process
initial_restriction = self.restriction_provider.initial_restriction(
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/iobase.py", line 1545, in initial_restriction
range_tracker = element_source.get_range_tracker(None, None)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 210, in get_range_tracker
return self._get_concat_source().get_range_tracker(
File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 200, in _f
return fnc(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 145, in _get_concat_source
match_result = FileSystems.match([pattern])[0]
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystems.py", line 209, in match
return filesystem.match(patterns, limits)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystem.py", line 765, in match
raise BeamIOError("Match operation failed", exceptions)
apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'s3://bucket_name/data_path/*': BeamIOError("List operation failed with exceptions {'s3://bucket_name/data_path/': ValueError('Must provide one of client or options')}")}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 289, in _execute
response = task()
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 362, in <lambda>
lambda: self.create_worker().do_instruction(request), request)
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 606, in do_instruction
return getattr(self, request_type)(
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 644, in process_bundle
bundle_processor.process_bundle(instruction_id))
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 999, in process_bundle
input_op_by_transform_id[element.transform_id].process_encoded(
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 228, in process_encoded
self.output(decoded_value)
File "apache_beam/runners/worker/operations.py", line 357, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 359, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 221, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 718, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 719, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1241, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1306, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1401, in apache_beam.runners.common._OutputProcessor.process_outputs
File "apache_beam/runners/worker/operations.py", line 221, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 718, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 719, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1241, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1321, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "/usr/local/lib/python3.8/site-packages/future/utils/__init__.py", line 446, in raise_with_traceback
raise exc.with_traceback(traceback)
File "apache_beam/runners/common.py", line 1239, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 587, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1374, in apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1425, in process
initial_restriction = self.restriction_provider.initial_restriction(
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/iobase.py", line 1545, in initial_restriction
range_tracker = element_source.get_range_tracker(None, None)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 210, in get_range_tracker
return self._get_concat_source().get_range_tracker(
File "/usr/local/lib/python3.8/site-packages/apache_beam/options/value_provider.py", line 200, in _f
return fnc(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py", line 145, in _get_concat_source
match_result = FileSystems.match([pattern])[0]
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystems.py", line 209, in match
return filesystem.match(patterns, limits)
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/filesystem.py", line 765, in match
raise BeamIOError("Match operation failed", exceptions)
apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'s3://bucket_name/data_path/*': BeamIOError("List operation failed with exceptions {'s3://bucket_name/data_path/': ValueError('Must provide one of client or options')}")} [while running 'InputToRecord/_ImportSerializedRecord({'input_base': 's3://bucket_name/data_path', 'input_config': '{\n "splits": [\n {\n "name": "single_split",\n "pattern": "*"\n }\n ]\n}', 'output_config': '{\n "split_config": {\n "splits": [\n {\n "hash_buckets": 2,\n "name": "train"\n },\n {\n "hash_buckets": 1,\n "name": "eval"\n }\n ]\n }\n}', 'output_data_format': 6, 'custom_config': None, 'range_config': None, 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:24321563,xor_checksum:1612792982,sum_checksum:1612792982', '_beam_pipeline_args': ['--runner=FlinkRunner', '--flink_master=flink-cluster:8081', '--environment_type=EXTERNAL', '--environment_config=127.0.0.1:50000', '--flink_version=1.10', '--flink_submit_uber_jar', '--s3_access_key=valid-access-key', '--s3_secret_access_key=valid-secret-key', '--s3_endpoint_url=http://192.168.0.117:49000', '--s3_region_name=us-east-1', '--s3_verify=false', '--extra_package=/tmp/tmpvy0mx3pm/build/tfx/dist/tfx_ephemeral-0.27.0.tar.gz', '--labels', 'tfx_executor=tfx-components-example_gen-import_example_gen-executor-executor', '--labels', 'tfx_py_version=3-8', '--labels', 'tfx_runner=local', '--labels', 'tfx_version=0-27-0']})/ReadFromTFRecord/Read/SDFBoundedSourceReader/ParDo(SDFBoundedSourceDoFn)/PairWithRestriction0'] with exceptions None
at org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:180)
at org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:160)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:251)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 more
python-BaseException
Process finished with exit code 130 (interrupted by signal 2: SIGINT)
--runner=FlinkRunner
--flink_master=flink-cluster:8081
--environment_type=EXTERNAL
--environment_config=127.0.0.1:50000 # at least one beam worker pool on each flink taskmanager
--flink_version=1.10
--flink_submit_uber_jar
--s3_access_key=valid-access-key
--s3_secret_access_key=valid-secret-key
--s3_endpoint_url=http://192.168.0.117:49000
--s3_region_name=us-east-1
--s3_verify=false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment