Skip to content

Instantly share code, notes, and snippets.

@yuwtennis
Created August 15, 2021 16:23
Show Gist options
  • Save yuwtennis/2e4c13c79f71e8f713e947955115b3e2 to your computer and use it in GitHub Desktop.
Save yuwtennis/2e4c13c79f71e8f713e947955115b3e2 to your computer and use it in GitHub Desktop.
SDK client for spark runner
export SDK_HARNESS_ENDPOINT=localhost:50000
export JOB_SERVICE_ENDPOINT=localhost:8099
import os
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import SetupOptions
from apache_beam.io.textio import WriteToText
def main():
sdk_harness_endpoint = os.getenv('SDK_HARNESS_ENDPOINT')
job_service_endpoint = os.getenv('JOB_SERVICE_ENDPOINT')
options = PipelineOptions([
"--runner=PortableRunner",
"--environment_config="+sdk_harness_endpoint,
"--environment_type=EXTERNAL",
"--job_endpoint="+job_service_endpoint
])
with beam.Pipeline(options=options) as p:
lines = ( p | beam.Create(['Hello World.', 'Apache beam']) )
# Write to local filesystem
( lines | WriteToText('/tmp/sample.txt') )
if __name__ == "__main__":
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment