Skip to content

Instantly share code, notes, and snippets.

@aahmed-se
Last active February 16, 2019 00:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aahmed-se/a9f675f3b1dd4d022ac79fedc7f57b71 to your computer and use it in GitHub Desktop.
Save aahmed-se/a9f675f3b1dd4d022ac79fedc7f57b71 to your computer and use it in GitHub Desktop.
Bigquery

Introduction

This is a usage and design summary of the pulsar-io-bigquery sink.

Parameters

This is the current list of parameters.

param name description
credentials_file_path BigQuery Json Key file Path
project_id BigQuery Project Id
topic_data_set BigQuery target topic/dataset map
eg :"topic1:dataset1,topic2:dataset2"
topic_table_set BigQuery target topic/table map
eg :"topic1:table_tag1,topic2:table_tag2"
add_insert_timestamp Adds a timestamp column
time_stamp_column_name default is "sink_timestamp"
useMessageTimeDatePartitioning Use Time Date Partitioning

Design

The current sink expects a gcp json credentials file to initialize, it also has message routing capabiltiy to different tables based on topic map.

Sample local run command

sink localrun \
--archive ./pulsar-google-nar-0.0.1.nar \
--tenant public \
--namespace default \
--name bigquery-sink \
--inputs bigquery-data \
--sinkConfigFile ~/bigquery-sink.yaml

Sample config yaml

configs:
  credentials_file_path: "/tmp/kubernetes-34c5c20a8e3e.json"
  project_id: "sample-project-170720"
  topic_data_set: "bigquery-data:test1"
  topic_table_set: "bigquery-data:test_table1"
  add_insert_timestamp: "true"
  time_stamp_column_name: "inserted_timestamp"

Schema

There is no schema validation performed currently and there no integration with the pulsar ot bigquery schema registry at this time.

Option is provided to add a time_stamp column if the option is enabled to add an additional column per row with the utc timestamp generated from java, before the insertion request is made.

Error Management

TODO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment