Skip to content

Instantly share code, notes, and snippets.

@jonstefansson
Created January 7, 2019 22:01
Show Gist options
  • Save jonstefansson/51fb939e227b3344f40f3932fc992a68 to your computer and use it in GitHub Desktop.
Save jonstefansson/51fb939e227b3344f40f3932fc992a68 to your computer and use it in GitHub Desktop.
Flatten BigQuery JSON schema
import click
import json
@click.command()
@click.argument("input_source", type=click.File("rt"))
def flatten(input_source):
"""
Creates a flattened representation of a BigQuery JSON schema.
\b
:param input_source: A JSON schema file or STDIN
:return: Streams flattened data to STDOUT
"""
schema = json.load(input_source)
_process_array(schema)
def _process_array(ary, parent=None):
for element in ary:
name = '.'.join([i for i in [parent, element["name"]] if i is not None])
print("{0:<65s} {1:<12s} {2:s}".format(name, element["type"], element["mode"]))
if element["type"] == "RECORD":
_process_array(element["fields"], parent=name)
if __name__ == "__main__":
flatten()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment