Skip to content

Instantly share code, notes, and snippets.

@KristoR
KristoR / databricks_all_columns.py
Created April 28, 2021 11:44
This script creates a dataframe with all databases, tables and columns in Databricks. Similar to information schema in regular SQL databases.
from pyspark.sql.types import StructType
# get field name from schema (recursive for getting nested values)
def get_schema_field_name(field, parent=None):
if type(field.dataType) == StructType:
if parent == None:
prt = field.name
else:
prt = parent+"."+field.name # using dot notation
res = []