Skip to content

Instantly share code, notes, and snippets.

@nmukerje
Created October 31, 2020 02:22
Show Gist options
  • Save nmukerje/91c6ce6ad0a241231f5e8698460080d4 to your computer and use it in GitHub Desktop.
Save nmukerje/91c6ce6ad0a241231f5e8698460080d4 to your computer and use it in GitHub Desktop.
## Convert a StructType to MapType column :
## Useful when you want to move all Dynamic Fields of a Schema within a StructType column into a single MapType Column.
from pyspark.sql.types import *
from pyspark.sql.functions import *
import json
def toMap(d):
if d:
return(json.loads(d))
else:
return None
# UDF returns a Map of Strings as Key:Value pair
map_udf=udf(lambda d: toMap(d),\
MapType(StringType(),StringType()))
df = df.withColumn("structtype_json_col", to_json('structtype_col'))
df = df.withColumn("maptype_col", map_udf(df.structtype_json_col)).drop("structtype_json_col")
df.printSchema()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment