Skip to content

Instantly share code, notes, and snippets.

@colbyford
Created June 18, 2019 21:19
Show Gist options
  • Save colbyford/da0d8a22eb6200ffcb9658983537d0e3 to your computer and use it in GitHub Desktop.
Save colbyford/da0d8a22eb6200ffcb9658983537d0e3 to your computer and use it in GitHub Desktop.
Rename all columns in a Spark Dataframe
########################################
## Title: Spark Script for Renaming All Columns in a Dataframe
## Language: PySpark
## Authors: Colby T. Ford, Ph.D.
########################################
column_list = data.columns
prefix = "my_prefix"
new_column_list = [prefix + s for s in column_list]
#new_column_list = [prefix + s if s != "ID" else s for s in column_list] ## Use if you plan on joining on an ID later
column_mapping = [[o, n] for o, n in zip(column_list, new_column_list)]
# print(column_mapping)
# data = data.select(list(map(lambda old, new: col(old).alias(new),*zip(*column_mapping))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment