Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save kadaliao/18fe923529a3b329b197259e83f92a44 to your computer and use it in GitHub Desktop.
Save kadaliao/18fe923529a3b329b197259e83f92a44 to your computer and use it in GitHub Desktop.
PySpark deep copy dataframe
import copy
X = spark.createDataFrame([[1,2], [3,4]], ['a', 'b'])
_schema = copy.deepcopy(X.schema)
_X = X.rdd.zipWithIndex().toDF(_schema)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment