Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
from pyspark.ml import Pipeline
# create a sample dataframe
sample_df = spark.createDataFrame([
(1, 'L101', 'R'),
(2, 'L201', 'C'),
(3, 'D111', 'R'),
(4, 'F210', 'R'),
(5, 'D110', 'C')
], ['id', 'category_1', 'category_2'])
sample_df.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.