Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
from pyspark.ml.classification import LogisticRegression
# create a sample dataframe with 4 features and 1 label column
sample_data_train = spark.createDataFrame([
(2.0, 'A', 'S10', 40, 1.0),
(1.0, 'X', 'E10', 25, 1.0),
(4.0, 'X', 'S20', 10, 0.0),
(3.0, 'Z', 'S10', 20, 0.0),
(4.0, 'A', 'E10', 30, 1.0),
(2.0, 'Z', 'S10', 40, 0.0),
(5.0, 'X', 'D10', 10, 1.0),
], ['feature_1', 'feature_2', 'feature_3', 'feature_4', 'label'])
# view the data
sample_data_train.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.