Skip to content

Instantly share code, notes, and snippets.

@oneryalcin
Created September 23, 2019 23:15
Show Gist options
  • Save oneryalcin/fbc5e0fd85f19457e386af6755925dfd to your computer and use it in GitHub Desktop.
Save oneryalcin/fbc5e0fd85f19457e386af6755925dfd to your computer and use it in GitHub Desktop.
11 Sparkify Vector Assembler
joined_vector = VectorAssembler(inputCols=['gender_dummy', 'level_dummy', 'logSessionCount',
'sqrtMeanSongCount', 'sqrtSessionsFreqDay'],
outputCol='nonScaledFeatures')\
.transform(joined)
joined_vector = joined_vector.withColumn('label', joined_vector.churned.cast('integer'))
joined_vector.drop('userId','level','gender', 'sessionCount', 'meanSongCount',
'sessionsFreqDay', 'gender_idx', 'level_idx', 'churned').show(4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment