Skip to content

Instantly share code, notes, and snippets.

@j450h1
Created June 12, 2020 06:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save j450h1/1354633981bfaa997801db38c4d7341d to your computer and use it in GitHub Desktop.
Save j450h1/1354633981bfaa997801db38c4d7341d to your computer and use it in GitHub Desktop.
def get_churned_users(df):
"""
Find out the users that cancelled so we can identify who churned.
Return updated dataframe with additional column identifying as such
"""
cancelled_ids = df.filter('page == "Cancellation Confirmation"').select("userId").distinct()
# Convert to list to be used to filter later
cancelled_ids = cancelled_ids.toPandas()['userId'].tolist()
# 1 when a user churned and 0 when they did not
df = df.withColumn("Churn", when((col("userId").isin(cancelled_ids)),lit('1')).otherwise(lit('0')))
return df
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment