Skip to content

Instantly share code, notes, and snippets.

@sanjurm16
Last active January 20, 2019 19:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sanjurm16/04caced18bb3d4935a4b6142952c2c9f to your computer and use it in GitHub Desktop.
Save sanjurm16/04caced18bb3d4935a4b6142952c2c9f to your computer and use it in GitHub Desktop.
train_df.select("age").describe().show()
train_df.where("age is null").count()
#177 values out of the 714 values are null.
#replacing the null values with the mean age value
train_avg_age_df = train_df.na.fill({'age': 29})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment