Skip to content

Instantly share code, notes, and snippets.

@ayodeji-AA
Created January 4, 2022 20:37
Missing data in column
#SPARK
Spark_missing_values = {col:titanic_sp.filter(titanic_sp[col].isNull()).count() for col in titanic_sp.columns}
Spark_missing_values
#PANDAS
# get the number of missing data points per column
missing_values_count = titanic_pd.isnull().sum()
# look at the # of missing points in the first ten columns
missing_values_count[0:10]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment