Skip to content

Instantly share code, notes, and snippets.

@suhaskv
Last active December 23, 2020 11:31
Show Gist options
  • Save suhaskv/c7e05efc7dc4d1af45431d0d0b52fdbb to your computer and use it in GitHub Desktop.
Save suhaskv/c7e05efc7dc4d1af45431d0d0b52fdbb to your computer and use it in GitHub Desktop.
Check presence of Null and NaN values in the train data
null_arr = []
nan_arr = []
for sig in metadata_train['signal_id'].values:
sig_data = pd.read_parquet('/content/train.parquet',
engine='fastparquet', columns=[str(sig)])
null_arr.append(sig_data.isnull().sum())
nan_arr.append(sig_data.isna().sum())
print(f"Number of Null values in train data: {np.sum(null_arr)}")
print(f"Number of NaN values in train data: {np.sum(nan_arr)}")
@suhaskv
Copy link
Author

suhaskv commented Dec 23, 2020

VSB Power Line Blog - the presence of null and nan values in train data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment