Skip to content

Instantly share code, notes, and snippets.

@finlytics-hub
Last active July 6, 2020 06:23
Show Gist options
  • Save finlytics-hub/6dffd68d0d2e700ee846a094fce1f3bc to your computer and use it in GitHub Desktop.
Save finlytics-hub/6dffd68d0d2e700ee846a094fce1f3bc to your computer and use it in GitHub Desktop.
Practical demonstration of using Z-scores to drop outlier rows
# Import Z-score function
from scipy.stats import zscore
# Define the SD threshold
thresh = 3
# List of all rows as `True` or `False` depending on if they have a value above the threshold or not
SD_outliers = X_train.apply(lambda x: np.abs(zscore(x, nan_policy = 'omit')) > thresh).any(axis=1)
# Drop (inplace) rows that have True in SD_Norm
X_train.drop(X_train.index[SD_outliers], inplace = True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment