Skip to content

Instantly share code, notes, and snippets.

@hendra-herviawan
Last active June 22, 2018 02:43
Show Gist options
  • Save hendra-herviawan/af92dfd6da852e4bf98c289ccab78951 to your computer and use it in GitHub Desktop.
Save hendra-herviawan/af92dfd6da852e4bf98c289ccab78951 to your computer and use it in GitHub Desktop.
#Print row with coloumb missing value
#https://stackoverflow.com/questions/37366717/pandas-print-column-name-with-missing-values
train_df[train_df['cat_1'].isnull().tolist() ]
#
#outliyer_percentile.py
#https://www.kaggle.com/sudalairajkumar/simple-exploration-notebook-zillow-prize
ulimit = np.percentile(train_df.logerror.values, 99)
llimit = np.percentile(train_df.logerror.values, 1)
train_df['logerror'].ix[train_df['logerror']>ulimit] = ulimit
train_df['logerror'].ix[train_df['logerror']<llimit] = llimit
# https://www.kaggle.com/danieleewww/xgboost-without-outliers-lb-0-06463/code
train_df=train_df[ train_df.logerror > -0.4 ]
train_df=train_df[ train_df.logerror < 0.42 ]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment