Skip to content

Instantly share code, notes, and snippets.

@egemenzeytinci
Created December 25, 2019 15:11
Show Gist options
  • Save egemenzeytinci/c8429d5aa321651db247c86c813b6075 to your computer and use it in GitHub Desktop.
Save egemenzeytinci/c8429d5aa321651db247c86c813b6075 to your computer and use it in GitHub Desktop.
Outlier detection with iqr
cleaned = df.copy()
columns = [
'lead_time',
'stays_in_weekend_nights',
'stays_in_week_nights',
'adults',
'children',
'babies',
'adr',
]
for col in columns:
q1 = df[col].quantile(0.25)
q3 = df[col].quantile(0.75)
iqr = q3 - q1
lower = q1 - 1.5 * iqr
upper = q3 + 1.5 * iqr
print(f'Lower point: {round(lower, 2)} \t upper point: {round(upper, 2)} \t {col}')
if lower == upper:
continue
cond1 = (cleaned[col] >= lower) & (cleaned[col] <= upper)
cond2 = cleaned[col].isnull()
cleaned = cleaned[cond1 | cond2]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment