Created
April 29, 2020 11:06
-
-
Save cereniyim/3656cb11e0920aa63867346738f859de to your computer and use it in GitHub Desktop.
clean data function
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def CleanData(df, drop_columns, target_name): | |
# this function drops not used features | |
# and duplicate rows | |
# and empty rows of target(poinst) | |
# returns cleaned df | |
interim_df = df.drop(columns=drop_columns) | |
interim_df_2 = (interim_df | |
.drop_duplicates(ignore_index=True)) | |
cleaned_df = (interim_df_2 | |
.dropna(subset=[target_name], how="any") | |
.reset_index(drop=True)) | |
return cleaned_df |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment