Skip to content

Instantly share code, notes, and snippets.

@mzaradzki
Created July 2, 2017 11:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mzaradzki/4cb32631240c900bcdb8f84922841e94 to your computer and use it in GitHub Desktop.
Save mzaradzki/4cb32631240c900bcdb8f84922841e94 to your computer and use it in GitHub Desktop.
# before filling the null keep track of them
dfX['construction_year_missing'] = (dfX['construction_year']==0)*1
dates.append( 'construction_year_missing' ) # list of dates related fields
# to fill missing dates, can use : mean, median or oldest
mean_year = dfX[dfX['construction_year']>0]['construction_year'].mean()
dfX.loc[dfX['construction_year']==0, 'construction_year'] = int(mean_year)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment