Skip to content

Instantly share code, notes, and snippets.

@satkr7
Created July 24, 2020 06:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save satkr7/45c3a49c007b33ba53b3c896c4309e21 to your computer and use it in GitHub Desktop.
Save satkr7/45c3a49c007b33ba53b3c896c4309e21 to your computer and use it in GitHub Desktop.
import pandas as pd
pip install datawig
import datawig
data = pd.read_csv("train.csv")
df_train, df_test = datawig.utils.random_split(data)
#Initialize a SimpleImputer model
imputer = datawig.SimpleImputer(
input_columns=['Pclass','SibSp','Parch'], # column(s) containing information about the column we want to impute
output_column= 'Age', # the column we'd like to impute values for
output_path = 'imputer_model' # stores model data and metrics
)
#Fit an imputer model on the train data
imputer.fit(train_df=df_train, num_epochs=50)
#Impute missing values and return original dataframe with predictions
imputed = imputer.predict(df_test)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment