Skip to content

Instantly share code, notes, and snippets.

@tommydangerous
Created June 11, 2021 05:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tommydangerous/96aa83c3bdea3cfa42ceb42223501ef8 to your computer and use it in GitHub Desktop.
Save tommydangerous/96aa83c3bdea3cfa42ceb42223501ef8 to your computer and use it in GitHub Desktop.
prepare_test_data.py
X_test = X_test_raw.copy()
# Add columns
X_test['can_vote'] = X_test['Age'].apply(lambda age: 1 if age >= 18 else 0)
X_test.loc[:, 'cabin_letter'] = X_test['Cabin'].apply(
lambda cabin: cabin[0] if cabin and type(cabin) is str else None,
)
# Remove columns
X_test = X_test.drop(columns=['Name', 'PassengerId'])
# Impute values
X_test.loc[X_test['Cabin'].isna(), 'Cabin'] = 'somewhere out of sight'
X_test.loc[X_test['cabin_letter'].isna(), 'cabin_letter'] = 'ZZZ'
X_test.loc[:, ['Age']] = age_imputer.transform(X_test[['Age']])
X_test.loc[X_test['Embarked'].isna(), 'Embarked'] = 'no idea'
# Scale columns
X_test.loc[:, ['Age']] = scaler.transform(X_test[['Age']])
# Encode values
X_test.loc[:, new_column_names] = categorical_encoder.transform(
X_test[categorical_columns],
).toarray()
# Select features
X_test = X_test[features_to_use].copy()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment