Skip to content

Instantly share code, notes, and snippets.

@staticor
Created January 10, 2016 05:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save staticor/ee9730d8ef0b01222428 to your computer and use it in GitHub Desktop.
Save staticor/ee9730d8ef0b01222428 to your computer and use it in GitHub Desktop.
df_train = pd.read_csv('../data/titanic/train.csv')
def clean_data(df):
# Get the unique values of Sex
sexes = np.sort(df['Sex'].unique())
# Generate a mapping of Sex from a string to a number representation
genders_mapping = dict(zip(sexes, range(0, len(sexes) + 1)))
# Transform Sex from a string to a number representation
df['Sex_Val'] = df['Sex'].map(genders_mapping).astype(int)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment