Skip to content

Instantly share code, notes, and snippets.

@socratesk
Last active November 3, 2018 18:46
Show Gist options
  • Save socratesk/33c61fdf48a23c838545273ffb4f98d2 to your computer and use it in GitHub Desktop.
Save socratesk/33c61fdf48a23c838545273ffb4f98d2 to your computer and use it in GitHub Desktop.
import pandas as pd
from sklearn import preprocessing
vehiclerDF = pd.DataFrame({'id':[101, 102, 103, 104, 105, 106, 107, 108],
'vehicle':['Car', 'Minivan', 'SUV', 'Car', 'Car', 'Minivan','Car', 'Minivan'],
'label':['Yes', 'Yes', 'Yes', 'No', 'Yes', 'No','Yes', 'No']})
# Encode label (target)
labelEncode = preprocessing.LabelEncoder()
vehiclerDF['label'] = labelEncode.fit_transform(vehiclerDF['label'])
# Group by category and calculate "mean" per item in the category
means = vehiclerDF.groupby('vehicle').label.mean()
# Map mean values against each respective item in the category
vehiclerDF['vehicleTargetRatio'] = vehiclerDF['vehicle'].map(means)
# Cleanup unwanted features
vehiclerDF.drop(['vehicle'], axis=1, inplace=True)
print(vehiclerDF)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment