Created
June 6, 2017 07:29
-
-
Save ytsaig/a596feec53d2a024ac69f5ae5a83d8f7 to your computer and use it in GitHub Desktop.
Multiclass classification (softmax regression) via xgboost custom objective
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
from sklearn import datasets | |
from sklearn.metrics import confusion_matrix | |
from sklearn.preprocessing import OneHotEncoder | |
import xgboost as xgb | |
def softmax(z): | |
z -= np.max(z) | |
sm = (np.exp(z).T / np.sum(np.exp(z), axis=1)).T | |
return sm | |
def softmaxobj(preds, dtrain): | |
"""Softmax objective. | |
Args: | |
preds: (N, K) array, N = #data, K = #classes. | |
dtrain: DMatrix object with training data. | |
Returns: | |
grad: N*K array with gradient values. | |
hess: N*K array with second-order gradient values. | |
""" | |
# Label is a vector of class indices for each input example | |
labels = dtrain.get_label() | |
# When objective=softprob, preds has shape (N, K) | |
labels = OneHotEncoder(sparse=False).fit_transform(labels.reshape(-1, 1)) | |
grad = preds - labels | |
hess = 2.0 * preds * (1.0-preds) | |
# Return as 1-d vectors | |
return grad.flatten(), hess.flatten() | |
# Iris dataset | |
iris = datasets.load_iris() | |
X, Ymc = iris.data, iris.target | |
Y = OneHotEncoder(sparse=False).fit_transform(Ymc.reshape(-1, 1)) | |
"""xgboost softmax regression""" | |
dtrain = xgb.DMatrix(X, label=Ymc) | |
params = {'max_depth': 2, 'eta': 0.1, 'silent': 1, | |
'objective': 'multi:softprob', 'num_class': len(np.unique(Ymc))} | |
# Fit | |
model = xgb.train(params, dtrain, 100) | |
# Evalute | |
yhat = model.predict(dtrain) | |
yhat_labels = np.argmax(yhat, axis=1) | |
confusion_matrix(Ymc, yhat_labels) | |
"""xgboost softmax regression via custom loss""" | |
dtrain = xgb.DMatrix(X, label=Ymc) | |
params = {'max_depth': 2, 'eta': 0.1, 'silent': 1, | |
'objective': 'multi:softprob', 'num_class': len(np.unique(Ymc))} | |
# Fit | |
model = xgb.Booster(params, [dtrain]) | |
for _ in range(100): | |
pred = model.predict(dtrain) | |
g, h = softmaxobj(pred, dtrain) | |
model.boost(dtrain, g, h) | |
# Evalute | |
yhat1 = model.predict(dtrain) | |
yhat1_labels = np.argmax(yhat, axis=1) | |
print(confusion_matrix(Ymc, yhat1_labels)) | |
# Compare the two approaches | |
print(confusion_matrix(yhat_labels, yhat1_labels)) | |
np.sum((yhat-yhat1)**2) | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It seems that something is off in the custom
softmax
objective functions that I have seen online. I tried to replicate the result of themulti:softprob
but it does not work. Here is a reproducible example similar to above:Does any one see anything wrong with my implementation? I was able to get the
binary
custom objective function to work with no issues.I figured out the issue with my
softmaxobj
it seems that convertinggrad.flatten()
andhess.flatten()
tograd.flatten('F')
andhess.flatten('F')
greatly impact the results.