Skip to content
{{ message }}

Instantly share code, notes, and snippets.

# Nikolay-Lysenko/xgb_quantile_loss.py

Last active Mar 1, 2021
Customized loss function for quantile regression with XGBoost
 import numpy as np def xgb_quantile_eval(preds, dmatrix, quantile=0.2): """ Customized evaluational metric that equals to quantile regression loss (also known as pinball loss). Quantile regression is regression that estimates a specified quantile of target's distribution conditional on given features. @type preds: numpy.ndarray @type dmatrix: xgboost.DMatrix @type quantile: float @rtype: float """ labels = dmatrix.get_label() return ('q{}_loss'.format(quantile), np.nanmean((preds >= labels) * (1 - quantile) * (preds - labels) + (preds < labels) * quantile * (labels - preds))) def xgb_quantile_obj(preds, dmatrix, quantile=0.2): """ Computes first-order derivative of quantile regression loss and a non-degenerate substitute for second-order derivative. Substitute is returned instead of zeros, because XGBoost requires non-zero second-order derivatives. See this page: https://github.com/dmlc/xgboost/issues/1825 to see why it is possible to use this trick. However, be sure that hyperparameter named `max_delta_step` is small enough to satisfy: ```0.5 * max_delta_step <= min(quantile, 1 - quantile)```. @type preds: numpy.ndarray @type dmatrix: xgboost.DMatrix @type quantile: float @rtype: tuple(numpy.ndarray) """ try: assert 0 <= quantile <= 1 except AssertionError: raise ValueError("Quantile value must be float between 0 and 1.") labels = dmatrix.get_label() errors = preds - labels left_mask = errors < 0 right_mask = errors > 0 grad = -quantile * left_mask + (1 - quantile) * right_mask hess = np.ones_like(preds) return grad, hess # Example of usage: # bst = xgb.train(hyperparams, train, num_rounds, # obj=xgb_quantile_obj, feval=xgb_quantile_eval)

### Silverneo commented May 31, 2017

 I saw this post http://www.bigdatarepublic.nl/regression-prediction-intervals-with-xgboost/ may be interesting to you:)

### JoshuaC3 commented Jan 13, 2018

 ``````hess = np.ones_like(preds)ess = np.ones_like(preds) `````` Does this not reduce XGBoost to being a classical GBM? What made you choose all ones as the hessian?

### Nikolay-Lysenko commented Jan 16, 2018

 @JoshuaC3, yes, you are right that replacing hessian with all ones makes `xgboost` close to GBM. Loosely speaking, GBM can be compared with gradient descent, whereas `xgboost` can be compared with Newton's method. Replacement of hessian with ones in Newton's method's update formula turns this formula to gradient descent update formula. However, there are other differences between `xgboost` and software implementations of gradient boosting such as `sklearn.GradientBoostingRegressor`. For the sake of having them, it is beneficial to port quantile regression loss to `xgboost`. Finally, a brief explanation why all ones are chosen as placeholder. Second-order derivative of quantile regression loss is equal to 0 at every point except the one where it is not defined. So "fair" implementation of quantile regression with `xgboost` is impossible due to division by zero. Thus, a non-zero placeholder for hessian is needed. There is nothing bad in assuming that this placeholder consists of all ones if `max_delta_step` hyperparameter is chosen appropriately (see docstring and the link from there). Also having varying values in placeholder without any general information is not correct, so all valid choices are equivalent to `np.ones_like(preds)` up to multiplication by non-zero constant.

### asquare commented Apr 3, 2018 • edited

 @Nikolay-Lysenko: would it be possible to attach a license to this Gist? Thanks in advance!

### manuelsh commented Sep 21, 2018

 This loss function makes all my predictions 0 for quantile 0.5... anyone having the same issue?

### lelemi1031 commented Jan 11, 2019

 This loss function makes all my predictions 0 for quantile 0.5... anyone having the same issue? I also have this issue. Did you manage to solve it?

### Ludecan commented Mar 19, 2019

 Here: http://jmarkhou.com/lgbqr/#mjx-eqn-quantileloss is a post by lightgbm that shows some issues they found with this approach and a way in which they improved it by replacing the 2nd order approximation of the Loss function with it's actual value.

### Shafi2016 commented Mar 26, 2020

 How can we find the lower and upper for the prediction interval using the above function?

### Nikolay-Lysenko commented Mar 26, 2020

 @Shafi2016, this can be done like this: ``````from functools import partial lower_quantile = 0.2 # Any other value can be placed here. upper_quantile = 0.8 xgb_quantile_lower_eval = partial(xgb_quantile_eval, quantile=lower_quantile) xgb_quantile_lower_obj = partial(xgb_quantile_obj, quantile=lower_quantile) lower_model = xgb.train(hyperparams, dtrain, num_rounds, obj=xgb_quantile_lower_obj, feval=xgb_quantile_lower_eval) xgb_quantile_upper_eval = partial(xgb_quantile_eval, quantile=upper_quantile) xgb_quantile_upper_obj = partial(xgb_quantile_obj, quantile=upper_quantile) upper_model = xgb.train(hyperparams, dtrain, num_rounds, obj=xgb_quantile_upper_obj, feval=xgb_quantile_upper_eval) lower_bound = lower_model.predict(dtest) upper_bound = upper_model.predict(dtest) `````` However, this gist is quite old. Now, there are better solutions. I recommend you to look at CatBoost or LightGBM, because these tools have native support of quantile regression as well as performance comparable to that of XGBoost.

### Shafi2016 commented Mar 26, 2020

 Thanks for the prompt response!. I have checked with both LightGBM and CatBoost. There is no doubt that their interval level is very stable. However, I could not get an improved forecast. In fact, I have a much better forecast XGBoost of H2o. Yet, H2o does not provide support for the Quantile regression. I tried to use prediction intervals using functions from this link (https://towardsdatascience.com/regression-prediction-intervals-with-xgboost-428e0a018b). However, the interval range gets very narrow and when the interval is increased upper limits get flat and there is no impact on the lower interval. I am thinking if I can get a better interval from using your function and then wrapped it up with the prediction of XGboost H2o. I hope this can be done.

### Nikolay-Lysenko commented Jan 28, 2021

 There are some questions about license. This gist is released under MIT License, so you can use it in your projects.
to join this conversation on GitHub. Already have an account? Sign in to comment