joshlk/confidence_intervals.md

## confidence_intervals.md

      
    Raw
  

              confidence_intervals.md
            
          
    Confidence/prediction intervals to quantiles

quantile_higher = (interval_width / 2) + 0.5
Quantile metrics:

def pinball_loss(y_true: np.ndarray, target_quantile: float, y_quantile: np.ndarray) -> np.ndarray:
    """
    The pinball loss function is a metric used to assess the accuracy of a quantile forecast.
    Calculates pinball loss element wise (does not aggregate).
    Smaller is better.

    Refs:
    * https://www.lokad.com/pinball-loss-function-definition
    """

    return (
            ((y_true - y_quantile) * target_quantile) * (y_true >= y_quantile)
            + ((y_quantile - y_true) * (1 - target_quantile)) * (y_true < y_quantile)
    )
    
def scaled_pinball_loss(
        y_true: np.ndarray, target_quantile: float, y_quantile: np.ndarray, y_train: np.ndarray) -> float:
    """
    Scaled pinball loss. Scaled using training ts_data.

    Roughly: mean pinball loss divided by mean gradient (or diff) of training ts_data

    Refs:
    * § Probabilistic Forecasts: https://github.com/Mcompetitions/M5-methods/blob/master/M5-Competitors-Guide.pdf
    * https://www.kaggle.com/chrisrichardmiles/m5u-wsplevaluator-weighted-scaled-pinball-loss
    """

    n_training = len(y_train)
    assert n_training > 1

    return (
            pinball_loss(y_true, target_quantile, y_quantile).mean()
            / ((1 / (n_training - 1)) * np.abs(np.diff(y_train)).sum())
    )
    
def interval_score(y_true: np.ndarray, y_upper: np.ndarray, y_lower: np.ndarray, interval_width: float) -> float:
    """
    Mean of the "interval score"

    Refs:
    * R implementation: https://rdrr.io/cran/scoringutils/man/interval_score.html
    * Paper. §6.2, page 21: https://apps.dtic.mil/sti/pdfs/ADA454828.pdf
    """
    alpha = 1 - interval_width
    return (
            (y_upper - y_lower)
            + 2 / alpha * (y_lower - y_true) * (y_true < y_lower)
            + 2 / alpha * (y_true - y_upper) * (y_true > y_upper)
    ).mean()
Standard deviation from quantile prediction (assume normally distributed)


 and  are the lower and upper predictions
 and  is the mean and standard deviation
 is the inverse of the cumulative distribution function with probability . (AKA quantile function or percent point function). Scipy inpmentation, Matlab implmentation


and

To take the average of the two values:

from scipy.stats import norm
def quantil_interval_to_std(lower, upper, interval_width=0.8):
    quantile_higher = (interval_width / 2) + 0.5
    std = 0.5*(upper-lower)/norm.ppf(quantile_higher)
    return std