Over and Underestimate Loss Metrics

For a model, metrics like MAE and RMSE and so on quantify how accurate a prediction is. However, the vast majority of metrics are symmetric, meaning they care equally about overestimates and underestimates.

So far, I have found really one relatively common way for caring about direction: Mean Logarithmic Error (MLE). I also created the logical opposite, inverse MLE (iMLE). Straight from my AutoTS documentation: MLE penalizes an under-predicted forecast greater than an over-predicted forecast. iMLE is the inverse, and penalizes an over-prediction more.

full_errors = forecast - actual
full_mae_errors = np.abs(full_errors)
le = np.log1p(full_mae_errors)
MLE = np.nanmean(np.where(full_errors > 0, le, full_mae_errors), axis=0)
iMLE = np.nanmean(np.where(-full_errors > 0, le, full_mae_errors), axis=0)

The basic purpose of this forecast is to give a big penalty for crossing the boundary, and only a small penalty for errors in the desired direction. Every approach comes down to a where statement (either in Python’s Numpy or Tensorflow).

But it came up today the point that sometimes you want a threshold first. So you don’t mind less than a 10% overestimate too much, but anything greater than that you really want to avoid.

One way to add a threshold would be to replace the > 0 argument in the function above with an array of threshold values.

However, inspired by Pinball/Quantile loss, I wanted a function that could handle either direction, adjusted solely by the input threshold. Thus, we have the threshold_loss function.

def threshold_loss(actual, forecast, threshold, penalty_threshold=None):
    """Run once for overestimate then again for underestimate. Add both for combined view.
    From Colin Catlin at syllepsis.live

    Args:
        actual/forecast: 2D wide style data DataFrame or np.array
        threshold: (0, 2), 0.9 (penalize 10% and greater underestimates) and 1.1 (penalize overestimate over 10%)
        penalty_threshold: defaults to same as threshold, adjust strength of penalty
    """
    if penalty_threshold is None:
        penalty_threshold = threshold
    actual_threshold = actual * threshold
    abs_err = abs(actual - forecast)
    ls = np.where(actual_threshold >= forecast, (1 / penalty_threshold) * abs_err, penalty_threshold * abs_err)
    return np.nanmean(ls, axis=0)

As written, this is essentially an adjusted MAE. However it would be easy enough to replace abs_err with sqr_err or some such.

Quick diagram of how the loss function of threshold_loss looks, here for an underestimate threshold.

Accordingly, when threshold is < 1, a larger value (more bad) represents more underestimates crossing the minimum tolerance threshold. When threshold is > 1, a larger value (more bad) represents more overestimates above threshold. In both cases, lower scores will represent a better forecast (lower MAE) and/or better conformance to the threshold.

If you only care about threshold crossing and not forecast accuracy inside the bounds, then none of these metrics are needed, and a simple classification type metric (1 for inside bounds, 0 for outside) would work quite well.

# returns % inside bounds for wide style data (records/dates * features/time series)
np.count_nonzero(
            (upper_bound >= forecast) & (lower_bound <= forecast), axis=0
) / forecast.shape[0]

Leave a Comment

Your email address will not be published. Required fields are marked *