Skip to content

Residual

yohou.metrics.conformity.Residual

Bases: BaseConformityScorer

Residual-based conformity scorer using signed prediction errors.

Computes conformity scores as the signed difference between the true and predicted values:

\[s = y - \hat{y}\]

The signed residuals produce asymmetric prediction intervals, where the lower and upper bounds can differ in width from the point prediction.

See Also

Examples

>>> import polars as pl
>>> from datetime import date
>>> from yohou.metrics.conformity import Residual
>>> scorer = Residual().fit(
...     pl.DataFrame({"time": [date(2020, 1, 1), date(2020, 1, 2)], "y": [1.0, 2.0]})
... )
>>> y_truth = pl.DataFrame({"time": [date(2020, 1, 3), date(2020, 1, 4)], "y": [3.0, 5.0]})
>>> y_pred = pl.DataFrame({"time": [date(2020, 1, 3), date(2020, 1, 4)], "y": [2.5, 4.0]})
>>> scores = scorer.score(y_truth, y_pred)
>>> scores.drop("time").to_series().to_list()
[0.5, 1.0]

Source Code

Show/Hide source
class Residual(BaseConformityScorer):
    r"""Residual-based conformity scorer using signed prediction errors.

    Computes conformity scores as the signed difference between the true
    and predicted values:

    $$s = y - \hat{y}$$

    The signed residuals produce **asymmetric** prediction intervals,
    where the lower and upper bounds can differ in width from the
    point prediction.

    See Also
    --------
    - [`AbsoluteResidual`][yohou.metrics.conformity.AbsoluteResidual] : Symmetric variant using absolute residuals.
    - [`GammaResidual`][yohou.metrics.conformity.GammaResidual] : Scale-dependent variant using relative errors.
    - [`SplitConformalForecaster`][yohou.interval.split_conformal.SplitConformalForecaster] :
        Conformal prediction forecaster that uses conformity scorers.

    Examples
    --------
    >>> import polars as pl
    >>> from datetime import date
    >>> from yohou.metrics.conformity import Residual
    >>> scorer = Residual().fit(
    ...     pl.DataFrame({"time": [date(2020, 1, 1), date(2020, 1, 2)], "y": [1.0, 2.0]})
    ... )
    >>> y_truth = pl.DataFrame({"time": [date(2020, 1, 3), date(2020, 1, 4)], "y": [3.0, 5.0]})
    >>> y_pred = pl.DataFrame({"time": [date(2020, 1, 3), date(2020, 1, 4)], "y": [2.5, 4.0]})
    >>> scores = scorer.score(y_truth, y_pred)
    >>> scores.drop("time").to_series().to_list()
    [0.5, 1.0]

    """

    def score(self, y_truth: pl.DataFrame, y_pred: pl.DataFrame, /, **score_params) -> pl.DataFrame:
        """Compute signed residual conformity scores.

        Parameters
        ----------
        y_truth : pl.DataFrame
            True target values.

        y_pred : pl.DataFrame
            Predicted values.

        Returns
        -------
        pl.DataFrame
            Conformity scores (y_truth - y_pred) with "time" column preserved.

        """
        check_is_fitted(self, ["_is_fitted"])

        # Filter out scorer from score_params to avoid conflict with explicit scorer=self
        score_params_filtered = {k: v for k, v in score_params.items() if k != "scorer"}

        # Validate and align (time dropped, returned as context)
        y_truth, y_pred, context = validate_scorer_data(
            self,
            y_truth,
            y_pred,
            **score_params_filtered,
        )

        # Compute scores and reconstruct with time
        scores_values = y_truth - y_pred
        scores = pl.DataFrame({"time": context.time_values}).hstack(scores_values)

        return scores

    def inverse_score(
        self, y_pred: pl.DataFrame, conformity_scores: pl.DataFrame, coverage_rate: float
    ) -> pl.DataFrame:
        """Construct prediction intervals from conformity scores.

        Parameters
        ----------
        y_pred : pl.DataFrame
            Point predictions, optionally with "time" column.

        conformity_scores : pl.DataFrame
            Computed conformity scores from calibration set, optionally with "time" column.

        coverage_rate : float
            Desired coverage probability (e.g., 0.9 for 90% intervals).

        Returns
        -------
        pl.DataFrame
            Prediction intervals with lower and upper bounds, and time columns if input had them.

        """
        check_is_fitted(self, ["_is_fitted"])

        # Validate and align inputs (time dropped, returned as context for reconstruction)
        y_pred, conformity_scores, context = validate_scorer_data(
            self, y_true=None, y_pred=y_pred, scores=conformity_scores, inverse=True
        )

        # Compute intervals
        lower_quantile, upper_quantile = self._compute_assymetric_quantiles(conformity_scores, coverage_rate)
        lower_bound, upper_bound = y_pred + lower_quantile, y_pred + upper_quantile

        y_pred_interval = self._format_y_pred_interval(lower_bound, upper_bound, coverage_rate)

        # Add time column back
        y_pred_interval = pl.DataFrame({"time": context.time_values}).hstack(y_pred_interval)

        return y_pred_interval

Methods

score(y_truth, y_pred, /, **score_params)

Compute signed residual conformity scores.

Parameters
Name Type Description Default
y_truth DataFrame

True target values.

required
y_pred DataFrame

Predicted values.

required
Returns
Type Description
DataFrame

Conformity scores (y_truth - y_pred) with "time" column preserved.

Source Code
Show/Hide source
def score(self, y_truth: pl.DataFrame, y_pred: pl.DataFrame, /, **score_params) -> pl.DataFrame:
    """Compute signed residual conformity scores.

    Parameters
    ----------
    y_truth : pl.DataFrame
        True target values.

    y_pred : pl.DataFrame
        Predicted values.

    Returns
    -------
    pl.DataFrame
        Conformity scores (y_truth - y_pred) with "time" column preserved.

    """
    check_is_fitted(self, ["_is_fitted"])

    # Filter out scorer from score_params to avoid conflict with explicit scorer=self
    score_params_filtered = {k: v for k, v in score_params.items() if k != "scorer"}

    # Validate and align (time dropped, returned as context)
    y_truth, y_pred, context = validate_scorer_data(
        self,
        y_truth,
        y_pred,
        **score_params_filtered,
    )

    # Compute scores and reconstruct with time
    scores_values = y_truth - y_pred
    scores = pl.DataFrame({"time": context.time_values}).hstack(scores_values)

    return scores

inverse_score(y_pred, conformity_scores, coverage_rate)

Construct prediction intervals from conformity scores.

Parameters
Name Type Description Default
y_pred DataFrame

Point predictions, optionally with "time" column.

required
conformity_scores DataFrame

Computed conformity scores from calibration set, optionally with "time" column.

required
coverage_rate float

Desired coverage probability (e.g., 0.9 for 90% intervals).

required
Returns
Type Description
DataFrame

Prediction intervals with lower and upper bounds, and time columns if input had them.

Source Code
Show/Hide source
def inverse_score(
    self, y_pred: pl.DataFrame, conformity_scores: pl.DataFrame, coverage_rate: float
) -> pl.DataFrame:
    """Construct prediction intervals from conformity scores.

    Parameters
    ----------
    y_pred : pl.DataFrame
        Point predictions, optionally with "time" column.

    conformity_scores : pl.DataFrame
        Computed conformity scores from calibration set, optionally with "time" column.

    coverage_rate : float
        Desired coverage probability (e.g., 0.9 for 90% intervals).

    Returns
    -------
    pl.DataFrame
        Prediction intervals with lower and upper bounds, and time columns if input had them.

    """
    check_is_fitted(self, ["_is_fitted"])

    # Validate and align inputs (time dropped, returned as context for reconstruction)
    y_pred, conformity_scores, context = validate_scorer_data(
        self, y_true=None, y_pred=y_pred, scores=conformity_scores, inverse=True
    )

    # Compute intervals
    lower_quantile, upper_quantile = self._compute_assymetric_quantiles(conformity_scores, coverage_rate)
    lower_bound, upper_bound = y_pred + lower_quantile, y_pred + upper_quantile

    y_pred_interval = self._format_y_pred_interval(lower_bound, upper_bound, coverage_rate)

    # Add time column back
    y_pred_interval = pl.DataFrame({"time": context.time_values}).hstack(y_pred_interval)

    return y_pred_interval

Tutorials

The following example notebooks use this component:

  • How to Use Conformity Scorers


    Evaluation-Search

    Compare Residual, AbsoluteResidual, GammaResidual, and AbsoluteGammaResidual conformity scorers with coverage/width analysis and DistanceSimilarity interaction.

    View · Open in marimo

  • How to Search Interval Forecaster Hyperparameters


    Evaluation-Search

    Tune interval forecaster parameters directly with interval metrics in GridSearchCV, including mixed point+interval multimetric search.

    View · Open in marimo

  • Conformal Prediction Intervals


    Getting-Started

    Build distribution-free prediction intervals with SplitConformalForecaster using calibration holdouts and configurable conformity scoring functions.

    View · Open in marimo