Skip to content

MeanDirectionalAccuracy

yohou.metrics.point.MeanDirectionalAccuracy

Bases: BasePointScorer

Mean Directional Accuracy metric for point forecasts.

Computes the proportion of time steps where the predicted direction of change matches the actual direction of change. This metric evaluates whether the forecast correctly predicts upward or downward movements.

The MDA is defined as:

\[\text{MDA} = \frac{1}{n-1}\sum_{i=2}^{n}\mathbf{1}[\text{sign}(\Delta y_i) = \text{sign}(\Delta \hat{y}_i)]\]

where \(\Delta y_i = y_i - y_{i-1}\) and \(\Delta \hat{y}_i = \hat{y}_i - \hat{y}_{i-1}\).

Parameters

Name Type Description Default
aggregation_method list of str or str

Dimensions to aggregate over. Options: - "stepwise": Aggregate across forecasting steps. - "vintagewise": Aggregate across vintages (observed times). - "componentwise": Aggregate across components, return per-timestep DataFrame - "groupwise": Aggregate across panel groups (panel data only) - "all": Aggregate across all dimensions (returns scalar). Same as ["stepwise", "vintagewise", "componentwise", "groupwise"].

"all"
groups list of str, dict of str to float, or None

Panel group filter (list) or filter with weights (dict).

None
components list of str, dict of str to float, or None

Component filter (list) or filter with weights (dict).

None

Attributes

Name Type Description
lower_is_better bool

Always False for MDA. Higher values indicate better directional prediction.

Examples

>>> import polars as pl
>>> from datetime import datetime
>>> from yohou.metrics import MeanDirectionalAccuracy
>>> y_true = pl.DataFrame({
...     "time": [
...         datetime(2020, 1, 1),
...         datetime(2020, 1, 2),
...         datetime(2020, 1, 3),
...         datetime(2020, 1, 4),
...         datetime(2020, 1, 5),
...     ],
...     "value": [10.0, 15.0, 12.0, 18.0, 20.0],
... })
>>> y_pred = pl.DataFrame({
...     "vintage_time": [datetime(2019, 12, 31)] * 5,
...     "time": [
...         datetime(2020, 1, 1),
...         datetime(2020, 1, 2),
...         datetime(2020, 1, 3),
...         datetime(2020, 1, 4),
...         datetime(2020, 1, 5),
...     ],
...     "value": [10.0, 14.0, 15.0, 17.0, 19.0],
... })
>>> mda = MeanDirectionalAccuracy()
>>> _ = mda.fit(y_true)
>>> mda.score(y_true, y_pred)
0.75

Notes

  • MDA = 1.0 means all directional changes were predicted correctly
  • MDA = 0.5 is equivalent to random guessing for direction
  • MDA = 0.0 means all directional predictions were wrong
  • Requires at least 2 time steps (N-1 comparisons from .diff())
  • Returns 0.0 when fewer than 2 rows are available
  • Overrides score() because computing direction requires .diff() on the full columns, not per-row errors

See Also

Source Code

Show/Hide source
class MeanDirectionalAccuracy(BasePointScorer):
    r"""Mean Directional Accuracy metric for point forecasts.

    Computes the proportion of time steps where the predicted direction of
    change matches the actual direction of change. This metric evaluates
    whether the forecast correctly predicts upward or downward movements.

    The MDA is defined as:

    $$\text{MDA} = \frac{1}{n-1}\sum_{i=2}^{n}\mathbf{1}[\text{sign}(\Delta y_i) = \text{sign}(\Delta \hat{y}_i)]$$

    where $\Delta y_i = y_i - y_{i-1}$ and $\Delta \hat{y}_i = \hat{y}_i - \hat{y}_{i-1}$.

    Parameters
    ----------
    aggregation_method : list of str or str, default="all"
        Dimensions to aggregate over. Options:
        - "stepwise": Aggregate across forecasting steps.
        - "vintagewise": Aggregate across vintages (observed times).
        - "componentwise": Aggregate across components, return per-timestep DataFrame
        - "groupwise": Aggregate across panel groups (panel data only)
        - "all": Aggregate across all dimensions (returns scalar). Same as
          ["stepwise", "vintagewise", "componentwise", "groupwise"].
    groups : list of str, dict of str to float, or None, default=None
        Panel group filter (list) or filter with weights (dict).
    components : list of str, dict of str to float, or None, default=None
        Component filter (list) or filter with weights (dict).

    Attributes
    ----------
    lower_is_better : bool
        Always False for MDA. Higher values indicate better directional prediction.

    Examples
    --------
    >>> import polars as pl
    >>> from datetime import datetime
    >>> from yohou.metrics import MeanDirectionalAccuracy
    >>> y_true = pl.DataFrame({
    ...     "time": [
    ...         datetime(2020, 1, 1),
    ...         datetime(2020, 1, 2),
    ...         datetime(2020, 1, 3),
    ...         datetime(2020, 1, 4),
    ...         datetime(2020, 1, 5),
    ...     ],
    ...     "value": [10.0, 15.0, 12.0, 18.0, 20.0],
    ... })
    >>> y_pred = pl.DataFrame({
    ...     "vintage_time": [datetime(2019, 12, 31)] * 5,
    ...     "time": [
    ...         datetime(2020, 1, 1),
    ...         datetime(2020, 1, 2),
    ...         datetime(2020, 1, 3),
    ...         datetime(2020, 1, 4),
    ...         datetime(2020, 1, 5),
    ...     ],
    ...     "value": [10.0, 14.0, 15.0, 17.0, 19.0],
    ... })
    >>> mda = MeanDirectionalAccuracy()
    >>> _ = mda.fit(y_true)
    >>> mda.score(y_true, y_pred)
    0.75

    Notes
    -----
    - MDA = 1.0 means all directional changes were predicted correctly
    - MDA = 0.5 is equivalent to random guessing for direction
    - MDA = 0.0 means all directional predictions were wrong
    - Requires at least 2 time steps (N-1 comparisons from ``.diff()``)
    - Returns 0.0 when fewer than 2 rows are available
    - Overrides ``score()`` because computing direction requires ``.diff()``
      on the full columns, not per-row errors

    See Also
    --------
    - [`MeanAbsoluteError`][yohou.metrics.point.MeanAbsoluteError] : Error magnitude metric (not directional)
    - [`R2Score`][yohou.metrics.point.R2Score] : Variance explained metric

    """

    _metric_name = "mda"

    lower_is_better = False

    def __init__(
        self,
        aggregation_method: list[str] | str = "all",
        groups: list[str] | dict[str, float] | None = None,
        components: list[str] | dict[str, float] | None = None,
    ) -> None:
        super().__init__(
            aggregation_method=aggregation_method,
            groups=groups,
            components=components,
        )

    def _compute_raw_errors(self, y_truth: pl.DataFrame, y_pred: pl.DataFrame) -> pl.DataFrame:
        """Not used directly. MDA overrides score()."""
        return (y_truth - y_pred).select(pl.all().abs())

    def score(  # type: ignore
        self,
        y_truth: pl.DataFrame,
        y_pred: pl.DataFrame,
        /,
        vintage_weight: Callable | pl.DataFrame | dict | None = None,
        **params,
    ) -> float | pl.DataFrame:
        """Compute Mean Directional Accuracy.

        Parameters
        ----------
        y_truth : pl.DataFrame
            True values with "time" column.
        y_pred : pl.DataFrame
            Predicted values with "time" column.
        vintage_weight : callable, pl.DataFrame, dict, or None, default=None
            Per-vintage weights for cross-vintage aggregation.
        **params : dict
            Metadata to route to nested estimators.

        Returns
        -------
        float or pl.DataFrame
            MDA score between 0 and 1. 1.0 for perfect directional prediction.

        Raises
        ------
        TypeError
            If time_weight or step_weight are passed.

        """
        self._reject_weights(**params)
        check_is_fitted(self, ["_is_fitted"])

        y_truth, y_pred, context = validate_scorer_data(
            self,
            y_truth,
            y_pred,
        )

        if len(y_truth) < 2:
            return 0.0

        # Resolve vintage_weight into context
        context = self._resolve_vintage_weight_to_context(context, vintage_weight)

        def _compute_mda(yt_slice: pl.DataFrame, yp_slice: pl.DataFrame) -> pl.DataFrame | None:
            """Compute per-column mean directional accuracy."""
            if len(yt_slice) < 2:
                return None
            mda_values = {}
            for col in yt_slice.columns:
                truth_diff = np.diff(yt_slice[col].to_numpy().astype(np.float64))
                pred_diff = np.diff(yp_slice[col].to_numpy().astype(np.float64))
                matches = (np.sign(truth_diff) == np.sign(pred_diff)).astype(np.float64)
                mda_values[col] = float(np.mean(matches))
            return pl.DataFrame(mda_values).select(yt_slice.columns)

        result = self._map_per_vintage(y_truth, y_pred, context, _compute_mda)
        return self._aggregate_per_vintage_scores(result, context)

    def __sklearn_tags__(self):
        """Get estimator tags.

        Returns
        -------
        Tags
            Estimator tags with lower_is_better=False.

        """
        tags = super().__sklearn_tags__()
        if tags.scorer_tags is not None:
            tags.scorer_tags.lower_is_better = False
        return tags

Methods

score(y_truth, y_pred, /, vintage_weight=None, **params)

Compute Mean Directional Accuracy.

Parameters
Name Type Description Default
y_truth DataFrame

True values with "time" column.

required
y_pred DataFrame

Predicted values with "time" column.

required
vintage_weight callable, pl.DataFrame, dict, or None

Per-vintage weights for cross-vintage aggregation.

None
**params dict

Metadata to route to nested estimators.

{}
Returns
Type Description
float or DataFrame

MDA score between 0 and 1. 1.0 for perfect directional prediction.

Raises
Type Description
TypeError

If time_weight or step_weight are passed.

Source Code
Show/Hide source
def score(  # type: ignore
    self,
    y_truth: pl.DataFrame,
    y_pred: pl.DataFrame,
    /,
    vintage_weight: Callable | pl.DataFrame | dict | None = None,
    **params,
) -> float | pl.DataFrame:
    """Compute Mean Directional Accuracy.

    Parameters
    ----------
    y_truth : pl.DataFrame
        True values with "time" column.
    y_pred : pl.DataFrame
        Predicted values with "time" column.
    vintage_weight : callable, pl.DataFrame, dict, or None, default=None
        Per-vintage weights for cross-vintage aggregation.
    **params : dict
        Metadata to route to nested estimators.

    Returns
    -------
    float or pl.DataFrame
        MDA score between 0 and 1. 1.0 for perfect directional prediction.

    Raises
    ------
    TypeError
        If time_weight or step_weight are passed.

    """
    self._reject_weights(**params)
    check_is_fitted(self, ["_is_fitted"])

    y_truth, y_pred, context = validate_scorer_data(
        self,
        y_truth,
        y_pred,
    )

    if len(y_truth) < 2:
        return 0.0

    # Resolve vintage_weight into context
    context = self._resolve_vintage_weight_to_context(context, vintage_weight)

    def _compute_mda(yt_slice: pl.DataFrame, yp_slice: pl.DataFrame) -> pl.DataFrame | None:
        """Compute per-column mean directional accuracy."""
        if len(yt_slice) < 2:
            return None
        mda_values = {}
        for col in yt_slice.columns:
            truth_diff = np.diff(yt_slice[col].to_numpy().astype(np.float64))
            pred_diff = np.diff(yp_slice[col].to_numpy().astype(np.float64))
            matches = (np.sign(truth_diff) == np.sign(pred_diff)).astype(np.float64)
            mda_values[col] = float(np.mean(matches))
        return pl.DataFrame(mda_values).select(yt_slice.columns)

    result = self._map_per_vintage(y_truth, y_pred, context, _compute_mda)
    return self._aggregate_per_vintage_scores(result, context)

__sklearn_tags__()

Get estimator tags.

Returns
Type Description
Tags

Estimator tags with lower_is_better=False.

Source Code
Show/Hide source
def __sklearn_tags__(self):
    """Get estimator tags.

    Returns
    -------
    Tags
        Estimator tags with lower_is_better=False.

    """
    tags = super().__sklearn_tags__()
    if tags.scorer_tags is not None:
        tags.scorer_tags.lower_is_better = False
    return tags