Skip to content

Precision

yohou.metrics.classification.Precision

Bases: BaseHardLabelScorer

Precision from class-probability forecasts.

Computes precision (positive predictive value) by argmaxing predicted probabilities into hard labels and computing the ratio of true positives to all predicted positives.

\[\text{Precision}_k = \frac{TP_k}{TP_k + FP_k}\]

Parameters

Name Type Description Default
average str

Class averaging strategy: "macro" (unweighted mean across classes), "micro" (aggregate counts first), or "weighted" (support-weighted mean).

"macro"
zero_division float

Value returned when the denominator is zero.

0.0
aggregation_method list of str or str

Dimensions to aggregate over.

"all"
groups list of str, dict of str to float, or None

Panel group filter or filter with weights.

None
components list of str, dict of str to float, or None

Component filter or filter with weights.

None

Attributes

Name Type Description
lower_is_better bool

Always False (higher precision is better).

Examples

>>> import polars as pl
>>> from datetime import datetime
>>> from yohou.metrics.classification import Precision
>>> y_true = pl.DataFrame({
...     "time": [datetime(2020, 1, i) for i in range(1, 6)],
...     "weather": ["sunny", "rainy", "cloudy", "sunny", "rainy"],
... })
>>> y_pred = pl.DataFrame({
...     "vintage_time": [datetime(2019, 12, 31)] * 5,
...     "time": [datetime(2020, 1, i) for i in range(1, 6)],
...     "weather_proba_sunny": [0.7, 0.1, 0.2, 0.2, 0.1],
...     "weather_proba_rainy": [0.2, 0.8, 0.1, 0.1, 0.8],
...     "weather_proba_cloudy": [0.1, 0.1, 0.7, 0.7, 0.1],
... })
>>> scorer = Precision()
>>> _ = scorer.fit(y_true)
>>> scorer.score(y_true, y_pred)
0.833...

See Also

  • Recall : Recall (sensitivity).
  • FBetaScore : Weighted harmonic mean of precision and recall.

Source Code

Show/Hide source
class Precision(BaseHardLabelScorer):
    r"""Precision from class-probability forecasts.

    Computes precision (positive predictive value) by argmaxing predicted
    probabilities into hard labels and computing the ratio of true
    positives to all predicted positives.

    $$\text{Precision}_k = \frac{TP_k}{TP_k + FP_k}$$

    Parameters
    ----------
    average : str, default="macro"
        Class averaging strategy: ``"macro"`` (unweighted mean across
        classes), ``"micro"`` (aggregate counts first), or ``"weighted"``
        (support-weighted mean).
    zero_division : float, default=0.0
        Value returned when the denominator is zero.
    aggregation_method : list of str or str, default="all"
        Dimensions to aggregate over.
    groups : list of str, dict of str to float, or None, default=None
        Panel group filter or filter with weights.
    components : list of str, dict of str to float, or None, default=None
        Component filter or filter with weights.

    Attributes
    ----------
    lower_is_better : bool
        Always False (higher precision is better).

    Examples
    --------
    >>> import polars as pl
    >>> from datetime import datetime
    >>> from yohou.metrics.classification import Precision
    >>> y_true = pl.DataFrame({
    ...     "time": [datetime(2020, 1, i) for i in range(1, 6)],
    ...     "weather": ["sunny", "rainy", "cloudy", "sunny", "rainy"],
    ... })
    >>> y_pred = pl.DataFrame({
    ...     "vintage_time": [datetime(2019, 12, 31)] * 5,
    ...     "time": [datetime(2020, 1, i) for i in range(1, 6)],
    ...     "weather_proba_sunny": [0.7, 0.1, 0.2, 0.2, 0.1],
    ...     "weather_proba_rainy": [0.2, 0.8, 0.1, 0.1, 0.8],
    ...     "weather_proba_cloudy": [0.1, 0.1, 0.7, 0.7, 0.1],
    ... })
    >>> scorer = Precision()
    >>> _ = scorer.fit(y_true)
    >>> scorer.score(y_true, y_pred)  # doctest: +ELLIPSIS
    0.833...

    See Also
    --------
    - [`Recall`][yohou.metrics.classification.Recall] : Recall (sensitivity).
    - [`FBetaScore`][yohou.metrics.classification.FBetaScore] : Weighted harmonic mean of precision and recall.

    """

    _metric_name = "precision"
    _lower_is_better = False

    def _compute_metric_from_counts(self, counts: pl.DataFrame) -> pl.DataFrame:
        """Compute precision from confusion counts."""
        return counts.select(
            pl
            .when(pl.col("tp") + pl.col("fp") == 0)
            .then(self.zero_division)
            .otherwise(pl.col("tp") / (pl.col("tp") + pl.col("fp")))
            .alias("value")
        )