Skip to content

PatternSeasonalityForecaster

yohou.stationarity.seasonality.PatternSeasonalityForecaster

Bases: _BaseSeasonalityForecaster

Forecast using seasonal pattern extraction and repetition.

Learns seasonal patterns from historical data and repeats them into the future. Suitable for time series with strong periodic behavior.

Parameters

Name Type Description Default
seasonality int

Seasonal period length (e.g., 12 for monthly data with yearly seasonality, 7 for daily data with weekly seasonality).

required
method (naive, average, median)

Method for aggregating seasonal patterns: - "naive": Use last complete cycle - "average": Mean across all cycles - "median": Median across all cycles (robust to outliers)

"naive"
target_transformer BaseTransformer

Transformer for target variable.

None
panel_strategy ('global', multivariate)

How to handle panel data. See BaseForecaster for details.

"global"

Attributes

Name Type Description
seasonal_pattern_ DataFrame

Learned seasonal pattern (length = seasonality).

Examples

>>> import polars as pl
>>> from datetime import datetime
>>> from yohou.stationarity import PatternSeasonalityForecaster
>>>
>>> # Create time series with monthly seasonality
>>> pattern = [10, 12, 15, 13, 11, 9, 8, 10, 12, 15, 13, 11]
>>> y = pl.DataFrame({
...     "time": pl.datetime_range(
...         start=datetime(2020, 1, 1), end=datetime(2022, 12, 1), interval="1mo", eager=True
...     ),
...     "value": pattern * 3,
... })
>>>
>>> # Fit seasonal forecaster
>>> forecaster = PatternSeasonalityForecaster(seasonality=12, method="average")
>>> forecaster.fit(y, forecasting_horizon=6)
PatternSeasonalityForecaster(seasonality=12)
>>>
>>> # Forecast next 6 months
>>> y_pred = forecaster.predict(forecasting_horizon=6)

See Also

Notes

  • Requires at least 2 complete seasonal cycles for "average"/"median" methods
  • "naive" method only requires 1 complete cycle
  • Works best with detrended data (consider using with differencing transformers)

Source Code

Show/Hide source
class PatternSeasonalityForecaster(_BaseSeasonalityForecaster):
    """Forecast using seasonal pattern extraction and repetition.

    Learns seasonal patterns from historical data and repeats them into the
    future. Suitable for time series with strong periodic behavior.

    Parameters
    ----------
    seasonality : int
        Seasonal period length (e.g., 12 for monthly data with yearly seasonality,
        7 for daily data with weekly seasonality).
    method : {"naive", "average", "median"}, default="average"
        Method for aggregating seasonal patterns:
        - "naive": Use last complete cycle
        - "average": Mean across all cycles
        - "median": Median across all cycles (robust to outliers)
    target_transformer : BaseTransformer, optional
        Transformer for target variable.
    panel_strategy : {"global", "multivariate"}, default="global"
        How to handle panel data.  See `BaseForecaster` for details.

    Attributes
    ----------
    seasonal_pattern_ : pl.DataFrame
        Learned seasonal pattern (length = seasonality).

    Examples
    --------
    >>> import polars as pl
    >>> from datetime import datetime
    >>> from yohou.stationarity import PatternSeasonalityForecaster
    >>>
    >>> # Create time series with monthly seasonality
    >>> pattern = [10, 12, 15, 13, 11, 9, 8, 10, 12, 15, 13, 11]
    >>> y = pl.DataFrame({
    ...     "time": pl.datetime_range(
    ...         start=datetime(2020, 1, 1), end=datetime(2022, 12, 1), interval="1mo", eager=True
    ...     ),
    ...     "value": pattern * 3,
    ... })
    >>>
    >>> # Fit seasonal forecaster
    >>> forecaster = PatternSeasonalityForecaster(seasonality=12, method="average")
    >>> forecaster.fit(y, forecasting_horizon=6)
    PatternSeasonalityForecaster(seasonality=12)
    >>>
    >>> # Forecast next 6 months
    >>> y_pred = forecaster.predict(forecasting_horizon=6)

    See Also
    --------
    - [`FourierSeasonalityForecaster`][yohou.stationarity.seasonality.FourierSeasonalityForecaster] : Fourier-based seasonality for smooth curves.
    - [`PolynomialTrendForecaster`][yohou.stationarity.trend.PolynomialTrendForecaster] : Polynomial trend estimation.
    - [`DecompositionPipeline`][yohou.compose.decomposition_pipeline.DecompositionPipeline] : Combines trend + seasonality + residual forecasters.

    Notes
    -----
    - Requires at least 2 complete seasonal cycles for "average"/"median" methods
    - "naive" method only requires 1 complete cycle
    - Works best with detrended data (consider using with differencing transformers)

    """

    _parameter_constraints: dict = {
        **_BaseSeasonalityForecaster._parameter_constraints,
        "method": [StrOptions({"naive", "average", "median"})],
    }

    def __init__(
        self,
        seasonality: StrictInt,
        method: Literal["naive", "average", "median"] = "average",
        target_transformer=None,
        panel_strategy="global",
    ):
        super().__init__(seasonality=seasonality, target_transformer=target_transformer, panel_strategy=panel_strategy)
        self.method = method

    def _fit(
        self,
        y_t: pl.DataFrame | dict[str, pl.DataFrame],
        X_t: pl.DataFrame | dict[str, pl.DataFrame] | None,
        forecasting_horizon: StrictInt,
    ) -> None:
        """Extract seasonal pattern from transformed data.

        Parameters
        ----------
        y_t : pl.DataFrame or dict[str, pl.DataFrame]
            Transformed target time series.
        X_t : pl.DataFrame or dict[str, pl.DataFrame] or None
            Transformed features (unused).
        forecasting_horizon : int
            Number of steps ahead to forecast.

        Raises
        ------
        ValueError
            If insufficient data for specified method.

        """
        # Validate sufficient data for seasonality
        self._validate_method_requirements(y_t)

        # Extract seasonal pattern
        self.seasonal_pattern_ = self._extract_pattern(y_t)

    def _validate_method_requirements(self, y_t: pl.DataFrame | dict[str, pl.DataFrame]) -> None:
        """Validate sufficient data for the specified method.

        Parameters
        ----------
        y_t : pl.DataFrame or dict[str, pl.DataFrame]
            Transformed target time series.

        Raises
        ------
        ValueError
            If insufficient data for method.

        """
        min_required = self.seasonality
        if self.method in ["average", "median"]:
            min_required = 2 * self.seasonality

        # Handle panel data (dict of DataFrames)
        if isinstance(y_t, dict):
            for panel_group_name, y_t_group in y_t.items():
                assert isinstance(y_t_group, pl.DataFrame)
                if len(y_t_group) < min_required:
                    raise ValueError(
                        f"Insufficient data for group '{panel_group_name}' with method='{self.method}': "
                        f"need at least {min_required} observations "
                        f"({min_required // self.seasonality} complete cycles), got {len(y_t_group)}"
                    )
        # Handle global data (single DataFrame)
        elif len(y_t) < min_required:
            raise ValueError(
                f"Insufficient data for method='{self.method}': "
                f"need at least {min_required} observations "
                f"({min_required // self.seasonality} complete cycles), got {len(y_t)}"
            )

    def _extract_pattern(self, y_t: pl.DataFrame | dict[str, pl.DataFrame]) -> pl.DataFrame:
        """Extract seasonal pattern from data.

        Parameters
        ----------
        y_t : pl.DataFrame or dict[str, pl.DataFrame]
            Transformed target time series.

        Returns
        -------
        pl.DataFrame
            Seasonal pattern with length = seasonality.

        """
        # Non-panel data
        if self.groups_ is None:
            assert isinstance(y_t, pl.DataFrame)
            patterns = self._extract_pattern_one(y_t)

        # Panel data with pooled pattern (concatenate all groups)
        else:
            # Concatenate all panel group data vertically
            # In panel mode, y_t is a dict and subscript returns DataFrame
            assert isinstance(y_t, dict)
            all_groups_data = [y_t[group_name] for group_name in self.groups_]
            y_t_pooled = pl.concat(all_groups_data, how="vertical")
            patterns = self._extract_pattern_one(y_t_pooled)

        return patterns

    def _extract_pattern_one(self, y_t: pl.DataFrame) -> pl.DataFrame:
        """Extract seasonal pattern from a single DataFrame.

        Parameters
        ----------
        y_t : pl.DataFrame
            Transformed target time series.

        Returns
        -------
        pl.DataFrame
            Seasonal pattern with length = seasonality.

        """
        # Calculate number of complete cycles
        n_cycles = len(y_t) // self.seasonality

        if self.method == "naive":
            # Return last complete cycle
            start_idx = (n_cycles - 1) * self.seasonality
            end_idx = n_cycles * self.seasonality
            pattern = y_t[start_idx:end_idx]

        else:
            # Reshape into cycles and aggregate
            # Truncate to complete cycles only
            truncated_length = n_cycles * self.seasonality
            assert isinstance(truncated_length, int)
            y_truncated = y_t[:truncated_length]

            # Add cycle and position indices
            cycle_indices = [i // self.seasonality for i in range(truncated_length)]
            position_indices = [i % self.seasonality for i in range(truncated_length)]

            y_with_indices = y_truncated.with_columns(
                pl.Series("cycle", cycle_indices),
                pl.Series("position", position_indices),
            )

            # Group by position and aggregate
            if self.method == "average":
                pattern = y_with_indices.group_by("position").agg([
                    pl.col(c).mean() for c in y_t.columns if c != "time"
                ])
            else:  # median
                pattern = y_with_indices.group_by("position").agg([
                    pl.col(c).median() for c in y_t.columns if c != "time"
                ])

            # Sort by position to maintain order
            pattern = pattern.sort("position").select(cs.all().exclude(["position", "cycle"]))

        return pattern

    def _predict_one(
        self,
        groups: list[str],
        **params,
    ) -> pl.DataFrame:
        """Predicts `_fit_forecasting_horizon` steps from the observation horizon.

        Parameters
        ----------
        groups : list of str
            Panel group names to predict for.
        **params : dict
            Metadata to route to nested estimators.

        Returns
        -------
        pl.DataFrame
            Predicted time series.

        """
        y_t_columns = list(self.local_y_t_schema_.keys())

        # Non-panel data
        if self.groups_ is None:
            # Get phase indices for predictions
            X_phases = self._get_time_indices(self.fit_forecasting_horizon_) % self.seasonality

            # Look up values from pattern
            assert isinstance(self.seasonal_pattern_, pl.DataFrame)
            y_pred = {}
            for col_name in self.seasonal_pattern_.columns:
                if col_name == "time":
                    continue

                # Extract values at specified phases
                pattern_values = self.seasonal_pattern_[col_name].to_list()
                pred_values = [pattern_values[phase] for phase in X_phases.to_list()]
                y_pred[col_name] = pred_values

            y_pred = pl.DataFrame(y_pred)

        # Panel data
        else:
            y_pred = []
            for panel_group_name in groups:
                # Get phase indices for this group
                X_phases = (
                    self._get_time_indices(self.fit_forecasting_horizon_, panel_group_name=panel_group_name)
                    % self.seasonality
                )

                # Get shared pooled pattern
                pattern = self.seasonal_pattern_

                # Look up values from pattern
                y_pred_group = {}
                for col_name in y_t_columns:
                    # Extract values at specified phases
                    pattern_values = pattern[col_name].to_list()
                    pred_values = [pattern_values[phase] for phase in X_phases.to_list()]
                    y_pred_group[f"{panel_group_name}__{col_name}"] = pred_values

                y_pred.append(pl.DataFrame(y_pred_group))

            y_pred = pl.concat(y_pred, how="horizontal")

        return self._add_time_columns(y_pred)

Tutorials

The following example notebooks use this component:

  • Decomposition


    Data-Features

    Chain PolynomialTrendForecaster, PatternSeasonalityForecaster, and FourierSeasonalityForecaster inside DecompositionPipeline with component visualisation.

    View · Open in marimo

  • How to Tune Fourier Seasonality Terms


    Data-Features

    Explore how Fourier harmonic count affects seasonal fit quality, compare Fourier vs Pattern seasonality, and tune harmonics jointly with GridSearchCV.

    View · Open in marimo

  • How to Build Panel Feature Pipelines


    Panel-Data

    Combine ColumnForecaster, FeaturePipeline, FeatureUnion, and DecompositionPipeline on panel data with per-group scoring on KDD Cup air quality.

    View · Open in marimo

  • Forecast Visualization


    Visualization

    Visualise point forecasts from single and multiple models, decomposition pipeline components, and time weight decay functions with interactive Plotly.

    View · Open in marimo