Forecaster Composition¶
Yohou provides four classes that compose forecasters into larger forecasting
structures. Each component is itself a full forecaster with
fit/predict/observe/rewind lifecycle, not a transformer or preprocessing
step. They address situations where a single forecaster cannot handle the full
problem: additive components in the data, target columns with different dynamics,
features that must be forecast before the target, or panel groups with
fundamentally different patterns.
All four classes support panel data, integrate with hyperparameter search and cross-validation, and can be nested inside each other or wrapped by ensemble voters.
For composing transformers (feature pipelines, scaling chains, lag features), see Feature Pipelines.
DecompositionPipeline¶
DecompositionPipeline decomposes a time series into additive components by fitting forecasters in
sequence. Each forecaster models the residuals left by all previous forecasters,
and the final prediction is the sum of all component predictions:
The forecasters parameter takes a list of (name, forecaster) tuples. All
entries must be point forecasters (interval or class probability forecasters are
not supported because residuals from probabilistic outputs are not well defined):
from yohou.compose import DecompositionPipeline
from yohou.stationarity import PolynomialTrendForecaster
from yohou.point import SeasonalNaive
pipeline = DecompositionPipeline(forecasters=[
("trend", PolynomialTrendForecaster(degree=1)),
("seasonality", SeasonalNaive(seasonality=12)),
])
The first forecaster fits the raw data and produces a trend forecast. The second receives the residuals (original minus trend) and models what remains. Ordering matters: placing a trend model first, then a seasonal model, then a residual model follows the classical decompose-forecast-recompose pattern.
Multiplicative decomposition¶
For multiplicative relationships, pass target_transformer=LogTransformer().
This transforms the target into log-space where multiplication becomes addition,
applies the additive pipeline, and back-transforms the result:
Feature transformation¶
The optional feature_transformer parameter applies a transformer to exogenous
features once at the pipeline level before any forecaster receives them. All
component forecasters share the same transformed features, so feature
preprocessing does not need to be duplicated inside each component.
Diagnostic residuals¶
Setting store_residuals=True saves the intermediate residuals after each
component in pipeline.residuals_, a dictionary mapping forecaster name to a
Polars DataFrame. This is useful for inspecting whether a component successfully
captured its intended pattern or whether signal remains for downstream
components to model.
ColumnForecaster¶
ColumnForecaster assigns different forecasters to different target columns, then concatenates
predictions horizontally. Each entry in the forecasters list is a
(name, forecaster, columns) tuple, where columns is a string or list of
strings identifying which target columns that forecaster is responsible for.
This is useful when target columns have fundamentally different characteristics. A slow-moving trend variable might work best with a linear model while a volatile signal needs gradient boosting. Forcing a single model to handle both can produce mediocre predictions for each.
Remainder handling¶
Columns not claimed by any forecaster are handled by the remainder parameter:
"drop"(default): unclaimed columns are excluded from predictions."passthrough": unclaimed columns are passed through unchanged.- A forecaster instance: unclaimed columns are forecast by that model.
Each column must appear in exactly one forecaster. Overlapping assignments raise an error.
Exogenous features and forecaster types¶
All forecasters receive the full exogenous data (X_actual, X_future,
X_forecast), but each sees only its assigned target columns in y. This means
a feature that is relevant to multiple targets only needs to appear once.
Because ColumnForecaster wraps arbitrary forecasters, it supports point
predictions, interval predictions, and class probability predictions. The
available methods depend on the capabilities of the inner forecasters. Setting
n_jobs enables parallel fitting across column groups.
When verbose_feature_names_out=True, output columns are prefixed with the
forecaster name (for example, sales_model__revenue), which avoids ambiguity
when multiple forecasters produce columns with the same name.
ForecastedFeatureForecaster¶
ForecastedFeatureForecaster is a two-stage forecaster for scenarios where exogenous features (X_actual)
are available during training but not at prediction time. It chains a
feature_forecaster that predicts future feature values with a
target_forecaster that uses those predicted features to forecast y. The class
requires X_actual at fit time and raises a ValueError if it is not provided.
graph LR
subgraph fit
direction TB
A["X_actual"] --> B["feature_fcstr"]
B -->|strategy| C["target_fcstr"]
D["y"] --> C
end
subgraph predict
direction TB
E["target_fcstr"] --> F["ลท_pred"]
end
fit ~~~ predict
The distribution shift problem¶
The core challenge is a training/prediction mismatch. At prediction time the
target forecaster receives forecasted (imperfect) feature values, but during
training the real feature values are available. Training on real features and
predicting with forecasted ones can degrade accuracy. The strategy parameter
controls how this is handled.
Training strategies¶
"actual" (default) fits the feature forecaster on the full X_actual, then
fits the target forecaster on the full y with the real X_actual. This is the
simplest approach but creates a distribution mismatch: the target forecaster
trains on perfect features and predicts with imperfect ones.
"predicted" splits the training data at position
int(len(y) * split_ratio). The feature forecaster trains on the first portion
and predicts features for the second. The target forecaster then trains on the
second portion using those predicted (imperfect) features. This avoids the
distribution shift but sacrifices some training data. The split_ratio
parameter (default 0.5) controls the split point; setting it lower gives the
target forecaster more training data at the cost of a less accurate feature
forecaster.
"rewind" fits the feature forecaster on all data, rewinds it to the
observation horizon, then predicts features from the rewind point onward. The
target forecaster trains on those predicted features. This approach uses all data
for feature learning while still exposing the target forecaster to imperfect
features, balancing data efficiency with distribution alignment.
Prediction capabilities¶
ForecastedFeatureForecaster delegates all prediction calls to the target
forecaster. If the target forecaster supports interval predictions or class
probability predictions, those methods become available on the composite. The
feature forecaster always produces point predictions regardless.
For the data-shaping perspective on exogenous features (the three types X_actual,
X_future, X_forecast, and step-indexed columns), see
Exogenous Features.
LocalPanelForecaster¶
LocalPanelForecaster fits a separate clone of a forecaster per panel group rather than a single
global model. The input must be panel data (columns with the group__column
naming convention). Each clone sees unprefixed, single-series data: a group named
store_a with column store_a__sales receives a DataFrame with a plain sales
column.
This is appropriate when groups have fundamentally different dynamics (for example, products with unrelated demand patterns) and a global model would blur the distinctions. The trade-off is that each group trains on only its own data, which can be a problem for groups with short histories. Global models share information across groups at the cost of missing group-specific patterns.
Exogenous feature routing¶
Exogenous features can be panel-specific (prefixed, like store_a__temperature)
or global (unprefixed, like holiday_flag). LocalPanelForecaster extracts
each group's prefixed columns, strips the prefixes, and combines them with any
global columns. Each clone therefore receives a clean, unprefixed feature set
tailored to its group.
Parallel fitting and prediction types¶
Setting n_jobs enables parallel fitting across groups, which is helpful when the
number of groups is large. The class supports whatever prediction types the wrapped
forecaster supports: point predictions are always available, and interval
predictions are available if the inner forecaster provides them.
After fitting, the per-group clones are accessible through the forecasters_
attribute, a dictionary mapping group names to fitted forecaster instances.
State Propagation Through Composite Forecasters¶
When you call observe() on a composite forecaster, the new data flows through
to each sub-component in a pattern that mirrors fit and predict. Understanding
this flow helps predict what will happen when new data arrives in a production
observe/predict loop.
DecompositionPipeline processes observations in the same order as training.
Each forecaster in the chain predicts its component, subtracts it from the
incoming data, and passes the residual to the next. This preserves the additive
decomposition: calling observe() then predict() produces the same result as
re-fitting on the extended data, as long as the components remain stable. The
observe_predict() method handles this residual decomposition internally,
ensuring rolling evaluation produces correct multi-component predictions.
ColumnForecaster routes each target column to its assigned forecaster.
All forecasters receive the full exogenous data, but each observes only its own
target columns. Calling observe() independently updates each column's model
without cross-contamination.
ForecastedFeatureForecaster chains observations in two stages. The feature
forecaster observes X_actual columns as its target. The target forecaster then
observes y together with the actual feature values. This maintains the
two-stage contract at observation time, not just during initial fitting.
LocalPanelForecaster dispatches observe() to each group's clone
with only the rows belonging to that group. Each group maintains independent
state, so observing new data for one group does not affect others.
Calling rewind() reverses these operations across all composite types,
restoring each sub-component to its previous observation window. This is useful
for what-if analysis: observe new data, predict, rewind, try different data,
predict again.
For the metadata routing infrastructure that enables these operations to flow through search and cross-validation objects, see Metadata Routing.
Connections¶
Feature Pipelines covers composing transformers rather than forecasters. Exogenous Features explains the three exogenous parameter types and step-indexed columns that ForecastedFeatureForecaster is designed around. Ensemble Forecasting describes combining forecasters by voting rather than by decomposition or column assignment. For how parameters like time_weight flow through pipelines and search objects, see Metadata Routing.
For practical recipes, see How to Compose Feature Pipelines and How to Combine Forecasters with Ensembles. The compose API is documented in the yohou.compose reference.