How to Combine Forecasters with Ensembles¶
This guide shows you how to combine multiple forecasters into a voting ensemble for more stable predictions.
Prerequisites¶
- yohou installed (Installation)
- Familiarity with fitting and predicting with point or interval forecasters (Getting Started)
Try it interactively
Build classification ensembles with VotingClassProbaForecaster using soft and hard voting strategies.
ViewOpen in marimoBuild interval ensembles with VotingIntervalForecaster using envelope, mean, and median aggregation strategies.
ViewOpen in marimoBuild point ensembles with VotingPointForecaster using mean, weighted, and median aggregation strategies.
ViewOpen in marimo1. Create a Point Ensemble¶
Pass named (name, forecaster) tuples to VotingPointForecaster:
from sklearn.linear_model import Ridge
from sklearn.ensemble import RandomForestRegressor
from yohou.ensemble import VotingPointForecaster
from yohou.point import PointReductionForecaster
from yohou.datasets import fetch_electricity_demand
data = fetch_electricity_demand()
y = data.frame
ridge = PointReductionForecaster(estimator=Ridge())
rf = PointReductionForecaster(estimator=RandomForestRegressor(n_estimators=50))
ensemble = VotingPointForecaster(
forecasters=[("ridge", ridge), ("rf", rf)],
method="mean",
)
ensemble.fit(y, forecasting_horizon=24)
y_pred = ensemble.predict()
To favor one model over another, pass weights:
weighted = VotingPointForecaster(
forecasters=[("ridge", ridge), ("rf", rf)],
method="mean",
weights=[0.3, 0.7], # favor random forest
)
Set method="median" for robustness against outlier predictions (weights
are ignored with median aggregation).
2. Ensemble Interval Forecasters¶
VotingIntervalForecaster combines prediction intervals from
multiple interval forecasters such as SplitConformalForecaster:
from yohou.ensemble import VotingIntervalForecaster
from yohou.interval import SplitConformalForecaster
interval_ensemble = VotingIntervalForecaster(
forecasters=[
("conf_ridge", SplitConformalForecaster(
point_forecaster=PointReductionForecaster(estimator=Ridge()),
)),
("conf_rf", SplitConformalForecaster(
point_forecaster=PointReductionForecaster(estimator=RandomForestRegressor()),
)),
],
method="envelope", # most conservative: min of lowers, max of uppers
)
interval_ensemble.fit(y, forecasting_horizon=24, coverage_rates=[0.9])
y_interval = interval_ensemble.predict_interval()
Available method values: "envelope" (default, most conservative), "mean", "median".
3. Ensemble Classification Forecasters¶
VotingClassProbaForecaster combines class probability predictions:
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from yohou.ensemble import VotingClassProbaForecaster
from yohou.class_proba import ClassProbaReductionForecaster
from yohou.datasets import fetch_air_quality_classification
class_data = fetch_air_quality_classification()
y_class = class_data.y
class_ensemble = VotingClassProbaForecaster(
forecasters=[
("lr", ClassProbaReductionForecaster(estimator=LogisticRegression())),
("rf", ClassProbaReductionForecaster(estimator=RandomForestClassifier())),
],
method="soft", # weighted average of probabilities
)
class_ensemble.fit(y_class, forecasting_horizon=24)
y_proba = class_ensemble.predict_class_proba()
Use method="hard" for majority voting (argmax per base model, then mode).
4. Speed Up with Parallel Fitting¶
All voting forecasters accept n_jobs to fit base models in parallel:
ensemble = VotingPointForecaster(
forecasters=[("ridge", ridge), ("rf", rf)],
n_jobs=-1, # use all available cores
)
5. Use with Panel Data¶
All voting forecasters support panel data automatically. Pass a DataFrame
with __ separated panel columns, and each base forecaster receives the full
panel. Aggregation happens per group:
from yohou.datasets import fetch_tourism_monthly
bunch = fetch_tourism_monthly()
y_panel = bunch.frame.select(
["time", "T187__tourists", "T188__tourists", "T189__tourists"]
).drop_nulls()
ridge = PointReductionForecaster(estimator=Ridge())
rf = PointReductionForecaster(estimator=RandomForestRegressor(n_estimators=50))
panel_ensemble = VotingPointForecaster(
forecasters=[("ridge", ridge), ("rf", rf)],
)
panel_ensemble.fit(y_panel, forecasting_horizon=12)
y_pred_panel = panel_ensemble.predict()
The output contains one column per group, each with the ensemble's aggregated prediction.
See Working with Panel Data for panel data preparation and forecasting.
See Also¶
- Ensemble Forecasting: theory and aggregation formulas
- How to Evaluate Forecast Accuracy
- API Reference: yohou.ensemble