How to Tune Forecaster Hyperparameters¶
This guide shows you how to find optimal hyperparameters for any yohou forecaster using cross-validated search with temporal splitters.
Prerequisites¶
- yohou installed (Installation)
- Familiarity with fitting and predicting (Getting Started)
Try it interactively
Tune forecaster hyperparameters with GridSearchCV and RandomizedSearchCV using temporal cross-validation splitters and result scatter visualisation.
ViewOpen in marimoEvaluate hyperparameter configurations against multiple metrics simultaneously with dict-of-scorers, refit strategies, and Pareto-optimal selection.
ViewOpen in marimoTime series cross-validation on panel data with GridSearchCV, selective group observation, rewind operations, and groupwise performance comparison.
ViewOpen in marimo1. Define a Forecaster and Parameter Grid¶
Use double underscore (__) to refer to nested parameters inside a
forecaster (following scikit-learn convention):
from sklearn.linear_model import Ridge
from yohou.point import PointReductionForecaster
from yohou.metrics import MeanAbsoluteError
from yohou.model_selection import GridSearchCV, ExpandingWindowSplitter, train_test_split
from yohou.datasets import fetch_electricity_demand
data = fetch_electricity_demand()
y = data.frame
y_train, y_test = train_test_split(y, test_size=48)
forecaster = PointReductionForecaster(estimator=Ridge())
param_grid = {"estimator__alpha": [0.01, 0.1, 1.0, 10.0]}
2. Choose a Splitter¶
Use ExpandingWindowSplitter to simulate accumulating historical data.
Use SlidingWindowSplitter if you want a fixed training window instead:
from yohou.model_selection import ExpandingWindowSplitter
splitter = ExpandingWindowSplitter(n_splits=5, test_size=24)
Set test_size to match your forecasting horizon. n_splits controls
how many train/test windows are evaluated per parameter combination.
3. Run Grid Search¶
Pass the forecaster, parameter grid, scorer, and splitter to
GridSearchCV:
search = GridSearchCV(
forecaster=forecaster,
param_grid=param_grid,
scoring=MeanAbsoluteError(),
cv=splitter,
n_jobs=-1,
refit=True,
)
search.fit(y_train, forecasting_horizon=24)
Setting refit=True refits the best forecaster on the full training set so
search.predict() is ready immediately after fit.
4. Predict with the Best Model¶
When refit=True, the search object acts as a fitted forecaster:
Access the winning configuration through best_params_ and best_score_:
5. Inspect and Visualize Results¶
cv_results_ contains per-fold scores for every parameter combination:
import polars as pl
results = pl.DataFrame(search.cv_results_)
print(results.select(["params", "mean_test_score", "rank_test_score"]))
Use plot_cv_results_scatter
to visualize how score changes across parameter values:
from yohou.plotting import plot_cv_results_scatter
fig = plot_cv_results_scatter(search.cv_results_, param_name="estimator__alpha")
fig.show()
Look for parameter values where the score flattens or reaches a minimum to identify the best operating region.
6. Use RandomizedSearchCV for Large Spaces¶
When the grid has many dimensions or continuous ranges, switch to
RandomizedSearchCV,
which samples n_iter random combinations instead of evaluating every one:
from scipy.stats import loguniform
from yohou.model_selection import RandomizedSearchCV
param_distributions = {
"estimator__alpha": loguniform(1e-3, 1e3),
}
search = RandomizedSearchCV(
forecaster=forecaster,
param_distributions=param_distributions,
scoring=MeanAbsoluteError(),
cv=splitter,
n_iter=20,
n_jobs=-1,
refit=True,
random_state=42,
)
search.fit(y_train, forecasting_horizon=24)
n_iter controls the number of parameter settings sampled. Use
GridSearchCV when total combinations are small (< 50) and
RandomizedSearchCV when the space is large or continuous.
7. Evaluate with Multiple Metrics¶
Pass a dict of scorers to scoring and set refit to the scorer name used
for selecting the best model. For example, combine MeanAbsoluteError and RootMeanSquaredError:
from yohou.metrics import RootMeanSquaredError
search = GridSearchCV(
forecaster=forecaster,
param_grid=param_grid,
scoring={"mae": MeanAbsoluteError(), "rmse": RootMeanSquaredError()},
cv=splitter,
refit="mae",
)
search.fit(y_train, forecasting_horizon=24)
results = pl.DataFrame(search.cv_results_)
print(results.select(["params", "mean_test_mae", "mean_test_rmse"]))
All scorers are evaluated on every fold, but only the one named in refit
determines which parameters are selected as best.
See Also¶
- Choose a Forecasting Method: select a forecaster before tuning
- Evaluate Forecast Accuracy: understand the metrics used for scoring
- Extensions:
yohou-optunaprovidesOptunaSearchCVfor Bayesian hyperparameter search yohou.model_selectionAPI reference