Skip to content

How to Tune Forecaster Hyperparameters

This guide shows you how to find optimal hyperparameters for any yohou forecaster using cross-validated search with temporal splitters.

Prerequisites

Try it interactively

How to Run Hyperparameter Search

Tune forecaster hyperparameters with GridSearchCV and RandomizedSearchCV using temporal cross-validation splitters and result scatter visualisation.

ViewOpen in marimo
How to Search with Multiple Metrics

Evaluate hyperparameter configurations against multiple metrics simultaneously with dict-of-scorers, refit strategies, and Pareto-optimal selection.

ViewOpen in marimo
How to Run Panel Cross-Validation

Time series cross-validation on panel data with GridSearchCV, selective group observation, rewind operations, and groupwise performance comparison.

ViewOpen in marimo

1. Define a Forecaster and Parameter Grid

Use double underscore (__) to refer to nested parameters inside a forecaster (following scikit-learn convention):

from sklearn.linear_model import Ridge
from yohou.point import PointReductionForecaster
from yohou.metrics import MeanAbsoluteError
from yohou.model_selection import GridSearchCV, ExpandingWindowSplitter, train_test_split
from yohou.datasets import fetch_electricity_demand

data = fetch_electricity_demand()
y = data.frame

y_train, y_test = train_test_split(y, test_size=48)

forecaster = PointReductionForecaster(estimator=Ridge())

param_grid = {"estimator__alpha": [0.01, 0.1, 1.0, 10.0]}

2. Choose a Splitter

Use ExpandingWindowSplitter to simulate accumulating historical data. Use SlidingWindowSplitter if you want a fixed training window instead:

from yohou.model_selection import ExpandingWindowSplitter

splitter = ExpandingWindowSplitter(n_splits=5, test_size=24)

Set test_size to match your forecasting horizon. n_splits controls how many train/test windows are evaluated per parameter combination.

Pass the forecaster, parameter grid, scorer, and splitter to GridSearchCV:

search = GridSearchCV(
    forecaster=forecaster,
    param_grid=param_grid,
    scoring=MeanAbsoluteError(),
    cv=splitter,
    n_jobs=-1,
    refit=True,
)
search.fit(y_train, forecasting_horizon=24)

Setting refit=True refits the best forecaster on the full training set so search.predict() is ready immediately after fit.

4. Predict with the Best Model

When refit=True, the search object acts as a fitted forecaster:

y_pred = search.predict()

Access the winning configuration through best_params_ and best_score_:

print("Best params:", search.best_params_)
print("Best score:", search.best_score_)

5. Inspect and Visualize Results

cv_results_ contains per-fold scores for every parameter combination:

import polars as pl

results = pl.DataFrame(search.cv_results_)
print(results.select(["params", "mean_test_score", "rank_test_score"]))

Use plot_cv_results_scatter to visualize how score changes across parameter values:

from yohou.plotting import plot_cv_results_scatter

fig = plot_cv_results_scatter(search.cv_results_, param_name="estimator__alpha")
fig.show()

Look for parameter values where the score flattens or reaches a minimum to identify the best operating region.

6. Use RandomizedSearchCV for Large Spaces

When the grid has many dimensions or continuous ranges, switch to RandomizedSearchCV, which samples n_iter random combinations instead of evaluating every one:

from scipy.stats import loguniform
from yohou.model_selection import RandomizedSearchCV

param_distributions = {
    "estimator__alpha": loguniform(1e-3, 1e3),
}

search = RandomizedSearchCV(
    forecaster=forecaster,
    param_distributions=param_distributions,
    scoring=MeanAbsoluteError(),
    cv=splitter,
    n_iter=20,
    n_jobs=-1,
    refit=True,
    random_state=42,
)
search.fit(y_train, forecasting_horizon=24)

n_iter controls the number of parameter settings sampled. Use GridSearchCV when total combinations are small (< 50) and RandomizedSearchCV when the space is large or continuous.

7. Evaluate with Multiple Metrics

Pass a dict of scorers to scoring and set refit to the scorer name used for selecting the best model. For example, combine MeanAbsoluteError and RootMeanSquaredError:

from yohou.metrics import RootMeanSquaredError

search = GridSearchCV(
    forecaster=forecaster,
    param_grid=param_grid,
    scoring={"mae": MeanAbsoluteError(), "rmse": RootMeanSquaredError()},
    cv=splitter,
    refit="mae",
)
search.fit(y_train, forecasting_horizon=24)

results = pl.DataFrame(search.cv_results_)
print(results.select(["params", "mean_test_mae", "mean_test_rmse"]))

All scorers are evaluated on every fold, but only the one named in refit determines which parameters are selected as best.

See Also