validate_search_data¶
yohou.utils.validation.validate_search_data(y, X_actual)
¶
Validate input data for hyperparameter search (GridSearchCV, RandomizedSearchCV).
Performs comprehensive validation of time series data for cross-validation: - Checks that y is not None - Validates time column presence, dtype, nulls, and sorting - Validates panel data internal consistency - Validates panel data group matching between y and X_actual - Validates consistent time intervals across DataFrames
This function is designed for SearchCV contexts where we validate data without modifying forecaster state (unlike validate_forecaster_data).
Parameters¶
| Name | Type | Description | Default |
|---|---|---|---|
y
|
DataFrame
|
Target time series with "time" column. |
required |
X_actual
|
DataFrame or None
|
Exogenous feature time series with "time" column, or None. |
required |
Returns¶
| Type | Description |
|---|---|
str
|
The common time interval shared by all provided DataFrames (e.g., "1d", "1mo"). |
Raises¶
| Type | Description |
|---|---|
ValueError
|
If y is None, time columns are invalid, panel data is inconsistent, or intervals don't match across DataFrames. |
Examples¶
>>> import polars as pl
>>> from datetime import datetime
>>> time_index = pl.datetime_range(
... start=datetime(2020, 1, 1), end=datetime(2020, 1, 5), interval="1d", eager=True
... )
>>> y = pl.DataFrame({"time": time_index, "sales": [100, 110, 120, 130, 140]})
>>> X_actual = pl.DataFrame({"time": time_index, "holiday": [0, 0, 1, 0, 0]})
>>> interval = validate_search_data(y, X_actual)
>>> interval
'1d'
See Also¶
validate_forecaster_data: Data validation with forecaster state managementcheck_inputs: Validates consistent time intervalscheck_time_column: Validates time column properties