yohou.utils¶
Validation, panel data, weighting, tags, discovery, and other utility functions.
Discovery¶
| Name | Description |
|---|---|
all_estimators |
Get a list of all estimators from yohou. |
all_displays |
Get a list of all displays from yohou. |
all_functions |
Get a list of all functions from yohou. |
Panel data¶
| Name | Description |
|---|---|
inspect_panel |
Inspect DataFrame columns to distinguish global and local (panel) data. |
get_group_df |
Extract and rename columns for a specific panel group. |
dict_to_panel |
Convert a dict of group DataFrames to a single DataFrame with prefixed columns. |
select_panel_columns |
Select panel group columns and optionally global columns of a DataFrame. |
panel_aware_rename |
Apply a rename function to a column name while preserving the panel group prefix. |
panel_aware_prefix |
Add a prefix to a column name while preserving the panel group prefix. |
panel_aware_suffix |
Add a suffix to a column name while preserving the panel group prefix. |
check_groups |
Validate and normalize panel group names for forecaster operations. |
check_groups_exist |
Validate all requested panel groups exist in fitted forecaster. |
check_panel_groups_match |
Validate that y and X have matching panel group structures. |
check_panel_internal_consistency |
Validate that all panel groups in a DataFrame have the same local column structure. |
Data validation¶
| Name | Description |
|---|---|
validate_forecaster_data |
Validate data for forecasters. |
validate_transformer_data |
Validate data for transformers. |
validate_scorer_data |
Validate and prepare scorer input data. |
validate_splitter_data |
Validate data for splitters. |
validate_plotting_data |
Validate a DataFrame for plotting and resolve columns. |
validate_plotting_params |
Validate common plotting function parameters. |
validate_search_data |
Validate input data for hyperparameter search (GridSearchCV, RandomizedSearchCV). |
validate_time_weight |
Validate time_weight parameter for forecasters and scorers. |
validate_column_names |
Validate that __ separator is used only for panel data group names. |
Time series validation¶
| Name | Description |
|---|---|
check_time_column |
Validate that time column exists, has proper dtype, no nulls, and is sorted. |
check_interval_consistency |
Validate that a time series has uniform time spacing. |
check_continuity |
Validate temporal continuity between consecutive DataFrames. |
check_sufficient_rows |
Validate DataFrame has sufficient rows for operation. |
check_inputs |
Validate that target and feature DataFrames have consistent time intervals. |
check_schema |
Validate DataFrame schema and return with proper column ordering. |
check_X_actual_required |
Validate X_actual is provided when required for recursive prediction. |
check_forecasting_horizon_positive |
Validate forecasting horizon is positive. |
check_scorer_column_selection |
Subselect columns based on scorer configuration. |
Weighting¶
| Name | Description |
|---|---|
exponential_decay_weight |
Generate exponential decay weights giving more weight to recent times. |
linear_decay_weight |
Generate linear decay weights giving more weight to recent times. |
seasonal_emphasis_weight |
Generate weights emphasizing specific seasonal positions. |
compose_weights |
Compose multiple weight functions by multiplication. |
validate_callable_signature |
Validate that callable has valid signature for time weighting. |
normalize_weights |
Normalize weights so they sum to the number of elements. |
validate_weight_array |
Validate a resolved weight array for NaN, negatives, infinities, and all-zero. |
resolve_dict_weights |
Map a {key: weight} dict to an aligned numpy array. |
combine_weight_vectors |
Combine weight vectors multiplicatively and normalize. |
resolve_weight_to_array |
Resolve a weight specification (callable, DataFrame, or dict) to a numpy array. |
Time intervals¶
| Name | Description |
|---|---|
add_interval |
Add n intervals to a datetime (handles variable-length intervals). |
interval_to_timedelta |
Convert fixed interval to timedelta, or None for variable intervals. |
parse_interval |
Parse interval string into (multiplier, unit). |
Polars helpers¶
| Name | Description |
|---|---|
cast |
Cast columns according to schema with integer rounding. |
get_numeric_columns |
Get list of numeric column names from a DataFrame. |
tabularize |
Convert time series to tabular format using lags. |