Skip to content

check_panel_internal_consistency

yohou.utils.validation.check_panel_internal_consistency(df, df_name='DataFrame')

Validate that all panel groups in a DataFrame have the same local column structure.

For panel data with multiple groups (e.g., sales__store_1, sales__store_2), this checks that all groups within the same prefix have identical local column names.

Parameters

Name Type Description Default
df DataFrame

DataFrame to validate. Must have "time" column.

required
df_name str

Name of DataFrame in error message (e.g., "y", "X_actual", "y_pred").

"DataFrame"

Raises

Type Description
ValueError

If panel groups have mismatched local column structures.

Examples

>>> import polars as pl
>>> from datetime import datetime
>>> # Valid panel data - both groups have same local columns
>>> df = pl.DataFrame({
...     "time": [datetime(2020, 1, 1), datetime(2020, 1, 2)],
...     "sales__store_1": [10, 20],
...     "sales__store_2": [30, 40],
... })
>>> check_panel_internal_consistency(df, "y")  # No error
>>> # Invalid - store_2 missing in second group
>>> df_bad = pl.DataFrame({
...     "time": [datetime(2020, 1, 1)],
...     "sales__store_1": [10],
...     "revenue__store_1": [100],
... })
>>> check_panel_internal_consistency(df_bad, "y")
Traceback (most recent call last):
    ...
ValueError: Panel structure mismatch in y...

See Also

Source Code

Show/Hide source
def check_panel_internal_consistency(df: pl.DataFrame, df_name: str = "DataFrame") -> None:
    """Validate that all panel groups in a DataFrame have the same local column structure.

    For panel data with multiple groups (e.g., sales__store_1, sales__store_2),
    this checks that all groups within the same prefix have identical local column names.

    Parameters
    ----------
    df : pl.DataFrame
        DataFrame to validate. Must have "time" column.
    df_name : str, default="DataFrame"
        Name of DataFrame in error message (e.g., "y", "X_actual", "y_pred").

    Raises
    ------
    ValueError
        If panel groups have mismatched local column structures.

    Examples
    --------
    >>> import polars as pl
    >>> from datetime import datetime
    >>> # Valid panel data - both groups have same local columns
    >>> df = pl.DataFrame({
    ...     "time": [datetime(2020, 1, 1), datetime(2020, 1, 2)],
    ...     "sales__store_1": [10, 20],
    ...     "sales__store_2": [30, 40],
    ... })
    >>> check_panel_internal_consistency(df, "y")  # No error

    >>> # Invalid - store_2 missing in second group
    >>> df_bad = pl.DataFrame({
    ...     "time": [datetime(2020, 1, 1)],
    ...     "sales__store_1": [10],
    ...     "revenue__store_1": [100],
    ... })
    >>> check_panel_internal_consistency(df_bad, "y")  # doctest: +SKIP
    Traceback (most recent call last):
        ...
    ValueError: Panel structure mismatch in y...

    See Also
    --------
    - [`check_panel_groups_match`][yohou.utils.validation.check_panel_groups_match] : Validate y and X_actual have matching panel groups.
    - [`check_groups`][yohou.utils.validation.check_groups] : Validate panel group names for forecaster operations.
    - [`inspect_panel`][yohou.utils.panel.inspect_panel] : Detect panel groups in a DataFrame.

    """
    _, groups = inspect_panel(df)
    if not groups:
        return  # No panel data, nothing to check

    # Get reference group (first one)
    ref_grp = next(iter(groups))
    ref_cols = sorted(c.split("__", 1)[1] for c in groups[ref_grp])

    # Check all other groups match reference
    for grp, cols in groups.items():
        curr_cols = sorted(c.split("__", 1)[1] for c in cols)
        if curr_cols != ref_cols:
            raise ValueError(
                f"Panel structure mismatch in `{df_name}`. Group '{ref_grp}' has local "
                f"columns {ref_cols}, but group '{grp}' has {curr_cols}."
            )