Skip to content

check_transform_output_structure

yohou.testing.transformer.check_transform_output_structure(transformer, X, y=None)

Check transform() output has "time" column and valid structure.

Transform output must be a polars DataFrame with a "time" column and valid data types.

Parameters

Name Type Description Default
transformer BaseTransformer

Unfitted transformer

required
X DataFrame

Training data

required
y DataFrame

Target data

None

Raises

Type Description
AssertionError

If output structure is invalid

Source Code

Show/Hide source
def check_transform_output_structure(transformer, X: pl.DataFrame, y: pl.DataFrame | None = None) -> None:
    """Check transform() output has "time" column and valid structure.

    Transform output must be a polars DataFrame with a "time" column
    and valid data types.

    Parameters
    ----------
    transformer : BaseTransformer
        Unfitted transformer
    X : pl.DataFrame
        Training data
    y : pl.DataFrame, optional
        Target data

    Raises
    ------
    AssertionError
        If output structure is invalid

    """
    transformer_clone = clone(transformer)
    transformer_clone.fit(X, y)

    X_trans = transformer_clone.transform(X)

    # Check it's a DataFrame
    assert isinstance(X_trans, pl.DataFrame), f"transform() must return pl.DataFrame, got {type(X_trans)}"

    # Check time column exists
    assert "time" in X_trans.columns, "transform() output must contain 'time' column"

    # Check time column is datetime
    assert X_trans["time"].dtype in [pl.Datetime, pl.Date], (
        f"'time' column must be datetime type, got {X_trans['time'].dtype}"
    )

    # Check output has at least one feature column
    feature_cols = [col for col in X_trans.columns if col != "time"]
    assert len(feature_cols) > 0, "transform() output must have at least one feature column besides 'time'"