PowerTransformer¶

`yohou.preprocessing.sklearn_wrappers.PowerTransformer` ¶

Bases: SklearnTransformer

Apply a power transform featurewise to make data more Gaussian-like.

Power transforms are a family of parametric, monotonic transformations that are applied to make data more Gaussian-like. This is useful for modeling issues related to heteroscedasticity (non-constant variance), or other situations where normality is desired.

Currently, PowerTransformer supports the Box-Cox transform and the Yeo-Johnson transform. The optimal parameter for stabilizing variance and minimizing skewness is estimated through maximum likelihood.

Box-Cox requires input data to be strictly positive, while Yeo-Johnson supports both positive and negative data.

This is a Yohou wrapper that preserves the polars DataFrame structure and "time" column.

Parameters¶

Name	Type	Description	Default
`method`	`('yeo-johnson', 'box-cox')`	The power transform method. Available methods are: 'yeo-johnson': Works with positive and negative values. 'box-cox': Only works with strictly positive values.	`'yeo-johnson'`
`standardize`	`bool`	Set to True to apply zero-mean, unit-variance normalization to the transformed output.	`True`

Attributes¶

Name	Type	Description
`instance_`	`PowerTransformer`	The fitted sklearn PowerTransformer instance.
`lambdas_`	`ndarray of float, shape (n_features,)`	The parameters of the power transform for the selected features.

Examples¶

>>> import polars as pl
>>> from datetime import datetime
>>> from yohou.preprocessing import PowerTransformer
>>> X = pl.DataFrame({
...     "time": [datetime(2024, 1, i) for i in range(1, 6)],
...     "value": [1.0, 4.0, 9.0, 16.0, 25.0],  # Skewed data
... })
>>> pt = PowerTransformer(method="yeo-johnson")
>>> pt.fit(X)
PowerTransformer(...)
>>> X_transformed = pt.transform(X)
>>> # Data is transformed to be more Gaussian-like
>>> "time" in X_transformed.columns
True
>>> # Can be inverted
>>> X_inv = pt.inverse_transform(X_transformed)
>>> abs(X_inv["value"][0] - 1.0) < 1e-10
True

Source Code¶

View on GitHub

Show/Hide sourceclass PowerTransformer(SklearnTransformer):
    """Apply a power transform featurewise to make data more Gaussian-like.

    Power transforms are a family of parametric, monotonic transformations that
    are applied to make data more Gaussian-like. This is useful for modeling
    issues related to heteroscedasticity (non-constant variance), or other
    situations where normality is desired.

    Currently, PowerTransformer supports the Box-Cox transform and the
    Yeo-Johnson transform. The optimal parameter for stabilizing variance and
    minimizing skewness is estimated through maximum likelihood.

    Box-Cox requires input data to be strictly positive, while Yeo-Johnson
    supports both positive and negative data.

    This is a Yohou wrapper that preserves the polars DataFrame structure and
    "time" column.

    Parameters
    ----------
    method : {'yeo-johnson', 'box-cox'}, default='yeo-johnson'
        The power transform method. Available methods are:

        - 'yeo-johnson': Works with positive and negative values.
        - 'box-cox': Only works with strictly positive values.

    standardize : bool, default=True
        Set to True to apply zero-mean, unit-variance normalization to the
        transformed output.

    Attributes
    ----------
    instance_ : sklearn.preprocessing.PowerTransformer
        The fitted sklearn PowerTransformer instance.

    lambdas_ : ndarray of float, shape (n_features,)
        The parameters of the power transform for the selected features.

    Examples
    --------
    >>> import polars as pl
    >>> from datetime import datetime
    >>> from yohou.preprocessing import PowerTransformer
    >>> X = pl.DataFrame({
    ...     "time": [datetime(2024, 1, i) for i in range(1, 6)],
    ...     "value": [1.0, 4.0, 9.0, 16.0, 25.0],  # Skewed data
    ... })
    >>> pt = PowerTransformer(method="yeo-johnson")
    >>> pt.fit(X)  # doctest: +ELLIPSIS
    PowerTransformer(...)
    >>> X_transformed = pt.transform(X)
    >>> # Data is transformed to be more Gaussian-like
    >>> "time" in X_transformed.columns
    True
    >>> # Can be inverted
    >>> X_inv = pt.inverse_transform(X_transformed)
    >>> abs(X_inv["value"][0] - 1.0) < 1e-10
    True

    See Also
    --------
    - [`QuantileTransformer`][yohou.preprocessing.sklearn_wrappers.QuantileTransformer] : Transform features using quantiles information.

    """

    _estimator_default_class = sklearn_PowerTransformer

    def __init__(self, method="yeo-johnson", standardize=True, copy=True, **kwargs):
        super().__init__(method=method, standardize=standardize, copy=copy, **kwargs)

    @property
    def lambdas_(self) -> np.ndarray:
        """The parameters of the power transform for the selected features."""
        check_is_fitted(self, ["instance_"])
        return self.instance_.lambdas_

Methods¶

`lambdas_` `property` ¶

The parameters of the power transform for the selected features.

Tutorials¶

The following example notebooks use this component:

How to Use Scikit-learn Scalers

Data-Features

Wrap sklearn scalers (StandardScaler, MinMaxScaler, RobustScaler, PowerTransformer, PolynomialFeatures) for polars DataFrames with inverse transforms.

View · Open in marimo