Skip to content

SplineTransformer

yohou.preprocessing.sklearn_wrappers.SplineTransformer

Bases: SklearnTransformer

Generate univariate B-spline bases for features.

Generate a new feature matrix consisting of n_splines=n_knots + degree - 1 spline basis functions (B-splines) of polynomial order degree for each feature.

This is a Yohou wrapper that preserves the polars DataFrame structure and "time" column.

Parameters

Name Type Description Default
n_knots int

Number of knots of the splines if knots equals one of {'uniform', 'quantile'}. Must be larger than or equal to 2.

5
degree int

The polynomial degree of the spline basis. Must be a non-negative integer.

3
knots ('uniform', 'quantile')

Set knot positions such that first and last knots are the 1st percentile and 99th percentile of the data respectively.

'uniform'
extrapolation ('error', 'constant', 'linear', 'continue', 'periodic')

If 'error', values outside the min and max values of the training features will raise an error.

'error'
include_bias bool

If True, then the last spline element inside each bin is dropped.

True
order ('C', 'F')

Order of output array.

'C'
sparse_output bool

If True, transform will return sparse CSC format. Otherwise, transform will return dense array.

False
handle_missing ('error', 'missing-as-zero')

How to handle missing values during transform. If 'error', a ValueError is raised if missing values are present. If 'missing-as-zero', missing values are treated as zeros in the spline basis.

'error'

Attributes

Name Type Description
instance_ SplineTransformer

The fitted sklearn SplineTransformer instance.

bsplines_ list of shape (n_features,)

List of BSplines objects, one for each feature.

n_features_out_ int

Number of output features.

Examples

>>> import polars as pl
>>> from datetime import datetime
>>> from yohou.preprocessing import SplineTransformer
>>> X = pl.DataFrame({
...     "time": [datetime(2024, 1, i) for i in range(1, 11)],
...     "value": [float(i) for i in range(10)],
... })
>>> spline = SplineTransformer(n_knots=4, degree=3)
>>> spline.fit(X)
SplineTransformer(...)
>>> X_spline = spline.transform(X)
>>> # Generates spline basis features
>>> len(X_spline.columns) > len(X.columns)
True

See Also

Source Code

Show/Hide source
class SplineTransformer(SklearnTransformer):
    """Generate univariate B-spline bases for features.

    Generate a new feature matrix consisting of ``n_splines=n_knots + degree - 1``
    spline basis functions (B-splines) of polynomial order ``degree`` for each
    feature.

    This is a Yohou wrapper that preserves the polars DataFrame structure and
    "time" column.

    Parameters
    ----------
    n_knots : int, default=5
        Number of knots of the splines if ``knots`` equals one of
        {'uniform', 'quantile'}. Must be larger than or equal to 2.

    degree : int, default=3
        The polynomial degree of the spline basis. Must be a non-negative
        integer.

    knots : {'uniform', 'quantile'} or array-like of shape (n_knots, n_features), default='uniform'
        Set knot positions such that first and last knots are the 1st percentile
        and 99th percentile of the data respectively.

    extrapolation : {'error', 'constant', 'linear', 'continue', 'periodic'}, default='constant'
        If 'error', values outside the min and max values of the training
        features will raise an error.

    include_bias : bool, default=True
        If True, then the last spline element inside each bin is dropped.

    order : {'C', 'F'}, default='C'
        Order of output array.

    sparse_output : bool, default=False
        If True, transform will return sparse CSC format. Otherwise,
        transform will return dense array.

    handle_missing : {'error', 'missing-as-zero'}, default='error'
        How to handle missing values during transform. If 'error', a ValueError
        is raised if missing values are present. If 'missing-as-zero', missing
        values are treated as zeros in the spline basis.

    Attributes
    ----------
    instance_ : sklearn.preprocessing.SplineTransformer
        The fitted sklearn SplineTransformer instance.

    bsplines_ : list of shape (n_features,)
        List of BSplines objects, one for each feature.

    n_features_out_ : int
        Number of output features.

    Examples
    --------
    >>> import polars as pl
    >>> from datetime import datetime
    >>> from yohou.preprocessing import SplineTransformer
    >>> X = pl.DataFrame({
    ...     "time": [datetime(2024, 1, i) for i in range(1, 11)],
    ...     "value": [float(i) for i in range(10)],
    ... })
    >>> spline = SplineTransformer(n_knots=4, degree=3)
    >>> spline.fit(X)  # doctest: +ELLIPSIS
    SplineTransformer(...)
    >>> X_spline = spline.transform(X)
    >>> # Generates spline basis features
    >>> len(X_spline.columns) > len(X.columns)
    True

    See Also
    --------
    - [`PolynomialFeatures`][yohou.preprocessing.sklearn_wrappers.PolynomialFeatures] : Generate polynomial and interaction features.

    """

    _estimator_default_class = sklearn_SplineTransformer

    def __init__(
        self,
        n_knots=5,
        degree=3,
        knots="uniform",
        extrapolation="constant",
        include_bias=True,
        order="C",
        sparse_output=False,
        handle_missing="error",
        **kwargs,
    ):
        params = _filter_estimator_params(
            sklearn_SplineTransformer,
            {
                "n_knots": n_knots,
                "degree": degree,
                "knots": knots,
                "extrapolation": extrapolation,
                "include_bias": include_bias,
                "order": order,
                "sparse_output": sparse_output,
                "handle_missing": handle_missing,
            },
        )
        super().__init__(**params, **kwargs)

    @property
    def bsplines_(self) -> list:
        """List of BSplines objects, one for each feature."""
        check_is_fitted(self, ["instance_"])
        return self.instance_.bsplines_

    @property
    def n_features_out_(self) -> int:
        """Number of output features."""
        check_is_fitted(self, ["instance_"])
        return self.instance_.n_features_out_

Methods

bsplines_ property

List of BSplines objects, one for each feature.

n_features_out_ property

Number of output features.