FeaturePipeline¶

`yohou.compose.feature_pipeline.FeaturePipeline` ¶

Bases: BaseTransformer, _BaseComposition

A sequence of time series transformers.

FeaturePipeline allows you to sequentially apply a list of time series transformers to preprocess the data.

Steps of the pipeline must be 'transforms', that is, they must implement fit, transform and observe methods.

The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. For this, it enables setting parameters of the various steps using their names and the parameter name separated by a '__', as in the example below. A step's estimator may be replaced entirely by setting the parameter with its name to another estimator, or a transformer removed by setting it to 'passthrough' or None.

Parameters¶

Name	Type	Description	Default
`steps`	`list of tuples`	List of (name of step, estimator) tuples that are to be chained in sequential order. To be compatible with the scikit-learn API, all steps must define `fit`. All non-last steps must also define `transform`. See Combining Estimators for more details.	required
`memory`	`str or object with the joblib.Memory interface`	Used to cache the fitted transformers of the pipeline. The last step will never be cached, even if it is a transformer. By default, no caching is performed. If a string is given, it is the path to the caching directory. Enabling caching triggers a clone of the transformers before fitting. Therefore, the transformer instance given to the pipeline cannot be inspected directly. Use the attribute `named_steps` or `steps` to inspect estimators within the pipeline. Caching the transformers is advantageous when fitting is time consuming.	`None`
`verbose`	`bool`	If True, the time elapsed while fitting each step will be printed as it is completed.	`False`

Attributes¶

Name	Type	Description
`named_steps`	`Bunch`	Dictionary-like object, with the following attributes. Read-only attribute to access any step parameter by user given name. Keys are step names and values are steps parameters.
`n_features_in_`	`int`	Number of features seen during `fit`. Only defined if the underlying first estimator in `steps` exposes such an attribute when fit.
`feature_names_in_`	ndarray of shape (`n_features_in_`,)	Names of features seen during `fit`. Only defined if the underlying estimator exposes such an attribute when fit.

Notes¶

All input data must include a time column with datetime values. The time column is preserved through all transformations.

The observation_horizon property accumulates across all steps, returning the sum of all transformer observation horizons. This indicates the total amount of historical data required by the pipeline.

Supports time series-specific observe() method for incremental learning, allowing the pipeline to incorporate new observations without full retraining.

The final step can be a forecaster, enabling end-to-end forecasting pipelines that transform features and generate predictions.

Examples¶

>>> import polars as pl
>>> from datetime import datetime, timedelta
>>> from yohou.compose import FeaturePipeline
>>> from yohou.stationarity import SeasonalDifferencing
>>> from yohou.preprocessing import LagTransformer
>>>
>>> # Create sample weekly time series data (52 weeks)
>>> time = pl.datetime_range(
...     start=datetime(2023, 1, 1),
...     end=datetime(2023, 1, 1) + timedelta(weeks=51),
...     interval="1w",
...     eager=True,
... )
>>> data = pl.DataFrame({"time": time, "sales": range(1, 53)})
>>>
>>> # Example 1: Create a sequential preprocessing pipeline
>>> pipe = FeaturePipeline([
...     ("deseason", SeasonalDifferencing(seasonality=4)),
...     ("lags", LagTransformer(lag=[1, 2, 3])),
... ])
>>>
>>> # Example 2: Access individual steps by name
>>> pipe.named_steps["deseason"]
SeasonalDifferencing(...)
>>>
>>> # Example 3: Access individual steps by position
>>> pipe[0]
SeasonalDifferencing(...)

Name	Type	Description	Default
`X`	`iterable`	Training data. Must fulfill input requirements of first step of the pipeline.	required
`y`	`iterable`	Training targets. Must fulfill label requirements for all steps of the pipeline.	`None`
`**params`	`dict of str -> object`	If `enable_metadata_routing=False` (default): Parameters passed to the `fit` method of each step, where each parameter name is prefixed such that parameter `p` for step `s` has key `s__p`. If `enable_metadata_routing=True`: Parameters requested and accepted by steps. Each step must have requested certain metadata for these parameters to be forwarded to them.	`{}`

Name	Type	Description	Default
`X`	`iterable`	Data to transform. Must fulfill input requirements of first step of the pipeline.	required
`**params`	`dict of str -> object`	Parameters requested and accepted by steps. Each step must have requested certain metadata for these parameters to be forwarded to them.	`{}`

Name	Type	Description	Default
`X`	`DataFrame`	New data to observe and transform. Must fulfill input requirements of first step of the pipeline.	required
`**params`	`dict of str -> object`	Parameters routed to the `transform` methods of the steps. Each step must have requested certain metadata via `set_transform_request()` for these parameters to be forwarded to them.	`{}`

Name	Type	Description	Default
`X`	`DataFrame`	Data to transform and use for rewinding state. Must fulfill input requirements of first step of the pipeline.	required
`**params`	`dict of str -> object`	Parameters routed to the `rewind_transform` methods of the steps. Each step must have requested certain metadata via `set_rewind_transform_request()` for these parameters to be forwarded to them.	`{}`

Name	Type	Description	Default
`X_t`	`DataFrame`	Transformed data to inverse-transform. Must fulfill input requirements of the last step's `inverse_transform` method.	required
`X_p`	`DataFrame`	Untransformed data corresponding to at least `observation_horizon` immediately previous time stamps. Used by stateful steps to reconstruct original-space values during inverse transformation. When `observation_horizon == 0`, this is unused but still required.	required
`**params`	`dict of str -> object`	Parameters requested and accepted by steps. Each step must have requested certain metadata for these parameters to be forwarded to them.	`{}`

FeaturePipeline¶

yohou.compose.feature_pipeline.FeaturePipeline ¶

Parameters¶

Attributes¶

See Also¶

Notes¶

Examples¶

Source Code¶

Methods¶

named_steps property ¶

Returns¶

n_features_in_ property ¶

Returns¶

feature_names_in_ property ¶

Returns¶

observation_horizon property ¶

Returns¶

Raises¶

get_params(deep=True) ¶

Parameters¶

Returns¶

Source Code¶

set_params(**params) ¶

Parameters¶

Returns¶

Source Code¶

__len__() ¶

Returns¶

Source Code¶

__getitem__(ind) ¶

Parameters¶

Returns¶

Source Code¶

get_feature_names_out(input_features=None) ¶

Parameters¶

Returns¶

Source Code¶

__sklearn_is_fitted__() ¶

Returns¶

Source Code¶

__sklearn_tags__() ¶

Returns¶

Source Code¶

rewind(X) ¶

Parameters¶

Returns¶

Source Code¶

observe(X) ¶

Parameters¶

Returns¶

Raises¶

Source Code¶

fit(X, y=None, **params) ¶

Parameters¶

Returns¶

Source Code¶

fit_transform(X, y=None, **params) ¶

Parameters¶

Returns¶

Source Code¶

transform(X, **params) ¶

Parameters¶

Returns¶

Source Code¶

observe_transform(X, **params) ¶

Parameters¶

Returns¶

Source Code¶

rewind_transform(X, **params) ¶

Parameters¶

Returns¶

Source Code¶

inverse_transform(X_t, X_p, **params) ¶

Parameters¶

Returns¶

Source Code¶

get_metadata_routing() ¶

Returns¶

Source Code¶

Tutorials¶

`yohou.compose.feature_pipeline.FeaturePipeline` ¶

`named_steps` `property` ¶

`n_features_in_` `property` ¶

`feature_names_in_` `property` ¶

`observation_horizon` `property` ¶

`get_params(deep=True)` ¶

`set_params(**params)` ¶

`len()` ¶

`getitem(ind)` ¶

`get_feature_names_out(input_features=None)` ¶

`__sklearn_is_fitted__()` ¶

`__sklearn_tags__()` ¶

`rewind(X)` ¶

`observe(X)` ¶

`fit(X, y=None, **params)` ¶

`fit_transform(X, y=None, **params)` ¶

`transform(X, **params)` ¶

`observe_transform(X, **params)` ¶

`rewind_transform(X, **params)` ¶

`inverse_transform(X_t, X_p, **params)` ¶

`get_metadata_routing()` ¶