Normalizer¶
yohou.preprocessing.sklearn_wrappers.Normalizer
¶
Bases: SklearnTransformer
Normalize samples individually to unit norm.
Each sample (i.e. each row of the data matrix) with at least one non-zero component is rescaled independently of other samples so that its norm (l1, l2 or max) equals one.
This normalizer can be useful as a preprocessing step for classifiers or other algorithms that rely on the angle between vectors, such as cosine similarity for document classification.
This is a Yohou wrapper that preserves the polars DataFrame structure and "time" column.
Parameters¶
| Name | Type | Description | Default |
|---|---|---|---|
norm
|
('l1', 'l2', 'max')
|
The norm to use to normalize each non zero sample. If norm='max' is used, values will be rescaled by the maximum of the absolute values. |
'l1'
|
Attributes¶
| Name | Type | Description |
|---|---|---|
instance_ |
Normalizer
|
The fitted sklearn Normalizer instance. |
Examples¶
>>> import polars as pl
>>> from datetime import datetime
>>> from yohou.preprocessing import Normalizer
>>> X = pl.DataFrame({
... "time": [datetime(2024, 1, i) for i in range(1, 4)],
... "a": [1.0, 2.0, 3.0],
... "b": [2.0, 4.0, 6.0],
... })
>>> normalizer = Normalizer(norm="l2")
>>> normalizer.fit(X)
Normalizer(...)
>>> X_norm = normalizer.transform(X)
>>> # Each row normalized to unit L2 norm
>>> "time" in X_norm.columns
True
See Also¶
StandardScaler: Standardize features by removing mean and scaling to unit variance.