Skip to content

NumericalDifferentiator

yohou.preprocessing.signal.NumericalDifferentiator

Bases: BaseTransformer

Numerical differentiation transformer for time series signals.

Differentiates each feature column using np.gradient, which computes the derivative using central differences in the interior and first differences at the boundaries.

Parameters

Name Type Description Default
order (1, 2)

Gradient is calculated using N-th order accurate differences at the boundaries: - 1: First-order accurate (uses 2 points at boundary) - 2: Second-order accurate (uses 3 points at boundary)

1

Attributes

Name Type Description
interval_ str

Detected time interval string (e.g., '1d', '1h', '1s').

sampling_interval_ float

Sampling interval in seconds derived from interval_.

Examples

>>> import polars as pl
>>> from datetime import datetime, timedelta
>>> time = [datetime(2020, 1, 1) + timedelta(seconds=i * 0.001) for i in range(100)]
>>> X = pl.DataFrame({"time": time, "signal": [float(i) for i in range(100)]})
>>> transformer = NumericalDifferentiator(order=1)
>>> transformer.fit(X)
NumericalDifferentiator(...)
>>> X_t = transformer.transform(X)
>>> "time" in X_t.columns
True

Notes

  • Output has the same length as input
  • Uses central differences in the interior (more accurate)
  • Uses one-sided differences at boundaries (order controls accuracy)
  • Inverse transform uses cumulative trapezoidal integration

See Also

  • NumericalIntegrator : Numerical integration transformer. numpy.gradient : NumPy gradient function.

Source Code

Show/Hide source
class NumericalDifferentiator(BaseTransformer):
    """Numerical differentiation transformer for time series signals.

    Differentiates each feature column using np.gradient, which computes
    the derivative using central differences in the interior and first
    differences at the boundaries.

    Parameters
    ----------
    order : {1, 2}, default=1
        Gradient is calculated using N-th order accurate differences at
        the boundaries:
        - 1: First-order accurate (uses 2 points at boundary)
        - 2: Second-order accurate (uses 3 points at boundary)

    Attributes
    ----------
    interval_ : str
        Detected time interval string (e.g., '1d', '1h', '1s').

    sampling_interval_ : float
        Sampling interval in seconds derived from interval_.

    Examples
    --------
    >>> import polars as pl
    >>> from datetime import datetime, timedelta
    >>> time = [datetime(2020, 1, 1) + timedelta(seconds=i * 0.001) for i in range(100)]
    >>> X = pl.DataFrame({"time": time, "signal": [float(i) for i in range(100)]})
    >>> transformer = NumericalDifferentiator(order=1)
    >>> transformer.fit(X)
    NumericalDifferentiator(...)
    >>> X_t = transformer.transform(X)
    >>> "time" in X_t.columns
    True

    Notes
    -----
    - Output has the same length as input
    - Uses central differences in the interior (more accurate)
    - Uses one-sided differences at boundaries (order controls accuracy)
    - Inverse transform uses cumulative trapezoidal integration

    See Also
    --------
    - [`NumericalIntegrator`][yohou.preprocessing.signal.NumericalIntegrator] : Numerical integration transformer.
    `numpy.gradient` : NumPy gradient function.

    """

    _parameter_constraints: dict = {
        "order": [Interval(numbers.Integral, 1, 2, closed="both")],
    }

    _tags = {"invertible": True}

    def __init__(self, order: StrictInt = 1):
        self.order = order

    def _fit(self, X: pl.DataFrame, y: pl.DataFrame | None = None) -> None:
        """Fit the internal model."""
        # Detect interval using utility function
        self.interval_ = check_interval_consistency(X)
        td = interval_to_timedelta(self.interval_)
        if td is None:
            raise ValueError(
                f"NumericalDifferentiator requires fixed-length intervals, but got variable interval: {self.interval_}"
            )
        self.sampling_interval_ = td.total_seconds()

    def _transform(self, X: pl.DataFrame) -> pl.DataFrame:
        """Differentiate each feature column.

        Parameters
        ----------
        X : pl.DataFrame
            Validated input time series.

        Returns
        -------
        pl.DataFrame
            Differentiated time series.

        """
        time = X.select(cs.by_name("time"))
        data = X.select(~cs.by_name("time"))

        dt = self.sampling_interval_

        # Differentiate each column
        result_cols = {}
        for col_name in data.columns:
            col_values = data[col_name].to_numpy()
            # Cast order to Literal[1, 2] for numpy typing (validated in parameter_constraints)
            differentiated = np.gradient(col_values, dt, edge_order=cast(Literal[1, 2], self.order))
            result_cols[col_name] = differentiated

        X_t = pl.DataFrame(result_cols)
        feature_names = self.get_feature_names_out()
        X_t = X_t.rename(dict(zip(X_t.columns, feature_names, strict=False)))
        X_t = pl.concat([time, X_t], how="horizontal")

        return X_t

    def _inverse_transform(self, X_t: pl.DataFrame, X_p: pl.DataFrame | None = None) -> pl.DataFrame:
        """Integrate to reverse differentiation.

        Parameters
        ----------
        X_t : pl.DataFrame
            Differentiated time series.
        X_p : pl.DataFrame or None
            Not used for this stateless transformer.

        Returns
        -------
        pl.DataFrame
            Inverse-transformed time series.

        """
        X_t, _ = validate_transformer_data(
            self,
            X=X_t,
            reset=False,
            inverse=True,
            X_p=X_p,
            observation_horizon=0,
        )

        time = X_t.select(cs.by_name("time"))
        data = X_t.select(~cs.by_name("time"))

        dt = self.sampling_interval_

        # Integrate each column
        result_cols = {}
        for col_name in data.columns:
            col_values = data[col_name].to_numpy()
            integrated = scipy.integrate.cumulative_trapezoid(col_values, x=None, dx=dt, initial=0.0)
            result_cols[col_name] = integrated

        X = pl.DataFrame(result_cols)
        X = X.rename(dict(zip(X.columns, self.feature_names_in_, strict=False)))
        X = pl.concat([time, X], how="horizontal")

        return X

    def get_feature_names_out(self, input_features: list[str] | None = None) -> list[str]:
        """Get output feature names for transformation.

        Parameters
        ----------
        input_features : array-like of str or None, default=None
            Column names of the input features.  If ``None``, uses the
            feature names seen during ``fit``.

        Returns
        -------
        list of str
            Output feature names after transformation.

        """
        input_features = _check_feature_names_in(self, input_features)
        return [f"{col}_differentiated" for col in input_features]

Methods

get_feature_names_out(input_features=None)

Get output feature names for transformation.

Parameters
Name Type Description Default
input_features array-like of str or None

Column names of the input features. If None, uses the feature names seen during fit.

None
Returns
Type Description
list of str

Output feature names after transformation.

Source Code
Show/Hide source
def get_feature_names_out(self, input_features: list[str] | None = None) -> list[str]:
    """Get output feature names for transformation.

    Parameters
    ----------
    input_features : array-like of str or None, default=None
        Column names of the input features.  If ``None``, uses the
        feature names seen during ``fit``.

    Returns
    -------
    list of str
        Output feature names after transformation.

    """
    input_features = _check_feature_names_in(self, input_features)
    return [f"{col}_differentiated" for col in input_features]

Tutorials

The following example notebooks use this component:

  • How to Apply Signal Processing Filters


    Data-Features

    Apply NumericalFilter (Butterworth, Chebyshev, Bessel), NumericalDifferentiator, and NumericalIntegrator for signal smoothing and rate-of-change extraction.

    View · Open in marimo