Skip to content

plot_resampling_comparison

yohou.plotting.exploration.plot_resampling_comparison(df_original, df_resampled, *, columns=None, original_label='Original', resampled_label='Resampled', groups=None, facet_by='member', facet_n_cols=2, color_palette=None, show_legend=True, title=None, x_label=None, y_label=None, width=None, height=None, connect_gaps=False, resampler=None, original_line_width=1.0, original_line_opacity=0.4, original_line_dash='solid', resampled_line_width=2.5, resampled_line_opacity=1.0, resampled_line_dash='solid')

Plot original vs resampled time series for comparison.

Overlays two versions of the same series at different temporal resolutions to visually assess information loss from aggregation.

Parameters

Name Type Description Default
df_original DataFrame

Original (higher-frequency) DataFrame with 'time' column.

required
df_resampled DataFrame

Resampled (lower-frequency) DataFrame with 'time' column.

required
columns str | list[str] | None

Column(s) to compare. If None, uses all numeric columns except 'time' from df_resampled.

None
original_label str

Legend label for the original series.

"Original"
resampled_label str

Legend label for the resampled series.

"Resampled"
groups list[str] | None

Panel group prefixes to plot.

None
facet_by Literal['group', 'member'] | None

Faceting axis for panel data. "group" creates one subplot per group, "member" one per member. None disables faceting. Ignored for non-panel data.

"member"
facet_n_cols int

Number of columns in facet grid.

2
color_palette list[str] | None

Custom color palette.

None
show_legend bool

Whether to show the legend.

True
title str | None

Plot title.

None
x_label str | None

X-axis label.

None
y_label str | None

Y-axis label.

None
width int | None

Plot width in pixels.

None
height int | None

Plot height in pixels.

None
connect_gaps bool

Whether to connect gaps in the data with lines.

False
resampler bool | Literal['widget'] | None

Enable plotly-resampler for large datasets. True or "widget" creates a FigureWidgetResampler; False or None uses a plain go.Figure.

None
original_line_width float

Width of the original series line in pixels.

1.0
original_line_opacity float

Opacity of the original series line.

0.4
original_line_dash str

Dash style for the original series line.

"solid"
resampled_line_width float

Width of the resampled series line in pixels.

2.5
resampled_line_opacity float

Opacity of the resampled series line.

1.0
resampled_line_dash str

Dash style for the resampled series line.

"solid"

Returns

Type Description
Figure

Plotly figure object.

Raises

Type Description
TypeError

If either DataFrame is not a Polars DataFrame.

ValueError

If columns don't exist in both DataFrames.

Examples

>>> import polars as pl
>>> from yohou.plotting import plot_resampling_comparison
>>> hourly = pl.DataFrame({
...     "time": pl.datetime_range(
...         pl.datetime(2020, 1, 1),
...         pl.datetime(2020, 1, 2, 23),
...         "1h",
...         eager=True,
...     ),
...     "y": list(range(48)),
... })
>>> daily = hourly.group_by_dynamic("time", every="1d").agg(pl.col("y").mean())
>>> fig = plot_resampling_comparison(hourly, daily, columns="y")
>>> len(fig.data)
2

See Also

plot_time_series : Plot basic time series. plot_rolling_statistics : Plot rolling window statistics.

Source Code

Show/Hide source
def plot_resampling_comparison(
    df_original: pl.DataFrame,
    df_resampled: pl.DataFrame,
    *,
    columns: str | list[str] | None = None,
    original_label: str = "Original",
    resampled_label: str = "Resampled",
    groups: list[str] | None = None,
    facet_by: Literal["group", "member"] | None = "member",
    facet_n_cols: int = 2,
    color_palette: list[str] | None = None,
    show_legend: bool = True,
    title: str | None = None,
    x_label: str | None = None,
    y_label: str | None = None,
    width: int | None = None,
    height: int | None = None,
    connect_gaps: bool = False,
    resampler: bool | Literal["widget"] | None = None,
    original_line_width: float = 1.0,
    original_line_opacity: float = 0.4,
    original_line_dash: str = "solid",
    resampled_line_width: float = 2.5,
    resampled_line_opacity: float = 1.0,
    resampled_line_dash: str = "solid",
) -> go.Figure:
    """
    Plot original vs resampled time series for comparison.

    Overlays two versions of the same series at different temporal
    resolutions to visually assess information loss from aggregation.

    Parameters
    ----------
    df_original : pl.DataFrame
        Original (higher-frequency) DataFrame with 'time' column.
    df_resampled : pl.DataFrame
        Resampled (lower-frequency) DataFrame with 'time' column.
    columns : str | list[str] | None, default=None
        Column(s) to compare. If None, uses all numeric columns except 'time'
        from df_resampled.
    original_label : str, default="Original"
        Legend label for the original series.
    resampled_label : str, default="Resampled"
        Legend label for the resampled series.
    groups : list[str] | None, default=None
        Panel group prefixes to plot.
    facet_by : Literal["group", "member"] | None, default="member"
        Faceting axis for panel data.  ``"group"`` creates one subplot per
        group, ``"member"`` one per member.  ``None`` disables faceting.
        Ignored for non-panel data.
    facet_n_cols : int, default=2
        Number of columns in facet grid.
    color_palette : list[str] | None, default=None
        Custom color palette.
    show_legend : bool, default=True
        Whether to show the legend.
    title : str | None, default=None
        Plot title.
    x_label : str | None, default=None
        X-axis label.
    y_label : str | None, default=None
        Y-axis label.
    width : int | None, default=None
        Plot width in pixels.
    height : int | None, default=None
        Plot height in pixels.
    connect_gaps : bool, default=False
        Whether to connect gaps in the data with lines.
    resampler : bool | Literal["widget"] | None, default=None
        Enable plotly-resampler for large datasets.  ``True`` or
        ``"widget"`` creates a ``FigureWidgetResampler``; ``False`` or
        ``None`` uses a plain ``go.Figure``.
    original_line_width : float, default=1.0
        Width of the original series line in pixels.
    original_line_opacity : float, default=0.4
        Opacity of the original series line.
    original_line_dash : str, default="solid"
        Dash style for the original series line.
    resampled_line_width : float, default=2.5
        Width of the resampled series line in pixels.
    resampled_line_opacity : float, default=1.0
        Opacity of the resampled series line.
    resampled_line_dash : str, default="solid"
        Dash style for the resampled series line.

    Returns
    -------
    go.Figure
        Plotly figure object.

    Raises
    ------
    TypeError
        If either DataFrame is not a Polars DataFrame.
    ValueError
        If columns don't exist in both DataFrames.

    Examples
    --------
    >>> import polars as pl
    >>> from yohou.plotting import plot_resampling_comparison

    >>> hourly = pl.DataFrame({
    ...     "time": pl.datetime_range(
    ...         pl.datetime(2020, 1, 1),
    ...         pl.datetime(2020, 1, 2, 23),
    ...         "1h",
    ...         eager=True,
    ...     ),
    ...     "y": list(range(48)),
    ... })
    >>> daily = hourly.group_by_dynamic("time", every="1d").agg(pl.col("y").mean())
    >>> fig = plot_resampling_comparison(hourly, daily, columns="y")
    >>> len(fig.data)
    2

    See Also
    --------
    [`plot_time_series`][yohou.plotting.plot_time_series] : Plot basic time series.
    [`plot_rolling_statistics`][yohou.plotting.plot_rolling_statistics] : Plot rolling window statistics.
    """
    # Validate both DataFrames
    validate_plotting_data(df_original)
    validate_plotting_data(df_resampled)
    validate_plotting_params(width=width, height=height)

    # Get styling parameters
    original_width = original_line_width
    original_opacity = original_line_opacity
    original_dash = original_line_dash
    resampled_width = resampled_line_width
    resampled_opacity = resampled_line_opacity
    resampled_dash = resampled_line_dash

    if groups is None and columns is None and _auto_detect_panel(df_resampled):
        groups = []

    if groups is not None:

        def _render_resampling(ctx: RenderContext) -> None:
            """Render original and resampled traces for a single panel."""
            base = [c for c in ctx.sub_df.columns if c != "time"][0]
            _colors = resolve_color_palette(color_palette, 1)
            # Original (from df_original, matching the same full column name)
            full_col = [c for c in df_original.columns if c.endswith(f"__{base}") or c == base]
            if full_col:
                orig_col = full_col[0]
                ctx.fig.add_trace(
                    go.Scatter(
                        x=df_original["time"],
                        y=df_original[orig_col],
                        mode="lines",
                        line={"color": _colors[0], "width": original_width, "dash": original_dash},
                        opacity=original_opacity,
                        name=original_label,
                        showlegend=False,
                    ),
                    row=ctx.row,
                    col=ctx.col,
                )
            # Resampled
            ctx.fig.add_trace(
                go.Scatter(
                    x=ctx.sub_df["time"],
                    y=ctx.sub_df[base],
                    mode="lines+markers",
                    line={"color": _colors[0], "width": resampled_width, "dash": resampled_dash},
                    opacity=resampled_opacity,
                    name=resampled_label,
                    showlegend=False,
                ),
                row=ctx.row,
                col=ctx.col,
            )

        effective_facet_by = facet_by or "member"
        fig = facet_figure(
            df_resampled,
            _render_resampling,
            groups=groups,
            columns=columns,
            facet_by=effective_facet_by,
            facet_n_cols=facet_n_cols,
            title=title or f"{original_label} vs {resampled_label}",
            x_label=x_label or "Time",
            y_label=y_label,
            width=width,
            height=height,
            resampler=resampler,
        )
        fig.update_layout(showlegend=show_legend)
        return fig

    # Non-panel case: column-mode facet_figure
    plot_columns = validate_plotting_data(df_resampled, columns=columns, exclude=["time"])
    for col in plot_columns:
        if col not in df_original.columns:
            msg = f"Column '{col}' not found in original DataFrame. Available: {df_original.columns}"
            raise ValueError(msg)

    _colors = resolve_color_palette(color_palette, len(plot_columns))
    _col_colors = dict(zip(plot_columns, _colors, strict=False))

    def _render_resampling(ctx: RenderContext) -> None:
        """Render original vs resampled for one column into a subplot."""
        base = ctx.display_name
        col_color = _col_colors[base]
        ctx.fig.add_trace(
            go.Scatter(
                x=df_original["time"],
                y=df_original[base],
                mode="lines",
                name=f"{base} ({original_label})",
                line={"color": col_color, "width": original_width, "dash": original_dash},
                opacity=original_opacity,
                connectgaps=connect_gaps,
                hovertemplate=f"<b>{base} ({original_label})</b><br>%{{x}}<br>%{{y:.2f}}<extra></extra>",
            ),
            row=ctx.row,
            col=ctx.col,
        )
        ctx.fig.add_trace(
            go.Scatter(
                x=ctx.sub_df["time"],
                y=ctx.sub_df[base],
                mode="lines+markers",
                name=f"{base} ({resampled_label})",
                line={"color": col_color, "width": resampled_width, "dash": resampled_dash},
                opacity=resampled_opacity,
                connectgaps=connect_gaps,
                hovertemplate=f"<b>{base} ({resampled_label})</b><br>%{{x}}<br>%{{y:.2f}}<extra></extra>",
            ),
            row=ctx.row,
            col=ctx.col,
        )

    fig = facet_figure(
        df_resampled,
        _render_resampling,
        columns=plot_columns,
        facet_n_cols=facet_n_cols,
        title=title or f"{original_label} vs {resampled_label}",
        x_label=x_label or "Time",
        y_label=y_label,
        width=width,
        height=height,
        resampler=resampler,
    )
    fig.update_layout(showlegend=show_legend)

    return fig

Tutorials

The following example notebooks use this component:

  • Exploratory Visualization


    Visualization

    Exploratory time series visualisation with raw series plots, rolling statistics overlays, seasonal overlays, subseries diagnostics, distribution boxplots, missing data pattern auditing, outlier detection, and resampling comparison.

    View · Open in marimo