plot_resampling_comparison¶

yohou.plotting.exploration.plot_resampling_comparison(df_original, df_resampled, *, columns=None, original_label='Original', resampled_label='Resampled', groups=None, facet_by='member', facet_n_cols=2, color_palette=None, show_legend=True, title=None, x_label=None, y_label=None, width=None, height=None, connect_gaps=False, resampler=None, original_line_width=1.0, original_line_opacity=0.4, original_line_dash='solid', resampled_line_width=2.5, resampled_line_opacity=1.0, resampled_line_dash='solid') ¶

Plot original vs resampled time series for comparison.

Overlays two versions of the same series at different temporal resolutions to visually assess information loss from aggregation.

Parameters¶

Name	Type	Description	Default
`df_original`	`DataFrame`	Original (higher-frequency) DataFrame with 'time' column.	required
`df_resampled`	`DataFrame`	Resampled (lower-frequency) DataFrame with 'time' column.	required
`columns`	`str \| list[str] \| None`	Column(s) to compare. If None, uses all numeric columns except 'time' from df_resampled.	`None`
`original_label`	`str`	Legend label for the original series.	`"Original"`
`resampled_label`	`str`	Legend label for the resampled series.	`"Resampled"`
`groups`	`list[str] \| None`	Panel group prefixes to plot.	`None`
`facet_by`	`Literal['group', 'member'] \| None`	Faceting axis for panel data. `"group"` creates one subplot per group, `"member"` one per member. `None` disables faceting. Ignored for non-panel data.	`"member"`
`facet_n_cols`	`int`	Number of columns in facet grid.	`2`
`color_palette`	`list[str] \| None`	Custom color palette.	`None`
`show_legend`	`bool`	Whether to show the legend.	`True`
`title`	`str \| None`	Plot title.	`None`
`x_label`	`str \| None`	X-axis label.	`None`
`y_label`	`str \| None`	Y-axis label.	`None`
`width`	`int \| None`	Plot width in pixels.	`None`
`height`	`int \| None`	Plot height in pixels.	`None`
`connect_gaps`	`bool`	Whether to connect gaps in the data with lines.	`False`
`resampler`	`bool \| Literal['widget'] \| None`	Enable plotly-resampler for large datasets. `True` or `"widget"` creates a `FigureWidgetResampler`; `False` or `None` uses a plain `go.Figure`.	`None`
`original_line_width`	`float`	Width of the original series line in pixels.	`1.0`
`original_line_opacity`	`float`	Opacity of the original series line.	`0.4`
`original_line_dash`	`str`	Dash style for the original series line.	`"solid"`
`resampled_line_width`	`float`	Width of the resampled series line in pixels.	`2.5`
`resampled_line_opacity`	`float`	Opacity of the resampled series line.	`1.0`
`resampled_line_dash`	`str`	Dash style for the resampled series line.	`"solid"`

Returns¶

Type	Description
`Figure`	Plotly figure object.

Raises¶

Type	Description
`TypeError`	If either DataFrame is not a Polars DataFrame.
`ValueError`	If columns don't exist in both DataFrames.

Examples¶

>>> import polars as pl
>>> from yohou.plotting import plot_resampling_comparison

>>> hourly = pl.DataFrame({
...     "time": pl.datetime_range(
...         pl.datetime(2020, 1, 1),
...         pl.datetime(2020, 1, 2, 23),
...         "1h",
...         eager=True,
...     ),
...     "y": list(range(48)),
... })
>>> daily = hourly.group_by_dynamic("time", every="1d").agg(pl.col("y").mean())
>>> fig = plot_resampling_comparison(hourly, daily, columns="y")
>>> len(fig.data)
2

Source Code¶

View on GitHub

Show/Hide sourcedef plot_resampling_comparison(
    df_original: pl.DataFrame,
    df_resampled: pl.DataFrame,
    *,
    columns: str | list[str] | None = None,
    original_label: str = "Original",
    resampled_label: str = "Resampled",
    groups: list[str] | None = None,
    facet_by: Literal["group", "member"] | None = "member",
    facet_n_cols: int = 2,
    color_palette: list[str] | None = None,
    show_legend: bool = True,
    title: str | None = None,
    x_label: str | None = None,
    y_label: str | None = None,
    width: int | None = None,
    height: int | None = None,
    connect_gaps: bool = False,
    resampler: bool | Literal["widget"] | None = None,
    original_line_width: float = 1.0,
    original_line_opacity: float = 0.4,
    original_line_dash: str = "solid",
    resampled_line_width: float = 2.5,
    resampled_line_opacity: float = 1.0,
    resampled_line_dash: str = "solid",
) -> go.Figure:
    """
    Plot original vs resampled time series for comparison.

    Overlays two versions of the same series at different temporal
    resolutions to visually assess information loss from aggregation.

    Parameters
    ----------
    df_original : pl.DataFrame
        Original (higher-frequency) DataFrame with 'time' column.
    df_resampled : pl.DataFrame
        Resampled (lower-frequency) DataFrame with 'time' column.
    columns : str | list[str] | None, default=None
        Column(s) to compare. If None, uses all numeric columns except 'time'
        from df_resampled.
    original_label : str, default="Original"
        Legend label for the original series.
    resampled_label : str, default="Resampled"
        Legend label for the resampled series.
    groups : list[str] | None, default=None
        Panel group prefixes to plot.
    facet_by : Literal["group", "member"] | None, default="member"
        Faceting axis for panel data.  ``"group"`` creates one subplot per
        group, ``"member"`` one per member.  ``None`` disables faceting.
        Ignored for non-panel data.
    facet_n_cols : int, default=2
        Number of columns in facet grid.
    color_palette : list[str] | None, default=None
        Custom color palette.
    show_legend : bool, default=True
        Whether to show the legend.
    title : str | None, default=None
        Plot title.
    x_label : str | None, default=None
        X-axis label.
    y_label : str | None, default=None
        Y-axis label.
    width : int | None, default=None
        Plot width in pixels.
    height : int | None, default=None
        Plot height in pixels.
    connect_gaps : bool, default=False
        Whether to connect gaps in the data with lines.
    resampler : bool | Literal["widget"] | None, default=None
        Enable plotly-resampler for large datasets.  ``True`` or
        ``"widget"`` creates a ``FigureWidgetResampler``; ``False`` or
        ``None`` uses a plain ``go.Figure``.
    original_line_width : float, default=1.0
        Width of the original series line in pixels.
    original_line_opacity : float, default=0.4
        Opacity of the original series line.
    original_line_dash : str, default="solid"
        Dash style for the original series line.
    resampled_line_width : float, default=2.5
        Width of the resampled series line in pixels.
    resampled_line_opacity : float, default=1.0
        Opacity of the resampled series line.
    resampled_line_dash : str, default="solid"
        Dash style for the resampled series line.

    Returns
    -------
    go.Figure
        Plotly figure object.

    Raises
    ------
    TypeError
        If either DataFrame is not a Polars DataFrame.
    ValueError
        If columns don't exist in both DataFrames.

    Examples
    --------
    >>> import polars as pl
    >>> from yohou.plotting import plot_resampling_comparison

    >>> hourly = pl.DataFrame({
    ...     "time": pl.datetime_range(
    ...         pl.datetime(2020, 1, 1),
    ...         pl.datetime(2020, 1, 2, 23),
    ...         "1h",
    ...         eager=True,
    ...     ),
    ...     "y": list(range(48)),
    ... })
    >>> daily = hourly.group_by_dynamic("time", every="1d").agg(pl.col("y").mean())
    >>> fig = plot_resampling_comparison(hourly, daily, columns="y")
    >>> len(fig.data)
    2

    See Also
    --------
    [`plot_time_series`][yohou.plotting.plot_time_series] : Plot basic time series.
    [`plot_rolling_statistics`][yohou.plotting.plot_rolling_statistics] : Plot rolling window statistics.
    """
    # Validate both DataFrames
    validate_plotting_data(df_original)
    validate_plotting_data(df_resampled)
    validate_plotting_params(width=width, height=height)

    # Get styling parameters
    original_width = original_line_width
    original_opacity = original_line_opacity
    original_dash = original_line_dash
    resampled_width = resampled_line_width
    resampled_opacity = resampled_line_opacity
    resampled_dash = resampled_line_dash

    if groups is None and columns is None and _auto_detect_panel(df_resampled):
        groups = []

    if groups is not None:

        def _render_resampling(ctx: RenderContext) -> None:
            """Render original and resampled traces for a single panel."""
            base = [c for c in ctx.sub_df.columns if c != "time"][0]
            _colors = resolve_color_palette(color_palette, 1)
            # Original (from df_original, matching the same full column name)
            full_col = [c for c in df_original.columns if c.endswith(f"__{base}") or c == base]
            if full_col:
                orig_col = full_col[0]
                ctx.fig.add_trace(
                    go.Scatter(
                        x=df_original["time"],
                        y=df_original[orig_col],
                        mode="lines",
                        line={"color": _colors[0], "width": original_width, "dash": original_dash},
                        opacity=original_opacity,
                        name=original_label,
                        showlegend=False,
                    ),
                    row=ctx.row,
                    col=ctx.col,
                )
            # Resampled
            ctx.fig.add_trace(
                go.Scatter(
                    x=ctx.sub_df["time"],
                    y=ctx.sub_df[base],
                    mode="lines+markers",
                    line={"color": _colors[0], "width": resampled_width, "dash": resampled_dash},
                    opacity=resampled_opacity,
                    name=resampled_label,
                    showlegend=False,
                ),
                row=ctx.row,
                col=ctx.col,
            )

        effective_facet_by = facet_by or "member"
        fig = facet_figure(
            df_resampled,
            _render_resampling,
            groups=groups,
            columns=columns,
            facet_by=effective_facet_by,
            facet_n_cols=facet_n_cols,
            title=title or f"{original_label} vs {resampled_label}",
            x_label=x_label or "Time",
            y_label=y_label,
            width=width,
            height=height,
            resampler=resampler,
        )
        fig.update_layout(showlegend=show_legend)
        return fig

    # Non-panel case: column-mode facet_figure
    plot_columns = validate_plotting_data(df_resampled, columns=columns, exclude=["time"])
    for col in plot_columns:
        if col not in df_original.columns:
            msg = f"Column '{col}' not found in original DataFrame. Available: {df_original.columns}"
            raise ValueError(msg)

    _colors = resolve_color_palette(color_palette, len(plot_columns))
    _col_colors = dict(zip(plot_columns, _colors, strict=False))

    def _render_resampling(ctx: RenderContext) -> None:
        """Render original vs resampled for one column into a subplot."""
        base = ctx.display_name
        col_color = _col_colors[base]
        ctx.fig.add_trace(
            go.Scatter(
                x=df_original["time"],
                y=df_original[base],
                mode="lines",
                name=f"{base} ({original_label})",
                line={"color": col_color, "width": original_width, "dash": original_dash},
                opacity=original_opacity,
                connectgaps=connect_gaps,
                hovertemplate=f"<b>{base} ({original_label})</b><br>%{{x}}<br>%{{y:.2f}}<extra></extra>",
            ),
            row=ctx.row,
            col=ctx.col,
        )
        ctx.fig.add_trace(
            go.Scatter(
                x=ctx.sub_df["time"],
                y=ctx.sub_df[base],
                mode="lines+markers",
                name=f"{base} ({resampled_label})",
                line={"color": col_color, "width": resampled_width, "dash": resampled_dash},
                opacity=resampled_opacity,
                connectgaps=connect_gaps,
                hovertemplate=f"<b>{base} ({resampled_label})</b><br>%{{x}}<br>%{{y:.2f}}<extra></extra>",
            ),
            row=ctx.row,
            col=ctx.col,
        )

    fig = facet_figure(
        df_resampled,
        _render_resampling,
        columns=plot_columns,
        facet_n_cols=facet_n_cols,
        title=title or f"{original_label} vs {resampled_label}",
        x_label=x_label or "Time",
        y_label=y_label,
        width=width,
        height=height,
        resampler=resampler,
    )
    fig.update_layout(showlegend=show_legend)

    return fig

Tutorials¶

The following example notebooks use this component:

Exploratory Visualization

Visualization

Exploratory time series visualisation with raw series plots, rolling statistics overlays, seasonal overlays, subseries diagnostics, distribution boxplots, missing data pattern auditing, outlier detection, and resampling comparison.

View · Open in marimo