Skip to content

fetch_hospital

yohou.datasets._fetchers.fetch_hospital(*, n_series=None, data_home=None, download_if_missing=True, n_retries=3, delay=1.0)

Fetch the Hospital dataset from Monash/Zenodo.

767 monthly time series representing patient counts related to medical products from January 2000 to December 2006.

Parameters

Name Type Description Default
n_series int or None

Maximum number of series to include. None loads all 767 series. A smaller value reduces memory usage and speeds up parsing.

None
data_home str, PathLike, or None

Specify another download and cache folder for the datasets. By default all yohou data is stored in ~/yohou_data/.

None
download_if_missing bool

If False, raise an OSError if the data is not locally available instead of trying to download it.

True
n_retries int

Number of retries when HTTP errors are encountered.

3
delay float

Number of seconds between retries.

1.0

Returns

Type Description
Bunch

Dictionary-like object with the following attributes:

frame : pl.DataFrame DataFrame with "time" (Datetime) and up to 767 series columns using the __ separator convention (e.g. "T1__patients"). feature_names : list of str Non-time column names. DESCR : str Full description of the dataset. frequency : str "1mo". n_series : int Number of series actually loaded. filename : str Path to the cached parquet file.

See Also

References

[1] Godahewa, R., Bergmeir, C., Webb, G. I., Hyndman, R. J., & Montero-Manso, P. (2021). "Monash Time Series Forecasting Archive." Neural Information Processing Systems Track on Datasets and Benchmarks. https://doi.org/10.5281/zenodo.4656014

Examples

>>> from yohou.datasets import fetch_hospital
>>> bunch = fetch_hospital()
>>> bunch.frame.columns[:2]
['time', 'T1__patients']

Source Code

Show/Hide source
def fetch_hospital(
    *,
    n_series: int | None = None,
    data_home: str | os.PathLike | None = None,
    download_if_missing: bool = True,
    n_retries: int = 3,
    delay: float = 1.0,
) -> Bunch:
    """Fetch the Hospital dataset from Monash/Zenodo.

    767 monthly time series representing patient counts related to
    medical products from January 2000 to December 2006.

    Parameters
    ----------
    n_series : int or None, default=None
        Maximum number of series to include.  ``None`` loads all 767
        series.  A smaller value reduces memory usage and speeds up
        parsing.
    data_home : str, PathLike, or None
        Specify another download and cache folder for the datasets.
        By default all yohou data is stored in ``~/yohou_data/``.
    download_if_missing : bool, default=True
        If ``False``, raise an ``OSError`` if the data is not locally
        available instead of trying to download it.
    n_retries : int, default=3
        Number of retries when HTTP errors are encountered.
    delay : float, default=1.0
        Number of seconds between retries.

    Returns
    -------
    Bunch
        Dictionary-like object with the following attributes:

        frame : pl.DataFrame
            DataFrame with ``"time"`` (Datetime) and up to 767 series
            columns using the ``__`` separator convention
            (e.g. ``"T1__patients"``).
        feature_names : list of str
            Non-time column names.
        DESCR : str
            Full description of the dataset.
        frequency : str
            ``"1mo"``.
        n_series : int
            Number of series actually loaded.
        filename : str
            Path to the cached parquet file.

    See Also
    --------
    - [`fetch_tourism_monthly`][yohou.datasets._fetchers.fetch_tourism_monthly] : Monthly tourism series.
    - [`fetch_dominick`][yohou.datasets._fetchers.fetch_dominick] : Weekly retail profit series.
    - [`get_data_home`][yohou.datasets._fetchers.get_data_home] : Return the path of the data directory.

    References
    ----------
    [1] Godahewa, R., Bergmeir, C., Webb, G. I., Hyndman, R. J., &
        Montero-Manso, P. (2021). "Monash Time Series Forecasting Archive."
        Neural Information Processing Systems Track on Datasets and
        Benchmarks. https://doi.org/10.5281/zenodo.4656014

    Examples
    --------
    >>> from yohou.datasets import fetch_hospital
    >>> bunch = fetch_hospital()  # doctest: +SKIP
    >>> bunch.frame.columns[:2]  # doctest: +SKIP
    ['time', 'T1__patients']

    """
    return _fetch_dataset(
        metadata=HOSPITAL,
        dataset_name="hospital",
        value_column_name="patients",
        n_series=n_series,
        data_home=data_home,
        download_if_missing=download_if_missing,
        n_retries=n_retries,
        delay=delay,
    )

Tutorials

The following example notebooks use this component:

  • How to Use Lagged Forecasts as Features


    Forecasting-Models

    Compare ForecastedFeatureForecaster strategies (actual, predicted, rewind) and split ratio tuning for chaining feature and target forecasters.

    View · Open in marimo