Skip to content

yohou.datasets

Remote time series dataset fetchers and related utilities.

User guide: See the Core Concepts section for data format details.

Loaders

Each function downloads data from Monash/Zenodo (CC BY 4.0) and returns a sklearn.utils.Bunch with a .frame attribute containing a polars.DataFrame with a "time" column. Data is cached locally after the first download.

Name Description
fetch_dominick Fetch the Dominick dataset from Monash/Zenodo.
fetch_electricity_demand Fetch the Australian Electricity Demand dataset from Monash/Zenodo.
fetch_hospital Fetch the Hospital dataset from Monash/Zenodo.
fetch_kdd_cup Fetch the KDD Cup 2018 air quality dataset from Monash/Zenodo.
fetch_pedestrian_counts Fetch the Melbourne Pedestrian Counts dataset from Monash/Zenodo.
fetch_sunspot Fetch the Sunspot dataset (without missing values) from Monash/Zenodo.
fetch_tourism_monthly Fetch the Tourism Monthly dataset from Monash/Zenodo.
fetch_tourism_quarterly Fetch the Tourism Quarterly dataset from Monash/Zenodo.
fetch_air_quality_classification Fetch a categorical air quality dataset derived from KDD Cup 2018.
fetch_demand_classification Fetch a categorical electricity demand dataset from Monash/Zenodo.

Utilities

Name Description
clear_data_home Delete all the content of the data home cache.
get_data_home Return the path of the yohou data directory.
parse_tsf Parse a Monash .tsf file into a wide polars DataFrame.

Synthetic generators

Parameterized generators that create synthetic time series with all three exogenous feature types (X_actual, X_future, X_forecast). No download required.

Name Description
make_exogenous_regression Generate synthetic regression data with exogenous features.
make_exogenous_classification Generate synthetic classification data with exogenous features.