fetch_kdd_cup¶
yohou.datasets._fetchers.fetch_kdd_cup(*, n_groups=5, data_home=None, download_if_missing=True, n_retries=3, delay=1.0)
¶
Fetch the KDD Cup 2018 air quality dataset from Monash/Zenodo.
Hourly time series of air quality measurements (PM2.5, PM10, NO2, CO, O3, SO2) from 59 monitoring stations in Beijing and London. This is a multivariate panel dataset: each station (panel group) contains multiple measurement columns.
Column names use yohou's __ separator convention with the
station as group prefix and the measurement as member suffix,
e.g. "beijing_dongsi_aq__pm2.5".
Parameters¶
| Name | Type | Description | Default |
|---|---|---|---|
n_groups
|
int or None
|
Maximum number of station groups to include. Each station has
6 measurement series (PM2.5, PM10, NO2, CO, O3, SO2), so
|
5
|
data_home
|
str, PathLike, or None
|
Specify another download and cache folder for the datasets.
By default all yohou data is stored in |
None
|
download_if_missing
|
bool
|
If |
True
|
n_retries
|
int
|
Number of retries when HTTP errors are encountered. |
3
|
delay
|
float
|
Number of seconds between retries. |
1.0
|
Returns¶
| Type | Description |
|---|---|
Bunch
|
Dictionary-like object with the following attributes: frame : pl.DataFrame
DataFrame with |
See Also¶
fetch_electricity_demand: Half-hourly electricity demand series.fetch_pedestrian_counts: Hourly pedestrian sensor series.get_data_home: Return the path of the data directory.
References¶
[1] Godahewa, R., Bergmeir, C., Webb, G. I., Hyndman, R. J., & Montero-Manso, P. (2021). "Monash Time Series Forecasting Archive." Neural Information Processing Systems Track on Datasets and Benchmarks. https://doi.org/10.5281/zenodo.4656756
Examples¶
>>> from yohou.datasets import fetch_kdd_cup
>>> bunch = fetch_kdd_cup()
>>> bunch.frame.columns[:3]
['time', 'beijing_aotizhongxin_aq__pm2.5', 'beijing_aotizhongxin_aq__pm10']
Source Code¶
Show/Hide source
831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 | |
Tutorials¶
The following example notebooks use this component:
-
How to Aggregate Scorer Results
Evaluation-Search
Demonstrate all scorer aggregation strategies (stepwise, vintagewise, componentwise, groupwise, coveragewise, all) on panel data with weighted group aggregation.
-
How to Forecast Panel Prediction Intervals
Panel-Data
Combine conformal and quantile regression intervals on panel data with per-group coverage analysis, calibration plots, and groupwise interval scoring.