train_test_split¶
yohou.model_selection.split.train_test_split(*arrays, test_size, X_forecast=None)
¶
Split time series data into temporal train and test sets.
A time series counterpart to :func:sklearn.model_selection.train_test_split.
Data is always split in temporal order (no shuffling): the earliest rows
form the training set and the most recent rows form the test set.
Row-indexed arrays (y, X_actual) are split by position.
X_forecast, when provided, is split by vintage_time range using
the cutoff time inferred from the first positional array.
Parameters¶
| Name | Type | Description | Default |
|---|---|---|---|
*arrays
|
DataFrame
|
One or more Polars DataFrames to split by row index. All must
have the same number of rows. The first array must contain a
|
()
|
test_size
|
int or float
|
If |
required |
X_forecast
|
DataFrame or None
|
External forecasts with |
None
|
Returns¶
| Type | Description |
|---|---|
list of pl.DataFrame
|
Alternating train/test pairs for each positional array, followed
by the X_forecast train/test pair if With one array: With two arrays: With |
Raises¶
| Type | Description |
|---|---|
ValueError
|
If no arrays are provided, arrays have different lengths,
|
Examples¶
Split y and X_actual (80/20):
>>> y = pl.DataFrame({
... "time": pl.date_range(pl.date(2020, 1, 1), pl.date(2020, 1, 10), eager=True),
... "value": list(range(10)),
... })
>>> y_train, y_test = train_test_split(y, test_size=2)
>>> len(y_train), len(y_test)
(8, 2)
Split with a fractional test_size:
Source Code¶
Show/Hide source
751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 | |
Tutorials¶
The following example notebooks use this component:
-
How to Tune Fourier Seasonality Terms
Data-Features
Explore how Fourier harmonic count affects seasonal fit quality, compare Fourier vs Pattern seasonality, and tune harmonics jointly with GridSearchCV.
-
How to Aggregate Scorer Results
Evaluation-Search
Demonstrate all scorer aggregation strategies (stepwise, vintagewise, componentwise, groupwise, coveragewise, all) on panel data with weighted group aggregation.
-
How to Use Lagged Forecasts as Features
Forecasting-Models
Compare ForecastedFeatureForecaster strategies (actual, predicted, rewind) and split ratio tuning for chaining feature and target forecasters.
-
How to Configure LocalPanelForecaster
Panel-Data
Wrap any forecaster with LocalPanelForecaster for fully independent per-group clones, parallel fitting via n_jobs, and selective group operations.
-
How to Forecast Panel Prediction Intervals
Panel-Data
Combine conformal and quantile regression intervals on panel data with per-group coverage analysis, calibration plots, and groupwise interval scoring.
-
How to Apply Stationarity to Panel Data
Panel-Data
Apply per-group stationarity transforms on panel data with SeasonalDifferencing, DecompositionPipeline (polynomial trend + pattern seasonality), and residuals.