macrosynergy.learning.splitters.kfold_splitters#
Panel K-Fold cross-validator classes.
- class ExpandingKFoldPanelSplit(n_splits=5, min_n_splits=2)[source]#
Bases:
KFoldPanelSplit
Time-respecting K-Fold cross-validator for panel data.
- Parameters:
n_splits (int) – Number of folds i.e. (training set, test set) pairs. Default is 5. Must be at least 2.
Notes
This splitter can be considered to be a panel data analogue to the TimeSeriesSplit splitter provided by scikit-learn.
Unique dates in the panel are divided into ‘n_splits + 1’ sequential and non-overlapping intervals, resulting in ‘n_splits’ pairs of training and test sets. The ‘i’th training set is the union of the first ‘i’ intervals, and the ‘i’th test set is the ‘i+1’th interval.
- class RollingKFoldPanelSplit(n_splits=5, min_n_splits=2)[source]#
Bases:
KFoldPanelSplit
Unshuffled K-Fold cross-validator for panel data.
- Parameters:
n_splits (int) – Number of folds. Default is 5. Must be at least 2.
Notes
This splitter can be considered to be a panel data analogue to the KFold splitter provided by scikit-learn, with shuffle=False and with splits determined on the time dimension.
Unique dates in the panel are divided into ‘n_splits’ sequential and non-overlapping intervals of equal size, resulting in ‘n_splits’ pairs of training and test sets. The ‘i’th test set is the ‘i’th interval, and the ‘i’th training set is all other intervals.
- class RecencyKFoldPanelSplit(n_splits=5, n_periods=252)[source]#
Bases:
KFoldPanelSplit
Time-respecting K-Fold panel cross-validator that creates training and test sets based on the most recent samples in the panel.
- Parameters:
Notes
This splitter is similar to the ExpandingKFoldPanelSplit, except that the sorted unique timestamps are not divided into equal intervals. Instead, the last n_periods * n_splits timestamps in the panel are divided into n_splits non-overlapping intervals, each of which is used as a test set. The corresponding training set is comprised of all samples with timestamps earlier than its test set. Consequently, this is a K-Fold walk-forward cross-validator, but with test folds concentrated on the most recent information.