macrosynergy.learning.cv_tools#

A set of tools for cross-validation of panel data.

panel_cv_scores(X, y, splitter, estimators, scoring, show_longbias=True, show_std=False, verbose=1, n_jobs=-1)[source]#

Returns a dataframe of cross-validation scores for a collection of models, with respect to a cross-validation splitter and a set of scorers.

Parameters:
  • X (pd.DataFrame) – Input feature matrix.

  • y (pd.DataFrame or pd.Series) – Target variable.

  • splitter (BasePanelSplit) – Panel cross-validation splitter.

  • estimators (dict) – Dictionary of models.

  • scoring (dict) – Dictionary of scorers.

  • show_longbias (bool, optional, default=True) – Whether to show the proportion of times a model predicts a positive return.

  • show_std (bool, optional, default=False) – Whether to show the standard deviation of the cross-validation scores over folds.

  • verbose (int, optional, default=1) – Verbosity level.

  • n_jobs (int, optional, default=-1) – Number of jobs to run in parallel.

Returns:

Dataframe of cross-validation scores.

Return type:

pd.DataFrame

Notes

This function returns a dataframe that is multi-indexed with the outer index representing a metric and the inner index representing the mean & (optionally) a standard deviation over validation splits. The columns are the estimators.