macrosynergy.learning.model_evaluation.metrics.metrics#

Scikit-learn compatible performance metrics for model evaluation.

create_panel_metric(y_true, y_pred, sklearn_metric, type='panel')[source]#

Evaluation with a scikit-learn metric, respecting the panel structure.

Parameters:
  • y_true (pd.Series of shape (n_samples,)) – True regression labels.

  • y_pred (array-like of shape (n_samples,)) – Predicted regression labels.

  • sklearn_metric (callable) – A scikit-learn metric function. This function must accept two arguments, y_true and y_pred, which have the same meaning as the arguments passed to this function.

  • type (str, default="panel") – The panel dimension over which to compute the metric. Options are “panel”, “cross_section” and “time_periods”.

Returns:

metric – The computed metric.

Return type:

float

Notes

This function is a wrapper around a scikit-learn metric, allowing it to be evaluated over different panel axes. For instance, the \(R^2\) metric can be evaluated over the whole panel, across cross-sections or across time periods. Instead of re-implementing every scikit-learn metric so that evaluation over panel axes is possible, the create_panel_metric function allows any scikit-learn metric to be evaluated over different panel axes.

regression_accuracy(y_true, y_pred, type='panel')[source]#

Accuracy of signs between regression labels and predictions.

Parameters:
  • y_true (pd.Series of shape (n_samples,)) – True regression labels.

  • y_pred (array-like of shape (n_samples,)) – Predicted regression labels.

  • type (str, default="panel") – The panel dimension over which to compute the accuracy. Options are “panel”, “cross_section” and “time_periods”.

Returns:

accuracy – The accuracy betweens signs of prediction-target pairs.

Return type:

float

Notes

Accuracy can be calculated over the whole panel, considering all samples irrespective of cross-section or time period. It can be beneficial, however, to estimate the expected accuracy for a cross-section or time period instead.

When type = “cross_section”, the returned accuracy is the mean accuracy across cross-sections, an empirical estimate of the expected accuracy for a cross-section of interest.

When type = “time_periods”, the returned accuracy is the mean accuracy across time periods, an empirical estimate of the expected accuracy for a time period of interest.

regression_balanced_accuracy(y_true, y_pred, type='panel')[source]#

Balanced accuracy of signs between regression labels and predictions.

Parameters:
  • y_true (pd.Series of shape (n_samples,)) – True regression labels.

  • y_pred (array-like of shape (n_samples,)) – Predicted regression labels.

  • type (str, default="panel") – The panel dimension over which to compute the balanced accuracy. Options are “panel”, “cross_section” and “time_periods”.

Returns:

balanced_accuracy – The balanced accuracy betweens signs of prediction-target pairs.

Return type:

float

Notes

Balanced accuracy can be calculated over the whole panel, considering all samples irrespective of cross-section or time period. It can be beneficial, however, to estimate expected balanced accuracy for a cross-section or time period instead.

When type = “cross_section”, the returned balanced accuracy score is the mean balanced accuracy across cross-sections, an empirical estimate of the expected balanced accuracy for a cross-section of interest.

When type = “time_periods”, the returned balanced accuracy score is the mean balanced accuracy across time periods, an empirical estimate of the expected balanced accuracy for a time period of interest.

panel_significance_probability(y_true, y_pred)[source]#

\(1 - pval\) using the Macrosynergy panel (MAP) test for significance of correlation accounting for cross-sectional correlations.

Parameters:
  • y_true (pd.Series of shape (n_samples,)) – True regression labels.

  • y_pred (pd.Series of shape (n_samples,)) – Predicted regression labels.

Returns:

prob_significance – The probability of significance of the relation between predictions and targets.

Return type:

float

Notes

The (Ma)crosynergy (p)anel (MAP) test is a hypothesis test for the significance of a relation between two variables accounting for cross-sectional correlations. A period-specific random effects model is estimated, with a Wald test performed on the concerned coefficient. Since the test requires a panel structure, the inputs are required to be pd.Series, multi-indexed by cross-section and real date.

sharpe_ratio(y_true, y_pred, binary=True, thresh=None, type='panel')[source]#

Sharpe ratio of a strategy where the trader goes long by a single unit when the predictions are positive and short by a single unit when the predictions are negative.

Parameters:
  • y_true (pd.Series of shape (n_samples,)) – True regression labels.

  • y_pred (array-like of shape (n_samples,)) – Predicted regression labels.

  • binary (bool, default=True) – Whether to consider only directional returns. If True, the portfolio returns only consider the sign of the predictions. If False, naive portfolio weights are determined. See Notes for more information on their calculation.

  • thresh (float, default=None) – The threshold for portfolio weights in the case where binary = False.

  • type (str, default="panel") – The panel dimension over which to compute the Sharpe ratio. Options are “panel”, “cross_section” and “time_periods”.

Returns:

sharpe_ratio – The Sharpe ratio of the strategy.

Return type:

float

Notes

A Sharpe ratio can be calculated over the whole panel, considering the mean and standard deviation of the returns irrespective of cross-section or time period. It can be beneficial, however, to estimate the expected Sharpe for a cross-section or time period instead.

When type = “cross_section”, the returned Sharpe ratio is the mean Sharpe ratio across cross-sections, an empirical estimate of the expected Sharpe ratio for a cross-section of interest.

When type = “time_periods”, the returned Sharpe ratio is the mean Sharpe ratio across time periods, an empirical estimate of the expected Sharpe ratio for a time period of interest.

This metric can calculate the Sharpe ratio of either binary or non-binary strategies. When binary = False, predictions are normalized by their standard deviation in each time period. If thresh is not None, the resulting weights are clipped to the range [-thresh, thresh]. The resulting portfolio returns are the product of these derived weights and the true returns.

sortino_ratio(y_true, y_pred, binary=True, thresh=None, type='panel')[source]#

Sortino ratio of a strategy where the trader goes long when the predictions are positive by a single unit and short by a single unit when the predictions are negative.

Parameters:
  • y_true (pd.Series of shape (n_samples,)) – True regression labels.

  • y_pred (array-like of shape (n_samples,)) – Predicted regression labels.

  • binary (bool, default=True) – Whether to consider only directional returns. If True, the portfolio returns only consider the sign of the predictions. If False, naive portfolio weights are determined. See Notes for more information on their calculation.

  • thresh (float, default=None) – The threshold for portfolio weights in the case where binary = False.

  • type (str, default="panel") – The panel dimension over which to compute the Sharpe ratio. Options are “panel”, “cross_section” and “time_periods”.

Returns:

sortino_ratio – The Sortino ratio of the strategy.

Return type:

float

Notes

A Sortino ratio can be calculated over the whole panel, considering the mean and downside standard deviation of the returns irrespective of cross-section or time period. It can be beneficial, however, to estimate the expected Sortino for a cross-section or time period instead.

When type = “cross_section”, the returned Sortino ratio is the mean Sortino ratio across cross-sections, an empirical estimate of the expected Sortino ratio for a cross-section of interest.

When type = “time_periods”, the returned Sortino ratio is the mean Sortino ratio across time periods, an empirical estimate of the expected Sortino ratio for a time period of interest.

This metric can calculate the Sortino ratio of either binary or non-binary strategies. When binary = False, predictions are normalized by their standard deviation in each time period. If thresh is not None, the resulting weights are clipped to the range [-thresh, thresh]. The resulting portfolio returns are the product of these derived weights and the true returns.

correlation_coefficient(y_true, y_pred, correlation_type='pearson', type='panel')[source]#

Correlation coefficient between true and predicted regression labels.

Parameters:
  • y_true (array-like of shape (n_samples,)) – True regression labels.

  • y_pred (array-like of shape (n_samples,)) – Predicted regression labels.

  • correlation_type (str, default="pearson") – The type of correlation coefficient to compute. Options are “pearson”, “spearman” and “kendall”.

Returns:

correlation – The correlation coefficient between true and predicted regression labels.

Return type:

float

Notes

A correlation coefficient can be calculated over the whole panel, considering all samples irrespective of cross-section or time period. It can be beneficial, however, to estimate the expected correlation coefficient for a cross-section or time period instead.

When type = “cross_section”, the returned correlation coefficient is the mean correlation coefficient across cross-sections, an empirical estimate of the expected correlation coefficient for a cross-section of interest.

When type = “time_periods”, the returned correlation coefficient is the mean correlation coefficient across time periods, an empirical estimate of the expected correlation coefficient for a time period of interest.