macrosynergy.learning.forecasting.meta_estimators#

class DataFrameTransformer(transformer, column_names=None)[source]#

Bases: BaseEstimator, TransformerMixin, MetaEstimatorMixin

Meta estimator to reconvert a transformed numpy array back to a multiindexed pandas DataFrame. This maintains the multi-indexed panel structure.

Parameters:: transformer (TransformerMixin) – A scikit-learn transformer with a fit and transform method.

Notes

Many scikit-learn compatible transformers convert pandas DataFrames to numpy arrays. This can be problematic when working with panel models that require knowledge of the panel structure. This class wraps around such transformers to ensure that the output is a pandas DataFrame, preserving the original index.

When no column names are provided, default names of the form “Factor_0”, “Factor_1”, etc. are used for the transformed DataFrame. If column names are provided, they will be used instead.

fit(X, y=None)[source]#

Fit the underlying transformer.

Parameters:

X (pd.DataFrame) – Pandas dataframe of input features.
y (pd.Series or pd.DataFrame or np.ndarray) – Pandas series, dataframe or numpy array of targets associated with each sample in X.

transform(X)[source]#

Transform the input data based on the underlying transformer, but return a pandas DataFrame instead of a numpy array.

Parameters:: X (pd.DataFrame or numpy array) – Input feature matrix.
Returns:: Transformed data as a pandas DataFrame, preserving the original index and using either provided column names or default names.
Return type:: pd.DataFrame

get_feature_names_out(input_features=None)[source]#

Get output feature names produced by the wrapped transformer.

Parameters:: input_features (None) – This parameter has no effect and is included for compatibility with the scikit-learn API.

class ProbabilityEstimator(classifier)[source]#

Bases: BaseEstimator, MetaEstimatorMixin, ClassifierMixin

Meta estimator to create trading signals based on the probability of going long.

Parameters:: classifier (ClassifierMixin) – A scikit-learn classifier.

Notes

This class stores feature importances as the feature importances of the base estimator as well as defining a create_signal method that returns the probability of going long in excess of 0.5. This is taken into account when used in the SignalOptimizer class in this package.

fit(X, y)[source]#

Fit the underlying classifier.

Parameters:

X (pd.DataFrame or np.ndarray) – Pandas dataframe or numpy array of input features.
y (pd.Series or pd.DataFrame or np.ndarray) – Pandas series, dataframe or numpy array of targets associated with each sample in X.

predict(X)[source]#

Predict the class labels for the provided data.

Parameters:: X (pd.DataFrame or numpy array) – Input feature matrix.
Returns:: y_pred – Numpy array of predictions.
Return type:: np.ndarray

create_signal(X)[source]#

Create a trading signal based on the probability of going long.

Parameters:: X (pd.DataFrame or numpy array) – Input feature matrix.
Returns:: y_pred – Numpy array of signals.
Return type:: np.ndarray

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → ProbabilityEstimator#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class FIExtractor(estimator)[source]#

Bases: BaseEstimator, MetaEstimatorMixin, RegressorMixin

fit(X, y)[source]#

Fit the underlying estimator and store normalized feature importances.

Parameters:

X (pd.DataFrame or np.ndarray) – Pandas dataframe or numpy array of input features.
y (pd.Series or pd.DataFrame or np.ndarray) – Pandas series, dataframe or numpy array of targets associated with each sample in X.

predict(X)[source]#

Predict the class labels for the provided data.

Parameters:: X (pd.DataFrame or numpy array) – Input feature matrix.
Returns:: y_pred – Numpy array of predictions.
Return type:: np.ndarray

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → FIExtractor#

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class CountryByCountryRegression(estimator, min_xs_samples=32)[source]#

Bases: BaseEstimator, MetaEstimatorMixin, RegressorMixin

MetaEstimator to fit a scikit-learn-compatible regressor on each country’s data slice in a panel. If a country has fewer samples than min_xs_samples, a global model is used for the sake of prediction.

Parameters:

estimator (object) – A scikit-learn compatible regressor that will be cloned for each country.
min_xs_samples (int, default=32) – Minimum number of samples required for fitting a country-specific model. If a country has fewer samples, the global model will be used for predictions.

Notes

Country by country regressions model a panel through a “bottoms-up” approach, treating each country as a separate regression problem. This is useful when a panel is particularly heterogeneous or each time series in the panel is long. Short time series results in a low-bias, high-variance model that tends to underperform a global forecasting model. Regularization on each country-specific model can help improve performance.

fit(X, y)[source]#

predict(X)[source]#

Predict the target values for the given input data.

Parameters:: X (pd.DataFrame) – Input features for prediction.
Returns:: Predicted target values.
Return type:: np.ndarray

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → CountryByCountryRegression#

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class TimeWeightedWrapper(model, half_life)[source]#

Bases: BaseEstimator, RegressorMixin

Meta-estimator that applies time-based weighting to samples during model fitting.

Parameters:

model (BaseEstimator) – An instance of a scikit-learn compatible regression model.
half_life (float) – The half-life parameter for the exponential decay weighting.

fit(X, y)[source]#

Fit the underlying model with time weights applied.

Parameters:

X (pandas.DataFrame or np.ndarray) – The feature matrix.
y (pandas.Series or np.ndarray) – The target vector.

predict(X)[source]#

Predict using the underlying model.

Parameters:: X (pandas.DataFrame or np.ndarray) – The feature matrix.
Returns:: predictions – The predicted values.
Return type:: np.ndarray

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → TimeWeightedWrapper#

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

macrosynergy.learning.forecasting.meta_estimators#

Submodules#