macrosynergy.learning.forecasting.meta_estimators#
- class DataFrameTransformer(transformer, column_names=None)[source]#
Bases:
BaseEstimator,TransformerMixin,MetaEstimatorMixinMeta estimator to reconvert a transformed numpy array back to a multiindexed pandas DataFrame. This maintains the multi-indexed panel structure.
- Parameters:
transformer (TransformerMixin) – A scikit-learn transformer with a fit and transform method.
Notes
Many scikit-learn compatible transformers convert pandas DataFrames to numpy arrays. This can be problematic when working with panel models that require knowledge of the panel structure. This class wraps around such transformers to ensure that the output is a pandas DataFrame, preserving the original index.
When no column names are provided, default names of the form “Factor_0”, “Factor_1”, etc. are used for the transformed DataFrame. If column names are provided, they will be used instead.
- fit(X, y=None)[source]#
Fit the underlying transformer.
- Parameters:
X (pd.DataFrame) – Pandas dataframe of input features.
y (pd.Series or pd.DataFrame or np.ndarray) – Pandas series, dataframe or numpy array of targets associated with each sample in X.
- transform(X)[source]#
Transform the input data based on the underlying transformer, but return a pandas DataFrame instead of a numpy array.
- Parameters:
X (pd.DataFrame or numpy array) – Input feature matrix.
- Returns:
Transformed data as a pandas DataFrame, preserving the original index and using either provided column names or default names.
- Return type:
pd.DataFrame
- class ProbabilityEstimator(classifier)[source]#
Bases:
BaseEstimator,MetaEstimatorMixin,ClassifierMixinMeta estimator to create trading signals based on the probability of going long.
- Parameters:
classifier (ClassifierMixin) – A scikit-learn classifier.
Notes
This class stores feature importances as the feature importances of the base estimator as well as defining a create_signal method that returns the probability of going long in excess of 0.5. This is taken into account when used in the SignalOptimizer class in this package.
- fit(X, y)[source]#
Fit the underlying classifier.
- Parameters:
X (pd.DataFrame or np.ndarray) – Pandas dataframe or numpy array of input features.
y (pd.Series or pd.DataFrame or np.ndarray) – Pandas series, dataframe or numpy array of targets associated with each sample in X.
- predict(X)[source]#
Predict the class labels for the provided data.
- Parameters:
X (pd.DataFrame or numpy array) – Input feature matrix.
- Returns:
y_pred – Numpy array of predictions.
- Return type:
np.ndarray
- create_signal(X)[source]#
Create a trading signal based on the probability of going long.
- Parameters:
X (pd.DataFrame or numpy array) – Input feature matrix.
- Returns:
y_pred – Numpy array of signals.
- Return type:
np.ndarray
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ProbabilityEstimator#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
- class FIExtractor(estimator)[source]#
Bases:
BaseEstimator,MetaEstimatorMixin,RegressorMixin- fit(X, y)[source]#
Fit the underlying estimator and store normalized feature importances.
- Parameters:
X (pd.DataFrame or np.ndarray) – Pandas dataframe or numpy array of input features.
y (pd.Series or pd.DataFrame or np.ndarray) – Pandas series, dataframe or numpy array of targets associated with each sample in X.
- predict(X)[source]#
Predict the class labels for the provided data.
- Parameters:
X (pd.DataFrame or numpy array) – Input feature matrix.
- Returns:
y_pred – Numpy array of predictions.
- Return type:
np.ndarray
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') FIExtractor#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
- class CountryByCountryRegression(estimator, min_xs_samples=32)[source]#
Bases:
BaseEstimator,MetaEstimatorMixin,RegressorMixinMetaEstimator to fit a scikit-learn-compatible regressor on each country’s data slice in a panel. If a country has fewer samples than min_xs_samples, a global model is used for the sake of prediction.
- Parameters:
Notes
Country by country regressions model a panel through a “bottoms-up” approach, treating each country as a separate regression problem. This is useful when a panel is particularly heterogeneous or each time series in the panel is long. Short time series results in a low-bias, high-variance model that tends to underperform a global forecasting model. Regularization on each country-specific model can help improve performance.
- predict(X)[source]#
Predict the target values for the given input data.
- Parameters:
X (pd.DataFrame) – Input features for prediction.
- Returns:
Predicted target values.
- Return type:
np.ndarray
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') CountryByCountryRegression#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
- class TimeWeightedWrapper(model, half_life)[source]#
Bases:
BaseEstimator,RegressorMixinMeta-estimator that applies time-based weighting to samples during model fitting.
- Parameters:
model (BaseEstimator) – An instance of a scikit-learn compatible regression model.
half_life (float) – The half-life parameter for the exponential decay weighting.
- fit(X, y)[source]#
Fit the underlying model with time weights applied.
- Parameters:
X (pandas.DataFrame or np.ndarray) – The feature matrix.
y (pandas.Series or np.ndarray) – The target vector.
- predict(X)[source]#
Predict using the underlying model.
- Parameters:
X (pandas.DataFrame or np.ndarray) – The feature matrix.
- Returns:
predictions – The predicted values.
- Return type:
np.ndarray
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') TimeWeightedWrapper#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Submodules#
- macrosynergy.learning.forecasting.meta_estimators.country_by_country_regressions
- macrosynergy.learning.forecasting.meta_estimators.dataframe_transformer
- macrosynergy.learning.forecasting.meta_estimators.feature_importances
- macrosynergy.learning.forecasting.meta_estimators.probability
- macrosynergy.learning.forecasting.meta_estimators.weighted_predictors