macrosynergy.learning.forecasting.linear_model.global_local#

class GlobalLocalRegression(local_lambda=1, global_lambda=1, positive=False, fit_intercept=True, min_xs_samples=36)[source]#

Bases: BaseEstimator, RegressorMixin

Linear panel model with hierarchical shrinkage of country-specific (local) coefficients towards unknown global coefficients. Learning means that both country-specific and global coefficients are estimated from data.

Parameters:

local_lambda (float, default=1) – Regularization strength to pull local coefficients towards global coefficients.
global_lambda (float, default=1) – Regularization strength to pull global coefficients towards zero.
positive (bool, default=False) – Whether to constrain all coefficients to be positive. Default is False.
fit_intercept (bool, default=True) – Whether to fit an intercept term. Default is True.
min_xs_samples (int, default=36) – Minimum number of samples required in each group for the group to be considered a contribution to the mean squared error component of the loss function.

Notes

A panel can be modelled from a global perspective, where time series of all countries are “pooled” or stacked together, meaning that samples from different countries are treated as independent. This is called a pooled regression. With one model fit on all countries’ data, this is a high-bias, low-variance model.

Alternatively, country-by-country regressions can be fit, with a separate model for each country. This is low-bias but high-variance, since each model sees less data.

This implies that a balance can be found between these two extremes by balancing this bias-variance trade-off. Introduction of bias to the country-by-country models can lead to a potentially substantial reduction in variance. Mathematically, this fit is found by minimizing the sum of squared residuals for each country, with a term that penalizes deviation of country-specific coefficients from a global coefficient. The global coefficient is also penalized to prevent it from growing too large.

The loss function is as follows:

\[L(\{\beta_i\}_{i=1}^{C}, \beta) = \frac{1}{C} \sum_{i = 1}^{C} \left [ \frac{1}{n_{i}} \sum_{t=1}^{n_{i}} (y_{it} - x_{it}^{\intercal} \beta_{i})^2 \right ] + \lambda_{\text{local}} \sum_{i=1}^{C} ||\beta_i - \beta||_{2}^{2} + \lambda_{\text{global}} ||\beta||_{2}^{2}\]

fit(X, y, sample_weight=None)[source]#

Fit the global-local model.

Parameters:

X (pd.DataFrame) – Input feature matrix, multi-indexed by cid and real_date.
y (pd.DataFrame or pd.Series) – Target vector associated with each sample in X, multi-indexed by cid and real_date.
sample_weight (np.ndarray, optional) – Sample weights for each sample in X. If provided, it should be a 1D array with the same length as the number of samples in X. If None, all samples are treated equally.

Returns:

Fitted estimator.

Return type:

self

loss(weights)[source]#

Loss function for the global-local regression model.

Parameters:: weights (np.ndarray) – Flattened array of weights, where the last n_features_ elements correspond with the global coefficients and the rest correspond to the local coefficients for each country.

loss_derivative(weights)[source]#

Derivative of the loss function with respect to the weights.

Parameters:: weights (np.ndarray) – Flattened array of weights, where the last n_features_ elements correspond with the global coefficients and the rest correspond to the local coefficients for each country.

predict(X)[source]#

Predict the target values for the given input data.

Parameters:: X (pd.DataFrame) – Input features for prediction.
Returns:: Predicted target values.
Return type:: np.ndarray

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → GlobalLocalRegression#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → GlobalLocalRegression#

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object