macrosynergy.learning.forecasting.linear_model.global_local#
- class GlobalLocalRegression(local_lambda=1, global_lambda=1, positive=False, fit_intercept=True, min_xs_samples=36)[source]#
Bases:
BaseEstimator,RegressorMixinLinear panel model with hierarchical shrinkage of country-specific (local) coefficients towards unknown global coefficients. Learning means that both country-specific and global coefficients are estimated from data.
- Parameters:
local_lambda (float, default=1) – Regularization strength to pull local coefficients towards global coefficients.
global_lambda (float, default=1) – Regularization strength to pull global coefficients towards zero.
positive (bool, default=False) – Whether to constrain all coefficients to be positive. Default is False.
fit_intercept (bool, default=True) – Whether to fit an intercept term. Default is True.
min_xs_samples (int, default=36) – Minimum number of samples required in each group for the group to be considered a contribution to the mean squared error component of the loss function.
Notes
A panel can be modelled from a global perspective, where time series of all countries are “pooled” or stacked together, meaning that samples from different countries are treated as independent. This is called a pooled regression. With one model fit on all countries’ data, this is a high-bias, low-variance model.
Alternatively, country-by-country regressions can be fit, with a separate model for each country. This is low-bias but high-variance, since each model sees less data.
This implies that a balance can be found between these two extremes by balancing this bias-variance trade-off. Introduction of bias to the country-by-country models can lead to a potentially substantial reduction in variance. Mathematically, this fit is found by minimizing the sum of squared residuals for each country, with a term that penalizes deviation of country-specific coefficients from a global coefficient. The global coefficient is also penalized to prevent it from growing too large.
The loss function is as follows:
\[L(\{\beta_i\}_{i=1}^{C}, \beta) = \frac{1}{C} \sum_{i = 1}^{C} \left [ \frac{1}{n_{i}} \sum_{t=1}^{n_{i}} (y_{it} - x_{it}^{\intercal} \beta_{i})^2 \right ] + \lambda_{\text{local}} \sum_{i=1}^{C} ||\beta_i - \beta||_{2}^{2} + \lambda_{\text{global}} ||\beta||_{2}^{2}\]- fit(X, y, sample_weight=None)[source]#
Fit the global-local model.
- Parameters:
X (pd.DataFrame) – Input feature matrix, multi-indexed by cid and real_date.
y (pd.DataFrame or pd.Series) – Target vector associated with each sample in X, multi-indexed by cid and real_date.
sample_weight (np.ndarray, optional) – Sample weights for each sample in X. If provided, it should be a 1D array with the same length as the number of samples in X. If None, all samples are treated equally.
- Returns:
Fitted estimator.
- Return type:
self
- loss(weights)[source]#
Loss function for the global-local regression model.
- Parameters:
weights (np.ndarray) – Flattened array of weights, where the last n_features_ elements correspond with the global coefficients and the rest correspond to the local coefficients for each country.
- loss_derivative(weights)[source]#
Derivative of the loss function with respect to the weights.
- Parameters:
weights (np.ndarray) – Flattened array of weights, where the last n_features_ elements correspond with the global coefficients and the rest correspond to the local coefficients for each country.
- predict(X)[source]#
Predict the target values for the given input data.
- Parameters:
X (pd.DataFrame) – Input features for prediction.
- Returns:
Predicted target values.
- Return type:
np.ndarray
- set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') GlobalLocalRegression#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') GlobalLocalRegression#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.