macrosynergy.panel.panel_imputer#

class BasePanelImputer(df, xcats, cids, start=None, end=None, min_cids=None, postfix='F')[source]#

Bases: object

Base class for imputing missing values in a panel DataFrame. Defines an overall structure for how the imputation should be performed, without the imputation method. Separate subclasses should be created for each imputation method, which will have a defined impute technique.

Parameters:
  • df (DataFrame) – DataFrame containing the panel data.

  • xcats (List[str]) – List of extended categories.

  • cids (List[str]) – List of cross sections.

  • start (str) – Start date in ISO format.

  • end (str) – End date in ISO format.

  • min_cids (int) – Minimum number of cross sections required to perform imputation on a specific real date. Default is len(cids) // 2.

  • postfix (str) – Postfix to add to the extended categories after imputation. Default is “F”.

impute()[source]#

Returns the imputed DataFrame.

get_impute_function(group)[source]#

Abstract method that should be implemented in a subclass. Defines the imputation technique to be used on a group of values.

generate_blacklist(diff)[source]#

Generates a dictionary of cross sections and dates that have been imputed in the same format as a blacklist dictionary. For each cross section it stores a date range where imputation has been performed.

Parameters:

diff (DataFrame) – DataFrame containing the differences between the original and imputed DataFrames.

return_blacklist(xcat=None)[source]#
class MeanPanelImputer(df, xcats, cids, start=None, end=None, min_cids=None, postfix='F')[source]#

Bases: BasePanelImputer

Imputer class that fills missing values with the global cross-sectional mean. If the group has less than min_cids non-missing values, the group is left as is.

get_impute_function(group)[source]#

Abstract method that should be implemented in a subclass. Defines the imputation technique to be used on a group of values.

class MedianPanelImputer(df, xcats, cids, start=None, end=None, min_cids=None, postfix='F')[source]#

Bases: BasePanelImputer

Imputer class that fills missing values with the global cross-sectional median. If the group has less than min_cids non-missing values, the group is left as is.

get_impute_function(group)[source]#

Abstract method that should be implemented in a subclass. Defines the imputation technique to be used on a group of values.