macrosynergy.panel.panel_calculator#

Implementation of panel calculation functions for quantamental data. The functionality allows applying mathematical operations on time-series data.

panel_calculator(df, calcs=None, cids=None, start=None, end=None, blacklist=None, external_func={})[source]#

Calculates new data panels through a given input formula which is performed on existing panels.

Parameters:
  • df (Dataframe) – standardized dataframe with following necessary columns: ‘cid’, ‘xcat’, ‘real_date’ and ‘value’.

  • calcs (List[str]) – list of formulas denoting operations on panels of categories. Words in capital letters denote category panels. Otherwise the formulas can include numpy functions and standard binary operators. See notes below.

  • cids (List[str]) – cross sections over which the panels are defined.

  • start (str) – earliest date in ISO format. Default is None and earliest date in df is used.

  • end (str) – latest date in ISO format. Default is None and latest date in df is used.

  • blacklist (dict) – cross sections with date ranges that should be excluded from the dataframe. If one cross section has several blacklist periods append numbers to the cross-section code.

  • external_func (dict) – dictionary of external functions to be used in the panel calculation. The key is the name of the function and the value is the function object itself. e.g. {“my_func”: my_func}.

Returns:

standardized dataframe with all new categories in standard format, i.e the columns ‘cid’, ‘xcat’, ‘real_date’ and ‘value’.

Return type:

Dataframe

Notes

Panel calculation strings can use numpy functions and unary/binary operators on category panels. The category is indicated by capital letters, underscores and numbers. Panel category names that are not at the beginning or end of the string must always have a space before and after the name. Calculated category and panel operations must be separated by ‘=’.

Examples:

NEWCAT = ( OLDCAT1 + 0.5) * OLDCAT2

or

NEWCAT = np.log( OLDCAT1 ) - np.abs( OLDCAT2 ) ** 1/2

Panel calculation can also involve individual indicator series (to be applied to all series in the panel by using th ‘i’ as prefix), such as:

NEWCAT = OLDCAT1 - np.sqrt( iUSD_OLDCAT2 )

These strings are passed as a list of strings (calcs) to the function.

If more than one new category is calculated, the resulting panels can be used sequentially in the calculations, such as: .. code-block:: python

[“NEWCAT1 = 1 + OLDCAT1 / 100”, “NEWCAT2 = OLDCAT2 * NEWCAT1”]

calcs = [
    "NEWCAT = OLDCAT1 + OLDCAT2",
    "NEWCAT2 = CAT_A * CAT_B - CAT_C * 0.5",
    "NEWCAT3 = OLDCAT1 - np.sqrt(iUSD_OLDCAT2)",
]

df = panel_calculator(df=df, calcs=calcs, ...)
time_series_check(formula, index)[source]#

Determine if the panel has any time-series methods applied. If a time-series conversion is applied, the function will return the terminal index of the respective category. Further, a boolean parameter is also returned to confirm the presence of a time-series operation.

Parameters:
  • formula (str) –

  • index (int) – starting index to iterate over.

Return type:

Tuple[int, bool]

is_valid_xcat(xcat_str)[source]#

Heuristic to determine if a string is a valid category (xcat). Conditions:

  • Only composed of alphanumeric characters and underscores

  • Must contain at least one uppercase letter

  • If starts with “i”, must be a ticker, i.e containing an underscore

Parameters:

xcat_str (str) – The string to check.

Returns:

True if the string is a valid category (xcat), False otherwise.

Return type:

bool

xcat_isolator(calc_rhs_str)[source]#

Split the category from the right hand side (RHS) of the panel calculation formula. The function will return a list of categories found in the RHS string.

Parameters:

calc_rhs_str (str) – right hand side of the panel calculation formula.

Return type:

List[str]