macrosynergy.management.utils.sparse#

create_delta_data(df, return_density_stats=False, score_by='diff')[source]#

Creates a dictionary of dataframes with the changes in the information state for each ticker in the QuantamentalDataFrame. Optionally, returns a DataFrame with the statistics for change frequency, density and date range for each ticker.

Parameters:

df (QuantamentalDataFrame) – The QuantamentalDataFrame to calculate the changes for.
return_density_stats (bool) – If True, returns a DataFrame with the density stats for each ticker.
score_by (str) – The method to use for scoring. If “diff” (default), the score is calculated based

Returns:

A dictionary of DataFrames with the changes in the information state for each ticker.

Return type:

Union[Dict[str, pd.DataFrame], pd.DataFrame]

class SubscriptableMeta[source]#

Bases: type

Convenience metaclass to allow subscripting of methods on a class.

class VolatilityEstimationMethods[source]#

Bases: object

Class to hold methods for calculating standard deviations. Each method must comply to the following signature: func(s: pd.Series, **kwargs) -> pd.Series Currently supported methods are: - std: Standard deviation - abs: Mean absolute deviation - exp: Exponentially weighted standard deviation - exp_abs: Exponentially weighted mean absolute deviation

static std(s, min_periods, **kwargs)[source]#

Calculate the expanding standard deviation of a Series.

Parameters:

s (pd.Series) – The Series to calculate the standard deviation for.
min_periods (int) – The minimum number of periods required for the calculation.

Returns:

The standard deviation of the Series.

Return type:

pd.Series

static abs(s, min_periods, **kwargs)[source]#

Calculate the expanding mean absolute deviation of a Series.

Parameters:

s (pd.Series) – The Series to calculate the absolute standard deviation for.
min_periods (int) – The minimum number of periods required for the calculation.

Returns:

The absolute standard deviation of the Series.

Return type:

pd.Series

static exp(s, halflife, min_periods, **kwargs)[source]#

Calculate the exponentially weighted standard deviation of a Series.

Parameters:

s (pd.Series) – The Series to calculate the exponentially weighted standard deviation for.
halflife (int) – The halflife of the exponential weighting.
min_periods (int) – The minimum number of periods required for the calculation.

Returns:

The exponentially weighted standard deviation of the Series.

Return type:

pd.Series

static exp_abs(s, halflife, min_periods, **kwargs)[source]#

Calculate the exponentially weighted mean absolute deviation of a Series.

Parameters:

s (pd.Series) – The Series to calculate the exponentially weighted absolute standard deviation for.
halflife (int) – The halflife of the exponential weighting.
min_periods (int) – The minimum number of periods required for the calculation.

Returns:

The exponentially weighted absolute standard deviation of the Series.

Return type:

pd.Series

calculate_score_on_sparse_indicator(isc, std='std', halflife=None, min_periods=10, isc_version=0, iis=False, custom_method=None, custom_method_kwargs={}, volatility_forecast=True)[source]#

Calculate score on sparse indicator

Parameters:

isc (Dict[str, pd.DataFrame]) – A dictionary of DataFrames with the changes in the information state for each ticker.
std (str) – The method to use for calculating the standard deviation. Supported methods are std, abs, exp and exp_abs. See the documentation for VolatilityEstimationMethods for more information.
halflife (int) – The halflife of the exponential weighting. Only used with exp and exp_abs methods. Default is None.
min_periods (int) – The minimum number of periods required for the calculation. Default is 10.
isc_version (int) – The version of the information state changes to use. If set to 0 (default), only the first version is used. If set to any other positive integer, all versions are used.
iis (bool) – if True (default) zn-scores are also calculated for the initial sample period defined by min_periods, on an in-sample basis, to avoid losing history.
custom_method (Callable) – A custom method to use for calculating the standard deviation. Must have the signature custom_method(s: pd.Series, **kwargs) -> pd.Series.
custom_method_kwargs (Dict) – Keyword arguments to pass to the custom method.
volatility_forecast (bool) – If True (default), the volatility forecast is shifted one period forward to align with the information state changes.

Returns:

A dictionary of DataFrames with the changes in the information state for each ticker.

Return type:

Dict[str, pd.DataFrame]

infer_frequency(df)[source]#

Infer the frequency of a QuantamentalDataFrame based on the most common eop_lag values.

Parameters:: df (QuantamentalDataFrame) – The QuantamentalDataFrame to infer the frequency for.
Returns:: A Series with the inferred frequency for each ticker in the QuantamentalDataFrame.
Return type:: pd.Series

weight_from_frequency(freq, base=252)[source]#: Weight from frequency

sparse_to_dense(isc, value_column, min_period, max_period, postfix=None, metrics=['eop', 'grading'], thresh=None)[source]#

Convert a dictionary of DataFrames with changes in the information state to a dense DataFrame (QuantamentalDataFrame).

Parameters:

isc (Dict[str, pd.DataFrame]) – A dictionary of DataFrames with the changes in the information state for each ticker.
value_column (str) – The name of the column to use as the value.
min_period (pd.Timestamp) – The minimum period to include in the DataFrame.
max_period (pd.Timestamp) – The maximum period to include in the DataFrame.
postfix (str) – A postfix to append to the xcat column. Default is None.
metrics (Optional[List[str]]) – A list of metrics to include in the DataFrame. Default is [“eop”, “grading”]. Use metrics=None to include all available (non-value) metrics; use metrics=[] to include none (the value column only).
thresh (Union[Tuple[float, float], float]) – A float or a tuple of two floats to winsorise the data to. Default is None. If a single float is provided, it is used for both lower and upper bounds, as (-thresh, thresh). If a tuple is provided, it is used as (thresh[0], thresh[1]).

Returns:

A DataFrame with the dense information state.

Return type:

pd.DataFrame

temporal_aggregator_exponential(df, halflife=5, winsorise=None)[source]#

Temporal aggregator using exponential moving average.

Parameters:

df (QuantamentalDataFrame) – The QuantamentalDataFrame to aggregate.
halflife (int) – The halflife of the exponential moving average.
winsorise (float) – The value to winsorise the data to. Default is None.

Returns:

A QuantamentalDataFrame with the aggregated values.

Return type:

QuantamentalDataFrame

temporal_aggregator_mean(df, window=21, winsorise=None)[source]#

Temporal aggregator using a rolling mean.

Parameters:

df (QuantamentalDataFrame) – The QuantamentalDataFrame to aggregate.
window (int) – The window size for the rolling mean.
winsorise (float) – The value to winsorise the data to. Default is None.

Returns:

A QuantamentalDataFrame with the aggregated values.

Return type:

QuantamentalDataFrame

temporal_aggregator_period(isc, start, end, winsorise=10, postfix='_NCSUM')[source]#

Temporal aggregator over periods of changes in the information state.

Parameters:

isc (Dict[str, pd.DataFrame]) – A dictionary of DataFrames with the changes in the information state for each ticker.
start (pd.Timestamp) – The start date of the period to aggregate.
end (pd.Timestamp) – The end date of the period to aggregate.
winsorise (int) – The value to winsorise the data to. Default is 10.
postfix (str) – A postfix to append to the xcat column. Default is “_NCSUM”.

Returns:

A QuantamentalDataFrame with the aggregated values.

Return type:

QuantamentalDataFrame

class InformationStateChanges(min_period=None, max_period=None)[source]#

Bases: object

Class to hold information state changes for a set of tickers. InformationStateChanges show only data releases where there is an update in the indicator’s value, grading or eop_lag. This offers a more compact representation of the data, where only releases which add information are retained.

Initialize using the from_qdf class method to create an InformationStateChanges object from a QuantamentalDataFrame. The calculate_score method can be used to calculate scores for the information state changes.

Example initialization:

from macrosynergy.download import JPMaQSDownload
from macrosynergy.management import InformationStateChanges

tickers = ["USD_GDPPC_SA", "GBP_GDPPC_SA"]

with JPMaQSDownload(client_id="cl_id", client_secret="cl_secret") as jpmaqs:
    df = jpmaqs.download(tickers=tickers, metrics="all")

isc = InformationStateChanges.from_qdf(df)
usd_gpdppc_isc = isc["USD_GDPPC_SA"]

Parameters:

min_period (pd.Timestamp) – The minimum period to include in the InformationStateChanges object.
max_period (pd.Timestamp) – The maximum period to include in the InformationStateChanges object.

Note

Instantiate using the from_qdf or from_isc_df class methods. This class is subscriptable, i.e. isc[“ticker”] will return the DataFrame for the given ticker.

keys()[source]#

A list of tickers in the InformationStateChanges object.

Returns:: A view of the tickers in the InformationStateChanges object.
Return type:: KeysView

values()[source]#

Extract the DataFrames from the InformationStateChanges object.

Returns:: A view of the DataFrames in the InformationStateChanges object.
Return type:: ValuesView

items()[source]#

Iterate through (ticker, DataFrame) pairs in the InformationStateChanges object.

Returns:: A view of the (ticker, DataFrame) pairs in the InformationStateChanges object.
Return type:: ItemsView

classmethod from_qdf(df, norm=True, annualize_by_release_frequency=None, score_by='diff', zscore_freq_window=3, zscore_freqs_allowed=('D', 'W', 'M', 'Q', 'A'), **kwargs)[source]#

Create an InformationStateChanges object from a QuantamentalDataFrame.

Parameters:

qdf (QuantamentalDataFrame) – The QuantamentalDataFrame to create the InformationStateChanges object from. This dataframe must contain a value column. Additionally, the eop_lag column is required to calculate the correct eop and version information. If not provided, the information state is assumed to be based on the value only. The grading column is optional and will be preserved in the output if provided.
norm (bool) – If True, calculate the score for the information state changes.
annualize_by_release_frequency (bool) – If True, annualize the score by the inferred release frequency. Default is None, where it follows the behaviour of norm (i.e. annualize_by_release_frequency is set to True if norm is True and False otherwise).
score_by (str) – The method to use for scoring. If “diff” (default), the score is calculated based on the difference between the information state changes. If “level”, the score is calculated based on the value (‘level’) of the information state change.
zscore_freq_window (int) – rolling-median window passed to infer_release_frequency as part of annualize_by_release_frequency. Default 3.
zscore_freqs_allowed (Tuple[str, ...]) – candidate frequency labels for infer_release_frequency as part of annualize_by_release_frequency. Default (“D”, “W”, “M”, “Q”, “A”).
**kwargs (Any) – Additional keyword arguments to pass to the calculate_score method. Please refer to InformationStateChanges.calculate_score() for more information.

Returns:

An InformationStateChanges object.

Return type:

InformationStateChanges

classmethod from_isc_df(df, ticker, value_column='value', eop_column='eop', grading_column='grading', real_date_column='real_date', norm=True, **kwargs)[source]#

Create an InformationStateChanges object from a DataFrame.

Parameters:

df (pd.DataFrame) – The DataFrame to create the InformationStateChanges object from.
ticker (str) – The ticker to create the InformationStateChanges object for.
value_column (str) – The name of the column to use as the value.
eop_column (str) – The name of the column to use as the end of period date.
grading_column (str) – The name of the column to use as the grading.
real_date_column (str) – The name of the column to use as the real date.
norm (bool) – If True, calculate the score for the information state changes.
**kwargs (Any) – Additional keyword arguments to pass to the calculate_score Please refer to InformationStateChanges.calculate_score() for more information.

Returns:

An InformationStateChanges object.

Return type:

InformationStateChanges

to_qdf(value_column='value', postfix=None, metrics=['eop', 'grading'], thresh=None)[source]#

Convert the InformationStateChanges object to a QuantamentalDataFrame.

Parameters:

value_column (str) – The name of the column to use as the value.
postfix (str) – A postfix to append to the xcat column. Default is None.
metrics (List[str]) – A list of metrics to include in the DataFrame. Default is [“eop”, “grading”]. Use metrics=None to include all available (non-value) metrics; use metrics=[] to include none (the value column only).
thresh (Union[Tuple[float, float], float]) – A float or a tuple of two floats to winsorise the data to. Default is None. If a single float is provided, it is used for both lower and upper bounds, as (-thresh, thresh). If a tuple is provided, it is used as (thresh[0], thresh[1]).

Returns:

A DataFrame with the information state changes.

Return type:

pd.DataFrame

annualize_by_release_frequency(zscore_freq_window=3, zscore_freqs_allowed=('D', 'W', 'M', 'Q', 'A'), thresh=None)[source]#

Annualize each value by a time-varying weight inferred from its release cadence.

Multiplies each value by sqrt(1 / ANNUALIZATION_FACTORS[freq]), where freq is the contemporaneous release frequency inferred per observation from the eop cadence (see infer_release_frequency()). The weight is time-varying: a series whose cadence changes (e.g. quarterly -> monthly) is weighted quarterly before the break and monthly after it.

Parameters:

zscore_freq_window (int) – rolling-median window passed to infer_release_frequency. Default 3.
zscore_freqs_allowed (Tuple[str, ...]) – candidate frequency labels. Default (“D”, “W”, “M”, “Q”, “A”).
thresh (Union[Tuple[float, float], float]) – Winsorise the zscore before weighting. Default None (no winsorisation). A scalar clips to (-thresh, thresh); a tuple clips to its (min, max), so order does not matter.

Notes

Tickers without a zscore column, or whose release frequency cannot be inferred (fewer than two distinct eop dates), are warned about and skipped rather than raising - so this stays safe on the default from_qdf(norm=True) path even when some tickers have too few releases to weight.

Return type:: QuantamentalDataFrame

to_dict(ticker)[source]#

Return type:: Dict[str, Union[List[Tuple[str, float, str, float]], Tuple[str, str, str], str]]

to_json(ticker)[source]#

Return type:: str

get_releases(from_date=Timestamp('2026-07-09 00:00:00'), to_date=Timestamp('2026-07-10 00:00:00'), excl_xcats=None, latest_only=True)[source]#

Get the latest releases for the InformationStateChanges object.

Parameters:

from_date (pd.Timestamp) – The start date of the period to get releases for.
to_date (pd.Timestamp) – The end date of the period to get releases for.
excl_xcats (List[str]) – A list of xcats to exclude from the releases.
latest_only (bool) – If True, only the latest release for each ticker is returned. Default is True.

Returns:

A DataFrame with the latest releases for each ticker. If latest_only is False, all releases within the date range are returned.

Return type:

pd.DataFrame

temporal_aggregator_period(winsorise=10, start=None, end=None)[source]#

Temporal aggregator over periods of changes in the information state.

Parameters:

winsorise (int) – The value to winsorise the data to. Default is 10.
start (pd.Timestamp) – The start date of the period to aggregate.
end (pd.Timestamp) – The end date of the period to aggregate.

Returns:

A QuantamentalDataFrame with the aggregated values.

Return type:

QuantamentalDataFrame

calculate_score(std='std', halflife=None, min_periods=10, isc_version=0, iis=False, custom_method=None, custom_method_kwargs={}, volatility_forecast=True, score_by='diff')[source]#

Calculate score on sparse indicator for the InformationStateChanges object.

Parameters:

std (str) – The method to use for calculating the standard deviation. Supported methods are std, abs, exp and exp_abs. See the documentation for StandardDeviationMethods for more information.
halflife (int) – The halflife of the exponential weighting. Only used with exp and exp_abs methods. Default is None.
min_periods (int) – The minimum number of periods required for the calculation. Default is 10.
isc_version (int) – The version of the information state changes to use. If set to 0 (default), only the first version is used. If set to any other positive integer, all versions are used.
iis (bool) – if True (default) zn-scores are also calculated for the initial sample period defined by min_periods, on an in-sample basis, to avoid losing history.
custom_method (Callable) – A custom method to use for calculating the standard deviation. Must have the signature custom_method(s: pd.Series, **kwargs) -> pd.Series.
custom_method_kwargs (Dict) – Keyword arguments to pass to the custom method.
volatility_forecast (bool) – If True (default), the volatility forecast is shifted one period forward to align with the information state changes.
score_by (str) – The method to use for scoring. If “diff” (default), the score is calculated based on the difference between the information state changes. If “level”, the score is calculated based on the value (‘level’) of the information state change.

Returns:

The InformationStateChanges object with the scores

Return type:

InformationStateChanges