macrosynergy.management.types.qdf.methods#

Module hosting custom types and meta-classes for use with Quantamental DataFrames.

get_col_sort_order(df)[source]#

Sort the columns of a QuantamentalDataFrame (in-place) in a consistent order.

Parameters:: df (QuantamentalDataFrame) – DataFrame to return the sorted columns of.
Returns:: List of sorted column names.
Return type:: List[str]

change_column_format(df, cols, dtype)[source]#

Change the format of columns in a DataFrame.

Parameters:

df (QuantamentalDataFrame) – DataFrame to change the format of.
cols (List[str]) – List of column names to change the format of.
dtype (Any) – Data type to change the columns to.

Returns:

DataFrame with the columns changed to the specified format

Return type:

QuantamentalDataFrame

Raises:

TypeError – If df is not a QuantamentalDataFrame.
TypeError – If cols is not a list of strings.
ValueError – If a column in cols is not found in the DataFrame.

qdf_to_categorical(df)[source]#

Convert the index columns (“cid”, “xcat”) of a DataFrame to categorical format.

Parameters:: df (QuantamentalDataFrame) – DataFrame to convert the index columns of.
Raises:: TypeError – If df is not a QuantamentalDataFrame.
Returns:: DataFrame with the index columns converted to categorical format.
Return type:: QuantamentalDataFrame

qdf_to_string_index(df)[source]#

Convert the index columns (“cid”, “xcat”) of a DataFrame to string format.

Parameters:: df (QuantamentalDataFrame) – DataFrame to convert the index columns of.
Raises:: TypeError – If df is not a QuantamentalDataFrame.
Returns:: DataFrame with the index columns converted to string format.
Return type:: QuantamentalDataFrame

check_is_categorical(df)[source]#

Check if the index columns of a DataFrame are categorical.

Parameters:: df (QuantamentalDataFrame) – DataFrame to check the index columns of.
Returns:: True if the required index columns (“cid”, “xcat”) are categorical, False otherwise.
Return type:: bool

apply_blacklist(df, blacklist)[source]#

Apply a blacklist to a list of cids and xcats. The blacklisted data ranges are removed from the DataFrame. This is useful for removing data that is known to be incorrect or unreliable.

Parameters:

df (QuantamentalDataFrame) – DataFrame to apply the blacklist to.
blacklist (dict) –
Dictionary with keys as cids and values as a list of start and end dates to blacklist. Example:
```
{"cid": ["2020-01-01", "2020-12-31"]}
```
This can be extended to cover multiple periods for the same cid by appending an additional label to the end of the cid key. Example:
```
{
    "usd_1": ["2020-01-01", "2020-12-31"],
    "usd_2": ["2020-01-01", "2020-12-31"],
    "eur": ["2020-01-01", "2020-12-31"],
}
```

Returns:

DataFrame with the blacklist applied.

Return type:

QuantamentalDataFrame

reduce_df(df, cids=None, xcats=None, start=None, end=None, blacklist=None, out_all=False, intersect=False)[source]#

Filter DataFrame by cids, xcats, and start & end dates.

Parameters:

df (QuantamentalDataFrameBase) – The DataFrame to be filtered.
cids (Optional[List[str]], optional) – List of cid values to filter by. If None, all cid values are included.
xcats (Optional[List[str]], optional) – List of xcat values to filter by. If None, all xcat values are included.
start (Optional[str], optional) – Start date for filtering. If None, no start date filtering is applied.
end (Optional[str], optional) – End date for filtering. If None, no end date filtering is applied.
blacklist (dict, optional) – Dictionary specifying blacklist criteria. If None, no blacklist filtering is applied.
out_all (bool, optional) – If True, returns the filtered DataFrame along with the lists of xcats and cids; i.e. (df, xcats, cids).
intersect (bool, optional) – If True, only includes cid values that are present for all xcat values.

Returns:

The filtered DataFrame. If out_all is True, also returns the lists of xcats and cids.

Return type:

Union[QuantamentalDataFrameBase, Tuple[QuantamentalDataFrameBase, List[str], List[str]]]

reduce_df_by_ticker(df, tickers, start=None, end=None, blacklist=None)[source]#

Filters the given QuantamentalDataFrameBase based on tickers, date range, and blacklist.

Parameters:

df (QuantamentalDataFrameBase) – The DataFrame to be filtered.
tickers (List[str]) – List of tickers to filter by.
start (Optional[str], optional) – Start date for filtering. If None, no start date filtering is applied.
end (Optional[str], optional) – End date for filtering. If None, no end date filtering is applied.
blacklist (dict, optional) – Dictionary specifying blacklist criteria. If None, no blacklist filtering is applied.

Raises:

TypeError – If df is not a QuantamentalDataFrame.

Returns:

The filtered DataFrame.

Return type:

QuantamentalDataFrameBase

update_df(df, df_add, xcat_replace=False)[source]#

Append a standard DataFrame to a standard base DataFrame with ticker replacement on the intersection.

Parameters:

df (QuantamentalDataFrame) – Base DataFrame to append to.
df_add (QuantamentalDataFrame) – DataFrame to append.
xcat_replace (bool, optional) – If True, replace the xcats in the base DataFrame with the xcats in the DataFrame to append. Default is False.

Return type:

QuantamentalDataFrameBase

update_tickers(df, df_add)[source]#

Method used to update aggregate DataFrame on the ticker level.

Parameters:

df (pd.DataFrame) – DataFrame to update.
df_add (pd.DataFrame) – DataFrame to add to the base DataFrame.

Returns:

Updated DataFrame.

Return type:

QuantamentalDataFrame

update_categories(df, df_add)[source]#

Method used to update the DataFrame on the category level.

Return type:: QuantamentalDataFrameBase

qdf_to_wide_df(df, value_column='value')[source]#

Pivot the DataFrame to a wide format with memory efficiency.

Return type:: DataFrame

add_ticker_column(df)[source]#

Get the list of tickers from the DataFrame.

Parameters:: df (QuantamentalDataFrame) – DataFrame to extract the tickers from.
Raises:: TypeError – If df is not a QuantamentalDataFrame.
Returns:: List of tickers.
Return type:: List[str]

rename_xcats(df, xcat_map=None, select_xcats=None, postfix=None, prefix=None, name_all=None, fmt_string=None)[source]#

Rename the xcats in a DataFrame based on a mapping or a format string. Only one of xcat_map or select_xcats must be provided. If name_all is provided, all xcats will be renamed to this value.

NOTE: This function maintains the datatype of the xcat column as a categorical.

Parameters:

df (QuantamentalDataFrame) – DataFrame to rename the xcats in.
xcat_map (dict, optional) – Dictionary mapping the old xcats to new xcats. Default is None.
select_xcats (List[str], optional) – List of xcats to rename. Default is None.
postfix (str, optional) – Postfix to add to the xcats. Default is None.
prefix (str, optional) – Prefix to add to the xcats. Default is None.
name_all (str, optional) – Name to rename all xcats to. Default is None.
fmt_string (str, optional) – Format string to rename xcats. Default is None.

Raises:

TypeError – If df is not a QuantamentalDataFrame.
ValueError – If both xcat_map and select_xcats are provided.
TypeError – If xcat_map is not a dictionary with string keys and values.
ValueError – If postfix, prefix, name_all, or fmt_string are not provided.
ValueError – If fmt_string does not contain exactly one pair of curly braces.

Returns:

DataFrame with the xcats renamed.

Return type:

QuantamentalDataFrame

create_empty_categorical_qdf(cid=None, xcat=None, ticker=None, metrics=['value'], date_range=None, start=None, end=None, categorical=True)[source]#

Create an empty QuantamentalDataFrame with categorical index columns. This is useful for creating a DataFrame for a given ticker with the required metrics. The ticker can be specified using cid and xcat or directly using ticker. The data range can be specified using date_range or start and end.

Parameters:

cid (str, optional) – cid value to use. Must be passed with xcat. Default is None.
xcat (str, optional) – xcat value to use. Must be passed with cid. Default is None.
ticker (str, optional) – Ticker to use. Must not be passed with cid and xcat. Default is None.
metrics (List[str], optional) – List of metrics to create columns for. Default is [“value”].
date_range (pd.DatetimeIndex, optional) – Date range to create the DataFrame for. Must not be passed with start and end. Default is None.
start (str, optional) – Start date for the DataFrame. Default is None.
end (str, optional) – End date for the DataFrame. Default is None.

Raises:

TypeError – If metrics is not a list of strings.
ValueError – If date_range is None and start and end are not provided.
ValueError – If cid and xcat are not provided together.
ValueError – If cid and xcat are provided together.
ValueError – If ticker is provided with cid and xcat.

Returns:

Empty DataFrame with the required index columns and metrics.

Return type:

QuantamentalDataFrame

add_nan_series(df, ticker=None, cid=None, xcat=None, start=None, end=None)[source]#

Add a NaN series to the DataFrame for a given ticker.

Parameters:

df (QuantamentalDataFrame) – DataFrame to add the NaN series to.
ticker (str, optional) – Ticker to add the NaN series for. Must not be passed with cid and xcat. Default is None.
cid (str, optional) – cid value to use. Must be passed with xcat. Default is None.
xcat (str, optional) – xcat value to use. Must be passed with cid. Default is None.
start (str or pd.Timestamp, optional) – Start date for the NaN series. Default is None.
end (str or pd.Timestamp, optional) – End date for the NaN series. Default is None.

Raises:

TypeError – If df is not a QuantamentalDataFrame.
ValueError – If ticker is provided with cid and xcat.

Returns:

DataFrame with the NaN series added.

Return type:

QuantamentalDataFrame

drop_nan_series(df, column='value', raise_warning=False)[source]#

Drops any series that are entirely NaNs. Raises a user warning if any series are dropped.

Parameters:

df (QuantamentalDataFrame) – DataFrame to drop the NaN series from.
column (str, optional) – Column to check for NaNs. Default is “value”.
raise_warning (bool, optional) – If True, raises a warning if any series are dropped. Default is False.

Raises:

TypeError – If df is not a QuantamentalDataFrame.
ValueError – If column is not found in the DataFrame.

Returns:

DataFrame with the NaN series dropped.

Return type:

QuantamentalDataFrame

qdf_from_timeseries(timeseries, cid=None, xcat=None, ticker=None, metric='value')[source]#

Create a QuantamentalDataFrame from a time series.

Parameters:

timeseries (pd.Series) – Time series to create the QuantamentalDataFrame from.
cid (str, optional) – cid value to use. Must be passed with xcat. Default is None.
xcat (str, optional) – xcat value to use. Must be passed with cid. Default is None.
ticker (str, optional) – Ticker to use. Must not be passed with cid and xcat. Default is None.
metric (str, optional) – Metric name to use. Default is “value”.

Raises:

TypeError – If timeseries is not a pandas Series.
TypeError – If metric is not a string.
ValueError – If timeseries does not have a datetime index.
ValueError – If only one of cid and xcat is provided.
ValueError – If ticker is provided with cid and xcat.

Returns:

DataFrame created from the time series.

Return type:

QuantamentalDataFrame

concat_qdfs(qdf_list)[source]#

Concatenate a list of QuantamentalDataFrames into a single QuantamentalDataFrame. Converts the index columns to categorical format, if not already categorical.

Parameters:: qdf_list (List[QuantamentalDataFrame]) – List of QuantamentalDataFrames to concatenate.
Raises:: TypeError – If qdf_list is not a list of QuantamentalDataFrames.
Returns:: DataFrame with the QuantamentalDataFrames concatenated.
Return type:: QuantamentalDataFrame