macrosynergy.management.types.qdf.methods#
Module hosting custom types and meta-classes for use with Quantamental DataFrames.
- get_col_sort_order(df)[source]#
Sort the columns of a QuantamentalDataFrame (in-place) in a consistent order.
- Parameters:
df (QuantamentalDataFrame) – DataFrame to return the sorted columns of.
- Returns:
List of sorted column names.
- Return type:
List[str]
- change_column_format(df, cols, dtype)[source]#
Change the format of columns in a DataFrame.
- Parameters:
df (QuantamentalDataFrame) – DataFrame to change the format of.
cols (List[str]) – List of column names to change the format of.
dtype (Any) – Data type to change the columns to.
- Returns:
DataFrame with the columns changed to the specified format
- Return type:
- Raises:
TypeError – If df is not a QuantamentalDataFrame.
TypeError – If cols is not a list of strings.
ValueError – If a column in cols is not found in the DataFrame.
- qdf_to_categorical(df)[source]#
Convert the index columns (“cid”, “xcat”) of a DataFrame to categorical format.
- Parameters:
df (QuantamentalDataFrame) – DataFrame to convert the index columns of.
- Raises:
TypeError – If df is not a QuantamentalDataFrame.
- Returns:
DataFrame with the index columns converted to categorical format.
- Return type:
- qdf_to_string_index(df)[source]#
Convert the index columns (“cid”, “xcat”) of a DataFrame to string format.
- Parameters:
df (QuantamentalDataFrame) – DataFrame to convert the index columns of.
- Raises:
TypeError – If df is not a QuantamentalDataFrame.
- Returns:
DataFrame with the index columns converted to string format.
- Return type:
- check_is_categorical(df)[source]#
Check if the index columns of a DataFrame are categorical.
- Parameters:
df (QuantamentalDataFrame) – DataFrame to check the index columns of.
- Returns:
True if the required index columns (“cid”, “xcat”) are categorical, False otherwise.
- Return type:
- apply_blacklist(df, blacklist)[source]#
Apply a blacklist to a list of cids and xcats. The blacklisted data ranges are removed from the DataFrame. This is useful for removing data that is known to be incorrect or unreliable.
- Parameters:
df (QuantamentalDataFrame) – DataFrame to apply the blacklist to.
blacklist (dict) –
Dictionary with keys as cids and values as a list of start and end dates to blacklist. Example:
{"cid": ["2020-01-01", "2020-12-31"]}
This can be extended to cover multiple periods for the same cid by appending an additional label to the end of the cid key. Example:
{ "usd_1": ["2020-01-01", "2020-12-31"], "usd_2": ["2020-01-01", "2020-12-31"], "eur": ["2020-01-01", "2020-12-31"], }
- Returns:
DataFrame with the blacklist applied.
- Return type:
- reduce_df(df, cids=None, xcats=None, start=None, end=None, blacklist=None, out_all=False, intersect=False)[source]#
Filter DataFrame by cids, xcats, and start & end dates.
- Parameters:
df (QuantamentalDataFrameBase) – The DataFrame to be filtered.
cids (Optional[List[str]], optional) – List of cid values to filter by. If None, all cid values are included.
xcats (Optional[List[str]], optional) – List of xcat values to filter by. If None, all xcat values are included.
start (Optional[str], optional) – Start date for filtering. If None, no start date filtering is applied.
end (Optional[str], optional) – End date for filtering. If None, no end date filtering is applied.
blacklist (dict, optional) – Dictionary specifying blacklist criteria. If None, no blacklist filtering is applied.
out_all (bool, optional) – If True, returns the filtered DataFrame along with the lists of xcats and cids; i.e. (df, xcats, cids).
intersect (bool, optional) – If True, only includes cid values that are present for all xcat values.
- Returns:
The filtered DataFrame. If out_all is True, also returns the lists of xcats and cids.
- Return type:
Union[QuantamentalDataFrameBase, Tuple[QuantamentalDataFrameBase, List[str], List[str]]]
- reduce_df_by_ticker(df, tickers, start=None, end=None, blacklist=None)[source]#
Filters the given QuantamentalDataFrameBase based on tickers, date range, and blacklist.
- Parameters:
df (QuantamentalDataFrameBase) – The DataFrame to be filtered.
tickers (List[str]) – List of tickers to filter by.
start (Optional[str], optional) – Start date for filtering. If None, no start date filtering is applied.
end (Optional[str], optional) – End date for filtering. If None, no end date filtering is applied.
blacklist (dict, optional) – Dictionary specifying blacklist criteria. If None, no blacklist filtering is applied.
- Raises:
TypeError – If df is not a QuantamentalDataFrame.
- Returns:
The filtered DataFrame.
- Return type:
- update_df(df, df_add, xcat_replace=False)[source]#
Append a standard DataFrame to a standard base DataFrame with ticker replacement on the intersection.
- Parameters:
df (QuantamentalDataFrame) – Base DataFrame to append to.
df_add (QuantamentalDataFrame) – DataFrame to append.
xcat_replace (bool, optional) – If True, replace the xcats in the base DataFrame with the xcats in the DataFrame to append. Default is False.
- Return type:
- update_tickers(df, df_add)[source]#
Method used to update aggregate DataFrame on the ticker level.
- Parameters:
df (pd.DataFrame) – DataFrame to update.
df_add (pd.DataFrame) – DataFrame to add to the base DataFrame.
- Returns:
Updated DataFrame.
- Return type:
- update_categories(df, df_add)[source]#
Method used to update the DataFrame on the category level.
- Return type:
- qdf_to_wide_df(df, value_column='value')[source]#
Pivot the DataFrame to a wide format with memory efficiency.
- Return type:
DataFrame
- add_ticker_column(df)[source]#
Get the list of tickers from the DataFrame.
- Parameters:
df (QuantamentalDataFrame) – DataFrame to extract the tickers from.
- Raises:
TypeError – If df is not a QuantamentalDataFrame.
- Returns:
List of tickers.
- Return type:
List[str]
- rename_xcats(df, xcat_map=None, select_xcats=None, postfix=None, prefix=None, name_all=None, fmt_string=None)[source]#
Rename the xcats in a DataFrame based on a mapping or a format string. Only one of xcat_map or select_xcats must be provided. If name_all is provided, all xcats will be renamed to this value.
NOTE: This function maintains the datatype of the xcat column as a categorical.
- Parameters:
df (QuantamentalDataFrame) – DataFrame to rename the xcats in.
xcat_map (dict, optional) – Dictionary mapping the old xcats to new xcats. Default is None.
select_xcats (List[str], optional) – List of xcats to rename. Default is None.
postfix (str, optional) – Postfix to add to the xcats. Default is None.
prefix (str, optional) – Prefix to add to the xcats. Default is None.
name_all (str, optional) – Name to rename all xcats to. Default is None.
fmt_string (str, optional) – Format string to rename xcats. Default is None.
- Raises:
TypeError – If df is not a QuantamentalDataFrame.
ValueError – If both xcat_map and select_xcats are provided.
TypeError – If xcat_map is not a dictionary with string keys and values.
ValueError – If postfix, prefix, name_all, or fmt_string are not provided.
ValueError – If fmt_string does not contain exactly one pair of curly braces.
- Returns:
DataFrame with the xcats renamed.
- Return type:
- create_empty_categorical_qdf(cid=None, xcat=None, ticker=None, metrics=['value'], date_range=None, start=None, end=None, categorical=True)[source]#
Create an empty QuantamentalDataFrame with categorical index columns. This is useful for creating a DataFrame for a given ticker with the required metrics. The ticker can be specified using cid and xcat or directly using ticker. The data range can be specified using date_range or start and end.
- Parameters:
cid (str, optional) – cid value to use. Must be passed with xcat. Default is None.
xcat (str, optional) – xcat value to use. Must be passed with cid. Default is None.
ticker (str, optional) – Ticker to use. Must not be passed with cid and xcat. Default is None.
metrics (List[str], optional) – List of metrics to create columns for. Default is [“value”].
date_range (pd.DatetimeIndex, optional) – Date range to create the DataFrame for. Must not be passed with start and end. Default is None.
start (str, optional) – Start date for the DataFrame. Default is None.
end (str, optional) – End date for the DataFrame. Default is None.
- Raises:
TypeError – If metrics is not a list of strings.
ValueError – If date_range is None and start and end are not provided.
ValueError – If cid and xcat are not provided together.
ValueError – If cid and xcat are provided together.
ValueError – If ticker is provided with cid and xcat.
- Returns:
Empty DataFrame with the required index columns and metrics.
- Return type:
- add_nan_series(df, ticker=None, cid=None, xcat=None, start=None, end=None)[source]#
Add a NaN series to the DataFrame for a given ticker.
- Parameters:
df (QuantamentalDataFrame) – DataFrame to add the NaN series to.
ticker (str, optional) – Ticker to add the NaN series for. Must not be passed with cid and xcat. Default is None.
cid (str, optional) – cid value to use. Must be passed with xcat. Default is None.
xcat (str, optional) – xcat value to use. Must be passed with cid. Default is None.
start (str or pd.Timestamp, optional) – Start date for the NaN series. Default is None.
end (str or pd.Timestamp, optional) – End date for the NaN series. Default is None.
- Raises:
TypeError – If df is not a QuantamentalDataFrame.
ValueError – If ticker is provided with cid and xcat.
- Returns:
DataFrame with the NaN series added.
- Return type:
- drop_nan_series(df, column='value', raise_warning=False)[source]#
Drops any series that are entirely NaNs. Raises a user warning if any series are dropped.
- Parameters:
df (QuantamentalDataFrame) – DataFrame to drop the NaN series from.
column (str, optional) – Column to check for NaNs. Default is “value”.
raise_warning (bool, optional) – If True, raises a warning if any series are dropped. Default is False.
- Raises:
TypeError – If df is not a QuantamentalDataFrame.
ValueError – If column is not found in the DataFrame.
- Returns:
DataFrame with the NaN series dropped.
- Return type:
- qdf_from_timeseries(timeseries, cid=None, xcat=None, ticker=None, metric='value')[source]#
Create a QuantamentalDataFrame from a time series.
- Parameters:
timeseries (pd.Series) – Time series to create the QuantamentalDataFrame from.
cid (str, optional) – cid value to use. Must be passed with xcat. Default is None.
xcat (str, optional) – xcat value to use. Must be passed with cid. Default is None.
ticker (str, optional) – Ticker to use. Must not be passed with cid and xcat. Default is None.
metric (str, optional) – Metric name to use. Default is “value”.
- Raises:
TypeError – If timeseries is not a pandas Series.
TypeError – If metric is not a string.
ValueError – If timeseries does not have a datetime index.
ValueError – If only one of cid and xcat is provided.
ValueError – If ticker is provided with cid and xcat.
- Returns:
DataFrame created from the time series.
- Return type:
- concat_qdfs(qdf_list)[source]#
Concatenate a list of QuantamentalDataFrames into a single QuantamentalDataFrame. Converts the index columns to categorical format, if not already categorical.
- Parameters:
qdf_list (List[QuantamentalDataFrame]) – List of QuantamentalDataFrames to concatenate.
- Raises:
TypeError – If qdf_list is not a list of QuantamentalDataFrames.
- Returns:
DataFrame with the QuantamentalDataFrames concatenated.
- Return type: