macrosynergy.download.jpmaqs#
JPMaQS Download Interface
- deconstruct_expression(expression)[source]#
Deconstruct an expression into a list of cid, xcat, and metric. Achieves the inverse of construct_expressions(). For non-JPMaQS expressions, the returned list will be [expression, expression, ‘value’]. The metric is set to ‘value’ to ensure the reported metric is consistent with the standard JPMaQS metrics (JPMaQSDownload.valid_metrics).
- Parameters:
expression (str) – expression to deconstruct. If a list is provided, each element will be deconstructed and returned as a list of lists.
- Raises:
TypeError – if expression is not a string or a list of strings.
ValueError – if expression is an empty list.
- Returns:
list of cid, xcat, and metric.
- Return type:
- check_attributes_in_sync(ts_list)[source]#
Checks if the attributes in the response are in sync with the time-series data. This is performed since on occasion the ticker will have just been calculated for a new date but on certain pods the data won’t have updated yet but on some it will have updated. This can lead to the attributes on a specific time-series being out of sync.
- construct_expressions(tickers=None, cids=None, xcats=None, metrics=None)[source]#
Construct expressions from the provided arguments.
- timeseries_to_qdf(timeseries)[source]#
Converts a dictionary containing a time-series to a QuantamentalDataFrame.
- Parameters:
timeseries (Dict[str, Any]) – A dictionary containing a time-series.
- Returns:
The converted DataFrame.
- Return type:
- timeseries_to_column(timeseries, errors='ignore')[source]#
Converts a dictionary of time series to a DataFrame with a single column.
- Parameters:
- Returns:
The converted DataFrame.
- Return type:
pd.DataFrame
- concat_column_dfs(df_list, errors='ignore')[source]#
Concatenates a list of DataFrames into a single DataFrame.
- Parameters:
df_list (List[pd.DataFrame]) – A list of DataFrames.
errors (str) – The error handling method to use. If ‘raise’, then invalid items in the list will raise an error. If ‘ignore’, then invalid items will be ignored. Default is ‘ignore’.
- Returns:
The concatenated DataFrame.
- Return type:
pd.DataFrame
- validate_downloaded_df(data_df, expected_expressions, found_expressions, start_date=None, end_date=None, verbose=True)[source]#
Validate the downloaded data in the provided dataframe.
- Parameters:
data_df (pd.DataFrame) – dataframe containing the downloaded data.
expected_expressions (list[str]) – list of expressions that were expected to be downloaded.
found_expressions (list[str]) – list of expressions that were actually downloaded.
start_date (str) – start date of the downloaded data.
end_date (str) – end date of the downloaded data.
verbose (bool) – whether to print the validation results.
- Raises:
TypeError – if data_df is not a dataframe.
- Returns:
True if the downloaded data is valid, False otherwise.
- Return type:
- get_expressions_from_file(file_path, as_dataframe=True, dataframe_format='qdf')[source]#
Loads the expressions found in a downloaded timeseries file (either JSON or CSV).
- validate_downloaded_data(path, expected_expressions, as_dataframe=True, dataframe_format='qdf', show_progress=True)[source]#
Validate the downloaded data in the provided path.
- Parameters:
path (str) – path to the downloaded data.
expected_expressions (list[str]) – list of expressions that were expected to be downloaded.
as_dataframe (bool) – whether to load the files as dataframes.
dataframe_format (str) – the format of the dataframe. Must be one of ‘qdf’ or ‘wide’.
show_progress (bool) – whether to show a progress bar.
- Returns:
list of expressions that are missing from the downloaded data.
- Return type:
- class JPMaQSDownload(oauth=True, client_id=None, client_secret=None, crt=None, key=None, username=None, password=None, check_connection=True, proxy=None, suppress_warning=True, debug=False, print_debug_data=False, dq_download_kwargs={}, *args, **kwargs)[source]#
Bases:
DataQueryInterfaceJPMaQSDownload Object. This object is used to download JPMaQS data via the DataQuery API. It can be extended to include the use of proxies, and even request generic DataQuery expressions.
- Parameters:
oauth (bool) – True if using oauth, False if using username/password with crt/key.
client_id (Optional[str]) – oauth client_id, required if oauth=True.
client_secret (Optional[str]) – oauth client_secret, required if oauth=True.
crt (Optional[str]) – path to crt file.
key (Optional[str]) – path to key file.
username (Optional[str]) – username for certificate based authentication.
password (Optional[str]) – paired with username for certificate
debug (bool) – True if debug mode, False if not.
suppress_warning (bool) – True if suppressing warnings, False if not.
check_connection (bool) – True if the interface should check the connection to the server before sending requests, False if not. False by default.
proxy (Optional[dict]) – proxy to use for requests, None if not using proxy (default).
print_debug_data (bool) – True if debug data should be printed, False if not (default).
dq_kwargs (dict) – additional arguments to pass to the DataQuery API object such calender and frequency for the DataQuery API. For more fine-grained usage, initialize the DataQueryInterface object explicitly.
kwargs (dict) – any other keyword arguments.
- Raises:
TypeError – if provided arguments are not of the correct type.
ValueError – if provided arguments are invalid or semantically incorrect.
- validate_download_args(tickers, cids, xcats, metrics, start_date, end_date, get_catalogue, expressions, show_progress, as_dataframe, dataframe_format, report_time_taken)[source]#
Validate the arguments passed to the download function.
- Raises:
TypeError – If any of the arguments are not of the correct type.
ValueError – If any of the arguments are semantically incorrect.
- Returns:
True if valid.
- Return type:
- filter_expressions_from_catalogue(expressions, verbose=True)[source]#
Method to filter a list of expressions against the JPMaQS catalogue. This avoids requesting data for expressions that are not in the catalogue, and provides the user wuth the complete list of expressions that are in the catalogue.
- get_catalogue(group_id='JPMAQS', page_size=1000, verbose=True)[source]#
Get the JPMaQS catalogue.
- Returns:
list of tickers in the JPMaQS catalogue.
- Return type:
List[str]
- download_all_to_disk(path, expressions=None, as_dataframe=True, dataframe_format='qdf', show_progress=True, delay_param=0.25, batch_size=None, retry=3, overwrite=True, *args, **kwargs)[source]#
Downloads all JPMaQS data to disk.
- Parameters:
path (str) – path to the directory where the data will be saved.
expressions (Optional[List[str]) – Default is None, meaning all expressions in the JPMaQS catalogue will be downloaded. If provided, only the expressions in the list will be downloaded.
as_dataframe (bool) – Default is True, meaning the data will be saved as a DataFrame (either in the Quantamental Data Format (‘qdf’) or wide format (‘wide’)). If False, the data will be saved as JSON files, with one expression per file.
dataframe_format (str) – Default is ‘qdf’. If as_dataframe is True, this parameter specifies the format of the DataFrame. Must be one of ‘qdf’ or ‘wide’.
show_progress (bool) – Default is True, meaning the progress of the download will be displayed. If False, the progress will not be displayed.
delay_param (float) – Default is 0.2 seconds (fastest allowed by DataQuery API). The delay parameter to use when making requests to the DataQuery API. Ideally, this should not be changed.
batch_size (int) – Default is None, meaning the batch size will be set to the default size (20). If provided, this parameter specifies the number of expressions to download in each batch.
retry (int) – Default is 3, meaning the download will be retried 3 times for any expressions that fail to download. If set to 0, no retries will be attempted.
overwrite (bool) – Default is True, meaning the data will be overwritten if it already exists. If False, the data will not be overwritten.
kwargs (dict) – any other keyword arguments.
- Returns:
The data is saved to disk.
- Return type:
None
Examples
Download all JPMaQS data to disk.
>>> with JPMaQSDownload( ... client_id=os.getenv("DQ_CLIENT_ID"), ... client_secret=os.getenv("DQ_CLIENT_SECRET"), ... ) as jpmaqs: ... jpmaqs.download_all_to_disk(path="./jpmaqs-data")
Alternatively downloading only a custom list of expressions
>>> expressions = ['DB(JPMAQS,USD_EQXR_NSA,value)', 'DB(JPMAQS,GBP_EQXR_NSA,value)'] >>> with JPMaQSDownload( ... client_id=os.getenv("DQ_CLIENT_ID"), ... client_secret=os.getenv("DQ_CLIENT_SECRET"), ... ) as jpmaqs: ... jpmaqs.download_all_to_disk(path="./jpmaqs-data", expressions=expressions)
Save each expression as a JSON
>>> with JPMaQSDownload( ... client_id=os.getenv("DQ_CLIENT_ID"), ... client_secret=os.getenv("DQ_CLIENT_SECRET"), ... ) as jpmaqs: ... jpmaqs.download_all_to_disk(path="./jpmaqs-data", as_dataframe=False)
- download(tickers=None, cids=None, xcats=None, metrics=['value'], start_date='2000-01-01', end_date=None, expressions=None, get_catalogue=False, show_progress=False, debug=False, suppress_warning=False, as_dataframe=True, dataframe_format='qdf', report_time_taken=False, categorical_dataframe=False, *args, **kwargs)[source]#
Driver function to download data from JPMaQS via the DataQuery API. Timeseries data can be requested using tickers with metrics, or passing formed DataQuery expressions. cids and xcats (along with metrics) are used to construct expressions, which are ultimately passed to the DataQuery Interface.
- Parameters:
metrics (list[str]) – list of metrics. Available metrics are “value” (default), “grading”, “eop_lag”, “mop_lag”, and “last_updated”. If “all” is provided, all available metrics are used. The available metrics are defined in macrosynergy.download.jpmaqs.JPMAQS_METRICS.
start_date (str) – start date of the data to download, in the ISO format - YYYY-MM-DD.
end_date (str) – end date of the data to download in the ISO format - YYYY-MM-DD.
get_catalogue (bool) – If True, the JPMaQS catalogue is downloaded and used to filter the list of tickers. Default is False.
show_progress (bool) – True if progress bar should be shown, False if not (default).
suppress_warning (bool) – True if suppressing warnings. Default is True.
debug (bool) – Override the debug behaviour of the JPMaQSDownload class. If True, debug mode is enabled.
print_debug_data (bool) – True if debug data should be printed, False if not (default). If debug=True, this is set to True.
as_dataframe (bool) – Return a dataframe if True (default), a list of dictionaries if False.
dataframe_format (str) – Format of the dataframe to return, one of “qdf” or “wide”. QDF is the Quantamental Dataframe format, and wide is the wide format with each expression as a column, and a single date column.
report_time_taken (bool) – If True, the time taken to download and apply data transformations is reported.
categorical_dataframe (bool) – If True, the dataframe returned will use the pandas Categorical data type for the cid and xcat columns. Default is False.
kwargs (dict) – any other keyword arguments.
- Raises:
ValueError – if provided arguments are invalid or semantically incorrect (see macrosynergy.download.jpmaqs.JPMaQSDownload.validate_download_args()).
- Returns:
dataframe of data if as_dataframe is True, list of dictionaries if False.
- Return type:
pd.DataFrame|list[Dict]