macrosynergy.download.fusion_interface#

class FusionOAuth(client_id, client_secret, resource='JPMC:URI:RS-93742-Fusion-PROD', application_name='fusion', root_url='https://fusion.jpmorgan.com/api/v1', auth_url='https://authe.jpmorgan.com/as/token.oauth2', proxies=None)[source]#

Bases: JPMorganOAuth

A class to handle OAuth authentication for the JPMorgan Fusion API. Uses the JPMorganOAuth class as a base.

cache_decorator(ttl=60, *, maxsize=None)[source]#

Decorator to cache the result of a function for up to ttl seconds total. Once any call happens at least ttl seconds after the last clear, the ENTIRE cache is flushed before proceeding.

Parameters:
  • ttl (int) – Time-to-live for the cache in seconds. After this time, the cache will be cleared. Default is 60 seconds.

  • maxsize (Optional[int]) – Maximum size of the cache. If None, the default size is used.

Return type:

Callable[[TypeVar(CachedType, bound= Callable[..., Any])], TypeVar(CachedType, bound= Callable[..., Any])]

request_wrapper(method, url, headers=None, params=None, data=None, json_payload=None, proxies=None, as_json=None, as_bytes=None, as_text=None, api_delay=1.0, timeout=None, verify_ssl=True)[source]#

A wrapper function for making API requests to the JPMorgan Fusion API.

Return type:

Union[Dict[str, Any], str, bytes]

request_wrapper_stream_bytes_to_disk(filename, url, method='GET', headers=None, params=None, data=None, json_payload=None, proxies=None, chunk_size=None, api_delay=1.0, timeout=None, verify_ssl=True)[source]#

Stream a request’s response bytes directly to disk, chunk by chunk.

Parameters:
  • filename (str) – The file path to write the streamed bytes to.

  • url (str) – The URL to request.

  • method (str) – HTTP method. Only GET is allowed for streaming to disk.

  • headers (dict, optional) – HTTP headers.

  • params (dict, optional) – Query parameters.

  • data (any, optional) – Data to send in the body.

  • json_payload (dict, optional) – JSON data to send in the body.

  • proxies (dict, optional) – Proxies to use for the request.

  • chunk_size (int) – Size of each chunk to write (default 8192).

  • api_delay (float) – Delay between API calls (defaults to 1.0 seconds).

  • timeout (float, optional) – Timeout for the request (defaults to None).

  • verify_ssl (bool) – Whether to verify SSL certificates (defaults to True).

Return type:

None

class SimpleFusionAPIClient(oauth_handler, base_url='https://fusion.jpmorgan.com/api/v1', proxies=None)[source]#

Bases: object

get_common_catalog(**kwargs)[source]#

Get the common catalog from the JPMorgan Fusion API.

Equivalent cURL request:

curl -X GET "https://fusion.jpmorgan.com/api/v1/catalogs/common" \
    -H "Authorization: Bearer <ACCESS_TOKEN>"
Returns:

API response containing the common catalog.

Return type:

Dict[str, Any]

get_products(**kwargs)[source]#

Get the list of products available in the JPMorgan Fusion API.

Equivalent cURL request:

curl -X GET "https://fusion.jpmorgan.com/api/v1/catalogs/common/products" \
    -H "Authorization: Bearer <ACCESS_TOKEN>"
Returns:

API response containing the list of products.

Return type:

Dict[str, Any]

get_product_details(product_id='JPMAQS', **kwargs)[source]#

Get the details of a specific product by its ID.

Equivalent cURL request:

curl -X GET "https://fusion.jpmorgan.com/api/v1/catalogs/common/products/{product_id}" \
    -H "Authorization: Bearer <ACCESS_TOKEN>"
Parameters:
  • product_id (str) – The ID of the product to retrieve details for. Default is “JPMAQS”.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

API response containing the product details.

Return type:

Dict[str, Any]

get_dataset(catalog, dataset, **kwargs)[source]#

Get the details of a specific dataset from a specified catalog.

Equivalent cURL request:

curl -X GET "https://fusion.jpmorgan.com/api/v1/catalogs/{catalog}/datasets/{dataset}" \
    -H "Authorization: Bearer <ACCESS_TOKEN>"
Parameters:
  • catalog (str) – The catalog from which to retrieve the dataset.

  • dataset (str) – The identifier of the dataset to retrieve.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

API response containing the dataset details.

Return type:

Dict[str, Any]

get_dataset_series(catalog, dataset, **kwargs)[source]#

Get the series available for a specific dataset in a specified catalog.

Equivalent cURL request:

curl -X GET "https://fusion.jpmorgan.com/api/v1/catalogs/{catalog}/datasets/{dataset}/datasetseries" \
    -H "Authorization: Bearer <ACCESS_TOKEN>"
Parameters:
  • catalog (str) – The catalog from which to retrieve the dataset series.

  • dataset (str) – The identifier of the dataset for which to retrieve series.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

API response containing the dataset series details.

Return type:

Dict[str, Any]

get_dataset_seriesmember(catalog, dataset, seriesmember, **kwargs)[source]#

Get the details of a specific series member in a dataset from a specified catalog.

Equivalent cURL request:

curl -X GET "https://fusion.jpmorgan.com/api/v1/catalogs/{catalog}/datasets/{dataset}/datasetseries/{seriesmember}" \
    -H "Authorization: Bearer <ACCESS_TOKEN>"
Parameters:
  • catalog (str) – The catalog from which to retrieve the series member.

  • dataset (str) – The identifier of the dataset containing the series member.

  • seriesmember (str) – The identifier of the series member to retrieve.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

API response containing the details of the specified series member.

Return type:

Dict[str, Any]

get_seriesmember_distributions(catalog, dataset, seriesmember, **kwargs)[source]#

Get the distributions available for a specific series member in a dataset from a specified catalog.

Equivalent cURL request:

curl -X GET "https://fusion.jpmorgan.com/api/v1/catalogs/{catalog}/datasets/{dataset}/datasetseries/{seriesmember}/distributions" \
    -H "Authorization: Bearer <ACCESS_TOKEN>"
Parameters:
  • catalog (str) – The catalog from which to retrieve the series member distributions.

  • dataset (str) – The identifier of the dataset containing the series member.

  • seriesmember (str) – The identifier of the series member for which to retrieve distributions.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

API response containing the available distributions for the specified series member.

Return type:

Dict[str, Any]

get_seriesmember_distribution_details(catalog, dataset, seriesmember, distribution, **kwargs)[source]#

Get the details of a specific distribution for a series member in a dataset from a specified catalog.

Equivalent cURL request:

curl -X GET "https://fusion.jpmorgan.com/api/v1/catalogs/{catalog}/datasets/{dataset}/datasetseries/{seriesmember}/distributions/{distribution}" \
    -H "Authorization: Bearer <ACCESS_TOKEN>"
Parameters:
  • catalog (str) – The catalog from which to retrieve the series member distribution.

  • dataset (str) – The identifier of the dataset containing the series member.

  • seriesmember (str) – The identifier of the series member for which to retrieve the distribution.

  • distribution (str) – The identifier of the distribution to retrieve (e.g., “parquet”).

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

API response containing the distribution details (the actual data) for the specified series member.

Return type:

Union[Dict[str, Any], bytes, str]

get_seriesmember_distribution_details_to_disk(filename, catalog, dataset, seriesmember, distribution='parquet', **kwargs)[source]#

Download the distribution for a specific series member in a dataset from a specified catalog and save it to disk.

Parameters:
  • filename (str) – The file path to save the downloaded distribution data.

  • catalog (str) – The catalog from which to retrieve the series member distribution.

  • dataset (str) – The identifier of the dataset containing the series member.

  • seriesmember (str) – The identifier of the series member for which to download the distribution.

  • distribution (str) – The identifier of the distribution to download (e.g., “parquet”).

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

The downloaded data is saved directly to the specified file.

Return type:

None

get_resources_df(response_dict, resources_key='resources', keep_fields=None, custom_sort_columns=True)[source]#

Extracts the ‘resources’ field from a response dictionary and returns it as a DataFrame.

Parameters:
  • response_dict (Dict[str, Any]) – The response dictionary containing the ‘resources’ field.

  • resources_key (str) – The key in the response dictionary that contains the resources data. Default is ‘resources’.

  • keep_fields (Optional[List[str]]) – A list of fields to keep in the DataFrame. If None, all fields are kept.

  • custom_sort_columns (bool) – If True, the DataFrame will be sorted with specific columns first. Default is True.

Returns:

A DataFrame containing the resources data.

Return type:

pd.DataFrame

convert_ticker_based_pandas_df_to_qdf(df, categorical=True)[source]#

Convert Parquet DataFrame with ticker entries to a QDF with cid & xcat columns.

Parameters:
  • df (pd.DataFrame) – The DataFrame to convert, which should contain a ‘ticker’ column.

  • categorical (bool) – If True, converts the DataFrame to a QuantamentalDataFrame with categorical data.

Return type:

DataFrame

convert_ticker_based_parquet_file_to_qdf(filename, compression='zstd', as_csv=False, qdf=False, keep_raw_data=False)[source]#

Convert a Parquet file with ticker entries to a QDF or CSV format. This function reads a Parquet file, extracts the ‘ticker’ column, splits it into ‘cid’ and ‘xcat’, and writes the result to a new Parquet file or CSV file.

Parameters:
  • filename (str) – The path to the Parquet file to convert. The file must exist.

  • compression (str) – The compression algorithm to use for the output Parquet file. Default is ‘zstd’.

  • as_csv (bool) – If True, the output will be saved as a CSV file instead of a Parquet file. Default is False.

  • qdf (bool) – If True, the output will be saved as a Quantamental DataFrame, in parquet or CSV format depending on the as_csv parameter. Default is False.

  • keep_raw_data (bool) – If True, the original Parquet file will not be deleted after conversion. If False, the original file will be removed after conversion. Default is False.

Return type:

None

convert_ticker_based_pyarrow_table_to_qdf(table)[source]#

Convert a PyArrow Table with ticker entries to a Quantamental DataFrame (QDF) with ‘cid’ and ‘xcat’ columns, splitting on ‘_’ lazily via a Scanner.

Parameters:

table (pa.Table) – The PyArrow Table to convert, which should contain a ‘ticker’ column.

Returns:

A PyArrow Table with all original columns except ‘ticker’, plus new ‘cid’ and ‘xcat’ (string) columns. The split only happens when you call to_table().

Return type:

pa.Table

read_parquet_from_bytes_to_pandas_dataframe(response_bytes)[source]#

Read a Parquet file from bytes and return a DataFrame. This function is used to read Parquet files downloaded from the JPMaQS Fusion API.

Parameters:

response_bytes (bytes) – The bytes of the Parquet file to read.

Returns:

A DataFrame containing the data from the Parquet file.

Return type:

pd.DataFrame

read_parquet_from_bytes_to_pyarrow_table(response_bytes, **kwargs)[source]#

Read a Parquet file from bytes and return a PyArrow Table. This function is used to read Parquet files downloaded from the JPMaQS Fusion API.

Parameters:
  • response_bytes (bytes) – The bytes of the Parquet file to read.

  • **kwargs (dict) – Additional keyword arguments to pass to pyarrow.parquet.read_table.

Returns:

A PyArrow Table containing the data from the Parquet file.

Return type:

pa.Table

coerce_real_date(table)[source]#
Return type:

Table

filter_parquet_table_as_qdf(table, tickers=None, start_date=None, end_date=None, qdf=False)[source]#

Filter a PyArrow Table based on tickers and date range. Optionally converts the table from a ticker-based format to a Quantamental DataFrame (QDF).

Parameters:
  • table (pa.Table) – The PyArrow Table to filter.

  • tickers (List[str], optional) – A list of tickers to filter by. If None, no ticker filtering is applied.

  • start_date (Optional[str], optional) – The start date for filtering in ISO format (YYYY-MM-DD). If None, no start date filtering is applied.

  • end_date (Optional[str], optional) – The end date for filtering in ISO format (YYYY-MM-DD). If None, no end date filtering is applied.

  • qdf (bool, optional) – If True, converts the filtered table to a Quantamental DataFrame (QDF) format. Default is False.

Returns:

A filtered PyArrow Table. If qdf is True, the table is converted to a QDF.

Return type:

pa.Table

class JPMaQSFusionClient(oauth_handler, base_url='https://fusion.jpmorgan.com/api/v1', proxies=None)[source]#

Bases: object

A client for accessing the JPMaQS product on the JPMorgan Fusion API. This client is specific to the JPMaQS product and provides methods to fetch the data catalog, list datasets, and download distributions. It uses SimpleFusionAPIClient() to handle the API requests.

Parameters:
  • oauth_handler (FusionOAuth) – An instance of FusionOAuth to handle OAuth authentication.

  • base_url (str) – The base URL for the Fusion API. Default is FUSION_ROOT_URL.

  • proxies (Optional[Dict[str, str]]) – Optional proxies to use for the HTTP requests. Default is None.

list_datasets(product_id='JPMAQS', fields=['@id', 'identifier', 'title', 'description'], include_catalog=False, include_notifications=False, include_full_datasets=True, include_explorer_datasets=False, include_delta_datasets=False, **kwargs)[source]#

List datasets available in the JPMaQS product. Returns a DataFrame with the specified fields. This excludes the metadata catalog and the Explorer datasets by default.

Parameters:
  • product_id (str) – The product ID to filter datasets by. Default is “JPMAQS”.

  • fields (List[str]) – List of fields to include in the returned DataFrame.

  • include_catalog (bool) – If True, includes the metadata catalog dataset in the results.

  • include_notifications (bool) – If True, includes the notifications dataset in the results.

  • include_explorer_datasets (bool) – If True, includes the Explorer datasets in the results.

  • include_delta_datasets (bool) – If True, includes the Delta datasets in the results.

Returns:

A DataFrame containing information about the available datasets.

Return type:

pd.DataFrame

get_metadata_catalog(**kwargs)[source]#

Get the metadata catalog for JPMaQS. This is a special dataset that contains metadata (e.g., dataset identifiers, ticker names and descriptions, etc.) Returns a DataFrame with the metadata catalog.

Parameters:

**kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the metadata catalog.

Return type:

pd.DataFrame

get_notifications_distribution(series_member=None, **kwargs)[source]#

Get the notifications distribution for JPMaQS. This dataset contains notifications around updating and refresh times of various series in the JPMaQS product.

Parameters:
  • series_member (Optional[str]) – The series member identifier for which to retrieve the distribution.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the notifications distribution details.

Return type:

pd.DataFrame

list_tickers(**kwargs)[source]#

List all tickers available in the JPMaQS product. This method retrieves the metadata catalog and extracts the tickers from it.

Parameters:

**kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the tickers and their metadata.

Return type:

pd.DataFrame

get_ticker_metadata(ticker, **kwargs)[source]#

Get metadata for a specific ticker in the JPMaQS product. This method retrieves the metadata catalog and filters it for the specified ticker.

Parameters:
  • ticker (str) – The ticker for which to retrieve metadata.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the metadata for the specified ticker.

Return type:

pd.DataFrame

get_dataset_available_series(dataset, **kwargs)[source]#

Get the available series for a given dataset in the JPMaQS product. Typically, each JPMaQS dataset will have one series for all business days (the JPMaQS release for that dataset for that day).

Parameters:
  • dataset (str) – The dataset identifier for which to retrieve series.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the available series for the specified dataset.

Return type:

pd.DataFrame

get_seriesmember_distributions(dataset, seriesmember, **kwargs)[source]#

Get the available distributions for a given series member in a dataset.

Parameters:
  • dataset (str) – The dataset identifier for which to retrieve series member distributions.

  • seriesmember (str) – The series member identifier for which to retrieve distributions.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the available distributions for the specified series member.

Return type:

pd.DataFrame

get_all_seriesmembers_for_all_datasets(include_catalog=False, include_notifications=False, include_full_datasets=True, include_explorer_datasets=False, include_delta_datasets=False, **kwargs)[source]#

Get all series members for all datasets in the JPMaQS product.

Parameters:
  • include_catalog (bool) – If True, includes the metadata catalog dataset in the snapshot. Default is False.

  • include_notifications (bool) – If True, includes notifications dataset in the snapshot. Default is False.

  • include_explorer_datasets (bool) – If True, includes Explorer datasets in the snapshot. Default is False.

  • include_delta_datasets (bool) – If True, includes Delta datasets in the snapshot. Default is False.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing all series members for all datasets.

Return type:

pd.DataFrame

download_series_member_distribution(dataset, seriesmember, distribution='parquet', **kwargs)[source]#

Download the distribution for a given series member in a dataset.

Parameters:
  • dataset (str) – The dataset identifier for which to download the series member distribution.

  • seriesmember (str) – The series member identifier for which to download the distribution.

  • distribution (str) – The distribution format to download. Default is “parquet”.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the distribution data for the specified series member.

Return type:

pd.DataFrame

download_series_member_distribution_to_disk(save_directory, dataset, seriesmember, distribution='parquet', qdf=False, as_csv=False, keep_raw_data=False, **kwargs)[source]#
Return type:

None

get_latest_seriesmember_identifier(dataset, **kwargs)[source]#

Get the latest distribution identifier for a given dataset in the JPMaQS product.

Parameters:
  • dataset (str) – The dataset identifier for which to get the latest distribution.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

The identifier of the latest distribution for the specified dataset.

Return type:

str

download_latest_distribution(dataset, distribution='parquet', qdf=True, categorical=True, **kwargs)[source]#

Download the latest distribution for a given dataset in the JPMaQS product.

Parameters:
  • dataset (str) – The dataset identifier for which to download the latest distribution.

  • distribution (str) – The distribution format to download. Default is “parquet”.

  • qdf (bool) – If True, converts the DataFrame to a QuantamentalDataFrame.

  • categorical (bool) – If True, converts the DataFrame to a QuantamentalDataFrame with categorical data.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the latest distribution for the specified dataset.

Return type:

pd.DataFrame

download_and_filter_series_member_distribution(dataset, seriesmember, tickers=None, start_date=None, end_date=None, qdf=False, distribution='parquet', **kwargs)[source]#

Download and filter the distribution for a given series member in a dataset.

Parameters:
  • dataset (str) – The dataset identifier for which to download the series member distribution.

  • seriesmember (str) – The series member identifier for which to download the distribution.

  • tickers (List[str]) – A list of tickers to filter the distribution by. If None, no filtering is done.

  • start_date (Optional[str]) – The start date to filter the distribution by (in ISO format). If None, no filtering is done.

  • end_date (Optional[str]) – The end date to filter the distribution by (in ISO format). If None, no filtering is done.

  • qdf (bool) – If True, converts the DataFrame to a QuantamentalDataFrame. Default is False.

  • distribution (str) – The distribution format to download. Default is “parquet”.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the filtered distribution for the specified series member.

Return type:

pd.DataFrame

download_latest_distribution_to_disk(save_directory, dataset, distribution='parquet', qdf=False, as_csv=False, keep_raw_data=False, **kwargs)[source]#
Return type:

None

download_latest_delta_distribution(folder=None, qdf=False, as_csv=False, keep_raw_data=False, **kwargs)[source]#

Download the latest Delta distribution for all datasets in the JPMaQS product.

Parameters:
  • folder (str) – The folder where the Delta distribution will be saved. If None, a folder with the current date will be created in the current directory.

  • qdf (bool) – If True, converts the DataFrame to a QuantamentalDataFrame.

  • as_csv (bool) – If True, saves the downloaded datasets as CSV files. Default is False, with Parquet as the default format.

  • keep_raw_data (bool) – If True, keeps the raw data files after conversion. Default is False.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the metadata catalog.

Return type:

pd.DataFrame

download_latest_full_snapshot(folder=None, qdf=False, include_catalog=False, include_notifications=False, include_explorer_datasets=False, include_delta_datasets=False, as_csv=False, keep_raw_data=False, datasets_list=None, **kwargs)[source]#

Download the latest full snapshot of all datasets in the JPMaQS product.

Parameters:
  • folder (str) – The folder where the snapshot will be saved. If None, a folder with the current date will be created in the current directory.

  • qdf (bool) – If True, converts the DataFrame to a QuantamentalDataFrame.

  • include_catalog (bool) – If True, includes the metadata catalog dataset in the snapshot. Default is False.

  • include_notifications (bool) – If True, includes notifications dataset in the snapshot. Default is False.

  • include_explorer_datasets (bool) – If True, includes Explorer datasets in the snapshot. Default is False.

  • include_delta_datasets (bool) – If True, includes Delta datasets in the snapshot. Default is False.

  • as_csv (bool) – If True, saves the downloaded datasets as CSV files. Default is False, with Parquet as the default format.

  • keep_raw_data (bool) – If True, keeps the raw data files after conversion. Default is False.

  • datasets_list (Optional[List[str]]) – A list of specific datasets to download. If None, all datasets specified using the include_* parameters will be downloaded.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the metadata catalog.

Return type:

pd.DataFrame

download(folder=None, tickers=None, cids=None, xcats=None, metrics=['all'], start_date='2000-01-01', end_date=None, qdf=True, as_csv=False, **kwargs)[source]#

Download data for specified tickers, cids, or xcats from the JPMaQS product. This method downloads the latest full snapshots of the requested tickers’ respective datasets and filters them based on the provided parameters.

Parameters:
  • folder (str) – The folder where the downloaded data will be saved. If None, a dataframe will be returned without saving to disk.

  • tickers (Optional[List[str]]) – A list of tickers to download data for. This list will be concatenated with the tickers generated from the combination of cids and xcats.

  • cids (Optional[List[str]]) – A list of cids to download data for. This will be used to generate tickers in the format “cid_xcat”.

  • xcats (Optional[List[str]]) – A list of xcats to download data for. This will be used to generate tickers in the format “cid_xcat”.

  • metrics (List[str]) – A list of metrics to include in the downloaded data. Default is [“all”], which includes all available metrics.

  • start_date (str) – The start date for the data to be downloaded, in “YYYY-MM-DD” format. Default is “2000-01-01”.

  • end_date (Optional[str]) – The end date for the data to be downloaded, in “YYYY-MM-DD” format. If None, defaults to the current date.

  • qdf (bool) – If True, converts the DataFrame to a QuantamentalDataFrame. Default is True.

  • as_csv (bool) – If True, saves the downloaded datasets as CSV files. Default is False, with Parquet as the default format.

  • **kwargs (dict) – Additional keyword arguments to pass to the API request.

Returns:

A DataFrame containing the downloaded data for the specified tickers, cids, or xcats. If folder is specified, the data will also be saved to disk.

Return type:

pd.DataFrame