macrosynergy.management.simulate.simulate_vintage_data#
Module with functionality for generating mock quantamental data vintages for testing purposes.
- class VintageData(ticker, cutoff='2020-12-31', release_lags=[15, 30], number_firsts=24, shortest=36, freq='M', start_value=100, trend_ar=5, sd_ar=3.4641016151377544, seasonal=None, added_dates=12)[source]#
Bases:
object
Creates standardized dataframe of single-ticker vintages. This class creates standardized grade 1 and grade 2 vintage data.
- Parameters:
ticker (str) – ticker name
cutoff (str) – last possible release date. The format must be ‘%Y-%m-%d’. All other dates are calculated from this one. Default is end 2020.
release_lags (list) – list of integers in ascending order denoting lags of the first, second etc. release in (calendar) days. Default is first release after 15 days and revision after 30 days. If days fall on weekend they will be delayed to Monday.
number_firsts (int) – number of first-release vintages in the simulated data set. Default is 24.
shortest (int) – number of observations in the first (shortest) vintage. Default is 36.
freq (str) – letter denoting the frequency of the vintage data. Must be one of ‘M’ (monthly, default), ‘Q’ (quarterly) or ‘W’ (weekly).
start_value (float) – expected first value of the random series. Default is 100.
trend_ar (float) – annualized trend. Default is 5% linear drift per year. This is applied to the start value. If the start value is not positive the linear trend is added as number.
sd_ar (float) – annualized standard deviation. Default is sqrt(12).
seasonal (float) – adds seasonal pattern (applying linear factor from low to high through the year) with value denoting the average % seasonal factor through the year. Default is None. The seasonal pattern makes only sense for values that are strictly positive and are interpreted as indices.
added_dates (int) – number of added first release dates, used for grade 2 dataframe generation. Default is 12.
- static date_check(date_string)[source]#
Validates that the dates passed are valid timestamp expressions and will convert to the required form ‘%Y-%m-%d’.
- Parameters:
date_string (str) – valid date expression. For instance, “1st January, 2000.”
- Raises:
TypeError – if the date_string is not a string.
ValueError – if the date_string is not in the correct format.
- seasonal_adj(obs_dates, seas_factors, values)[source]#
Method used to seasonally adjust the series. Economic data can vary according to the season.
- Parameters:
- Returns:
returns a list of values which have been adjusted seasonally
- Return type:
List[float]
- make_graded(grading, upgrades=[])[source]#
Simulates an explicitly graded dataframe with a column ‘grading’.
- make_grade2()[source]#
Method used to construct a dataframe that consists of each respective observation date and the corresponding release date(s) (the release dates are computed using the observation date and the time-period(s) specified in the field “release_lags”).
- Returns:
Will return the DataFrame with the additional columns.
- Return type:
pd.DataFrame