data_describe.core.time

plot_time_series(df, col, decompose=False, model=’additive’, compute_backend=None, viz_backend=None, **kwargs)

Plots time series given a dataframe with datetime index. Statistics are computed using the statsmodels API.

stationarity_test(df, col, test=’dickey-fuller’, regression=’c’, compute_backend=None, **kwargs)

Perform stationarity tests to see if mean and variance are changing over time.

plot_autocorrelation(df, col, plot_type=’acf’, n_lags=40, fft=False, compute_backend=None, viz_backend=None, **kwargs)

Correlation estimate using partial autocorrelation or autocorrelation.

adf_test(timeseries, autolag: str = ‘AIC’, regression: str = ‘c’, **kwargs)

Compute the Augmented Dickey-Fuller (ADF) test for stationarity.

kpss_test(timeseries, regression: str = ‘c’, nlags: Optional[int] = None, **kwargs)

Compute the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test for stationarity.

figure_layout(title=’Time Series’, xlabel=’Date’, ylabel=’Variable’)

Generates the figure layout.

data_describe.core.time.plot_time_series(df, col, decompose=False, model='additive', compute_backend=None, viz_backend=None, **kwargs)

Plots time series given a dataframe with datetime index. Statistics are computed using the statsmodels API.

Parameters
  • df – The dataframe with datetime index

  • col (str or [str]) – Column of interest. Column datatype must be numerical

  • decompose – Set as True to decompose the timeseries with moving average. Defaults to False.

  • model – Specify seasonal component when decompose is True. Defaults to “additive”.

  • compute_backend – Select computing backend. Defaults to None (pandas).

  • viz_backend – Select visualization backend. Defaults to None (seaborn).

  • **kwargs – Keyword arguments

Raises
  • ValueError – Invalid input data type.

  • ValueError`col` not a list or string.

Returns

The visualization

data_describe.core.time.stationarity_test(df, col, test='dickey-fuller', regression='c', compute_backend=None, **kwargs)

Perform stationarity tests to see if mean and variance are changing over time.

Backend uses statsmodel’s statsmodels.tsa.stattools.adfuller or statsmodels.tsa.stattools.kpss

Parameters
  • df – The dataframe. Must contain a datetime index

  • col – The feature of interest

  • test – Choice of stationarity test. “kpss” or “dickey-fuller”. Defaults to “dickey-fuller”.

  • regression – Constant and trend order to include in regression. Choose between ‘c’,’ct’,’ctt’, and ‘nc’. Defaults to ‘c’

  • compute_backend – Select computing backend. Defaults to None (pandas).

  • **kwargs – Keyword arguments

Raises
  • ValueError – Invalid input data type.

  • ValueErrorcol not found in dataframe.

Returns

Pandas dataframe containing the statistics

data_describe.core.time.plot_autocorrelation(df, col, plot_type='acf', n_lags=40, fft=False, compute_backend=None, viz_backend=None, **kwargs)

Correlation estimate using partial autocorrelation or autocorrelation.

Statistics are computed using the statsmodels API.

Parameters
  • df – The dataframe with datetime index

  • col – The feature of interest

  • plot_type – Choose between ‘acf’ or ‘pacf. Defaults to “pacf”.

  • n_lags – Number of lags to return autocorrelation for. Defaults to 40.

  • fft – If True, computes ACF via fourier fast transform (FFT). Defaults to False.

  • compute_backend – Select computing backend. Defaults to None (pandas).

  • viz_backend – Select visualization backend. Defaults to None (seaborn).

  • **kwargs – Keyword arguments

Raises
  • ValueError – Invalid input data type.

  • ValueErrorcol not found in dataframe.

Returns

The visualization

data_describe.core.time.adf_test(timeseries, autolag: str = 'AIC', regression: str = 'c', **kwargs)

Compute the Augmented Dickey-Fuller (ADF) test for stationarity.

Backend uses statsmodels.tsa.stattools.adfuller

Parameters
  • timeseries – The timeseries

  • autolag – Method to use when determining the number of lags. Defaults to ‘AIC’. Choose between ‘AIC’, ‘BIC’, ‘t-stat’, and None

  • regression – Constant and trend order to include in regression. Choose between ‘c’,’ct’,’ctt’, and ‘nc’

  • **kwargs – Keyword arguments for adfuller

Returns

Pandas dataframe containing the statistics

data_describe.core.time.kpss_test(timeseries, regression: str = 'c', nlags: Optional[int] = None, **kwargs)

Compute the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test for stationarity.

Backend uses statsmodels.tsa.stattools.kpss

Parameters
  • timeseries – The timeseries

  • regression – The null hypothesis for the KPSS test. ‘c’ : The data is stationary around a constant (default). ‘ct’ : The data is stationary around a trend.

  • nlags – Indicates the number of lags to be used. Defaults to None.

  • **kwargs – Keyword arguments for kpss

Returns

Pandas dataframe containing the statistics

data_describe.core.time.figure_layout(title='Time Series', xlabel='Date', ylabel='Variable')

Generates the figure layout.

Parameters
  • title – Title of the plot. Defaults to “Time Series”.

  • xlabel – x-axis label. Defaults to “Date”.

  • ylabel – y-axis label. Defaults to “Variable”.

Returns

The plotly layout