data_describe.core.time¶
|
Plots time series given a dataframe with datetime index. Statistics are computed using the statsmodels API. |
|
Perform stationarity tests to see if mean and variance are changing over time. |
|
Correlation estimate using partial autocorrelation or autocorrelation. |
|
Compute the Augmented Dickey-Fuller (ADF) test for stationarity. |
|
Compute the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test for stationarity. |
|
Generates the figure layout. |
-
data_describe.core.time.
plot_time_series
(df, col, decompose=False, model='additive', compute_backend=None, viz_backend=None, **kwargs)¶ Plots time series given a dataframe with datetime index. Statistics are computed using the statsmodels API.
- Parameters
df – The dataframe with datetime index
col (str or [str]) – Column of interest. Column datatype must be numerical
decompose – Set as True to decompose the timeseries with moving average. Defaults to False.
model – Specify seasonal component when decompose is True. Defaults to “additive”.
compute_backend – Select computing backend. Defaults to None (pandas).
viz_backend – Select visualization backend. Defaults to None (seaborn).
**kwargs – Keyword arguments
- Raises
ValueError – Invalid input data type.
ValueError –
`col`
not a list or string.
- Returns
The visualization
-
data_describe.core.time.
stationarity_test
(df, col, test='dickey-fuller', regression='c', compute_backend=None, **kwargs)¶ Perform stationarity tests to see if mean and variance are changing over time.
Backend uses statsmodel’s statsmodels.tsa.stattools.adfuller or statsmodels.tsa.stattools.kpss
- Parameters
df – The dataframe. Must contain a datetime index
col – The feature of interest
test – Choice of stationarity test. “kpss” or “dickey-fuller”. Defaults to “dickey-fuller”.
regression – Constant and trend order to include in regression. Choose between ‘c’,’ct’,’ctt’, and ‘nc’. Defaults to ‘c’
compute_backend – Select computing backend. Defaults to None (pandas).
**kwargs – Keyword arguments
- Raises
ValueError – Invalid input data type.
ValueError – col not found in dataframe.
- Returns
Pandas dataframe containing the statistics
-
data_describe.core.time.
plot_autocorrelation
(df, col, plot_type='acf', n_lags=40, fft=False, compute_backend=None, viz_backend=None, **kwargs)¶ Correlation estimate using partial autocorrelation or autocorrelation.
Statistics are computed using the statsmodels API.
- Parameters
df – The dataframe with datetime index
col – The feature of interest
plot_type – Choose between ‘acf’ or ‘pacf. Defaults to “pacf”.
n_lags – Number of lags to return autocorrelation for. Defaults to 40.
fft – If True, computes ACF via fourier fast transform (FFT). Defaults to False.
compute_backend – Select computing backend. Defaults to None (pandas).
viz_backend – Select visualization backend. Defaults to None (seaborn).
**kwargs – Keyword arguments
- Raises
ValueError – Invalid input data type.
ValueError – col not found in dataframe.
- Returns
The visualization
-
data_describe.core.time.
adf_test
(timeseries, autolag: str = 'AIC', regression: str = 'c', **kwargs)¶ Compute the Augmented Dickey-Fuller (ADF) test for stationarity.
Backend uses statsmodels.tsa.stattools.adfuller
- Parameters
timeseries – The timeseries
autolag – Method to use when determining the number of lags. Defaults to ‘AIC’. Choose between ‘AIC’, ‘BIC’, ‘t-stat’, and None
regression – Constant and trend order to include in regression. Choose between ‘c’,’ct’,’ctt’, and ‘nc’
**kwargs – Keyword arguments for adfuller
- Returns
Pandas dataframe containing the statistics
-
data_describe.core.time.
kpss_test
(timeseries, regression: str = 'c', nlags: Optional[int] = None, **kwargs)¶ Compute the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test for stationarity.
Backend uses statsmodels.tsa.stattools.kpss
- Parameters
timeseries – The timeseries
regression – The null hypothesis for the KPSS test. ‘c’ : The data is stationary around a constant (default). ‘ct’ : The data is stationary around a trend.
nlags – Indicates the number of lags to be used. Defaults to None.
**kwargs – Keyword arguments for kpss
- Returns
Pandas dataframe containing the statistics
-
data_describe.core.time.
figure_layout
(title='Time Series', xlabel='Date', ylabel='Variable')¶ Generates the figure layout.
- Parameters
title – Title of the plot. Defaults to “Time Series”.
xlabel – x-axis label. Defaults to “Date”.
ylabel – y-axis label. Defaults to “Variable”.
- Returns
The plotly layout