Time Series¶
[1]:
import pandas as pd
import matplotlib.pyplot as plt
import data_describe as dd
from data_describe.core.time import plot_autocorrelation, stationarity_test
[2]:
df = pd.read_csv("https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-total-female-births.csv")
df['Date'] = pd.to_datetime(df.Date, unit='ns')
df['Births_Multiplier'] = df['Births'] * 1.16
df.set_index("Date", inplace=True)
df.head(2)
[2]:
Births | Births_Multiplier | |
---|---|---|
Date | ||
1959-01-01 | 35 | 40.60 |
1959-01-02 | 32 | 37.12 |
Plot time series¶
[3]:
dd.plot_time_series(df, col="Births")
[3]:
<AxesSubplot:xlabel='Date', ylabel='Births'>
Plot interactive time series¶
[4]:
dd.plot_time_series(df, col=["Births","Births_Multiplier"], viz_backend="plotly" )
C:\workspace\data-describe\data_describe\compat\_notebook.py:32: JupyterPlotlyWarning:
Are you running in Jupyter Lab? The extension "jupyterlab-plotly" was not found and is required for Plotly visualizations in Jupyter Lab.
Plot decomposition¶
[5]:
dd.plot_time_series(df, col="Births", decompose=True)
[5]:
[6]:
dd.plot_time_series(df, col="Births", decompose=True, viz_backend="plotly")
Perform Stationarity Tests¶
[7]:
stationarity_test(df, col='Births', test="dickey-fuller")
[7]:
stats | |
---|---|
Test Statistic | -4.808291 |
p-value | 0.000052 |
Lags Used | 6.000000 |
Number of Observations Used | 358.000000 |
Critical Value (1%) | -3.448749 |
Critical Value (5%) | -2.869647 |
Critical Value (10%) | -2.571089 |
Plot ACF¶
[8]:
# Use seaborn by default
plot_autocorrelation(df, col='Births', plot_type="acf")
[8]:
Plot PACF¶
[9]:
plot_autocorrelation(df, col="Births", plot_type="pacf", n_lags=10, viz_backend="plotly")