AlainOUYANG/Stationary Test (Augmented Dicky-Fuller Test & KPSS Test).md

## Stationary Test (Augmented Dicky-Fuller Test & KPSS Test).md

      
    Raw
  

              Stationary Test (Augmented Dicky-Fuller Test & KPSS Test).md
            
          
    Source: https://www.statsmodels.org/dev/examples/notebooks/generated/stationarity_detrending_adf_kpss.html
Augmented Dicky-Fuller Test

from statsmodels.tsa.stattools import adfuller

def adf_test(timeseries):
    print("Dickey-Fuller Test:")
    print("\tNull hypothesis: The series has unit root.\n"
          "\tAlternate Hypothesis: The series has no unit root.\n")
    dftest = adfuller(timeseries, autolag="AIC")
    dfoutput = pd.Series(
        dftest[0:4],
        index=[
            "Test Statistic",
            "p-value",
            "#Lags Used",
            "Number of Observations Used",
        ],
    )
    for key, value in dftest[4].items():
        dfoutput["Critical Value (%s)" % key] = value

    print("Results of Dickey-Fuller Test:")
    print(dfoutput)

    if dfoutput.loc['p-value'] < (0.05 - 0.005):
        print(f'\np-value = {dfoutput.loc["p-value"]:.4f} < 0.05, reject null, series has no unit root, stationary.')
    elif dfoutput.loc['p-value'] > (0.05 + 0.005):
        print(f'\np-value = {dfoutput.loc["p-value"]:.4f} > 0.05, cannot reject null, series has unit root, non-stationary.')
    else:
        print(f'\np-value = {dfoutput.loc["p-value"]:.4f} close to 0.05, critical values is used.')
        if dfoutput["Test Statistic"] < dfoutput["Critical Value (%5)"]:
            print(f'\nTest Statistic ({dfoutput["Test Statistic"]:.4f}) < Critical Value (5%) ({dfoutput["Critical Value (%5)"]}), '
                  f'reject null, series has no unit root, stationary.')
        else:
            print(f'\nTest Statistic ({dfoutput["Test Statistic"]:.4f}) >= Critical Value (5%) ({dfoutput["Critical Value (%5)"]}), '
                  f'cannot reject null, series has unit root, non-stationary.')
ADF test is used to determine the presence of unit root in the series, and hence helps in understand if the series is stationary or not.
The null and alternate hypothesis of this test are:

Null Hypothesis: The series has a unit root.
Alternate Hypothesis: The series has no unit root.

Firstly consider the p-value, if the p-value > 0.05, cannot reject the null, series is non-stationary.
If the p-value is close to significant, then the critical values should be used to judge whether to reject the null:

If the test statistic > critical value (5%), then cannot reject the null, series is non-stationary.

If the null hypothesis in failed to be rejected, this test may provide evidence that the series is non-stationary.

KPSS test

from statsmodels.tsa.stattools import kpss

def kpss_test(timeseries):
    print("KPSS Test:")
    print("\tNull hypothesis: The process is trend stationary.\n"
          "\tAlternate Hypothesis: The series has a unit root (series is not stationary).\n")
    kpsstest = kpss(timeseries, regression="c", nlags="auto")
    kpss_output = pd.Series(
        kpsstest[0:3], index=["Test Statistic", "p-value", "Lags Used"]
    )
    for key, value in kpsstest[3].items():
        kpss_output["Critical Value (%s)" % key] = value

    print("Results of KPSS Test:")
    print(kpss_output)

    if kpss_output.loc['p-value'] < (0.05 - 0.005):
        print(f'\np-value = {kpss_output.loc["p-value"]:.4f} < 0.05, reject null, series has unit root, non-stationary.')
    elif kpss_output.loc['p-value'] > (0.05 + 0.005):
        print(f'\np-value = {kpss_output.loc["p-value"]:.4f} > 0.05, cannot reject null, series has no unit root, trend stationary.')
    else:
        print(f'\np-value = {kpss_output.loc["p-value"]:.4f} close to 0.05, critical values is used.')
        if kpss_output["Test Statistic"] < kpss_output["Critical Value (%5)"]:
            print(f'\nTest Statistic ({kpss_output["Test Statistic"]:.4f}) < Critical Value (5%) ({kpss_output["Critical Value (%5)"]}), '
                  f'reject null, series has unit root, non-stationary.')
        else:
            print(f'\nTest Statistic ({kpss_output["Test Statistic"]:.4f}) >= Critical Value (5%) ({kpss_output["Critical Value (%5)"]}), '
                  f'cannot reject null, series has no unit root, trend stationary.')
KPSS is another test for checking the stationarity of a time series.
The null and alternate hypothesis for the KPSS test are opposite that of the ADF test:

Null Hypothesis: The process is trend stationary.
Alternate Hypothesis: The series has a unit root (series is not stationary).

Firstly consider the p-value, if the p-value > 0.05, cannot reject the null, series is stationary.
If the p-value is close to significant, then the critical values should be used to judge whether to reject the null:

If the test statistic > critical value (5%), then cannot reject the null, series is stationary.


Notes of stationary test

It is always better to apply both the tests, so that it can be ensured that the series is truly stationary.
Possible outcomes of applying these stationary tests are as follows:

Case 1: Both tests conclude that the series is not stationary

The series is not stationary.


Case 2: Both tests conclude that the series is stationary

The series is stationary.


Case 3: KPSS indicates stationarity and ADF indicates non-stationarity

The series is trend stationary.
Trend needs to be removed to make series strict stationary.
The detrended series is checked for stationarity.


Case 4: KPSS indicates non-stationarity and ADF indicates stationarity

The series is difference stationary.
Differencing is to be used to make series stationary.
The differenced series is checked for stationarity.