
verticapy.machine_learning.model_selection.statistical_tests.tsa.het_arch¶
- verticapy.machine_learning.model_selection.statistical_tests.tsa.het_arch(input_relation: Annotated[str | vDataFrame, ''], eps: str, ts: str, by: Annotated[str | list[str], 'STRING representing one column or a list of columns'] | None = None, p: int = 1) tuple[float, float, float, float] ¶
Engle’s Test for Autoregressive Conditional Heteroscedasticity (ARCH).
Parameters¶
- input_relation: SQLRelation
Input relation.
- eps: str
Input residual vDataColumn.
- ts: str
vDataColumn used as timeline to to order the data. It can be a numerical or date-like type (date, datetime, timestamp…) vDataColumn.
- by: SQLColumns, optional
vDataColumns used in the partition.
- p: int, optional
Number of lags to consider in the test.
Returns¶
- tuple
Lagrange Multiplier statistic, LM pvalue, F statistic, F pvalue
Examples¶
Initialization¶
Let’s try this test on a dummy dataset that has the following elements:
A value of interest that has noise
Time-stamp data
Before we begin we can import the necessary libraries:
import verticapy as vp import numpy as np
Example 1: Random¶
Now we can create the dummy dataset:
# Initialization N = 50 # Number of Rows. days = list(range(N)) vals = [np.random.normal(5) for x in days] # vDataFrame vdf = vp.vDataFrame( { "day": days, "eps": vals, } )
Let us plot the distribution of noise with respect to time:
vdf.scatter(["day", "eps"])
Test¶
Now we can apply the Durbin Watson Test:
from verticapy.machine_learning.model_selection.statistical_tests import het_arch het_arch(input_relation = vdf, ts = "day", eps = "eps", p = 5) Out[8]: (5.792308772293665, 0.3269556338536383, 1.1523251435923334, 0.3497782532314971)
We can see that there is no relationship with any lag except that which is by chance.
Now let us contrast it with another example where the lags are related: