verticapy.machine_learning.model_selection.statistical_tests.tsa.cochrane_orcutt#
- verticapy.machine_learning.model_selection.statistical_tests.tsa.cochrane_orcutt(model: LinearModel, input_relation: str | vDataFrame, ts: str, prais_winsten: bool = False, drop_tmp_model: bool = True) LinearModel #
Performs a Cochrane-Orcutt estimation.
Parameters#
- model: LinearModel
Linear regression object.
- input_relation: SQLRelation
Input relation.
- ts: str
vDataColumn of numeric or date-like type (date, datetime, timestamp, etc.) used as the timeline and to order the data.
- prais_winsten: bool, optional
If True, retains the first observation of the time series, increasing precision and efficiency. This configuration is called the Prais–Winsten estimation.
- drop_tmp_model: bool, optional
If true, drops the temporary model.
Returns#
- model_tmp
A Linear Model with the different information stored as attributes:
- intercept_:
Model’s intercept.
- coef_:
Model’s coefficients.
- pho_:
Cochrane-Orcutt pho.
- anova_table_:
ANOVA table.
- r2_:
R2 score.
Examples#
Initialization#
Let’s try this test on a dummy dataset that has the following elements:
A value of interest that has noise related to time
Time-stamp data
Before we begin we can import the necessary libraries:
import verticapy as vp import numpy as np
Example 1: Trend#
Now we can create the dummy dataset:
# Initialization N = 30 # Number of Rows. days = list(range(N)) y_val = [2 * x + np.random.normal(scale = 4 * x) for x in days] # vDataFrame vdf = vp.vDataFrame( { "day": days, "y1": y_val, } )
We can visually inspect the trend by drawing the appropriate graph:
vdf.scatter(["day", "y1"])
Model Fitting#
Next, we can fit a Linear Model. To do that we need to first import the model and intialize:
from verticapy.machine_learning.vertica.linear_model import LinearRegression model = LinearRegression()
Next we can fit the model:
model.fit(vdf, X = "day", y = "y1")
Now we can apply the Cochrane-Orcutt estimation to get the new modified model:
from verticapy.machine_learning.model_selection.statistical_tests import cochrane_orcutt new_model = cochrane_orcutt(model = model, input_relation = vdf, ts = "day")
Now we can compare the coefficients of both the models to see the difference.
model.coef_ Out[12]: array([-0.32161423])
new_model.coef_ Out[13]: array([-0.7573264])
We can see that the new model has slightly different coefficients to cater for the autocorrelated noise.