Loading...

verticapy.machine_learning.model_selection.statistical_tests.tsa.cochrane_orcutt#

verticapy.machine_learning.model_selection.statistical_tests.tsa.cochrane_orcutt(model: LinearModel, input_relation: str | vDataFrame, ts: str, prais_winsten: bool = False, drop_tmp_model: bool = True) LinearModel#

Performs a Cochrane-Orcutt estimation.

Parameters#

model: LinearModel

Linear regression object.

input_relation: SQLRelation

Input relation.

ts: str

vDataColumn of numeric or date-like type (date, datetime, timestamp, etc.) used as the timeline and to order the data.

prais_winsten: bool, optional

If True, retains the first observation of the time series, increasing precision and efficiency. This configuration is called the Prais–Winsten estimation.

drop_tmp_model: bool, optional

If true, drops the temporary model.

Returns#

model_tmp

A Linear Model with the different information stored as attributes:

Examples#

Initialization#

Let’s try this test on a dummy dataset that has the following elements:

  • A value of interest that has noise related to time

  • Time-stamp data

Before we begin we can import the necessary libraries:

import verticapy as vp

import numpy as np

Example 1: Trend#

Now we can create the dummy dataset:

# Initialization
N = 30 # Number of Rows.

days = list(range(N))

y_val = [2 * x + np.random.normal(scale = 4 * x) for x in days]

# vDataFrame
vdf = vp.vDataFrame(
    {
        "day": days,
        "y1": y_val,
    }
)

We can visually inspect the trend by drawing the appropriate graph:

vdf.scatter(["day", "y1"])

Model Fitting#

Next, we can fit a Linear Model. To do that we need to first import the model and intialize:

from verticapy.machine_learning.vertica.linear_model import LinearRegression

model = LinearRegression()

Next we can fit the model:

model.fit(vdf, X = "day", y = "y1")

Now we can apply the Cochrane-Orcutt estimation to get the new modified model:

from verticapy.machine_learning.model_selection.statistical_tests import cochrane_orcutt

new_model = cochrane_orcutt(model = model, input_relation = vdf, ts = "day")

Now we can compare the coefficients of both the models to see the difference.

model.coef_
Out[12]: array([-0.32161423])
new_model.coef_
Out[13]: array([-0.7573264])

We can see that the new model has slightly different coefficients to cater for the autocorrelated noise.