Loading...

verticapy.machine_learning.model_selection.statistical_tests.tsa.cochrane_orcutt

verticapy.machine_learning.model_selection.statistical_tests.tsa.cochrane_orcutt(model: LinearModel, input_relation: Annotated[str | vDataFrame, ''], ts: str, prais_winsten: bool = False, drop_tmp_model: bool = True) LinearModel

Performs a Cochrane-Orcutt estimation.

Parameters

model: LinearModel

Linear regression object.

input_relation: SQLRelation

Input relation.

ts: str

vDataColumn of numeric or date-like type (date, datetime, timestamp, etc.) used as the timeline and to order the data.

prais_winsten: bool, optional

If True, retains the first observation of the time series, increasing precision and efficiency. This configuration is called the Prais–Winsten estimation.

drop_tmp_model: bool, optional

If true, drops the temporary model.

Returns

model_tmp

A Linear Model with the different information stored as attributes:

Examples

Initialization

Let’s try this test on a dummy dataset that has the following elements:

  • A value of interest that has noise related to time

  • Time-stamp data

Before we begin we can import the necessary libraries:

import verticapy as vp

import numpy as np

Example 1: Trend

Now we can create the dummy dataset:

# Initialization
N = 30 # Number of Rows.

days = list(range(N))

y_val = [2 * x + np.random.normal(scale = 4 * x) for x in days]

# vDataFrame
vdf = vp.vDataFrame(
    {
        "day": days,
        "y1": y_val,
    }
)

We can visually inspect the trend by drawing the appropriate graph:

vdf.scatter(["day", "y1"])

Model Fitting

Next, we can fit a Linear Model. To do that we need to first import the model and intialize:

from verticapy.machine_learning.vertica.linear_model import LinearRegression

model = LinearRegression()

Next we can fit the model:

model.fit(vdf, X = "day", y = "y1")


=======
details
=======
predictor|coefficient|std_err |t_value |p_value 
---------+-----------+--------+--------+--------
Intercept|  7.87525  |18.44175| 0.42703| 0.67262
   day   |  1.28337  | 1.09208| 1.17516| 0.24983


==============
regularization
==============
type| lambda 
----+--------
none| 1.00000


===========
call_string
===========
linear_reg('"public"."_verticapy_tmp_linearregression_v_demo_209fd42c55a311ef880f0242ac120002_"', '"public"."_verticapy_tmp_view_v_demo_20acdaf055a311ef880f0242ac120002_"', '"y1"', '"day"'
USING PARAMETERS optimizer='newton', epsilon=1e-06, max_iterations=100, regularization='none', lambda=1, alpha=0.5, fit_intercept=true)

===============
Additional Info
===============
       Name       |Value
------------------+-----
 iteration_count  |  1  
rejected_row_count|  0  
accepted_row_count| 30  

Now we can apply the Cochrane-Orcutt estimation to get the new modified model:

from verticapy.machine_learning.model_selection.statistical_tests import cochrane_orcutt

new_model = cochrane_orcutt(model = model, input_relation = vdf, ts = "day")

Now we can compare the coefficients of both the models to see the difference.

model.coef_
Out[12]: array([1.28336817])
new_model.coef_
Out[13]: array([1.1448507])

We can see that the new model has slightly different coefficients to cater for the autocorrelated noise.