
verticapy.machine_learning.model_selection.statistical_tests.tsa.cochrane_orcutt¶
- verticapy.machine_learning.model_selection.statistical_tests.tsa.cochrane_orcutt(model: LinearModel, input_relation: Annotated[str | vDataFrame, ''], ts: str, prais_winsten: bool = False, drop_tmp_model: bool = True) LinearModel ¶
Performs a Cochrane-Orcutt estimation.
Parameters¶
- model: LinearModel
Linear regression object.
- input_relation: SQLRelation
Input relation.
- ts: str
vDataColumn of numeric or date-like type (date, datetime, timestamp, etc.) used as the timeline and to order the data.
- prais_winsten: bool, optional
If True, retains the first observation of the time series, increasing precision and efficiency. This configuration is called the Prais–Winsten estimation.
- drop_tmp_model: bool, optional
If true, drops the temporary model.
Returns¶
- model_tmp
A Linear Model with the different information stored as attributes:
- intercept_:
Model’s intercept.
- coef_:
Model’s coefficients.
- pho_:
Cochrane-Orcutt pho.
- anova_table_:
ANOVA table.
- r2_:
R2 score.
Examples¶
Initialization¶
Let’s try this test on a dummy dataset that has the following elements:
A value of interest that has noise related to time
Time-stamp data
Before we begin we can import the necessary libraries:
import verticapy as vp import numpy as np
Example 1: Trend¶
Now we can create the dummy dataset:
# Initialization N = 30 # Number of Rows. days = list(range(N)) y_val = [2 * x + np.random.normal(scale = 4 * x) for x in days] # vDataFrame vdf = vp.vDataFrame( { "day": days, "y1": y_val, } )
We can visually inspect the trend by drawing the appropriate graph:
vdf.scatter(["day", "y1"])
Model Fitting¶
Next, we can fit a Linear Model. To do that we need to first import the model and intialize:
from verticapy.machine_learning.vertica.linear_model import LinearRegression model = LinearRegression()
Next we can fit the model:
model.fit(vdf, X = "day", y = "y1") ======= details ======= predictor|coefficient|std_err |t_value |p_value ---------+-----------+--------+--------+-------- Intercept| 7.87525 |18.44175| 0.42703| 0.67262 day | 1.28337 | 1.09208| 1.17516| 0.24983 ============== regularization ============== type| lambda ----+-------- none| 1.00000 =========== call_string =========== linear_reg('"public"."_verticapy_tmp_linearregression_v_demo_209fd42c55a311ef880f0242ac120002_"', '"public"."_verticapy_tmp_view_v_demo_20acdaf055a311ef880f0242ac120002_"', '"y1"', '"day"' USING PARAMETERS optimizer='newton', epsilon=1e-06, max_iterations=100, regularization='none', lambda=1, alpha=0.5, fit_intercept=true) =============== Additional Info =============== Name |Value ------------------+----- iteration_count | 1 rejected_row_count| 0 accepted_row_count| 30
Now we can apply the Cochrane-Orcutt estimation to get the new modified model:
from verticapy.machine_learning.model_selection.statistical_tests import cochrane_orcutt new_model = cochrane_orcutt(model = model, input_relation = vdf, ts = "day")
Now we can compare the coefficients of both the models to see the difference.
model.coef_ Out[12]: array([1.28336817])
new_model.coef_ Out[13]: array([1.1448507])
We can see that the new model has slightly different coefficients to cater for the autocorrelated noise.