verticapy.machine_learning.vertica.tsa.MA.report#
- MA.report(metrics: str | Literal[None, 'anova', 'details'] | list[Literal['aic', 'bic', 'r2', 'rsquared', 'mae', 'mean_absolute_error', 'mse', 'mean_squared_error', 'msle', 'mean_squared_log_error', 'max', 'max_error', 'median', 'median_absolute_error', 'var', 'explained_variance']] | None = None, start: int | None = None, npredictions: int | None = None, method: Literal['auto', 'forecast'] = 'auto') float | TableSample #
Computes a regression report using multiple metrics to evaluate the model (
r2
,mse
,max error
…).Parameters#
- metrics: str | list, optional
The metrics used to compute the regression report.
- None:
Computes the model different metrics.
- anova:
Computes the model ANOVA table.
- details:
Computes the model details.
It can also be a
list
of the metrics used to compute the final report.- aic:
Akaike’s Information Criterion
\[AIC = 2k - 2\ln(\hat{L})\]
- bic:
Bayesian Information Criterion
\[BIC = -2\ln(\hat{L}) + k \ln(n)\]
- max:
Max Error.
\[ME = \max_{i=1}^{n} \left| y_i - \hat{y}_i \right|\]
- mae:
Mean Absolute Error.
\[MAE = \frac{1}{n} \sum_{i=1}^{n} \left| y_i - \hat{y}_i \right|\]
- median:
Median Absolute Error.
\[MedAE = \text{median}_{i=1}^{n} \left| y_i - \hat{y}_i \right|\]
- mse:
Mean Squared Error.
\[MsE = \frac{1}{n} \sum_{i=1}^{n} \left( y_i - \hat{y}_i \right)^2\]
- msle:
Mean Squared Log Error.
\[MSLE = \frac{1}{n} \sum_{i=1}^{n} (\log(1 + y_i) - \log(1 + \hat{y}_i))^2\]
- r2:
R squared coefficient.
\[R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}\]
- r2a:
R2 adjusted
\[\text{Adjusted } R^2 = 1 - \frac{(1 - R^2)(n - 1)}{n - k - 1}\]
- qe:
quantile error, the quantile must be included in the name. Example: qe50.1% will return the quantile error using q=0.501.
- rmse:
Root-mean-squared error
\[RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}\]
- var:
Explained Variance
\[\text{Explained Variance} = 1 - \frac{Var(y - \hat{y})}{Var(y)}\]
- start: int, optional
The behavior of the start parameter and its range of accepted values depends on whether you provide a timeseries-column (
ts
):- No provided timeseries-column:
start
must be an integer greater or equal to 0, where zero indicates to start prediction at the end of the in-sample data. Ifstart
is a positive value, the function predicts the values between the end of the in-sample data and the start index, and then uses the predicted values as time series inputs for the subsequentnpredictions
.
- timeseries-column provided:
start
must be aninteger
greater or equal to1
and identifies the index (row) of the timeseries-column at which to begin prediction. If thestart
index is greater than the number of rows,N
, in the input data, the function predicts the values betweenN
andstart
and uses the predicted values as time series inputs for the subsequent npredictions.
Default:
- No provided timeseries-column:
prediction begins from the end of the in-sample data.
- timeseries-column provided:
prediction begins from the end of the provided input data.
- npredictions: int, optional
integer
greater or equal to1
, the number of predicted timesteps.- method: str, optional
Forecasting method. One of the following:
- auto:
the model initially utilizes the true values at each step for forecasting. However, when it reaches a point where it can no longer rely on true values, it transitions to using its own predictions for further forecasting. This method is often referred to as “one step ahead” forecasting.
- forecast:
the model initiates forecasting from an initial value and entirely disregards any subsequent true values. This approach involves forecasting based solely on the model’s own predictions and does not consider actual observations after the start point.
Returns#
- TableSample
report.
Examples#
We import
verticapy
:import verticapy as vp
For this example, we will use the airline passengers dataset.
import verticapy.datasets as vpd data = vpd.load_airline_passengers()
📅dateDate123passengersInteger1 1949-01-01 112 2 1949-02-01 118 3 1949-03-01 132 4 1949-04-01 129 5 1949-05-01 121 6 1949-06-01 135 7 1949-07-01 148 8 1949-08-01 148 9 1949-09-01 136 10 1949-10-01 119 11 1949-11-01 104 12 1949-12-01 118 13 1950-01-01 115 14 1950-02-01 126 15 1950-03-01 141 16 1950-04-01 135 17 1950-05-01 125 18 1950-06-01 149 19 1950-07-01 170 20 1950-08-01 170 21 1950-09-01 158 22 1950-10-01 133 23 1950-11-01 114 24 1950-12-01 140 25 1951-01-01 145 26 1951-02-01 150 27 1951-03-01 178 28 1951-04-01 163 29 1951-05-01 172 30 1951-06-01 178 31 1951-07-01 199 32 1951-08-01 199 33 1951-09-01 184 34 1951-10-01 162 35 1951-11-01 146 36 1951-12-01 166 37 1952-01-01 171 38 1952-02-01 180 39 1952-03-01 193 40 1952-04-01 181 41 1952-05-01 183 42 1952-06-01 218 43 1952-07-01 230 44 1952-08-01 242 45 1952-09-01 209 46 1952-10-01 191 47 1952-11-01 172 48 1952-12-01 194 49 1953-01-01 196 50 1953-02-01 196 51 1953-03-01 236 52 1953-04-01 235 53 1953-05-01 229 54 1953-06-01 243 55 1953-07-01 264 56 1953-08-01 272 57 1953-09-01 237 58 1953-10-01 211 59 1953-11-01 180 60 1953-12-01 201 61 1954-01-01 204 62 1954-02-01 188 63 1954-03-01 235 64 1954-04-01 227 65 1954-05-01 234 66 1954-06-01 264 67 1954-07-01 302 68 1954-08-01 293 69 1954-09-01 259 70 1954-10-01 229 71 1954-11-01 203 72 1954-12-01 229 73 1955-01-01 242 74 1955-02-01 233 75 1955-03-01 267 76 1955-04-01 269 77 1955-05-01 270 78 1955-06-01 315 79 1955-07-01 364 80 1955-08-01 347 81 1955-09-01 312 82 1955-10-01 274 83 1955-11-01 237 84 1955-12-01 278 85 1956-01-01 284 86 1956-02-01 277 87 1956-03-01 317 88 1956-04-01 313 89 1956-05-01 318 90 1956-06-01 374 91 1956-07-01 413 92 1956-08-01 405 93 1956-09-01 355 94 1956-10-01 306 95 1956-11-01 271 96 1956-12-01 306 97 1957-01-01 315 98 1957-02-01 301 99 1957-03-01 356 100 1957-04-01 348 Rows: 1-100 | Columns: 2First we import the model:
from verticapy.machine_learning.vertica.tsa import ARIMA
Then we can create the model:
model = ARIMA(order = (12, 1, 2))
We can now fit the model:
model.fit(data, "date", "passengers")
We can get the entire report using:
model.report()
value explained_variance 0.843011800385913 max_error 108.703124575763 median_absolute_error 23.5457433749146 mean_absolute_error 31.195252646127 mean_squared_error 1692.48056292341 root_mean_squared_error 41.1397686299207 r2 0.842975867228999 r2_adj 0.841494507485876 aic 807.057132402344 bic 812.230918666116 Rows: 1-10 | Columns: 2