verticapy.machine_learning.vertica.tsa.ensemble.TimeSeriesByCategory.predict#

TimeSeriesByCategory.predict(vdf: str | vDataFrame | None = None, ts: str | None = None, y: str | None = None, start: int | None = None, npredictions: int = 10, freq: Literal[None, 'm', 'months', 'y', 'year', 'infer'] = 'infer', filter_step: int | None = None, method: Literal['auto', 'forecast'] = 'auto') → vDataFrame#

Predicts using the input relation.

Parameters#

vdf: SQLRelation

Object used to run the prediction. You can also specify a customized relation, but you must enclose it with an alias. For example, (SELECT 1) x is valid, whereas (SELECT 1) and SELECT 1 are invalid.

ts: str

TS (Time Series) :py:class`vDataColumn` used to order the data. The :py:class`vDataColumn` type must be date (date, datetime, timestamp…) or numerical.

y: str, optional

Response column.

start: int, optional

The behavior of the start parameter and its range of accepted values depends on whether you provide a timeseries-column (ts):

No provided timeseries-column:
start must be an integer greater or equal to 0, where zero indicates to start prediction at the end of the in-sample data. If start is a positive value, the function predicts the values between the end of the in-sample data and the start index, and then uses the predicted values as time series inputs for the subsequent npredictions.
timeseries-column provided:
start must be an integer greater or equal to 1 and identifies the index (row) of the timeseries-column at which to begin prediction. If the start index is greater than the number of rows, N, in the input data, the function predicts the values between N and start and uses the predicted values as time series inputs for the subsequent npredictions.

Default:

No provided timeseries-column:
prediction begins from the end of the in-sample data.
timeseries-column provided:
prediction begins from the end of the provided input data.

npredictions: int, optional

integer greater or equal to 1, the number of predicted timesteps.

freq: str, optional

How to compute the delta.

m/month:
We assume that the data is organized on a monthly basis.
y/year:
We assume that the data is organized on a yearly basis.
infer:
When making inferences, the system will attempt to identify the best option, which may involve more computational resources.
None:
The inference is based on the average of the difference between ts and its lag.

filter_step: int, optional

Integer parameter that determines the frequency of predictions. You can adjust it according to your specific requirements, such as setting it to 3 for predictions every third step.

Note

It is only utilized when output_estimated_ts=True.

method: str, optional

Forecasting method. One of the following:

auto:
the model initially utilizes the true values at each step for forecasting. However, when it reaches a point where it can no longer rely on true values, it transitions to using its own predictions for further forecasting. This method is often referred to as “one step ahead” forecasting.
forecast:
the model initiates forecasting from an initial value and entirely disregards any subsequent true values. This approach involves forecasting based solely on the model’s own predictions and does not consider actual observations after the start point.

Returns#

vDataFrame: a new object.

Examples#

This model is built based on multiple base models. You should look at the source models to see entire examples.

ARIMA; ARMA; AR; MA;