VerticaPy
Time Series¶
Time series models are a type of regression on a dataset with a timestamp label.
The following example creates a time series model to predict the number of forest fires in Brazil with the 'Amazon' dataset.
from verticapy.datasets import load_amazon
amazon = load_amazon().groupby("date", "SUM(number) AS number")
display(amazon)
The feature 'date' tells us that we should be working with a time series model. To do predictions on time series, we use previous values called 'lags'.
To help visualize the seasonality of forest fires, we'll draw some autocorrelation plots.
amazon.acf(ts = "date",
column = "number",
p = 48)
amazon.pacf(ts = "date",
column = "number",
p = 48)
Forest fires follow a predictable, seasonal pattern, so it should be easy to predict future forest fires with past data.
VerticaPy offers several models, including a multiple time series model. For this example, let's use a SARIMAX model.
from verticapy.learn.tsa import SARIMAX
model = SARIMAX("SARIMAX_amazon",
p = 1,
d = 0,
q = 0,
P = 4,
D = 0,
Q = 0,
s = 12)
model.fit(amazon,
y = "number",
ts = "date")
Just like with other regression models, we'll evaluate our model with the report() method.
model.report()
We can also draw our model using one-step ahead and dynamic forecasting.
model.plot(amazon,
nlead = 150,
dynamic = True)
This concludes the fundamental lessons on machine learning algorithms in VerticaPy.
