Loading...

verticapy.vDataFrame.acf#

vDataFrame.acf(column: str, ts: str, by: str | list[str] | None = None, p: int | list = 12, unit: str = 'rows', method: Literal['pearson', 'kendall', 'spearman', 'spearmand', 'biserial', 'cramer'] = 'pearson', confidence: bool = True, alpha: float = 0.95, show: bool = True, kind: Literal['line', 'heatmap', 'bar'] = 'bar', mround: int = 3, chart: PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure | None = None, **style_kwargs) PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure#

Calculates the correlations between the specified vDataColumn and its various time lags. This function is particularly useful for time series analysis and forecasting as it helps uncover relationships between data points at different time intervals. Understanding these correlations can be vital for making predictions and gaining insights into temporal data patterns.

Parameters#

column: str

Input vDataColumn used to compute the Auto Correlation Plot.

ts: str

TS (Time Series) vDataColumn used to order the data. It can be of type date or a numerical vDataColumn.

by: SQLColumns, optional

vDataColumns used in the partition.

p: int | list, optional

Int equal to the maximum number of lag to consider during the computation or List of the different lags to include during the computation. p must be positive or a list of positive integers.

unit: str, optional

Unit used to compute the lags.

  • rows:

    Natural lags.

  • else:

    Any time unit. For example, you can write ‘hour’ to compute the hours lags or ‘day’ to compute the days lags.

method: str, optional

Method used to compute the correlation.

  • pearson:

    Pearson’s correlation coefficient (linear).

  • spearman:

    Spearman’s correlation coefficient (monotonic - rank based).

  • spearmanD:

    Spearman’s correlation coefficient using the DENSE RANK function instead of the RANK function.

  • kendall:

    Kendall’s correlation coefficient (similar trends). The method computes the Tau-B coefficient.

    Warning

    This method uses a CROSS JOIN during computation and is therefore computationally expensive at O(n * n), where n is the total count of the vDataFrame.

  • cramer:

    Cramer’s V (correlation between categories).

  • biserial:

    Biserial Point (correlation between binaries and a numericals).

confidence: bool, optional

If set to True, the confidence band width is drawn.

alpha: float, optional

Significance Level. Probability to accept H0. Only used to compute the confidence band width.

show: bool, optional

If set to True, the Plotting object is returned.

kind: str, optional

ACF Type.

  • bar:

    Classical Autocorrelation Plot using bars.

  • heatmap:

    Draws the ACF heatmap.

  • line:

    Draws the ACF using a Line Plot.

mround: int, optional

Round the coefficient using the input number of digits. It is used only to display the ACF Matrix (kind must be set to ‘heatmap’).

chart: PlottingObject, optional

The chart object used to plot.

**style_kwargs

Any optional parameter to pass to the plotting functions.

Returns#

obj

Plotting Object.

Examples#

Import the amazon dataset from VerticaPy.

from verticapy.datasets import load_amazon

data = load_amazon()
📅
date
Date
Abc
state
Varchar(32)
123
number
Integer
11998-01-01ACRE0
21998-01-01ALAGOAS0
31998-01-01AMAPÁ0
41998-01-01AMAZONAS0
51998-01-01BAHIA0
61998-01-01CEARÁ0
71998-01-01DISTRITO FEDERAL0
81998-01-01ESPÍRITO SANTO0
91998-01-01GOIÁS0
101998-01-01MARANHÃO0
111998-01-01MATO GROSSO0
121998-01-01MATO GROSSO DO SUL0
131998-01-01MINAS GERAIS0
141998-01-01PARANÁ0
151998-01-01PARAÍBA0
161998-01-01PARÁ0
171998-01-01PERNAMBUCO0
181998-01-01PIAUÍ0
191998-01-01RIO DE JANEIRO0
201998-01-01RIO GRANDE DO NORTE0
211998-01-01RIO GRANDE DO SUL0
221998-01-01RONDÔNIA0
231998-01-01RORAIMA0
241998-01-01SANTA CATARINA0
251998-01-01SERGIPE0
261998-01-01SÃO PAULO0
271998-01-01TOCANTINS0
281998-02-01ACRE0
291998-02-01ALAGOAS0
301998-02-01AMAPÁ0
311998-02-01AMAZONAS0
321998-02-01BAHIA0
331998-02-01CEARÁ0
341998-02-01DISTRITO FEDERAL0
351998-02-01ESPÍRITO SANTO0
361998-02-01GOIÁS0
371998-02-01MARANHÃO0
381998-02-01MATO GROSSO0
391998-02-01MATO GROSSO DO SUL0
401998-02-01MINAS GERAIS0
411998-02-01PARANÁ0
421998-02-01PARAÍBA0
431998-02-01PARÁ0
441998-02-01PERNAMBUCO0
451998-02-01PIAUÍ0
461998-02-01RIO DE JANEIRO0
471998-02-01RIO GRANDE DO NORTE0
481998-02-01RIO GRANDE DO SUL0
491998-02-01RONDÔNIA0
501998-02-01RORAIMA0
511998-02-01SANTA CATARINA0
521998-02-01SERGIPE0
531998-02-01SÃO PAULO0
541998-02-01TOCANTINS0
551998-03-01ACRE0
561998-03-01ALAGOAS0
571998-03-01AMAPÁ0
581998-03-01AMAZONAS0
591998-03-01BAHIA0
601998-03-01CEARÁ0
611998-03-01DISTRITO FEDERAL0
621998-03-01ESPÍRITO SANTO0
631998-03-01GOIÁS0
641998-03-01MARANHÃO0
651998-03-01MATO GROSSO0
661998-03-01MATO GROSSO DO SUL0
671998-03-01MINAS GERAIS0
681998-03-01PARANÁ0
691998-03-01PARAÍBA0
701998-03-01PARÁ0
711998-03-01PERNAMBUCO0
721998-03-01PIAUÍ0
731998-03-01RIO DE JANEIRO0
741998-03-01RIO GRANDE DO NORTE0
751998-03-01RIO GRANDE DO SUL0
761998-03-01RONDÔNIA0
771998-03-01RORAIMA0
781998-03-01SANTA CATARINA0
791998-03-01SERGIPE0
801998-03-01SÃO PAULO0
811998-03-01TOCANTINS0
821998-04-01ACRE0
831998-04-01ALAGOAS0
841998-04-01AMAPÁ0
851998-04-01AMAZONAS0
861998-04-01BAHIA0
871998-04-01CEARÁ0
881998-04-01DISTRITO FEDERAL0
891998-04-01ESPÍRITO SANTO0
901998-04-01GOIÁS0
911998-04-01MARANHÃO0
921998-04-01MATO GROSSO0
931998-04-01MATO GROSSO DO SUL0
941998-04-01MINAS GERAIS0
951998-04-01PARANÁ0
961998-04-01PARAÍBA0
971998-04-01PARÁ0
981998-04-01PERNAMBUCO0
991998-04-01PIAUÍ0
1001998-04-01RIO DE JANEIRO0
Rows: 1-100 | Columns: 3

Draw the ACF Plot.

data.acf(
    column = "number",
    ts = "date",
    by = "state",
    method = "pearson",
    p = 24,
)

For more examples, please look at the Auto-Correlation Plot page of the Chart Gallery.

See also

vDataFrame.pacf() : Computes the partial autocorrelations.