verticapy.vDataFrame.acf#
- vDataFrame.acf(column: str, ts: str, by: str | list[str] | None = None, p: int | list = 12, unit: str = 'rows', method: Literal['pearson', 'kendall', 'spearman', 'spearmand', 'biserial', 'cramer'] = 'pearson', confidence: bool = True, alpha: float = 0.95, show: bool = True, kind: Literal['line', 'heatmap', 'bar'] = 'bar', mround: int = 3, chart: PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure | None = None, **style_kwargs) PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure #
Calculates the correlations between the specified vDataColumn and its various time lags. This function is particularly useful for time series analysis and forecasting as it helps uncover relationships between data points at different time intervals. Understanding these correlations can be vital for making predictions and gaining insights into temporal data patterns.
Parameters#
- column: str
Input vDataColumn used to compute the Auto Correlation Plot.
- ts: str
TS (Time Series) vDataColumn used to order the data. It can be of type date or a numerical vDataColumn.
- by: SQLColumns, optional
vDataColumns used in the partition.
- p: int | list, optional
Int equal to the maximum number of lag to consider during the computation or List of the different lags to include during the computation. p must be positive or a list of positive integers.
- unit: str, optional
Unit used to compute the lags.
- rows:
Natural lags.
- else:
Any time unit. For example, you can write ‘hour’ to compute the hours lags or ‘day’ to compute the days lags.
- method: str, optional
Method used to compute the correlation.
- pearson:
Pearson’s correlation coefficient (linear).
- spearman:
Spearman’s correlation coefficient (monotonic - rank based).
- spearmanD:
Spearman’s correlation coefficient using the DENSE RANK function instead of the RANK function.
- kendall:
Kendall’s correlation coefficient (similar trends). The method computes the Tau-B coefficient.
Warning
This method uses a CROSS JOIN during computation and is therefore computationally expensive at O(n * n), where n is the total count of the
vDataFrame
.
- cramer:
Cramer’s V (correlation between categories).
- biserial:
Biserial Point (correlation between binaries and a numericals).
- confidence: bool, optional
If set to True, the confidence band width is drawn.
- alpha: float, optional
Significance Level. Probability to accept H0. Only used to compute the confidence band width.
- show: bool, optional
If set to True, the Plotting object is returned.
- kind: str, optional
ACF Type.
- bar:
Classical Autocorrelation Plot using bars.
- heatmap:
Draws the ACF heatmap.
- line:
Draws the ACF using a Line Plot.
- mround: int, optional
Round the coefficient using the input number of digits. It is used only to display the ACF Matrix (kind must be set to ‘heatmap’).
- chart: PlottingObject, optional
The chart object used to plot.
- **style_kwargs
Any optional parameter to pass to the plotting functions.
Returns#
- obj
Plotting Object.
Examples#
Import the amazon dataset from VerticaPy.
from verticapy.datasets import load_amazon data = load_amazon()
📅dateDateAbcstateVarchar(32)123numberInteger1 1998-01-01 ACRE 0 2 1998-01-01 ALAGOAS 0 3 1998-01-01 AMAPÁ 0 4 1998-01-01 AMAZONAS 0 5 1998-01-01 BAHIA 0 6 1998-01-01 CEARÁ 0 7 1998-01-01 DISTRITO FEDERAL 0 8 1998-01-01 ESPÍRITO SANTO 0 9 1998-01-01 GOIÁS 0 10 1998-01-01 MARANHÃO 0 11 1998-01-01 MATO GROSSO 0 12 1998-01-01 MATO GROSSO DO SUL 0 13 1998-01-01 MINAS GERAIS 0 14 1998-01-01 PARANÁ 0 15 1998-01-01 PARAÍBA 0 16 1998-01-01 PARÁ 0 17 1998-01-01 PERNAMBUCO 0 18 1998-01-01 PIAUÍ 0 19 1998-01-01 RIO DE JANEIRO 0 20 1998-01-01 RIO GRANDE DO NORTE 0 21 1998-01-01 RIO GRANDE DO SUL 0 22 1998-01-01 RONDÔNIA 0 23 1998-01-01 RORAIMA 0 24 1998-01-01 SANTA CATARINA 0 25 1998-01-01 SERGIPE 0 26 1998-01-01 SÃO PAULO 0 27 1998-01-01 TOCANTINS 0 28 1998-02-01 ACRE 0 29 1998-02-01 ALAGOAS 0 30 1998-02-01 AMAPÁ 0 31 1998-02-01 AMAZONAS 0 32 1998-02-01 BAHIA 0 33 1998-02-01 CEARÁ 0 34 1998-02-01 DISTRITO FEDERAL 0 35 1998-02-01 ESPÍRITO SANTO 0 36 1998-02-01 GOIÁS 0 37 1998-02-01 MARANHÃO 0 38 1998-02-01 MATO GROSSO 0 39 1998-02-01 MATO GROSSO DO SUL 0 40 1998-02-01 MINAS GERAIS 0 41 1998-02-01 PARANÁ 0 42 1998-02-01 PARAÍBA 0 43 1998-02-01 PARÁ 0 44 1998-02-01 PERNAMBUCO 0 45 1998-02-01 PIAUÍ 0 46 1998-02-01 RIO DE JANEIRO 0 47 1998-02-01 RIO GRANDE DO NORTE 0 48 1998-02-01 RIO GRANDE DO SUL 0 49 1998-02-01 RONDÔNIA 0 50 1998-02-01 RORAIMA 0 51 1998-02-01 SANTA CATARINA 0 52 1998-02-01 SERGIPE 0 53 1998-02-01 SÃO PAULO 0 54 1998-02-01 TOCANTINS 0 55 1998-03-01 ACRE 0 56 1998-03-01 ALAGOAS 0 57 1998-03-01 AMAPÁ 0 58 1998-03-01 AMAZONAS 0 59 1998-03-01 BAHIA 0 60 1998-03-01 CEARÁ 0 61 1998-03-01 DISTRITO FEDERAL 0 62 1998-03-01 ESPÍRITO SANTO 0 63 1998-03-01 GOIÁS 0 64 1998-03-01 MARANHÃO 0 65 1998-03-01 MATO GROSSO 0 66 1998-03-01 MATO GROSSO DO SUL 0 67 1998-03-01 MINAS GERAIS 0 68 1998-03-01 PARANÁ 0 69 1998-03-01 PARAÍBA 0 70 1998-03-01 PARÁ 0 71 1998-03-01 PERNAMBUCO 0 72 1998-03-01 PIAUÍ 0 73 1998-03-01 RIO DE JANEIRO 0 74 1998-03-01 RIO GRANDE DO NORTE 0 75 1998-03-01 RIO GRANDE DO SUL 0 76 1998-03-01 RONDÔNIA 0 77 1998-03-01 RORAIMA 0 78 1998-03-01 SANTA CATARINA 0 79 1998-03-01 SERGIPE 0 80 1998-03-01 SÃO PAULO 0 81 1998-03-01 TOCANTINS 0 82 1998-04-01 ACRE 0 83 1998-04-01 ALAGOAS 0 84 1998-04-01 AMAPÁ 0 85 1998-04-01 AMAZONAS 0 86 1998-04-01 BAHIA 0 87 1998-04-01 CEARÁ 0 88 1998-04-01 DISTRITO FEDERAL 0 89 1998-04-01 ESPÍRITO SANTO 0 90 1998-04-01 GOIÁS 0 91 1998-04-01 MARANHÃO 0 92 1998-04-01 MATO GROSSO 0 93 1998-04-01 MATO GROSSO DO SUL 0 94 1998-04-01 MINAS GERAIS 0 95 1998-04-01 PARANÁ 0 96 1998-04-01 PARAÍBA 0 97 1998-04-01 PARÁ 0 98 1998-04-01 PERNAMBUCO 0 99 1998-04-01 PIAUÍ 0 100 1998-04-01 RIO DE JANEIRO 0 Rows: 1-100 | Columns: 3Draw the ACF Plot.
data.acf( column = "number", ts = "date", by = "state", method = "pearson", p = 24, )
For more examples, please look at the Auto-Correlation Plot page of the Chart Gallery.
See also
vDataFrame.
pacf()
: Computes the partial autocorrelations.