verticapy.vDataFrame.aad#

vDataFrame.aad(columns: str | list[str] | None = None, **agg_kwargs) → TableSample#

Utilizes the aad (Average Absolute Deviation) aggregation method to analyze the vDataColumn. AAD measures the average absolute deviation of data points from their mean, offering valuable insights into data variability and dispersion. When we aggregate the vDataFrame using aad, we gain an understanding of how data points deviate from the mean on average, which is particularly useful for assessing data spread and the magnitude of deviations.

This method is valuable in scenarios where we want to evaluate data variability while giving equal weight to all data points, regardless of their direction of deviation. Calculating aad provides us with information about the overall data consistency and can be useful in various analytical and quality assessment contexts.

Warning

To compute aad, VerticaPy needs to execute multiple queries. It necessitates, at a minimum, a query that includes a subquery to perform this type of aggregation. This complexity is the reason why calculating aad is typically slower than some other types of aggregations.

Parameters#

columns: SQLColumns, optional: List of the vDataColumns names. If empty, all vDataColumns are used.
**agg_kwargs: Any optional parameter to pass to the Aggregate function.

Returns#

TableSample: result.

Examples#

For this example, we will use the following dataset:

import verticapy as vp

data = vp.vDataFrame(
    {
        "x": [1, 2, 4, 9, 10, 15, 20, 22],
        "y": [1, 2, 1, 2, 1, 1, 2, 1],
        "z": [10, 12, 2, 1, 9, 8, 1, 3],
    }
)

Now, let’s calculate the average absolute deviation for specific columns.

data.aad(
    columns = ["x", "y", "z"],
)

	aad
"x"	6.46875
"y"	0.46875
"z"	4.0

Note

All the calculations are pushed to the database.

Hint

For more precise control, please refer to the aggregate method.