verticapy.vDataFrame.kurtosis#

vDataFrame.kurtosis(columns: str | list[str] | None = None, **agg_kwargs) → TableSample#

Calculates the kurtosis of the vDataFrame to obtain a measure of the data’s peakedness or tailness. The kurtosis statistic helps us understand the shape of the data distribution. It quantifies whether the data has heavy tails or is more peaked relative to a normal distribution.

By aggregating the vDataFrame with kurtosis, we can gain valuable insights into the data’s distribution characteristics.

Warning

To compute kurtosis, VerticaPy needs to execute multiple queries. It necessitates, at a minimum, a query that includes a subquery to perform this type of aggregation. This complexity is the reason why calculating kurtosis is typically slower than some other types of aggregations.

Parameters#

columns: SQLColumns, optional: List of the vDataColumns names. If empty, all vDataColumns are used.
**agg_kwargs: Any optional parameter to pass to the Aggregate function.

Returns#

TableSample: result.

Examples#

For this example, we will use the following dataset:

import verticapy as vp

data = vp.vDataFrame(
    {
        "x": [1, 2, 4, 9, 10, 15, 20, 22],
        "y": [1, 2, 1, 2, 1, 1, 2, 1],
        "z": [10, 12, 2, 1, 9, 8, 1, 3],
    }
)

Now, let’s calculate the kurtosis for specific columns.

data.kurtosis(
    columns = ["x", "y", "z"],
)

	kurtosis
"x"	-1.44661035091946
"y"	-2.24000000000001
"z"	-2.08357035495433

Note

All the calculations are pushed to the database.

Hint

For more precise control, please refer to the aggregate method.