verticapy.vDataFrame.count_percent#

vDataFrame.count_percent(columns: str | list[str] | None = None, sort_result: bool = True, desc: bool = True, **agg_kwargs) → TableSample#

Performs aggregation on the vDataFrame using a list of aggregate functions, including count and percent. The count function computes the number of non-missing (non-null) values within the dataset, providing us with an understanding of the data’s completeness.

On the other hand, the percent function calculates the percentage of non-missing values in relation to the total dataset size, offering insights into data integrity and completeness as a proportion.

Parameters#

columns: SQLColumns, optional: List of vDataColumn names. If empty, all vDataColumns are used.
sort_result: bool, optional: If set to True, the result is sorted.
desc: bool, optional: If set to True and sort_result is set to True, the result is sorted in descending order.
**agg_kwargs: Any optional parameter to pass to the Aggregate function.

Returns#

TableSample: result.

Examples#

For this example, we will use the following dataset:

import verticapy as vp

data = vp.vDataFrame(
    {
        "x": [1, 2, 4, 9, 10, 15, 20, 22],
        "y": [1, 2, 1, 2, 1, 1, 2, 1],
        "z": [10, 12, 2, 1, 9, 8, 1, 3],
    }
)

Now, let’s calculate the count percentage for specific columns.

data.count_percent(
    columns = ["x", "y", "z"],
)

	...	count	percent
"y"	...	7.0	87.5
"z"	...	6.0	75.0
"x"	...	4.0	50.0

Note

All the calculations are pushed to the database.

Hint

For more precise control, please refer to the aggregate method.