verticapy.vDataFrame.quantile#
- vDataFrame.quantile(q: int | float | Decimal | list | ndarray, columns: str | list[str] | None = None, approx: bool = True, **agg_kwargs) TableSample #
Aggregates the vDataFrame using specified
quantile
. Thequantile
function is an indispensable tool for comprehending data distribution. By providing a quantile value as input, this aggregation method helps us identify the data point below which a certain percentage of the data falls. This can be pivotal for tasks like analyzing data distributions, assessing skewness, and determining essential percentiles such as medians or quartiles.Warning
It’s important to note that the
quantile
aggregation operates in two distinct modes, allowing flexibility in computation. Depending on theapprox
parameter, it can use eitherAPPROXIMATE_QUANTILE
orQUANTILE
methods to derive the final aggregation. TheAPPROXIMATE_QUANTILE
method provides faster results by estimating the quantile values with an approximation technique, whileQUANTILE
calculates precise quantiles through rigorous computation. This choice empowers users to strike a balance between computational efficiency and the level of precision required for their specific data analysis tasks.Parameters#
- q: PythonNumber / ArrayLike
List of the different quantiles. They must be numbers between 0 and 1. For example [0.25, 0.75] will return Q1 and Q3.
- columns: SQLColumns, optional
List of the vDataColumns names. If empty, all numerical vDataColumns are used.
- approx: bool, optional
If set to True, the approximate quantile is returned. By setting this parameter to False, the function’s performance can drastically decrease.
- **agg_kwargs
Any optional parameter to pass to the Aggregate function.
Returns#
- TableSample
result.
Examples#
For this example, we will use the following dataset:
import verticapy as vp data = vp.vDataFrame( { "x": [1, 2, 4, 9, 10, 15, 20, 22], "y": [1, 2, 1, 2, 1, 1, 2, 1], "z": [10, 12, 2, 1, 9, 8, 1, 3], } )
Now, let’s calculate some approximate quantiles for specific columns.
data.quantile( q = [0.1, 0.2, 0.5, 0.9], columns = ["x", "y", "z"], approx = True, )
... approx_50% approx_90% "x" ... 9.5 20.6 "y" ... 1.0 2.0 "z" ... 5.5 10.6 Note
All the calculations are pushed to the database.
Hint
For more precise control, please refer to the
aggregate
method.See also
vDataColumn.
aggregate()
: Aggregations for a specific column.vDataFrame.
aggregate()
: Aggregates for particular columns.