verticapy.vDataColumn.sem#

vDataColumn.sem() → bool | float | str | timedelta | datetime#

Leverages the sem (Standard Error of the Mean) aggregation technique to perform analysis and aggregation on the vDataColumn. Standard Error of the Mean is a valuable statistical measure used to estimate the precision of the sample mean as an approximation of the population mean.

When we aggregate the vDataColumn using sem, we gain insights into the variability or uncertainty associated with the sample mean. This measure helps us assess the reliability of the sample mean as an estimate of the true population mean.

It is worth noting that computing the Standard Error of the Mean requires statistical calculations and can be particularly useful when evaluating the precision of sample statistics or making inferences about a larger dataset based on a sample.

Warning

To compute sem, VerticaPy needs to execute multiple queries. It necessitates, at a minimum, a query that includes a subquery to perform this type of aggregation. This complexity is the reason why calculating sem is typically slower than some other types of aggregations.

Returns#

PythonScalar: sem

Examples#

For this example, let’s generate a dataset and calculate the standard error of the mean of a column:

import verticapy as vp

data = vp.vDataFrame(
    {
        "x": [1, 2, 4, 9, 10, 15, 20, 22],
        "y": [1, 2, 1, 2, 1, 1, 2, 1],
        "z": [10, 12, 2, 1, 9, 8, 1, 3],
    }
)


data["x"].sem()
Out[3]: 2.83433980723151

Note

All the calculations are pushed to the database.

Hint

For more precise control, please refer to the aggregate method.