verticapy.vDataFrame.sem#
- vDataFrame.sem(columns: str | list[str] | None = None, **agg_kwargs) TableSample #
Leverages the
sem
(Standard Error of the Mean) aggregation technique to perform analysis and aggregation on the vDataFrame. Standard Error of the Mean is a valuable statistical measure used to estimate the precision of the sample mean as an approximation of the population mean.When we aggregate the vDataFrame using
sem
, we gain insights into the variability or uncertainty associated with the sample mean. This measure helps us assess the reliability of the sample mean as an estimate of the true population mean.It is worth noting that computing the Standard Error of the Mean requires statistical calculations and can be particularly useful when evaluating the precision of sample statistics or making inferences about a larger dataset based on a sample.
Warning
To compute sem, VerticaPy needs to execute multiple queries. It necessitates, at a minimum, a query that includes a subquery to perform this type of aggregation. This complexity is the reason why calculating sem is typically slower than some other types of aggregations.
Parameters#
- columns: SQLColumns, optional
List of the vDataColumns names. If empty, all numerical vDataColumns are used.
- **agg_kwargs
Any optional parameter to pass to the Aggregate function.
Returns#
- TableSample
result.
Examples#
For this example, we will use the following dataset:
import verticapy as vp data = vp.vDataFrame( { "x": [1, 2, 4, 9, 10, 15, 20, 22], "y": [1, 2, 1, 2, 1, 1, 2, 1], "z": [10, 12, 2, 1, 9, 8, 1, 3], } )
Now, let’s calculate the standard error of the mean for specific columns.
data.sem( columns = ["x", "y", "z"], )
sem "x" 2.83433980723151 "y" 0.18298126367785 "z" 1.57831284242745 Note
All the calculations are pushed to the database.
Hint
For more precise control, please refer to the
aggregate
method.