vDataFrame.outliers¶
In [ ]:
vDataFrame.outliers(columns: list = [],
name: str = "distribution_outliers",
threshold: float = 3.0,
robust: bool = False)
Adds a new vcolumn labeled with 0 and 1. 1 means that the record is a global outlier.
Parameters¶
Name | Type | Optional | Description |
---|---|---|---|
columns | list | ✓ | List of the vcolumns names. If empty, all the numerical vcolumns will be used. |
name | str | ✓ | Name of the new vcolumn. |
threshold | float | ✓ | Threshold equals to the critical score. |
robust | bool | ✓ | If set to True, the score used will be the Robust Z-Score instead of the Z-Score. |
In [44]:
from verticapy.datasets import load_titanic
titanic = load_titanic()
display(titanic)
In [55]:
titanic.scatter(columns = ["age", "fare"])
titanic.outliers(columns = ["age", "fare"],
name = "outliers",
threshold = 2.5)
In [56]:
titanic.scatter(columns = ["age", "fare"],
catcol = "outliers")
See Also¶
vDataFrame.normalize | Normalizes the input vcolumns. |