verticapy.vDataFrame.outliers_plot#
- vDataFrame.outliers_plot(columns: str | list[str], threshold: float = 3.0, max_nb_points: int = 500, color: str = 'orange', outliers_color: str = 'black', inliers_color: str = 'white', inliers_border_color: str = 'red', chart: PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure | None = None, **style_kwargs) PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure #
Draws the global outliers plot of one or two columns based on their ZSCORE.
Parameters#
- columns: SQLColumns
List of one or two vDataColumn names.
- threshold: float, optional
ZSCORE threshold used to detect outliers.
- max_nb_points: int, optional
Maximum number of points to display.
- color: ColorType, optional
Inliers Area color.
- outliers_color: ColorType, optional
Outliers color.
- inliers_color: ColorType, optional
Inliers color.
- inliers_border_color: ColorType, optional
Inliers border color.
- chart: PlottingObject, optional
The chart object to plot on.
- **style_kwargs
Any optional parameter to pass to the plotting functions.
Returns#
- obj
Plotting Object.
Examples#
Note
The below example is a very basic one. For other more detailed examples and customization options, please see Machine Learning - Outliers
Let’s begin by importing VerticaPy.
import verticapy as vp
Let’s also import numpy to create a dataset.
import numpy as np
We can create a variable
N
to fix the size:N = 30
Let’s generate a dataset using the following data.
# Normal Distributions x = np.random.normal(5, 1, round(N / 2)) y = np.random.normal(3, 1, round(N / 2)) # Creating a vDataFrame with a few outliers data = vp.vDataFrame( { "x": np.concatenate([x, [15]]), "y": np.concatenate([y, [12]]), } )
Below are examples of two types of outliers_plot plots:
1D
2D
data.outliers_plot(columns = ["x"])
data.outliers_plot(columns = ["x", "y"])