verticapy.vDataFrame.outliers_plot#

Draws the global outliers plot of one or two columns based on their ZSCORE.

Parameters#

columns: SQLColumns: List of one or two vDataColumn names.
threshold: float, optional: ZSCORE threshold used to detect outliers.
max_nb_points: int, optional: Maximum number of points to display.
color: ColorType, optional: Inliers Area color.
outliers_color: ColorType, optional: Outliers color.
inliers_color: ColorType, optional: Inliers color.
inliers_border_color: ColorType, optional: Inliers border color.
chart: PlottingObject, optional: The chart object to plot on.
**style_kwargs: Any optional parameter to pass to the plotting functions.

Returns#

obj: Plotting Object.

Examples#

Note

The below example is a very basic one. For other more detailed examples and customization options, please see Machine Learning - Outliers

Let’s begin by importing VerticaPy.

import verticapy as vp

Let’s also import numpy to create a dataset.

import numpy as np

We can create a variable N to fix the size:

N = 30

Let’s generate a dataset using the following data.

# Normal Distributions
x = np.random.normal(5, 1, round(N / 2))

y = np.random.normal(3, 1, round(N / 2))

# Creating a vDataFrame with a few outliers
data = vp.vDataFrame(
    {
        "x": np.concatenate([x, [15]]),
        "y": np.concatenate([y, [12]]),
    }
)

Below are examples of two types of outliers_plot plots:

data.outliers_plot(columns = ["x"])

data.outliers_plot(columns = ["x", "y"])