Loading...

verticapy.vDataFrame.outliers_plot#

vDataFrame.outliers_plot(columns: str | list[str], threshold: float = 3.0, max_nb_points: int = 500, color: str = 'orange', outliers_color: str = 'black', inliers_color: str = 'white', inliers_border_color: str = 'red', chart: PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure | None = None, **style_kwargs) PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure#

Draws the global outliers plot of one or two columns based on their ZSCORE.

Parameters#

columns: SQLColumns

List of one or two vDataColumn names.

threshold: float, optional

ZSCORE threshold used to detect outliers.

max_nb_points: int, optional

Maximum number of points to display.

color: ColorType, optional

Inliers Area color.

outliers_color: ColorType, optional

Outliers color.

inliers_color: ColorType, optional

Inliers color.

inliers_border_color: ColorType, optional

Inliers border color.

chart: PlottingObject, optional

The chart object to plot on.

**style_kwargs

Any optional parameter to pass to the plotting functions.

Returns#

obj

Plotting Object.

Examples#

Note

The below example is a very basic one. For other more detailed examples and customization options, please see Machine Learning - Outliers

Let’s begin by importing VerticaPy.

import verticapy as vp

Let’s also import numpy to create a dataset.

import numpy as np

We can create a variable N to fix the size:

N = 30

Let’s generate a dataset using the following data.

# Normal Distributions
x = np.random.normal(5, 1, round(N / 2))

y = np.random.normal(3, 1, round(N / 2))

# Creating a vDataFrame with a few outliers
data = vp.vDataFrame(
    {
        "x": np.concatenate([x, [15]]),
        "y": np.concatenate([y, [12]]),
    }
)

Below are examples of two types of outliers_plot plots:

  • 1D

  • 2D

data.outliers_plot(columns = ["x"])
data.outliers_plot(columns = ["x", "y"])

See also

vDataFrame.scatter() : Scatter Plot.
vDataFrame.boxplot() : Box Plot.