Loading...

verticapy.vDataFrame.scatter#

vDataFrame.scatter(columns: str | list[str], by: str | None = None, size: str | None = None, cmap_col: str | None = None, max_cardinality: int = 6, cat_priority: None | bool | float | str | timedelta | datetime | list | ndarray = None, max_nb_points: int = 20000, dimensions: tuple = None, bbox: tuple | None = None, img: str | None = None, chart: PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure | None = None, **style_kwargs) PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure#

Draws the scatter plot of the input vDataColumns.

Parameters#

columns: SQLColumns

List of the vDataColumns names.

by: str, optional

Categorical vDataColumn used to label the data.

size: str

Numerical vDataColumn used to represent the Bubble size.

cmap_col: str, optional

Numerical column used to represent the color map.

max_cardinality: int, optional

Maximum number of distinct elements for ‘by’ to be used as categorical. The less frequent elements are gathered together to create a new category: ‘Others’.

cat_priority: PythonScalar / ArrayLike, optional

ArrayLike list of the different categories to consider when labeling the data using the vDataColumn ‘by’. The other categories are filtered.

max_nb_points: int, optional

Maximum number of points to display.

dimensions: tuple, optional

Tuple of two elements representing the IDs of the PCA’s components. If empty and the number of input columns is greater than 3, the first and second PCA are drawn.

bbox: list, optional

Tuple of 4 elements to delimit the boundaries of the final Plot. It must be similar the following list: [xmin, xmax, ymin, ymax]

img: str, optional

Path to the image to display as background.

chart: PlottingObject, optional

The chart object to plot on.

**style_kwargs

Any optional parameter to pass to the plotting functions.

Returns#

obj

Plotting Object.

Examples#

Note

The below example is a very basic one. For other more detailed examples and customization options, please see Scatter Plots

Let’s begin by importing VerticaPy.

import verticapy as vp

Let’s also import numpy to create a dataset.

import numpy as np

We can create a variable N to fix the size:

N = 30

Let’s generate a dataset using the following data.

data = vp.vDataFrame(
    {
        "category": [np.random.choice(['A','B','C']) for _ in range(N)],
        "x": np.random.normal(5, 1, N),
        "y": np.random.normal(8, 1.5, N),
        "z": np.random.normal(10, 2, N),
    }
)

Below are examples of two types of scatter plots:

  • 2D

  • 3D

data.scatter(columns = ["x", "y"], by = "category")
data.scatter(columns = ["x", "y", "z"])

See also

vDataFrame.density() : Density Plot.
vDataFrame.outliers_plot() : Outliers Plot.