Loading...

verticapy.vDataFrame.cov

vDataFrame.cov(columns: Annotated[str | list[str], 'STRING representing one column or a list of columns'] | None = None, focus: str | None = None, show: bool = True, chart: PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure | None = None, **style_kwargs) PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure

Computes the covariance matrix of the vDataFrame. This matrix summarizes the covariances between pairs of variables in the dataset, shedding light on how variables move in relation to each other. It’s an important tool in understanding the relationships and interactions between variables, which can be used for various statistical analyses and modeling tasks.

Parameters

columns: SQLColumns, optional

List of the vDataColumns names. If empty, all numerical vDataColumns are used.

focus: str, optional

Focus the computation on one vDataColumn.

show: bool, optional

If set to True, the Plotting object is returned.

chart: PlottingObject, optional

The chart object used to plot.

**style_kwargs

Any optional parameter to pass to the plotting functions.

Returns

obj

Plotting Object.

Examples

Import VerticaPy.

import verticapy as vp

Import numpy to create a random dataset.

import numpy as np

Generate a dataset using the following data.

N = 30 # Number of records

data = vp.vDataFrame(
    {
        "score1": np.random.normal(5, 1, N),
        "score2": np.random.normal(8, 1.5, N),
        "score3": np.random.normal(10, 2, N),
        "score4": np.random.normal(14, 3, N),
    }
)

Draw the covariance matrix.

data.cov()

You can also use the parameter focus to only compute a covariance vector.

data.cov(method = "pearson", focus = "score1")