Loading...

verticapy.vDataFrame.cov#

vDataFrame.cov(columns: str | list[str] | None = None, focus: str | None = None, show: bool = True, chart: PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure | None = None, **style_kwargs) PlottingBase | TableSample | Axes | mFigure | Highchart | Highstock | Figure#

Computes the covariance matrix of the vDataFrame. This matrix summarizes the covariances between pairs of variables in the dataset, shedding light on how variables move in relation to each other. It’s an important tool in understanding the relationships and interactions between variables, which can be used for various statistical analyses and modeling tasks.

Parameters#

columns: SQLColumns, optional

List of the vDataColumns names. If empty, all numerical vDataColumns are used.

focus: str, optional

Focus the computation on one vDataColumn.

show: bool, optional

If set to True, the Plotting object is returned.

chart: PlottingObject, optional

The chart object used to plot.

**style_kwargs

Any optional parameter to pass to the plotting functions.

Returns#

obj

Plotting Object.

Examples#

Import VerticaPy.

import verticapy as vp

Import numpy to create a random dataset.

import numpy as np

Generate a dataset using the following data.

N = 30 # Number of records

data = vp.vDataFrame(
    {
        "score1": np.random.normal(5, 1, N),
        "score2": np.random.normal(8, 1.5, N),
        "score3": np.random.normal(10, 2, N),
        "score4": np.random.normal(14, 3, N),
    }
)

Draw the covariance matrix.

data.cov()

You can also use the parameter focus to only compute a covariance vector.

data.cov(method = "pearson", focus = "score1")

It is less expensive and it allows you to focus your search on one specific column.

For more examples, please look at the Correlation Matrix page of the Chart Gallery. Those ones are related to correlation matrix, but the customization stays the same for the covariance matrix.

See also

vDataFrame.corr() : Computes the correlation matrix.