verticapy.vDataFrame.cov#

Computes the covariance matrix of the vDataFrame. This matrix summarizes the covariances between pairs of variables in the dataset, shedding light on how variables move in relation to each other. It’s an important tool in understanding the relationships and interactions between variables, which can be used for various statistical analyses and modeling tasks.

Parameters#

columns: SQLColumns, optional: List of the vDataColumns names. If empty, all numerical vDataColumns are used.
focus: str, optional: Focus the computation on one vDataColumn.
show: bool, optional: If set to True, the Plotting object is returned.
chart: PlottingObject, optional: The chart object used to plot.
**style_kwargs: Any optional parameter to pass to the plotting functions.

Returns#

obj: Plotting Object.

Examples#

Import VerticaPy.

import verticapy as vp

Import numpy to create a random dataset.

import numpy as np

Generate a dataset using the following data.

N = 30 # Number of records

data = vp.vDataFrame(
    {
        "score1": np.random.normal(5, 1, N),
        "score2": np.random.normal(8, 1.5, N),
        "score3": np.random.normal(10, 2, N),
        "score4": np.random.normal(14, 3, N),
    }
)

Draw the covariance matrix.

data.cov()

You can also use the parameter focus to only compute a covariance vector.

data.cov(method = "pearson", focus = "score1")

It is less expensive and it allows you to focus your search on one specific column.

For more examples, please look at the Correlation Matrix page of the Chart Gallery. Those ones are related to correlation matrix, but the customization stays the same for the covariance matrix.