Loading...

verticapy.vDataFrame.polynomial_comb#

vDataFrame.polynomial_comb(columns: str | list[str] | None = None, r: int = 2) vDataFrame#

Returns a vDataFrame containing the different product combinations of the input vDataColumn. This function is ideal for bivariate analysis.

Parameters#

columns: SQLColumns, optional

List of the vDataColumn names. If empty, all numerical vDataColumn are used.

r: int, optional

Degree of the polynomial.

Returns#

vDataFrame

the Polynomial object.

Examples#

Let’s begin by importing VerticaPy.

import verticapy as vp

Hint

By assigning an alias to verticapy, we mitigate the risk of code collisions with other libraries. This precaution is necessary because verticapy uses commonly known function names like “average” and “median”, which can potentially lead to naming conflicts. The use of an alias ensures that the functions from verticapy are used as intended without interfering with functions from other libraries.

Let us create a vDataFrame with multiple columns:

vdf = vp.vDataFrame(
    {
        "col1": [1, 2, 3],
        "col2": [0, 7, 8],
        "col3": [3, 11, 93],
    }
)

123
col1
Integer
100%
...
123
col2
Integer
100%
123
col3
Integer
100%
11...03
22...711
33...893

We can create a new vDataFrame that has a combination of the original columns using the vDataFrame.polynomial_comb() method:

new_vdf = vdf.polynomial_comb(r = 2)
123
col1
Integer
100%
...
123
col2_col3
Integer
100%
123
col3_col3
Integer
100%
11...09
22...77121
33...7448649

Note

This function is highly useful for data preparation, as certain combinations of variables may be relevant for predicting a specific column. It can be beneficial to combine it with a correlation matrix to determine if any of the created combinations can influence the response column.

See also

vDataFrame.corr() : Computes the correlation matrix.