verticapy.vDataFrame.polynomial_comb#
- vDataFrame.polynomial_comb(columns: str | list[str] | None = None, r: int = 2) vDataFrame #
Returns a vDataFrame containing the different product combinations of the input
vDataColumn
. This function is ideal for bivariate analysis.Parameters#
- columns: SQLColumns, optional
List of the
vDataColumn
names. If empty, all numericalvDataColumn
are used.- r: int, optional
Degree of the polynomial.
Returns#
- vDataFrame
the Polynomial object.
Examples#
Let’s begin by importing VerticaPy.
import verticapy as vp
Hint
By assigning an alias to
verticapy
, we mitigate the risk of code collisions with other libraries. This precaution is necessary because verticapy uses commonly known function names like “average” and “median”, which can potentially lead to naming conflicts. The use of an alias ensures that the functions fromverticapy
are used as intended without interfering with functions from other libraries.Let us create a
vDataFrame
with multiple columns:vdf = vp.vDataFrame( { "col1": [1, 2, 3], "col2": [0, 7, 8], "col3": [3, 11, 93], } )
123col1Integer100%... 123col2Integer100%123col3Integer100%1 1 ... 0 3 2 2 ... 7 11 3 3 ... 8 93 We can create a new
vDataFrame
that has a combination of the original columns using thevDataFrame.
polynomial_comb()
method:new_vdf = vdf.polynomial_comb(r = 2)
123col1Integer100%... 123col2_col3Integer100%123col3_col3Integer100%1 1 ... 0 9 2 2 ... 77 121 3 3 ... 744 8649 Note
This function is highly useful for data preparation, as certain combinations of variables may be relevant for predicting a specific column. It can be beneficial to combine it with a correlation matrix to determine if any of the created combinations can influence the response column.
See also
vDataFrame.
corr()
: Computes the correlation matrix.