Loading...

verticapy.vDataFrame.append#

vDataFrame.append(input_relation: str | vDataFrame, expr1: str | list[str] | StringSQL | list[StringSQL] | None = None, expr2: str | list[str] | StringSQL | list[StringSQL] | None = None, union_all: bool = True) vDataFrame#

Merges the vDataFrame with another vDataFrame or an input relation, and returns a new vDataFrame.

Warning

Appending datasets can potentially increase the structural weight; exercise caution when performing this operation.

Parameters#

input_relation: SQLRelation

Relation to merge with.

expr1: SQLExpression, optional

List of pure-SQL expressions from the current vDataFrame to use during merging. For example, CASE WHEN "column" > 3 THEN 2 ELSE NULL END and POWER("column", 2) will work. If empty, all vDataFrame vDataColumns are used. Aliases are recommended to avoid auto-naming.

expr2: SQLExpression, optional

List of pure-SQL expressions from the input relation to use during the merging. For example, CASE WHEN "column" > 3 THEN 2 ELSE NULL END and POWER("column", 2) will work. If empty, all input relation columns are used. Aliases are recommended to avoid auto-naming.

union_all: bool, optional

If set to True, the vDataFrame is merged with the input relation using an ‘UNION ALL’ instead of an ‘UNION’.

Returns#

vDataFrame

vDataFrame of the Union

Examples#

Let’s begin by importing VerticaPy.

import verticapy as vp

Hint

By assigning an alias to verticapy, we mitigate the risk of code collisions with other libraries. This precaution is necessary because verticapy uses commonly known function names like “average” and “median”, which can potentially lead to naming conflicts. The use of an alias ensures that the functions from verticapy are used as intended without interfering with functions from other libraries.

Let us create two vDataFrame which we can merge for this example:

vdf = vp.vDataFrame(
    {
        "score": [12, 11, 13],
        "cat": ['A', 'B', 'A'],
    }
)


vdf_2 = vp.vDataFrame(
    {
        "score": [11, 1, 23],
        "cat": ['A', 'B', 'B'],
    }
)

We can conveniently append the the first vDataFrame with the second one:

vdf.append(vdf_2)
123
score
Integer
100%
Abc
cat
Varchar(1)
100%
112A
211B
313A
411A
51B
623B

We can also apply some SQL expressions on the append using expr1 and expr2. Let us try to limit the maximum value of the second vDataFrame to 20.

vdf.append(
    vdf_2,
    expr1 = [
        'CASE WHEN "score" > 20 THEN 20 ELSE "score" END',
        '"cat"',
    ],
)
123
score
Integer
100%
Abc
cat
Varchar(1)
100%
112A
211B
313A
411A
51B
620B

Note

VerticaPy offers the flexibility to use UNION ALL or simple UNION based on your specific use case. The former includes duplicates, while the latter handles them. Refer to union_all for more information.

See also

vDataFrame.join() : Joins the vDataFrame with another one or an input_relation.