vDataFrame.add_duplicates

In [ ]:
vDataFrame.add_duplicates(weight: (int, str), 
                          use_gcd: bool = True,)

Duplicates the vDataFrame using the input weight.

Parameters

Name Type Optional Description
weight
int / str
vColumn or integer representing the weight.
use_gcd
bool
If set to True, uses the GCD (Greatest Common Divisor) to reduce all common weights to avoid unnecessary duplicates.

Returns

vDataFrame : the output vDataFrame

Example

In [4]:
from verticapy import *
names = tablesample({"name": ["Badr", "Waqas", "Pratibha"], "weight": [2, 4, 6]}).to_vdf()
display(names)
Abc
name
Varchar(8)
123
weight
Int
1Badr2
2Waqas4
3Pratibha6
Rows: 1-3 | Columns: 2
In [5]:
names.add_duplicates("weight")
Out[5]:
Abc
name
Varchar(8)
1Badr
2Waqas
3Pratibha
4Waqas
5Pratibha
6Pratibha
Rows: 1-6 | Column: name | Type: varchar(8)
In [6]:
# Disabling gcd
names.add_duplicates("weight", use_gcd=False)
Out[6]:
Abc
name
Varchar(8)
1Badr
2Waqas
3Pratibha
4Badr
5Waqas
6Pratibha
7Waqas
8Pratibha
9Waqas
10Pratibha
11Pratibha
12Pratibha
Rows: 1-12 | Column: name | Type: varchar(8)