vDataFrame[].get_dummies

In [ ]:
vDataFrame[].get_dummies(prefix: str = "", 
                         prefix_sep: str = "_", 
                         drop_first: bool = True, 
                         use_numbers_as_suffix: bool = False)

Encodes the vcolumn using the One Hot Encoding algorithm.

Parameters

Name Type Optional Description
prefix
str
Prefix of the dummies.
prefix_sep
str
Prefix delimitor of the dummies.
drop_first
bool
Drops the first dummy to avoid the creation of correlated features.
use_numbers_as_suffix
bool
Uses numbers as suffix instead of the vcolumns categories.

Returns

vDataFrame : self.parent

Example

In [79]:
from verticapy import vDataFrame
churn = vDataFrame("public.churn")
churn = churn.select(["InternetService", "MonthlyCharges", "churn"])
display(churn)
Abc
InternetService
Varchar(20)
123
MonthlyCharges
Float
Abc
Churn
Varchar(3)
1DSL65.6No
2DSL59.9No
3Fiber optic73.9Yes
4Fiber optic98.0Yes
5Fiber optic83.9Yes
6DSL69.4No
7Fiber optic109.7No
8Fiber optic84.65No
9DSL48.2No
10DSL90.45No
11DSL45.2No
12Fiber optic116.8No
13Fiber optic68.95No
14Fiber optic101.3No
15DSL45.05No
16Fiber optic95.75No
17DSL61.25No
18Fiber optic72.1No
19DSL62.7Yes
20DSL25.1Yes
21No25.2No
22Fiber optic94.1Yes
23Fiber optic83.75No
24No19.85No
25No20.35Yes
26DSL30.5Yes
27Fiber optic103.7No
28No20.4No
29No19.6No
30No19.7No
31Fiber optic91.2No
32No20.45No
33Fiber optic115.8No
34No20.55No
35DSL39.4No
36No25.1No
37Fiber optic89.8No
38Fiber optic94.75No
39No20.3No
40Fiber optic75.75No
41DSL49.25Yes
42DSL78.2No
43No25.5No
44DSL61.6No
45DSL45.0No
46DSL85.15No
47DSL51.45No
48Fiber optic99.25No
49DSL44.3No
50Fiber optic94.2No
51DSL81.25No
52Fiber optic99.95No
53Fiber optic91.55No
54Fiber optic104.5Yes
55Fiber optic95.0Yes
56DSL50.35No
57DSL64.5No
58No19.4No
59Fiber optic104.8No
60Fiber optic109.4No
61DSL50.3No
62DSL71.4No
63Fiber optic116.0No
64No19.85Yes
65Fiber optic99.75Yes
66Fiber optic93.95No
67Fiber optic90.8No
68DSL84.35Yes
69DSL58.25No
70Fiber optic107.55No
71No19.95No
72Fiber optic111.2Yes
73DSL40.2Yes
74Fiber optic85.8No
75DSL35.4No
76Fiber optic73.85Yes
77DSL88.1No
78Fiber optic101.35Yes
79DSL45.8No
80Fiber optic94.65No
81No20.5No
82Fiber optic89.4No
83Fiber optic86.25Yes
84DSL74.85No
85Fiber optic89.75Yes
86Fiber optic109.95No
87Fiber optic80.2Yes
88No19.85Yes
89Fiber optic90.35No
90Fiber optic86.45No
91No20.3No
92Fiber optic101.25No
93Fiber optic94.7Yes
94Fiber optic70.9Yes
95DSL54.2Yes
96Fiber optic114.9No
97No19.55No
98Fiber optic86.85No
99No25.35No
100Fiber optic109.9No
Rows: 1-100 of 7032 | Columns: 3
In [78]:
churn["InternetService"].get_dummies()
Abc
InternetService
Varchar(20)
123
MonthlyCharges
Float
Abc
Churn
Varchar(3)
123
InternetService_DSL
Bool
123
InternetService_Fiber_optic
Bool
1DSL65.6No10
2DSL59.9No10
3Fiber optic73.9Yes01
4Fiber optic98.0Yes01
5Fiber optic83.9Yes01
6DSL69.4No10
7Fiber optic109.7No01
8Fiber optic84.65No01
9DSL48.2No10
10DSL90.45No10
11DSL45.2No10
12Fiber optic116.8No01
13Fiber optic68.95No01
14Fiber optic101.3No01
15DSL45.05No10
16Fiber optic95.75No01
17DSL61.25No10
18Fiber optic72.1No01
19DSL62.7Yes10
20DSL25.1Yes10
21No25.2No00
22Fiber optic94.1Yes01
23Fiber optic83.75No01
24No19.85No00
25No20.35Yes00
26DSL30.5Yes10
27Fiber optic103.7No01
28No20.4No00
29No19.6No00
30No19.7No00
31Fiber optic91.2No01
32No20.45No00
33Fiber optic115.8No01
34No20.55No00
35DSL39.4No10
36No25.1No00
37Fiber optic89.8No01
38Fiber optic94.75No01
39No20.3No00
40Fiber optic75.75No01
41DSL49.25Yes10
42DSL78.2No10
43No25.5No00
44DSL61.6No10
45DSL45.0No10
46DSL85.15No10
47DSL51.45No10
48Fiber optic99.25No01
49DSL44.3No10
50Fiber optic94.2No01
51DSL81.25No10
52Fiber optic99.95No01
53Fiber optic91.55No01
54Fiber optic104.5Yes01
55Fiber optic95.0Yes01
56DSL50.35No10
57DSL64.5No10
58No19.4No00
59Fiber optic104.8No01
60Fiber optic109.4No01
61DSL50.3No10
62DSL71.4No10
63Fiber optic116.0No01
64No19.85Yes00
65Fiber optic99.75Yes01
66Fiber optic93.95No01
67Fiber optic90.8No01
68DSL84.35Yes10
69DSL58.25No10
70Fiber optic107.55No01
71No19.95No00
72Fiber optic111.2Yes01
73DSL40.2Yes10
74Fiber optic85.8No01
75DSL35.4No10
76Fiber optic73.85Yes01
77DSL88.1No10
78Fiber optic101.35Yes01
79DSL45.8No10
80Fiber optic94.65No01
81No20.5No00
82Fiber optic89.4No01
83Fiber optic86.25Yes01
84DSL74.85No10
85Fiber optic89.75Yes01
86Fiber optic109.95No01
87Fiber optic80.2Yes01
88No19.85Yes00
89Fiber optic90.35No01
90Fiber optic86.45No01
91No20.3No00
92Fiber optic101.25No01
93Fiber optic94.7Yes01
94Fiber optic70.9Yes01
95DSL54.2Yes10
96Fiber optic114.9No01
97No19.55No00
98Fiber optic86.85No01
99No25.35No00
100Fiber optic109.9No01
Out[78]:
Rows: 1-100 of 7032 | Columns: 5
In [80]:
# Number as suffix
churn["InternetService"].get_dummies(use_numbers_as_suffix = True)
Abc
InternetService
Varchar(20)
123
MonthlyCharges
Float
Abc
Churn
Varchar(3)
123
InternetService_0
Bool
123
InternetService_1
Bool
1DSL65.6No10
2DSL59.9No10
3Fiber optic73.9Yes01
4Fiber optic98.0Yes01
5Fiber optic83.9Yes01
6DSL69.4No10
7Fiber optic109.7No01
8Fiber optic84.65No01
9DSL48.2No10
10DSL90.45No10
11DSL45.2No10
12Fiber optic116.8No01
13Fiber optic68.95No01
14Fiber optic101.3No01
15DSL45.05No10
16Fiber optic95.75No01
17DSL61.25No10
18Fiber optic72.1No01
19DSL62.7Yes10
20DSL25.1Yes10
21No25.2No00
22Fiber optic94.1Yes01
23Fiber optic83.75No01
24No19.85No00
25No20.35Yes00
26DSL30.5Yes10
27Fiber optic103.7No01
28No20.4No00
29No19.6No00
30No19.7No00
31Fiber optic91.2No01
32No20.45No00
33Fiber optic115.8No01
34No20.55No00
35DSL39.4No10
36No25.1No00
37Fiber optic89.8No01
38Fiber optic94.75No01
39No20.3No00
40Fiber optic75.75No01
41DSL49.25Yes10
42DSL78.2No10
43No25.5No00
44DSL61.6No10
45DSL45.0No10
46DSL85.15No10
47DSL51.45No10
48Fiber optic99.25No01
49DSL44.3No10
50Fiber optic94.2No01
51DSL81.25No10
52Fiber optic99.95No01
53Fiber optic91.55No01
54Fiber optic104.5Yes01
55Fiber optic95.0Yes01
56DSL50.35No10
57DSL64.5No10
58No19.4No00
59Fiber optic104.8No01
60Fiber optic109.4No01
61DSL50.3No10
62DSL71.4No10
63Fiber optic116.0No01
64No19.85Yes00
65Fiber optic99.75Yes01
66Fiber optic93.95No01
67Fiber optic90.8No01
68DSL84.35Yes10
69DSL58.25No10
70Fiber optic107.55No01
71No19.95No00
72Fiber optic111.2Yes01
73DSL40.2Yes10
74Fiber optic85.8No01
75DSL35.4No10
76Fiber optic73.85Yes01
77DSL88.1No10
78Fiber optic101.35Yes01
79DSL45.8No10
80Fiber optic94.65No01
81No20.5No00
82Fiber optic89.4No01
83Fiber optic86.25Yes01
84DSL74.85No10
85Fiber optic89.75Yes01
86Fiber optic109.95No01
87Fiber optic80.2Yes01
88No19.85Yes00
89Fiber optic90.35No01
90Fiber optic86.45No01
91No20.3No00
92Fiber optic101.25No01
93Fiber optic94.7Yes01
94Fiber optic70.9Yes01
95DSL54.2Yes10
96Fiber optic114.9No01
97No19.55No00
98Fiber optic86.85No01
99No25.35No00
100Fiber optic109.9No01
Out[80]:
Rows: 1-100 of 7032 | Columns: 5

See Also

vDataFrame[].decode Encodes the vcolumn using a user defined Encoding.
vDataFrame[].discretize Discretizes the vcolumn.
vDataFrame[].label_encode Encodes the vcolumn using the Label Encoding.
vDataFrame[].mean_encode Encodes the vcolumn using the Mean Encoding of a response.