vDataFrame[].label_encode

In [ ]:
vDataFrame[].label_encode()

Encodes the vcolumn using a bijection from the different categories to [0, n - 1] (n being the vcolumn cardinality).

Returns

vDataFrame : self.parent

Example

In [11]:
from verticapy.datasets import load_titanic
titanic = load_titanic()
display(titanic["embarked"])
Abc
embarked
Varchar(20)
1S
2S
3S
4S
5C
6C
7S
8C
9C
10C
11S
12S
13S
14C
15C
16S
17S
18S
19S
20S
21S
22S
23S
24S
25C
26S
27S
28C
29S
30S
31C
32S
33C
34C
35C
36S
37C
38S
39S
40S
41S
42S
43S
44C
45C
46S
47C
48S
49S
50S
51S
52S
53S
54S
55S
56C
57C
58S
59S
60C
61C
62S
63S
64C
65S
66S
67S
68S
69C
70S
71C
72S
73Q
74S
75S
76C
77C
78S
79C
80C
81S
82S
83S
84S
85C
86S
87S
88C
89C
90S
91S
92C
93C
94S
95C
96S
97S
98C
99S
100S
Rows: 1-100 of 1234 | Column: embarked | Type: varchar(20)
In [12]:
titanic["embarked"].label_encode()
123
embarked
Int
12
22
32
42
50
60
72
80
90
100
112
122
132
140
150
162
172
182
192
202
212
222
232
242
250
262
272
280
292
302
310
322
330
340
350
362
370
382
392
402
412
422
432
440
450
462
470
482
492
502
512
522
532
542
552
560
570
582
592
600
610
622
632
640
652
662
672
682
690
702
710
722
731
742
752
760
770
782
790
800
812
822
832
842
850
862
872
880
890
902
912
920
930
942
950
962
972
980
992
1002
Out[12]:
Rows: 1-100 of 1234 | Column: embarked | Type: int

See Also

vDataFrame[].decode Encodes the vcolumn using a User Defined Encoding.
vDataFrame[].discretize Discretizes the vcolumn.
vDataFrame[].get_dummies Encodes the vcolumn using the One Hot Encoding.
vDataFrame[].mean_encode Encodes the vcolumn using the Mean Encoding of a response.