vDataFrame.groupby

In [ ]:
vDataFrame.groupby(columns: list, 
                   expr: list = [])

Aggregates the vDataFrame by grouping the elements.

Parameters

Name Type Optional Description
columns
list
List of the vcolumns used for the grouping. It can also be a list of customized expressions.
expr
list
List of the different aggregations. Pure SQL must be written. Aliases can also be given. 'SUM(column)' or 'AVG(column) AS my_new_alias' are correct whereas 'AVG' is incorrect. Aliases are recommended to keep the track of the different features and not have ambiguous names. The function MODE does not exist in SQL for example but can be obtained using the 'analytic' method first and then by grouping the result.

Returns

vDataFrame : object result of the grouping.

Example

In [10]:
from verticapy.datasets import load_market
market = load_market()
display(market)
Abc
Name
Varchar(32)
Abc
Form
Varchar(32)
123
Price
Float
1Acorn squashFresh1.1193087167
2Acorn squashFresh1.1722478842
3ApplesFresh1.56751539145
4ApplesFresh1.6155336441
5ApplesFrozen0.5104657455
6ApplesFrozen0.537867915537
7ApplesReady to drink0.6311325278
8ApplesReady to drink0.727287713105
9ApricotsDried7.3309645323
10ApricotsDried7.73397208999
11ApricotsFresh3.0400719671
12ApricotsFresh3.087137817
13ApricotsPacked in juice1.4742760156
14ApricotsPacked in juice1.64748926094
15ApricotsPacked in syrup, syrup discarded1.55204253208
16ApricotsPacked in syrup, syrup discarded1.8552908088
17ArtichokeCanned3.3174627313
18ArtichokeCanned3.38561047493
19ArtichokeFresh2.21305047929
20ArtichokeFresh2.3637333814
21ArtichokeFrozen5.7683252728
22ArtichokeFrozen6.22141832776
23AsparagusCanned2.5802971683
24AsparagusCanned2.7200086787
25AsparagusFresh3.0756105724
26AsparagusFresh3.21349409304
27AsparagusFrozen5.85731991397
28AsparagusFrozen6.0657860935
29AvocadoFresh2.225544076
30AvocadoFresh2.23587438158
31BananasFresh0.5494172928
32BananasFresh0.566983414531
33BeetsCanned0.9427311158
34BeetsCanned1.01727546324
35Berries, mixedFrozen3.41021478336
36Berries, mixedFrozen3.63694066
37Black beansCanned0.945681045
38Black beansCanned0.980579717755
39Black beansDried1.4038762924
40Black beansDried1.48978420263
41BlackberriesFresh5.661671336
42BlackberriesFresh5.77470825035
43BlackberriesFrozen3.38850085426
44BlackberriesFrozen3.6920108696
45Blackeye peasCanned0.910440813422
46Blackeye peasCanned0.9310398825
47Blackeye peasDried1.58990085218
48Blackeye peasDried1.613022434
49BlueberriesFresh4.3911083709
50BlueberriesFresh4.73462168973
51BlueberriesFrozen3.471925452
52BlueberriesFrozen3.64325041591
53BroccoliFlorets2.3624557989
54BroccoliFlorets2.56847143403
55BroccoliFrozen1.8227837671
56BroccoliFrozen1.86970036313
57BroccoliHeads1.63609690647
58BroccoliHeads1.9191171292
59Brussels sproutsFresh2.76355337151
60Brussels sproutsFresh2.9620929796
61Brussels sproutsFrozen2.0397358015
62Brussels sproutsFrozen2.13825366618
63Butternut squashFresh1.24473670387
64Butternut squashFresh1.2920110231
65CabbageFresh green cabbage0.579208394258
66CabbageFresh green cabbage0.6238712291
67CabbageFresh red cabbage1.0241250285
68CabbageFresh red cabbage1.05644972791
69CabbageSauerkraut1.153095782
70CabbageSauerkraut1.26463509211
71CantaloupeFresh0.520793672
72CantaloupeFresh0.535873776106
73CarrotsCanned0.9231914223
74CarrotsCanned1.06049032785
75CarrotsCooked whole0.741539992379
76CarrotsCooked whole0.7737514155
77CarrotsFrozen1.3965860252
78CarrotsFrozen1.46051272856
79CarrotsRaw baby1.4373172338
80CarrotsRaw baby1.44745804876
81CarrotsRaw whole0.741539992379
82CarrotsRaw whole0.7737514155
83CauliflowerFlorets3.1279734694
84CauliflowerFlorets3.27064843515
85CauliflowerFrozen1.6814349408
86CauliflowerFrozen1.716444729
87CauliflowerHeads1.22881085623
88CauliflowerHeads1.416972077
89CelerySticks2.1340528999
90CelerySticks2.2049694293
91CeleryTrimmed bunches1.0928457916
92CeleryTrimmed bunches1.11365395418
93CherriesCanned, packed in syrup or water3.51874337733
94CherriesCanned, packed in syrup or water3.7259182628
95CherriesFresh3.2130681107
96CherriesFresh3.59298975549
97Collard greensCanned0.8534190293
98Collard greensCanned0.902593822572
99Collard greensFresh2.6294683753
100Collard greensFresh2.63083790515
Rows: 1-100 | Columns: 3
In [11]:
market.groupby(columns = ["Form", "Name"],
               expr = ["AVG(Price) AS avg_price",
                       "STDDEV(Price) AS std"])
Out[11]:
Abc
Form
Varchar(32)
Abc
Name
Varchar(32)
123
avg_price
Float
123
std
Float
1FreshAcorn squash1.145778300450.0374336443296234
2FreshApples1.5915245177750.033954032069563
3FrozenApples0.52416683051850.0193762602523861
4Ready to drinkApples0.67921012045250.0679919835754149
5DriedApricots7.5324683111450.284969376912029
6FreshApricots3.063604892050.0332805816266319
7Packed in juiceApricots1.560882638270.122480260371241
8Packed in syrup, syrup discardedApricots1.703666670440.214428912851844
9CannedArtichoke3.3515366031150.048187731643313
10FreshArtichoke2.2883919303450.106548901890849
11FrozenArtichoke5.994871800280.320385171670741
12CannedAsparagus2.65015292350.0987909564136379
13FreshAsparagus3.144552332720.0974983724584246
14FrozenAsparagus5.9615530037350.147407849193659
15FreshAvocado2.230709228790.00730462912737973
16FreshBananas0.55820035366550.0124211237951375
17CannedBeets0.980003289520.0527108135739495
18FrozenBerries, mixed3.523577721680.160319404842618
19CannedBlack beans0.96313038137750.0246770881594642
20DriedBlack beans1.4468302475150.0607460658812006
21FreshBlackberries5.7181897931750.0799291686612887
22FrozenBlackberries3.540255861930.214613990004942
23CannedBlackeye peas0.9207403479610.0145657414311878
24DriedBlackeye peas1.601461643090.0163494272967093
25FreshBlueberries4.5628650303150.242900597172589
26FrozenBlueberries3.5575879339550.121145043767303
27FloretsBroccoli2.4654636164650.145675052630875
28FrozenBroccoli1.8462420651150.0331750432030009
29HeadsBroccoli1.7776070178350.200125518705313
30FreshBrussels sprouts2.8628231755550.140388703214564
31FrozenBrussels sprouts2.088994733840.0696626501832463
32FreshButternut squash1.2683738634850.0334279917035078
33Fresh green cabbageCabbage0.6015398116790.0315813933837907
34Fresh red cabbageCabbage1.0402873782050.0228570141526365
35SauerkrautCabbage1.2088654370550.0788702025476506
36FreshCantaloupe0.5283337240530.0106632438743567
37CannedCarrots0.9918408750750.0970849871638956
38Cooked wholeCarrots0.75764570393950.0227769157205302
39FrozenCarrots1.428549376880.0452030054447557
40Raw babyCarrots1.442387641280.00717063902494663
41Raw wholeCarrots0.75764570393950.0227769157205302
42FloretsCauliflower3.1993109522750.10088643578742
43FrozenCauliflower1.69893983490.0247556586441232
44HeadsCauliflower1.3228914666150.133050075162805
45SticksCelery2.16951116460.0501455588369294
46Trimmed bunchesCelery1.103249872890.0147135928643669
47Canned, packed in syrup or waterCherries3.6223308200650.146494766407389
48FreshCherries3.4030289330950.26864517135055
49CannedCollard greens0.8780064259360.0347718297860748
50FreshCollard greens2.6301531402250.000968403844114992
51FrozenCollard greens1.5363409018050.0788916229899441
52CannedCorn0.87881674892250.0354806214422065
53FreshCorn2.9777755490650.406094804455345
54FrozenCorn1.608633701530.0147631602577653
55DriedCranberries4.738072602780.0688280414993992
56Fresh, consumed with peelCucumbers1.275758474120.0285287803154144
57Fresh, peeledCucumbers1.275758474120.0285287803154144
58DriedDates5.1515280522950.509366657325232
59DriedFigs5.936844258350.266616901811729
60Packed in juiceFruit cocktail1.4495193002850.0574655341542805
61Packed in syrup or waterFruit cocktail1.2501511321350.0151222388887894
62FreshGrapefruit0.953799633990.0791925552121489
63FrozenGrapefruit0.674839678618nan
64Juice, ready to drinkGrapefruit0.8494909232nan
65Ready to drinkGrapefruit0.824989780378nan
66FreshGrapes2.164850304250.100441537458926
67FrozenGrapes0.7281973016980.0115558279448051
68RaisinsGrapes3.5382900334950.0528567963134104
69Ready to drinkGrapes0.8795559948690.0480536917097328
70CannedGreat northern beans0.8986240635090.0357680245500685
71DriedGreat northern beans1.5616052949150.0389991356591416
72CannedGreen beans0.82548990485850.00204944698175326
73FreshGreen beans2.1331243562050.00968438768679218
74FrozenGreen beans1.684903486670.0220737713811638
75CannedGreen peas1.000041517670.0184238996381032
76FrozenGreen peas1.652002882310.0090936859340817
77FreshGreen peppers1.45266674290.0598270480860252
78FreshHoneydew melon0.8113237098650.0207429836981466
79CannedKale1.0686461381nan
80FreshKale2.84229605730.049489387502778
81FrozenKale1.894887193660.252665889924294
82FreshKiwi2.1130055434350.0966220905848999
83DriedLentils1.473235323650.124732955328185
84FreshLettuce, Iceberg1.1511474876150.0875284236578059
85Full HeadsLettuce, Romaine1.6600665920850.248946451608926
86HeartsLettuce, Romaine2.6533145840350.140980565011309
87CannedLima beans1.355890943050.0391430589925351
88DriedLima beans1.9750962913150.345691279858497
89FrozenLima beans1.867453774960.0448718757678454
90DriedMangoes9.334180925041.17313487217308
91FreshMangoes1.3505403741650.0382163961426793
92CannedMixed Vegetables1.16931490570250.103832331507163
93FrozenMixed Vegetables1.58140365080750.249177110412949
94SlicedMushrooms3.816004514830.00789112051889268
95WholeMushrooms3.4801193387350.0963560092729786
96CannedMustard greens0.8997654622780.108931383247022
97FreshMustard greens2.6242517824450.0778054568543142
98FrozenMustard greens1.5055585625850.070167375472974
99CannedNavy beans0.9583369893930.0170926458557113
100DriedNavy beans1.492265254360.0191123609892753
Rows: 1-100 | Columns: 4
In [12]:
from verticapy.datasets import load_amazon
amazon = load_amazon()
display(amazon)

# customized SQL selection
amazon.groupby(columns = ["MONTH(date) AS month"],
               expr = ["AVG(number) AS avg_number"],)
📅
date
Date
Abc
state
Varchar(32)
123
number
Int
11998-01-01ACRE0
21998-01-01ALAGOAS0
31998-01-01AMAPÁ0
41998-01-01AMAZONAS0
51998-01-01BAHIA0
61998-01-01CEARÁ0
71998-01-01DISTRITO FEDERAL0
81998-01-01ESPÍRITO SANTO0
91998-01-01GOIÁS0
101998-01-01MARANHÃO0
111998-01-01MATO GROSSO0
121998-01-01MATO GROSSO DO SUL0
131998-01-01MINAS GERAIS0
141998-01-01PARANÁ0
151998-01-01PARAÍBA0
161998-01-01PARÁ0
171998-01-01PERNAMBUCO0
181998-01-01PIAUÍ0
191998-01-01RIO DE JANEIRO0
201998-01-01RIO GRANDE DO NORTE0
211998-01-01RIO GRANDE DO SUL0
221998-01-01RONDÔNIA0
231998-01-01RORAIMA0
241998-01-01SANTA CATARINA0
251998-01-01SERGIPE0
261998-01-01SÃO PAULO0
271998-01-01TOCANTINS0
281998-02-01ACRE0
291998-02-01ALAGOAS0
301998-02-01AMAPÁ0
311998-02-01AMAZONAS0
321998-02-01BAHIA0
331998-02-01CEARÁ0
341998-02-01DISTRITO FEDERAL0
351998-02-01ESPÍRITO SANTO0
361998-02-01GOIÁS0
371998-02-01MARANHÃO0
381998-02-01MATO GROSSO0
391998-02-01MATO GROSSO DO SUL0
401998-02-01MINAS GERAIS0
411998-02-01PARANÁ0
421998-02-01PARAÍBA0
431998-02-01PARÁ0
441998-02-01PERNAMBUCO0
451998-02-01PIAUÍ0
461998-02-01RIO DE JANEIRO0
471998-02-01RIO GRANDE DO NORTE0
481998-02-01RIO GRANDE DO SUL0
491998-02-01RONDÔNIA0
501998-02-01RORAIMA0
511998-02-01SANTA CATARINA0
521998-02-01SERGIPE0
531998-02-01SÃO PAULO0
541998-02-01TOCANTINS0
551998-03-01ACRE0
561998-03-01ALAGOAS0
571998-03-01AMAPÁ0
581998-03-01AMAZONAS0
591998-03-01BAHIA0
601998-03-01CEARÁ0
611998-03-01DISTRITO FEDERAL0
621998-03-01ESPÍRITO SANTO0
631998-03-01GOIÁS0
641998-03-01MARANHÃO0
651998-03-01MATO GROSSO0
661998-03-01MATO GROSSO DO SUL0
671998-03-01MINAS GERAIS0
681998-03-01PARANÁ0
691998-03-01PARAÍBA0
701998-03-01PARÁ0
711998-03-01PERNAMBUCO0
721998-03-01PIAUÍ0
731998-03-01RIO DE JANEIRO0
741998-03-01RIO GRANDE DO NORTE0
751998-03-01RIO GRANDE DO SUL0
761998-03-01RONDÔNIA0
771998-03-01RORAIMA0
781998-03-01SANTA CATARINA0
791998-03-01SERGIPE0
801998-03-01SÃO PAULO0
811998-03-01TOCANTINS0
821998-04-01ACRE0
831998-04-01ALAGOAS0
841998-04-01AMAPÁ0
851998-04-01AMAZONAS0
861998-04-01BAHIA0
871998-04-01CEARÁ0
881998-04-01DISTRITO FEDERAL0
891998-04-01ESPÍRITO SANTO0
901998-04-01GOIÁS0
911998-04-01MARANHÃO0
921998-04-01MATO GROSSO0
931998-04-01MATO GROSSO DO SUL0
941998-04-01MINAS GERAIS0
951998-04-01PARANÁ0
961998-04-01PARAÍBA0
971998-04-01PARÁ0
981998-04-01PERNAMBUCO0
991998-04-01PIAUÍ0
1001998-04-01RIO DE JANEIRO0
Rows: 1-100 | Columns: 3
Out[12]:
123
month
Int
123
avg_number
Float
1197.2033271719039
2259.0685185185185
3365.0333333333333
4455.4759259259259
5585.3388888888889
66210.422222222222
77417.233333333333
881466.26111111111
992132.94259259259
10101243.8462962963
1111622.798148148148
1212327.300194931774
Rows: 1-12 | Columns: 2

See Also

vDataFrame.append Merges the vDataFrame with another relation.
vDataFrame.analytic Adds a new vcolumn to the vDataFrame by using an advanced analytical function on a specific vcolumn.
vDataFrame.join Joins the vDataFrame with another relation.
vDataFrame.sort Sorts the vDataFrame.