VerticaPy

Python API for Vertica Data Science at Scale

Integrating with GeoPandas

As of version 0.4.0, VerticaPy features GeoPandas integration. This allows you to easily export a vDataFrame as a GeoPandas DataFrame, giving you more control over geospatial data.

This example demonstrates the advantages of GeoPandas integration with the 'world' dataset.

In [49]:
from verticapy.datasets import *
world = load_world()
display(world)
123
pop_est
Int
Abc
continent
Varchar(32)
Abc
country
Varchar(82)
🌎
Geometry(1048576)
1140Seven seas (open ocean)Fr. S. Antarctic Lands
22931South AmericaFalkland Is.
34050AntarcticaAntarctica
457713North AmericaGreenland
5265100AsiaN. Cyprus
6279070OceaniaNew Caledonia
7282814OceaniaVanuatu
8329988North AmericaBahamas
9339747EuropeIceland
10360346North AmericaBelize
11443593AsiaBrunei
12591919South AmericaSuriname
13594130EuropeLuxembourg
14603253AfricaW. Sahara
15642550EuropeMontenegro
16647581OceaniaSolomon Is.
17737718South AmericaGuyana
18758288AsiaBhutan
19778358AfricaEq. Guinea
20865267AfricaDjibouti
21920938OceaniaFiji
221218208North AmericaTrinidad and Tobago
231221549AsiaCyprus
241251581EuropeEstonia
251291358AsiaTimor-Leste
261467152AfricaeSwatini
271772255AfricaGabon
281792338AfricaGuinea-Bissau
291895250EuropeKosovo
301944643EuropeLatvia
311958042AfricaLesotho
321972126EuropeSlovenia
332051363AfricaGambia
342103721EuropeMacedonia
352214858AfricaBotswana
362314307AsiaQatar
372484780AfricaNamibia
382823859EuropeLithuania
392875422AsiaKuwait
402990561North AmericaJamaica
413045191AsiaArmenia
423047987EuropeAlbania
433068243AsiaMongolia
443351827North AmericaPuerto Rico
453360148South AmericaUruguay
463424386AsiaOman
473474121EuropeMoldova
483500000AfricaSomaliland
493753142North AmericaPanama
503758571AfricaMauritania
513856181EuropeBosnia and Herz.
524292095EuropeCroatia
534510327OceaniaNew Zealand
544543126AsiaPalestine
554689021AfricaLiberia
564926330AsiaGeorgia
574930258North AmericaCosta Rica
584954674AfricaCongo
595011102EuropeIreland
605320045EuropeNorway
615351277AsiaTurkmenistan
625445829EuropeSlovakia
635491218EuropeFinland
645605948EuropeDenmark
655625118AfricaCentral African Rep.
665789122AsiaKyrgyzstan
675918919AfricaEritrea
686025951North AmericaNicaragua
696072475AsiaUnited Arab Emirates
706163195AfricaSierra Leone
716172011North AmericaEl Salvador
726229794AsiaLebanon
736653210AfricaLibya
746909701OceaniaPapua New Guinea
756943739South AmericaParaguay
767101510EuropeBulgaria
777111024EuropeSerbia
787126706AsiaLaos
797531386AfricaSomalia
807965055AfricaTogo
818236303EuropeSwitzerland
828299706AsiaIsrael
838468555AsiaTajikistan
848754413EuropeAustria
859038741North AmericaHonduras
869549747EuropeBelarus
879850845EuropeHungary
889960487EuropeSweden
899961396AsiaAzerbaijan
9010248069AsiaJordan
9110646714North AmericaHaiti
9210674723EuropeCzechia
9310734247North AmericaDominican Rep.
9410768477EuropeGreece
9510839514EuropePortugal
9611038805AfricaBenin
9711138234South AmericaBolivia
9811147407North AmericaCuba
9911403800AfricaTunisia
10011466756AfricaBurundi
Rows: 1-100 | Columns: 4

The 'apply' function of the VerticaPy stats module allows you to apply any Vertica function to the data. Let's compute the area of each country.

In [50]:
import verticapy.stats as st
world["geography"] = st.apply("stv_geography", world["geometry"])
world["geography"].astype("geography")
world["area"] = st.apply("st_area", world["geography"])
display(world)
123
pop_est
Int
Abc
continent
Varchar(32)
Abc
country
Varchar(82)
🌎
Geometry(1048576)
🌎
Geography
123
area
Float
1140Seven seas (open ocean)Fr. S. Antarctic Lands11565103231.3373
22931South AmericaFalkland Is.16301980840.923
34050AntarcticaAntarctica12236252243904.5
457713North AmericaGreenland2189751414857.88
5265100AsiaN. Cyprus3786381407.61484
6279070OceaniaNew Caledonia23282206866.1151
7282814OceaniaVanuatu7516451865.11271
8329988North AmericaBahamas15615897740.7182
9339747EuropeIceland107031853684.15
10360346North AmericaBelize22118506248.1433
11443593AsiaBrunei10747411137.2237
12591919South AmericaSuriname144905905345.04
13594130EuropeLuxembourg2408816376.52072
14603253AfricaW. Sahara96485945530.7032
15642550EuropeMontenegro13420753080.0136
16647581OceaniaSolomon Is.24831146940.9329
17737718South AmericaGuyana210721444770.75
18758288AsiaBhutan39441951914.4996
19778358AfricaEq. Guinea27242029433.1684
20865267AfricaDjibouti21966242368.6336
21920938OceaniaFiji19353579665.0818
221218208North AmericaTrinidad and Tobago7769152795.8564
231221549AsiaCyprus6207615256.11238
241251581EuropeEstonia44441790595.9777
251291358AsiaTimor-Leste14776389256.3134
261467152AfricaeSwatini18151531543.0868
271772255AfricaGabon270683290583.006
281792338AfricaGuinea-Bissau36333129901.889
291895250EuropeKosovo11211660926.8804
301944643EuropeLatvia63610085256.8601
311958042AfricaLesotho27538833464.2686
321972126EuropeSlovenia19070599769.5595
332051363AfricaGambia14084033590.1295
342103721EuropeMacedonia25026416110.0245
352214858AfricaBotswana593439013501.164
362314307AsiaQatar11350868333.7436
372484780AfricaNamibia827008138278.503
382823859EuropeLithuania63539126862.9797
392875422AsiaKuwait16673251066.3903
402990561North AmericaJamaica12500311840.4168
413045191AsiaArmenia28624801795.8918
423047987EuropeAlbania29655575543.1869
433068243AsiaMongolia1540264832849.55
443351827North AmericaPuerto Rico9253937792.62798
453360148South AmericaUruguay176951396575.409
463424386AsiaOman310171982405.565
473474121EuropeMoldova32231982684.4942
483500000AfricaSomaliland168036707912.611
493753142North AmericaPanama75581062587.0651
503758571AfricaMauritania1057124493616.92
513856181EuropeBosnia and Herz.50502366239.6082
524292095EuropeCroatia57402973305.2149
534510327OceaniaNew Zealand277235027321.0
544543126AsiaPalestine5040777640.68459
554689021AfricaLiberia98630883299.1773
564926330AsiaGeorgia68940478573.1486
574930258North AmericaCosta Rica54052091510.5747
584954674AfricaCongo341198516478.108
595011102EuropeIreland58217540791.3531
605320045EuropeNorway395342753995.474
615351277AsiaTurkmenistan480403612700.583
625445829EuropeSlovakia46922595108.0149
635491218EuropeFinland339068193512.295
645605948EuropeDenmark42557150895.5091
655625118AfricaCentral African Rep.624537721850.979
665789122AsiaKyrgyzstan195599523413.578
675918919AfricaEritrea119741273687.045
686025951North AmericaNicaragua130035380109.343
696072475AsiaUnited Arab Emirates80064396581.7372
706163195AfricaSierra Leone76298870279.8167
716172011North AmericaEl Salvador20971345487.6572
726229794AsiaLebanon10102103572.6829
736653210AfricaLibya1636524223070.74
746909701OceaniaPapua New Guinea466521361111.786
756943739South AmericaParaguay402294650565.806
767101510EuropeBulgaria110029924011.393
777111024EuropeSerbia76232737912.4867
787126706AsiaLaos229798138060.103
797531386AfricaSomalia486443125671.911
807965055AfricaTogo61222637778.6997
818236303EuropeSwitzerland46063257324.0069
828299706AsiaIsrael23010251034.7033
838468555AsiaTajikistan138010387019.757
848754413EuropeAustria84824085671.2045
859038741North AmericaHonduras114200451680.34
869549747EuropeBelarus208097501687.829
879850845EuropeHungary92223563486.8061
889960487EuropeSweden447872323030.927
899961396AsiaAzerbaijan91012352138.7241
9010248069AsiaJordan89284998136.0192
9110646714North AmericaHaiti28628396092.704
9210674723EuropeCzechia80936955801.5538
9310734247North AmericaDominican Rep.48306262335.5482
9410768477EuropeGreece131854192993.243
9510839514EuropePortugal93318040232.9742
9611038805AfricaBenin117478842076.243
9711138234South AmericaBolivia1088907530802.77
9811147407North AmericaCuba115172058712.562
9911403800AfricaTunisia156277140541.341
10011466756AfricaBurundi26355519798.6781
Rows: 1-100 | Columns: 6

We can now export our vDataFrame as a GeoPandas DataFrame.

In [51]:
df = world.to_geopandas(geometry = "geometry")
display(df)
pop_est continent country geography area geometry
0 140 Seven seas (open ocean) Fr. S. Antarctic Lands b'\\251UY\\305\\300\\000\\000\\000\\000\\000\\... 1.156510e+10 POLYGON ((68.93500 -48.62500, 69.58000 -48.940...
1 2931 South America Falkland Is. b'\\203\\377\\363\\304\\000\\000\\000\\000\\00... 1.630198e+10 POLYGON ((-61.20000 -51.85000, -60.00000 -51.2...
2 4050 Antarctica Antarctica b'\\200\\000\\000\\000\\000\\000\\000\\000\\00... 1.223625e+13 MULTIPOLYGON (((-48.66062 -78.04702, -48.15140...
3 57713 North America Greenland b'\\226\\252\\240\\000\\000\\000\\000\\000\\00... 2.189751e+12 POLYGON ((-46.76379 82.62796, -43.40644 83.225...
4 265100 Asia N. Cyprus b'\\274\\000\\014\\000\\000\\000\\000\\000\\00... 3.786381e+09 POLYGON ((32.73178 35.14003, 32.80247 35.14550...
... ... ... ... ... ... ...
172 207353391 South America Brazil b'\\200\\000\\000\\000\\000\\000\\000\\000\\00... 8.540950e+12 POLYGON ((-53.37366 -33.76838, -53.65054 -33.2...
173 260580739 Asia Indonesia b'\\200\\000\\000\\000\\000\\000\\000\\000\\00... 1.827304e+12 MULTIPOLYGON (((141.00021 -2.60015, 141.01706 ...
174 326625791 North America United States of America b'\\226\\252\\240\\000\\000\\000\\000\\000\\00... 9.494301e+12 MULTIPOLYGON (((-122.84000 49.00000, -120.0000...
175 1281935911 Asia India b'\\274\\000\\000\\000\\000\\000\\000\\000\\00... 3.150427e+12 POLYGON ((97.32711 28.26158, 97.40256 27.88254...
176 1379302771 Asia China b'\\274\\000\\000\\000\\000\\000\\000\\000\\00... 9.408026e+12 MULTIPOLYGON (((109.47521 18.19770, 108.65521 ...

177 rows × 6 columns

From there, we can draw any geospatial object.

In [52]:
ax = df.plot(edgecolor = "black",
             color = "white",
             figsize = (10, 9))
In [53]:
# Loading the cities dataset
cities = load_cities()

import verticapy.stats as st
import matplotlib.pyplot as plt

# Creating a Matplotlib figure
fig, ax = plt.subplots()
fig.set_size_inches(11, 8)

# Extracting longitude and latitude
cities["lon"] = st.apply("st_x", cities["geometry"])
cities["lat"] = st.apply("st_y", cities["geometry"])

# Drawing the data on a Map
ax = cities.scatter(["lon", "lat"], ax = ax)
df.plot(edgecolor = "black",
        color = "white",
        ax = ax)
Out[53]:
<AxesSubplot:xlabel='"lon"', ylabel='"lat"'>