Loading...

verticapy.sql.geo.create_index#

verticapy.sql.geo.create_index(vdf: str | vDataFrame, gid: str, g: str, index: str, overwrite: bool = False, max_mem_mb: int = 256, skip_nonindexable_polygons: bool = False) TableSample#

Creates a spatial index on a set of polygons to speed up spatial intersection with a set of points.

Parameters#

vdf: SQLRelation

vDataFrame used to compute the spatial join.

gid: str

Name of an integer column that uniquely identifies the polygon. The gid cannot be NULL.

g: str

Name of a geometry or geography (WGS84) column or expression that contains polygons and multipolygons. Only polygon and multipolygon can be indexed. Other shape types are excluded from the index.

index: str

Name of the index.

overwrite: bool, optional

BOOLEAN value that specifies whether to overwrite the index, if an index exists.

max_mem_mb: int, optional

A positive integer that assigns a limit to the amount of memory in megabytes that create_index can allocate during index construction.

skip_nonindexable_polygons: bool, optional

In rare cases, intricate polygons (for instance, those with too high resolution or anomalous spikes) cannot be indexed. These polygons are considered non-indexable. When set to False, non-indexable polygons cause the index creation to fail. When set to True, index creation can succeed by excluding non-indexable polygons from the index.

Returns#

TableSample

geospatial indexes.

Examples#

For this example, we will use the Cities and World dataset.

import verticapy.datasets as vpd

cities = vpd.load_cities()
world = vpd.load_world()
Abc
city
Varchar(82)
Abc
Long varchar(2411724)
1Abidjan
2Abu Dhabi
3Abuja
4Accra
5Addis Ababa
6Algiers
7Amman
8Amsterdam
9Andorra
10Ankara
11Antananarivo
12Apia
13Ashgabat
14Asmara
15Astana
16Asuncion
17Athens
18Baghdad
19Baku
20Bamako
21Bandar Seri Begawan
22Bangkok
23Bangui
24Banjul
25Basseterre
26Beijing
27Beirut
28Belgrade
29Belmopan
30Berlin
31Bern
32Bishkek
33Bissau
34Bloemfontein
35Bogota
36Brasilia
37Bratislava
38Brazzaville
39Bridgetown
40Brussels
41Bucharest
42Budapest
43Buenos Aires
44Bujumbura
45Cairo
46Canberra
47Cape Town
48Caracas
49Castries
50Chisinau
51Colombo
52Conakry
53Cotonou
54Dakar
55Damascus
56Dar es Salaam
57Dhaka
58Dili
59Djibouti
60Doha
61Dublin
62Dushanbe
63Freetown
64Funafuti
65Gaborone
66Georgetown
67Guatemala
68Hanoi
69Harare
70Hargeysa
71Havana
72Helsinki
73Honiara
74Islamabad
75Jakarta
76Jerusalem
77Johannesburg
78Juba
79Kabul
80Kampala
81Kathmandu
82Khartoum
83Kiev
84Kigali
85Kingston
86Kingstown
87Kinshasa
88Kuala Lumpur
89Kuwait
90København
91La Paz
92Libreville
93Lilongwe
94Lima
95Lisbon
96Ljubljana
97Lome
98London
99Luanda
100Lusaka
Rows: 1-100 | Columns: 2
123
pop_est
Integer
Abc
continent
Varchar(32)
Abc
country
Varchar(82)
Abc
Long varchar(2411724)
1140Seven seas (open ocean)Fr. S. Antarctic Lands
22931South AmericaFalkland Is.
34050AntarcticaAntarctica
457713North AmericaGreenland
5265100AsiaN. Cyprus
6279070OceaniaNew Caledonia
7282814OceaniaVanuatu
8329988North AmericaBahamas
9339747EuropeIceland
10360346North AmericaBelize
11443593AsiaBrunei
12591919South AmericaSuriname
13594130EuropeLuxembourg
14603253AfricaW. Sahara
15642550EuropeMontenegro
16647581OceaniaSolomon Is.
17737718South AmericaGuyana
18758288AsiaBhutan
19778358AfricaEq. Guinea
20865267AfricaDjibouti
21920938OceaniaFiji
221218208North AmericaTrinidad and Tobago
231221549AsiaCyprus
241251581EuropeEstonia
251291358AsiaTimor-Leste
261467152AfricaeSwatini
271772255AfricaGabon
281792338AfricaGuinea-Bissau
291895250EuropeKosovo
301944643EuropeLatvia
311958042AfricaLesotho
321972126EuropeSlovenia
332051363AfricaGambia
342103721EuropeMacedonia
352214858AfricaBotswana
362314307AsiaQatar
372484780AfricaNamibia
382823859EuropeLithuania
392875422AsiaKuwait
402990561North AmericaJamaica
413045191AsiaArmenia
423047987EuropeAlbania
433068243AsiaMongolia
443351827North AmericaPuerto Rico
453360148South AmericaUruguay
463424386AsiaOman
473474121EuropeMoldova
483500000AfricaSomaliland
493753142North AmericaPanama
503758571AfricaMauritania
513856181EuropeBosnia and Herz.
524292095EuropeCroatia
534510327OceaniaNew Zealand
544543126AsiaPalestine
554689021AfricaLiberia
564926330AsiaGeorgia
574930258North AmericaCosta Rica
584954674AfricaCongo
595011102EuropeIreland
605320045EuropeNorway
615351277AsiaTurkmenistan
625445829EuropeSlovakia
635491218EuropeFinland
645605948EuropeDenmark
655625118AfricaCentral African Rep.
665789122AsiaKyrgyzstan
675918919AfricaEritrea
686025951North AmericaNicaragua
696072475AsiaUnited Arab Emirates
706163195AfricaSierra Leone
716172011North AmericaEl Salvador
726229794AsiaLebanon
736653210AfricaLibya
746909701OceaniaPapua New Guinea
756943739South AmericaParaguay
767101510EuropeBulgaria
777111024EuropeSerbia
787126706AsiaLaos
797531386AfricaSomalia
807965055AfricaTogo
818236303EuropeSwitzerland
828299706AsiaIsrael
838468555AsiaTajikistan
848754413EuropeAustria
859038741North AmericaHonduras
869549747EuropeBelarus
879850845EuropeHungary
889960487EuropeSweden
899961396AsiaAzerbaijan
9010248069AsiaJordan
9110646714North AmericaHaiti
9210674723EuropeCzechia
9310734247North AmericaDominican Rep.
9410768477EuropeGreece
9510839514EuropePortugal
9611038805AfricaBenin
9711138234South AmericaBolivia
9811147407North AmericaCuba
9911403800AfricaTunisia
10011466756AfricaBurundi
Rows: 1-100 | Columns: 4

Note

VerticaPy offers a wide range of sample datasets that are ideal for training and testing purposes. You can explore the full list of available datasets in the Datasets, which provides detailed information on each dataset and how to use them effectively. These datasets are invaluable resources for honing your data analysis and machine learning skills within the VerticaPy environment.

Let’s preprocess the datasets by extracting latitude and longitude values and creating an index.

world["id"] = "ROW_NUMBER() OVER(ORDER BY country, pop_est)"
display(world)

cities["id"] = "ROW_NUMBER() OVER (ORDER BY city)"
cities["lat"] = "ST_X(geometry)"
cities["lon"] = "ST_Y(geometry)"
display(cities)
123
pop_est
Int
100%
...
🌎
Geometry(1048576)
100%
123
id
Integer
100%
134124811...1
23047987...2
340969443...3
429310273...4
54050...5
644293293...6
73045191...7
823232413...8
98754413...9
109961396...10
11329988...11
12157826578...12
139549747...13
1411491346...14
15360346...15
1611038805...16
17758288...17
1811138234...18
193856181...19
202214858...20
Abc
city
Varchar(82)
100%
...
🌎
lat
Float(22)
100%
🌎
lon
Float(22)
100%
1Abidjan...-4.041994118507095.32194282609856
2Abu Dhabi...54.36659338259224.4666835723799
3Abuja...7.531382142932339.0852790077542
4Accra...-0.2186615989606945.55198046444593
5Addis Ababa...38.69805857534879.03525622129575
6Algiers...3.0486066709092436.7650106566281
7Amman...35.931354066874131.9519711058275
8Amsterdam...4.9146943174009752.3519145466644
9Andorra...1.5164859605055242.5000014435459
10Ankara...32.862445782356639.9291844440755
11Antananarivo...47.5146780415299-18.9146914920322
12Apia...-171.738641608603-13.8415450424484
13Ashgabat...58.383299111774637.949994933111
14Asmara...38.933323525759315.3333392526819
15Astana...71.42777420948351.1811253042576
16Asuncion...-57.6434510279013-25.2944571170577
17Athens...23.731375225679437.9852720905523
18Baghdad...44.391922914564133.3405943561586
19Baku...49.860271303257840.397217891343
20Bamako...-8.001984963249712.6519605263233

Let’s create the geo-index.

from verticapy.sql.geo import create_index

create_index(world, "id", "geometry", "world_polygons", True)
Abc
type
Varchar(20)
...
123
max_y
Float(22)
Abc
info
Varchar(500)
1GEOMETRY...83.64513

Let’s calculate the intersection between the cities and the various countries by using the GEOMETRY data type.

from verticapy.sql.geo import intersect

intersect(cities, "world_polygons", "id", "geometry")
123
point_id
Integer
100%
123
polygon_gid
Integer
100%
150102
251147
35265
45316
554136
655152
756155
8129100
9131118
1013228
1113324
12135122
13136149
14140
152165
163116
17461
18551
1963
20781

The same can be done using directly the longitude and latitude.

intersect(cities, "world_polygons", "id", x="lat", y="lon")
123
point_id
Integer
100%
123
polygon_gid
Integer
100%
141131
24270
3436
44425
54546
6468
747144
848171
912283
10124107
1112530
1212672
13127115
14128108
152017
1620236
17140
182165
193116
20461

Note

For geospatial functions, Vertica utilizes indexing to expedite computations, especially considering the potentially extensive size of polygons. This is a unique optimization approach employed by Vertica in these scenarios.

See also

describe_index() : Describes the geo index.
intersect() : Spatially intersects a point or points with a set of polygons.
rename_index() : Renames the geo index.