Best practices#

In this tutorial, we will explore some best practices and optimizations to help you get the most out of Vertica and VerticaPy.

1. Restrict objects and operations to essential columns#

As VerticaPy is effectively an abstraction of SQL, any database-level optimizations you make in your Vertica database carry over to VerticaPy. In Vertica, optimization is centered on projections, which are collections of table columns—from one or more tables—stored on disk in a format that optimizes query execution. When you write queries in terms of the original tables, the query uses the projections to return query results. For details about creating and designing projections, see the Projections section in the Vertica documentation.

Projections are created and managed in the Vertica database, but you can leverage the power of projections in VerticaPy with features such as the vDataFrame’s usecols parameter, which specifies the columns from the input relation to include in the vDataFrame. As columnar databases perform better when there are fewer columns in the query, especially when you are working with large datasets, limiting vDataFrames and operations to essential columns can lead to a significant performance improvement. By default, most vDataFrame methods use all numerical columns in the vDataFrame, but you can restrict the operation to specific columns.

In the following examples, we’ll demonstrate how to create a vDataFrame from specific columns in the input relation, and then run methods on that vDataFrame. First, load the titanic dataset into Vertica using the load_titanic() function:

[2]:
from verticapy.datasets import load_titanic

load_titanic()
[2]:
123
pclass
Integer
123
survived
Integer
Abc
Varchar(164)
Abc
sex
Varchar(20)
123
age
Numeric(8)
123
sibsp
Integer
123
parch
Integer
Abc
ticket
Varchar(36)
123
fare
Numeric(12)
Abc
cabin
Varchar(30)
Abc
embarked
Varchar(20)
Abc
boat
Varchar(100)
123
body
Integer
Abc
Varchar(100)
110male47.010PC 17757227.525C62 C64C[null]124
210male24.001PC 17558247.5208B58 B60C[null][null]
310male25.0001390526.0[null]C[null]148
410male42.00011048926.55D22S[null][null]
510male45.00011305026.55B38S[null][null]
610male46.010W.E.P. 573461.175E31S[null][null]
710male64.01419950263.0C23 C25 C27S[null][null]
810male[null]0011379642.4[null]S[null][null]
910male32.500113503211.5C132C[null]45
1010male55.00011378730.5C30S[null][null]
1110male37.001PC 1759629.7C118C[null][null]
1210male64.00069326.0[null]S[null]263
1310male56.00011379226.55[null]S[null][null]
1410male24.0101369560.0C31S[null][null]
1511male0.9212113781151.55C22 C26S11[null]
1611female18.010PC 17757227.525C62 C64C4[null]
1711male80.0002704230.0A23SB[null]
1811female42.000PC 17757227.525[null]C4[null]
1911male25.0101196791.0792B49C7[null]
2011female45.000PC 17608262.375[null]C4[null]
2111female22.00111350555.0E33S6[null]
2211female[null]001777027.7208[null]C5[null]
2311female22.000113781151.55[null]S11[null]
2411female64.002PC 1775683.1583E45C14[null]
2511female54.0103694778.2667D20C4[null]
2611male43.0101776527.7208D40C5[null]
2711female22.0021356849.5B39C5[null]
2811male23.001PC 1775963.3583D10 D12C7[null]
2911female35.01011378952.0[null]S8[null]
3011female[null]101746451.8625D21S8[null]
3111male42.0101175352.5542D19S5[null]
3211female45.0101175352.5542D19S5[null]
3311female16.001PC 1759239.4D28S9[null]
3411female21.0001350277.9583D9S10[null]
3511male36.000PC 1747326.2875E25S7[null]
3611male[null]00F.C. 1299825.7417[null]C7[null]
3711female33.000PC 1761327.7208A11C11[null]
3811female48.013PC 17608262.375B57 B59 B63 B66C4[null]
3911female23.0102122882.2667B45S7[null]
4011female31.00236928164.8667C7S8[null]
4120male57.00024434613.0[null]S[null][null]
4220female29.010SC/AH 2903726.0[null]S[null][null]
4320male29.000W./C. 1426310.5[null]S[null][null]
4420female30.00023724913.0[null]S[null][null]
4520male24.00024872613.5[null]S[null]297
4620male30.00025064613.0[null]S[null]305
4720male52.00025064713.0[null]S[null]19
4820male44.0102670726.0[null]S[null][null]
4920male57.00021953312.35[null]Q[null][null]
5020male32.00023721613.5[null]S[null]209
5120male70.000C.A. 2458010.5[null]S[null][null]
5220male54.0002901114.0[null]S[null][null]
5320male16.000S.O./P.P. 310.5[null]S[null][null]
5420male62.0002402769.6875[null]Q[null][null]
5520male27.000SC/PARIS 216815.0333[null]C[null][null]
5620male36.000C.A. 1724810.5[null]S[null][null]
5720female27.0101166821.0[null]S[null][null]
5820male27.01022841426.0[null]S[null]293
5920male66.000C.A. 2457910.5[null]S[null][null]
6021female48.002C.A. 3311236.75[null]S14[null]
6121female30.010SC/PARIS 214813.8583[null]C12[null]
6221female34.00024388013.0[null]S12[null]
6321female24.00225064914.5[null]S4[null]
6421female48.01222084565.0[null]S9[null]
6521female3.012SC/Paris 212341.5792[null]C14[null]
6621male3.01123008026.0F2SD[null]
6721female2.0112636026.0[null]S11[null]
6821female18.00225065213.0[null]S16[null]
6930male30.000C 70767.25[null]S[null]72
7030female40.01075469.475[null]S[null][null]
7130male35.0003734508.05[null]S[null][null]
7230female18.001269114.4542[null]C[null][null]
7330male[null]0026647.225[null]C[null][null]
7430female32.01136484915.5[null]Q[null][null]
7530male22.0003500457.7958[null]S[null][null]
7630male35.0003645128.05[null]S[null][null]
7730male[null]10268914.4583[null]C[null][null]
7830female[null]10268914.4583[null]C[null][null]
7930male31.0003350977.75[null]Q[null][null]
8030female22.000755210.5167[null]S[null][null]
8130male0.330234708014.4[null]S[null][null]
8230male34.01134708014.4[null]S[null]197
8330female28.01134708014.4[null]S[null][null]
8430male25.0003492037.8958[null]S[null][null]
8530male25.0003492507.8958[null]S[null][null]
8630male[null]003492387.8958[null]S[null][null]
8730male[null]003492257.8958[null]S[null][null]
8830male22.000A/5 211727.25[null]S[null][null]
8930male[null]0026747.225[null]C[null][null]
9030male[null]0026317.225[null]C[null][null]
9130male40.500C.A. 621215.1[null]S[null]187
9230male40.5003672327.75[null]Q[null]68
9330male18.0003500367.7958[null]S[null][null]
9430female[null]003648597.75[null]Q[null][null]
9530male[null]003492547.8958[null]C[null][null]
9630male40.016CA 214446.9[null]S[null][null]
9730male51.000214408.05[null]S[null][null]
9830male26.0103500257.8542[null]S[null][null]
9930female[null]003826497.75[null]Q[null][null]
10030male[null]003492207.8958[null]S[null][null]
Rows: 1-100 | Columns: 14

Supposing we are only interested in the ‘survived’, ‘pclass’, ‘age’, ‘parch’, and ‘sibsp’ columns, we can create a vDataFrame with just those columns by specifying them in the usecols parameter:

[3]:
import verticapy as vp

vdf = vp.vDataFrame("public.titanic",
                    usecols = ["survived", "pclass", "age", "parch", "sibsp"])
display(vdf)
123
pclass
Integer
123
survived
Integer
123
age
Numeric(8)
123
sibsp
Integer
123
parch
Integer
11071.000
21017.000
31027.010
41037.011
510[null]00
61031.010
710[null]00
81037.010
91024.000
101045.010
111042.000
1210[null]00
131041.010
1410[null]00
151029.000
161047.000
171058.002
181050.010
191057.010
201064.010
211051.001
221153.020
231136.001
241114.012
2511[null]01
261136.002
2711[null]00
281135.000
291122.001
301125.010
311135.010
321139.000
331137.010
341130.010
351131.010
361122.010
371143.001
381133.000
391135.000
401148.010
411160.010
422044.010
432029.010
442036.000
452030.000
462044.000
472021.000
482047.000
492022.020
502031.000
5120[null]00
522035.000
532041.000
54214.021
55210.8302
562117.000
572124.011
582150.001
592133.002
602130.030
612131.000
622131.000
632112.000
643042.000
653026.000
663032.000
673022.000
68306.011
69309.011
703019.000
713028.000
723029.000
733020.000
743027.000
753030.000
763027.000
773022.000
783032.000
793023.010
803016.000
813018.022
823033.011
833041.000
843019.000
853021.000
863022.000
873029.000
883030.000
893022.020
9030[null]00
913036.010
9230[null]00
9330[null]00
943055.500
953021.000
9630[null]00
973022.000
9830[null]00
993028.000
1003020.000
Rows: 1-100 | Columns: 5

If we run the avg() method without specifying columns, all numerical vDataFrame columns are included in the operation:

NOTE: To examine the generated SQL for each command, turn on the “sql_on” option.

[4]:
vp.set_option("sql_on", True)
vdf.avg()

Computing the different aggregations.

  SELECT
    /*+LABEL('vDataframe.aggregate')*/ AVG("pclass"),
    AVG("survived"),
    AVG("age"),
    AVG("sibsp"),
    AVG("parch")  
  FROM
(  
  SELECT
    "pclass",
    "survived",
    "age",
    "sibsp",
    "parch"  
  FROM
"public"."titanic")  
VERTICAPY_SUBTABLE  
  LIMIT 1
[4]:
avg
"pclass"2.28444084278768
"survived"0.364667747163695
"age"30.1524573721163
"sibsp"0.504051863857374
"parch"0.378444084278768
Rows: 1-5 | Columns: 2

To restrict the operation to specific columns in the vDataFrame, provide the column names in the columns parameter:

[5]:
vdf.avg(columns = ["age", "survived"])
[5]:
avg
"age"30.1524573721163
"survived"0.364667747163695
Rows: 1-2 | Columns: 2

As we are working with a small dataset, the perfomance impact of excluding unncessary columns is not very significant. However, with large datasets (e.g. greater than a TB), the impact is much greater, and choosing essential columns becomes a key step in improving performance.

Instead of specifying essential columns to include, some methods allow you to list the columns to exclude with the exclude_columns parameter:

[6]:
vdf.numcol(exclude_columns = ["parch", "sibsp"])
[6]:
['"pclass"', '"survived"', '"age"']

NOTE: To list all columns in a vDataFrame, including non-numerical columns, use the get_columns() method.

You can then use this truncated list of columns in another method call; for instance, to compute a correlation matrix:

[7]:
vdf.corr(columns = vdf.numcol(exclude_columns = ["parch", "sibsp"]))

Computing the pearson Corr Matrix.

  SELECT
    /*+LABEL('vDataframe._aggregate_matrix')*/ CORR_MATRIX("pclass", "survived", "age") OVER ()  
  FROM
(  
  SELECT
    "pclass",
    "survived",
    "age",
    "sibsp",
    "parch"  
  FROM
"public"."titanic")  
VERTICAPY_SUBTABLE
../../../_images/notebooks_introduction_best-practices_index_17_2.png

To turn off the SQL code generation option:

[8]:
vp.set_option("sql_on", False)

2. Save the current relation#

The vDataFrame works like a view, a stored query that encapsulates one or more SELECT statements. If the generated relation uses many different functions, the computation time for each method call is greatly increased.

Small transformations don’t drastically slow down computation, but heavy transformations (multiple joins, frequent use of advanced analytical funcions, moving windows, etc.) can result in noticeable slowdown. When performing computationally expensive operations, you can aid performance by saving the vDataFrame structure as a table in the Vertica database. We will demonstrate this process in the following example.

First, create a vDataFrame, then perform some operations on that vDataFrame:

[9]:
vdf = vp.vDataFrame("public.titanic")
vdf["sex"].label_encode()["boat"].fillna(method = "0ifnull")["name"].str_extract(
    ' ([A-Za-z]+)\.').eval("family_size", expr = "parch + sibsp + 1").drop(
    columns = ["cabin", "body", "ticket", "home.dest"])["fare"].fill_outliers().fillna()
print(vdf.current_relation())
795 elements were filled.
(
   SELECT
     "pclass",
     "survived",
     "name",
     "sex",
     "age",
     "sibsp",
     "parch",
     COALESCE("fare", 32.9113074018842) AS "fare",
     "embarked",
     "boat",
     "family_size"
   FROM
 (
   SELECT
     "pclass",
     "survived",
     REGEXP_SUBSTR("name", ' ([A-Za-z]+)\.') AS "name",
     DECODE("sex", 'female', 0, 'male', 1, 2) AS "sex",
     COALESCE("age", 30.1524573721163) AS "age",
     "sibsp",
     "parch",
     (CASE WHEN "fare" < -176.6204982585513 THEN -176.6204982585513 WHEN "fare" > 244.5480856064831 THEN 244.5480856064831 ELSE "fare" END) AS "fare",
     COALESCE("embarked", 'S') AS "embarked",
     DECODE("boat", NULL, 0, 1) AS "boat",
     parch + sibsp + 1 AS "family_size"
   FROM
 (

   SELECT

                    "pclass",
     "survived",
     "name",
     "sex",
     "age",
     "sibsp",
     "parch",
     "fare",
     "embarked",
     "boat"

   FROM
 "public"."titanic")
VERTICAPY_SUBTABLE)
VERTICAPY_SUBTABLE)
VERTICAPY_SUBTABLE

To understand how Vertica executes the different aggregations in the above relation, let’s take a look at the query plan:

NOTE: Query plans can be hard to interpret if you don’t know how to parse them. For more information, see query plan information and structure.

[10]:
print(vdf.explain())
------------------------------
QUERY PLAN DESCRIPTION:

EXPLAIN SELECT /*+LABEL('vDataframe.explain')*/ * FROM (SELECT "pclass", "survived", "name", "sex", "age", "sibsp", "parch", COALESCE("fare", 32.9113074018842) AS "fare", "embarked", "boat", "family_size" FROM (SELECT "pclass", "survived", REGEXP_SUBSTR("name", ' ([A-Za-z]+)\.') AS "name", DECODE("sex", 'female', 0, 'male', 1, 2) AS "sex", COALESCE("age", 30.1524573721163) AS "age", "sibsp", "parch", (CASE WHEN "fare" < -176.6204982585513 THEN -176.6204982585513 WHEN "fare" > 244.5480856064831 THEN 244.5480856064831 ELSE "fare" END) AS "fare", COALESCE("embarked", 'S') AS "embarked", DECODE("boat", NULL, 0, 1) AS "boat", parch + sibsp + 1 AS "family_size" FROM ( SELECT "pclass", "survived", "name", "sex", "age", "sibsp", "parch", "fare", "embarked", "boat" FROM "public"."titanic") VERTICAPY_SUBTABLE) VERTICAPY_SUBTABLE) VERTICAPY_SUBTABLE

Access Path:
+-STORAGE ACCESS for titanic [Cost: 129, Rows: 1K (NO STATISTICS)] (PATH ID: 1)
|  Projection: public.titanic_super
|  Materialize: titanic.pclass, titanic.survived, titanic.name, titanic.sex, titanic.age, titanic.sibsp, titanic.parch, titanic.fare, titanic.embarked, titanic.boat
|  Execute on: v_vertica_eon_node0001, v_vertica_eon_node0002, v_vertica_eon_node0003


-----------------------------------------------
PLAN: BASE QUERY PLAN (GraphViz Format)
-----------------------------------------------
digraph G {
graph [rankdir=BT, label = "BASE QUERY PLAN
        Query: EXPLAIN SELECT /*+LABEL(\'vDataframe.explain\')*/ * FROM (SELECT \"pclass\", \"survived\", \"name\", \"sex\", \"age\", \"sibsp\", \"parch\", COALESCE(\"fare\", 32.9113074018842) AS \"fare\", \"embarked\", \"boat\", \"family_size\" FROM (SELECT \"pclass\", \"survived\", REGEXP_SUBSTR(\"name\", \' ([A-Za-z]+)\.\') AS \"name\", DECODE(\"sex\", \'female\', 0, \'male\', 1, 2) AS \"sex\", COALESCE(\"age\", 30.1524573721163) AS \"age\", \"sibsp\", \"parch\", (CASE WHEN \"fare\" \< -176.6204982585513 THEN -176.6204982585513 WHEN \"fare\" \> 244.5480856064831 THEN 244.5480856064831 ELSE \"fare\" END) AS \"fare\", COALESCE(\"embarked\", \'S\') AS \"embarked\", DECODE(\"boat\", NULL, 0, 1) AS \"boat\", parch + sibsp + 1 AS \"family_size\" FROM ( SELECT \"pclass\", \"survived\", \"name\", \"sex\", \"age\", \"sibsp\", \"parch\", \"fare\", \"embarked\", \"boat\" FROM \"public\".\"titanic\") VERTICAPY_SUBTABLE) VERTICAPY_SUBTABLE) VERTICAPY_SUBTABLE

        All Nodes Vector:

          node[0]=v_vertica_eon_node0001 (initiator) Up
          node[1]=v_vertica_eon_node0002 (executor) Up
          node[2]=v_vertica_eon_node0003 (executor) Up
          node[3]=v_vertica_eon_node0004 (executor) Down
          node[4]=v_vertica_eon_node0005 (executor) Down
          node[5]=v_vertica_eon_node0006 (executor) Down
          node[6]=v_vertica_eon_node0007 (executor) Down
          node[7]=v_vertica_eon_node0008 (executor) Down
          node[8]=v_vertica_eon_node0009 (executor) Down

        Participating subscriptions:

          replica: [<1, 2, 0>]
          segment0001: [<0>]
          segment0002: [<1>]
          segment0003: [<2>]
          segment0004: [<0>]
          segment0005: [<1>]
          segment0006: [<2>]
          segment0007: [<0>]
          segment0008: [<1>]
          segment0009: [<2>]
        ", labelloc=t, labeljust=l ordering=out]
0[label = "Root
        OutBlk=[UncTuple(11)]", color = "green", shape = "house"];
1[label = "NewEENode
        OutBlk=[UncTuple(11)]", color = "green", shape = "box"];
2[label = "Recv
        Recv from: v_vertica_eon_node0001, v_vertica_eon_node0002, v_vertica_eon_node0003
        Net id: 1000

        Unc: Integer(8)
        Unc: Integer(8)
        Unc: Varchar(164)
        Unc: Integer(8)
        Unc: Numeric(16, 13)
        Unc: Integer(8)
        Unc: Integer(8)
        Unc: Numeric(18, 13)
        Unc: Varchar(20)
        Unc: Integer(8)
        Unc: Integer(8)", color = "green", shape = "box"];
3[label = "Send
        Send to: v_vertica_eon_node0001
        Net id: 1000

        Unc: Integer(8)
        Unc: Integer(8)
        Unc: Varchar(164)
        Unc: Integer(8)
        Unc: Numeric(16, 13)
        Unc: Integer(8)
        Unc: Integer(8)
        Unc: Numeric(18, 13)
        Unc: Varchar(20)
        Unc: Integer(8)
        Unc: Integer(8)", color = "green", shape = "box"];
4[label = "StorageUnionStep: titanic_super
        Unc: Integer(8)
        Unc: Integer(8)
        Unc: Varchar(164)
        Unc: Integer(8)
        Unc: Numeric(16, 13)
        Unc: Integer(8)
        Unc: Integer(8)
        Unc: Numeric(18, 13)
        Unc: Varchar(20)
        Unc: Integer(8)
        Unc: Integer(8)", color = "purple", shape = "box"];
5[label = "ExprEval:
          titanic.pclass
          titanic.survived
          regexp_substr(titanic.name, E\' ([A-Za-z]+)\\.\', 1, 1, \'\', 0)
          CASE titanic.sex WHEN NULLSEQUAL \'female\' THEN 0 WHEN NULLSEQUAL \'male\' THEN 1 ELSE 2 END
          coalesce(titanic.age, 30.1524573721163)
          titanic.sibsp
          titanic.parch
          coalesce(CASE WHEN (titanic.fare \< (-176.6204982585513)) THEN (-176.6204982585513) WHEN (titanic.fare \> 244.5480856064831) THEN 244.5480856064831 ELSE titanic.fare END, 32.9113074018842)
          coalesce(titanic.embarked, \'S\')
          CASE titanic.boat WHEN NULLSEQUAL NULL THEN 0 ELSE 1 END
          ((titanic.parch + titanic.sibsp) + 1)
        Unc: Integer(8)
        Unc: Integer(8)
        Unc: Varchar(164)
        Unc: Integer(8)
        Unc: Numeric(16, 13)
        Unc: Integer(8)
        Unc: Integer(8)
        Unc: Numeric(18, 13)
        Unc: Varchar(20)
        Unc: Integer(8)
        Unc: Integer(8)", color = "brown", shape = "box"];
6[label = "ScanStep: titanic_super
        pclass
        survived
        name
        sex
        age
        sibsp
        parch
        fare
        embarked
        boat
        Unc: Integer(8)
        Unc: Integer(8)
        Unc: Varchar(164)
        Unc: Varchar(20)
        Unc: Numeric(6, 3)
        Unc: Integer(8)
        Unc: Integer(8)
        Unc: Numeric(10, 5)
        Unc: Varchar(20)
        Unc: Varchar(100)", color = "brown", shape = "box"];
1->0 [label = "V[0] C=11", color = "black", style="bold", arrowtail="inv"];
2->1 [label = "0", color = "blue"];
3->2 [label = "0", color = "blue"];
4->3 [label = "0", color = "blue"];
5->4 [label = "0", color = "blue"];
6->5 [label = "0", color = "blue"];}

Looking at the plan and its associated relation, it’s clear that the transformations we applied to the vDataFrame result in a complicated relation. Each method call to the vDataFrame must use this relation for computation.

NOTE: To better understand your queries, check out the `QueryProfiler <https://www.vertica.com/python/documentation/1.0.x/html/api/verticapy.performance.vertica.qprof.QueryProfiler.html>`__ function.

To save the relation as a table in the Vertica and replace the current relation in VerticaPy with the new table relation, use the `to_db() <https://www.vertica.com/python/documentation/1.0.x/html/verticapy.vDataFrame.to_db.html>`__ method with the inplace parameter set to True:

[11]:
vp.drop("public.titanic_clean", method = "table") # drops any existing table with the same schema and name
vdf.to_db("public.titanic_clean",
          relation_type = "table",
          inplace = True)
print(vdf.current_relation())
"public"."titanic_clean"

When dealing with very large datasets, it’s best to take caution before saving relations with complicated transformations. Ideally, you will perform a thorough data exploration, and only execute heavy transformations when essential.

3. Use the help function#

For a quick and convenient way to view information about an object or function, use the help() function:

[12]:
help(vp.connect)
Help on function connect in module verticapy.connection.connect:

connect(section: str, dsn: Optional[str] = None) -> None
    Connects to the database.

    Parameters
    ----------
    section: str
        Name of the section in the
        configuration file.
    dsn: str, optional
        Path to the file containing
        the credentials. If empty,
        the Connection File will be
        used.

    Examples
    --------
    Display all available connections:

    .. code-block:: python

        from verticapy.connection import available_connections

        available_connections()

    ``['VML', 'VerticaDSN', 'VerticaDSN_test']``

    Connect using the VerticaDSN connection:

    .. code-block:: python

        from verticapy.connection import connect

        connect("VerticaDSN")

    .. seealso::

        | :py:func:`~verticapy.connection.available_connections` :
            Displays all available connections.
        | :py:func:`~verticapy.connection.get_connection_file` :
            Gets the VerticaPy connection file.
        | :py:func:`~verticapy.connection.new_connection` :
            Creates a new VerticaPy connection.
        | :py:func:`~verticapy.connection.set_connection` :
            Sets the VerticaPy connection.

4. Close your connections#

Each connection to the database increases the concurrency on the system, so try to close connections when you’re done with them. VerticaPy simplifies the connection process by allowing the user to create an auto-connection, but the closing of connections must be done manually with the close_connection() function.

To demonstrate, create a database connection:

[ ]:
vp.connect("VerticaDSN")

When you are done making changes, close the connection with the close_connection() function:

[ ]:
vp.close_connection()

It is especially important to close connections when you are working in an environment with mutliple users.

5. Consider a method’s time complexity#

Some techniques are significantly more computationally expensive than others. For example, a Kendall correlation is very expensive compared to a Pearson correlation because, unlike Pearson, Kendall correlations use a cross join, resulting in a time complexity of O(n*n) (where n is the number of rows). Let’s compare the time needed to compute these two correlations on the ‘titanic’ dataset:

[13]:
import time

vdf = vp.vDataFrame("public.titanic")

start_time = time.time()
x = vdf.corr(method = "pearson", show = False)
print("Pearson, time: {0}".format(time.time() - start_time))

start_time = time.time()
x = vdf.corr(method = "kendall", show = False)
print("Kendall, time: {0}".format(time.time() - start_time))
Pearson, time: 0.17660832405090332
100%|██████████| 6/6 [00:00<00:00, 2943.03it/s]
Kendall, time: 0.857501745223999

The Kendall correlation matrix takes longer to compute than the Pearson matrix. As we are using a relatively small dataset, this difference is not very large, but it increases in magntiude with the size of the dataset. Taking into account the time complexity of operations is a key step in optimzing performance, esepcially with large datasets.

6. Limit plot elements#

Graphics are an essential tool to understand your data, but they can become difficult to parse if they contain too many elements. VerticaPy provides options that restrict plots to specified elements. To demonstrate, let’s first draw a multi-histogram with a categorical column with thousands of categories:

[3]:
vdf.bar(["name", "survived"])

VerticaPy outputs the bar chart, but the number of categories makes the graph basically incomprehensible. Instead, whenever possible, try to create graphics with as few categories as possible for your use case:

[4]:
vdf.hist(["pclass", "survived"])

To view the cardinality of your variables, use the nunique() method:

[17]:
vdf.nunique()
[17]:
approx_unique
"pclass"3.0
"survived"2.0
"name"1233.0
"sex"2.0
"age"96.0
"sibsp"7.0
"parch"8.0
"ticket"888.0
"fare"275.0
"cabin"181.0
"embarked"3.0
"boat"26.0
"body"118.0
"home.dest"355.0
Rows: 1-14 | Columns: 2

7. Filter unnecessary data#

Filtering your data is a crucial step in data preparation. Proper filtering avoids unnecessary computations and greatly improves the performance of each method call. While the performance impact can be minimal for small datasets, filtering large datasets is key to improving performance.

For example, if we are only interested in analyzing Titanic passengers who didn’t have a lifeboat, we can filter on this requirement using the `filter() <https://www.vertica.com/python/documentation/1.0.x/html/verticapy.vDataFrame.filter.html>`__ method:

[18]:
vdf.filter("boat IS NOT NULL")
795 elements were filtered.
[18]:
123
pclass
Integer
123
survived
Integer
Abc
Varchar(164)
Abc
sex
Varchar(20)
123
age
Numeric(8)
123
sibsp
Integer
123
parch
Integer
Abc
ticket
Varchar(36)
123
fare
Numeric(12)
Abc
cabin
Varchar(30)
Abc
embarked
Varchar(20)
Abc
boat
Varchar(100)
123
body
Integer
Abc
Varchar(100)
111male37.0111175152.5542D35S5[null]
211male26.00011136930.0C148C5[null]
311female35.000PC 17760135.6333C99S8[null]
411female60.0001181376.2917D15C8[null]
511male45.000PC 1759429.7A9C7[null]
611female27.011PC 17558247.5208B58 B60C6[null]
711female35.01011380353.1C123SD[null]
811male53.00011378028.5C51CB[null]
911female19.00011205330.0B42S3[null]
1011female58.001PC 17582153.4625C125S3[null]
1111male25.0101176555.4417E50C5[null]
1211male[null]001698830.0D45S3[null]
1311female18.01011377353.1D30S10[null]
1411female30.0001274993.5B73S3[null]
1511female54.010PC 1760359.4[null]C6[null]
1611female18.022PC 17608262.375B57 B59 B63 B66C4[null]
1711female43.0101177855.4417C116C5[null]
1811female39.01111041379.65E67S8[null]
1911female39.01117421110.8833C68C4[null]
2011female55.000PC 17760135.6333C32C8[null]
2121male1.02123013639.0F4S11[null]
2221female20.01023685326.0[null]S12[null]
2321female45.00223778930.0[null]S12[null]
2421female27.010SC/PARIS 214913.8583[null]C12[null]
2521female54.0132910523.0[null]S4[null]
2621male42.00023779813.0[null]S10[null]
2721female[null]0022659312.35E101Q10[null]
2821female34.000C.A. 3426010.5F33S14[null]
2921male29.000SC/PARIS 214713.8583[null]C9[null]
3021male0.83112910618.75[null]S4[null]
3121male2.0112910323.0[null]S14[null]
3231female19.0103500467.8542[null]S16[null]
3331female22.0003349147.725[null]Q13[null]
3431male45.00075988.05[null]S15[null]
3531male1.012C.A. 231520.575[null]S10[null]
3631male[null]00SOTON/O.Q. 31013087.05[null]S15[null]
3731female45.01035002614.1083[null]S11[null]
3831female27.00139209612.475E121S14[null]
3931female[null]1037111024.15[null]Q16[null]
4031female[null]1036723015.5[null]Q16[null]
4131female26.0003474707.8542[null]S13[null]
4231male25.0003457689.5[null]S11[null]
4331female[null]003354327.7333[null]Q13[null]
4411female32.0001181376.2917D15C8[null]
4511female47.0111175152.5542D35S5[null]
4611female44.000PC 1761027.7208B4C6[null]
4711female36.012113760120.0B96 B98S4[null]
4811female36.000PC 17608262.375B61C4[null]
4911female39.011PC 1775683.1583E49C14[null]
5011female38.010PC 1759971.2833C85C4[null]
5111female33.000113781151.55[null]S8[null]
5211female27.012F.C. 1275052.0B71S3[null]
5311female24.03219950263.0C23 C25 C27S10[null]
5411male49.0101745389.1042C92C5[null]
5511female16.00111136157.9792B18C4[null]
5611female30.000PC 17761106.425[null]C2[null]
5711female35.0101323657.75C28C11[null]
5811female[null]00PC 1758579.2[null]CD[null]
5911male[null]001994735.5C52SD[null]
6021female28.010P/PP 338124.0[null]C10[null]
6121male34.00024869813.0D56S13[null]
6221male24.0002803410.5[null]S9[null]
6321male8.0022822032.5[null]S10[null]
6421female24.02124384727.0[null]S12[null]
6521male2.01123008026.0F2SD[null]
6621male22.000W./C. 1426010.5[null]S13[null]
6721female50.000W./C. 1425810.5[null]S13[null]
6821female50.000F.C.C. 1353110.5[null]S9[null]
6921female4.0112910323.0[null]S14[null]
7031female17.04231012817.925[null]SD[null]
7131female18.000A/4 314168.05[null]SC[null]
7231female[null]1038652516.1[null]S16[null]
7331male[null]00160156.4958[null]S13[null]
7431male[null]00160156.4958[null]SC[null]
7531male26.0003470707.775[null]S15[null]
7631male[null]00160156.4958[null]SC[null]
7731female22.000C 70777.25[null]S13[null]
7831male20.010STON/O 2. 31012857.925[null]S15[null]
7931female24.0003826537.75[null]Q15[null]
8031female[null]003656815.5[null]Q16[null]
8131male20.011265315.7417[null]CC[null]
8231male12.010265111.2417[null]CC[null]
8331male25.0103470837.775[null]S15[null]
8431female[null]11266822.3583F E69CD[null]
8531female23.000CA. 23147.55[null]SC[null]
8631female[null]1037656416.1[null]S10[null]
8710male36.0001305075.2417C6CA[null]
8811female29.00024160211.3375B5S2[null]
8911male48.0001995226.55E12S3[null]
9011male28.00011056426.55C52SD[null]
9111female33.01011380653.1E8S5[null]
9211female30.00011015286.5B77S8[null]
9311female64.01111290126.55B26S7[null]
9411male27.00011380430.5[null]S3[null]
9511female17.0101747457.0B20S3[null]
9611male53.0113363881.8583A34S13[null]
9711female[null]10PC 17611133.65[null]S5[null]
9811female51.0101350277.9583D11S10[null]
9911female33.0101992890.0C78Q14[null]
10011female19.0021175226.2833D47S5[null]
Rows: 1-100 of 439 | Columns: 14

To drop unnecessary columns from your vDataFrame, use the `drop() <https://www.vertica.com/python/documentation/1.0.x/html/verticapy.vDataFrame.drop.html>`__ method:

[19]:
vdf.drop(["name", "body"])
[19]:
123
pclass
Integer
123
survived
Integer
Abc
sex
Varchar(20)
123
age
Numeric(8)
123
sibsp
Integer
123
parch
Integer
Abc
ticket
Varchar(36)
123
fare
Numeric(12)
Abc
cabin
Varchar(30)
Abc
embarked
Varchar(20)
Abc
boat
Varchar(100)
Abc
Varchar(100)
111male0.9212113781151.55C22 C26S11
211female18.010PC 17757227.525C62 C64C4
311male80.0002704230.0A23SB
411female42.000PC 17757227.525[null]C4
511male25.0101196791.0792B49C7
611female45.000PC 17608262.375[null]C4
711female22.00111350555.0E33S6
811female[null]001777027.7208[null]C5
911female22.000113781151.55[null]S11
1011female64.002PC 1775683.1583E45C14
1111female54.0103694778.2667D20C4
1211male43.0101776527.7208D40C5
1311female22.0021356849.5B39C5
1411male23.001PC 1775963.3583D10 D12C7
1511female35.01011378952.0[null]S8
1611female[null]101746451.8625D21S8
1711male42.0101175352.5542D19S5
1811female45.0101175352.5542D19S5
1911female16.001PC 1759239.4D28S9
2011female21.0001350277.9583D9S10
2111male36.000PC 1747326.2875E25S7
2211male[null]00F.C. 1299825.7417[null]C7
2311female33.000PC 1761327.7208A11C11
2411female48.013PC 17608262.375B57 B59 B63 B66C4
2511female23.0102122882.2667B45S7
2611female31.00236928164.8667C7S8
2721female48.002C.A. 3311236.75[null]S14
2821female30.010SC/PARIS 214813.8583[null]C12
2921female34.00024388013.0[null]S12
3021female24.00225064914.5[null]S4
3121female48.01222084565.0[null]S9
3221female3.012SC/Paris 212341.5792[null]C14
3321male3.01123008026.0F2SD
3421female2.0112636026.0[null]S11
3521female18.00225065213.0[null]S16
3630male27.010265914.4542[null]CC
3731male25.0003481227.65F G63SA
3831male24.000S.O./P.P. 7527.55[null]SD
3931female22.01034707213.9[null]S16
4031female[null]00A. 2. 391868.05[null]SC
4131male21.0003500347.7958[null]SA
4231male4.01134774211.1333[null]S15
4331female27.00234774211.1333[null]S15
4431male21.0003500537.7958[null]S13
4531male39.00134925613.4167[null]C15
4631female26.01131515322.025[null]S2
4731male25.00026547.2292F E57C10
4831female[null]00368667.7375[null]Q16
4931female[null]02266822.3583[null]CD
5031female[null]003427128.05[null]SC
5131female21.0003431207.65[null]SC
5211female63.0101350277.9583D7S10
5311female58.00011378326.55C103S8
5411female41.00016966134.5E40C3
5511female53.000PC 1760627.4458[null]C6
5611female58.001PC 17755512.3292B51 B53 B55C3
5711male11.012113760120.0B96 B98S4
5811female76.0101987778.85C46S6
5911male48.010PC 1757276.7292D33C3
6011male27.000PC 1757276.7292D49C3
6111female24.0001176783.1583C54C7
6211female52.0111274993.5B69S3
6311female44.00111136157.9792B18C4
6411female15.00124160211.3375B5S2
6511female39.000PC 17758108.9C105C8
6611female22.00111350961.9792B36C5
6711male52.00011378630.5C104S6
6811male45.01116966134.5E34C3
6911female40.01116966134.5E34C3
7011female35.000PC 17755512.3292[null]C3
7111male21.001PC 1759761.3792[null]CA
7220male34.0103102721.0[null]S12
7321female19.0002840413.0[null]S12
7421female25.01123778930.0[null]S12
7521male8.011C.A. 3311236.75[null]S14
7621female28.00023766813.0[null]S13
7721female29.010292626.0[null]S16
7821female7.002F.C.C. 1352926.25[null]S14
7921female29.0102670726.0[null]S10
8021female24.01024436726.0[null]S12
8121male1.002S.C./PARIS 207937.0042[null]C10
8221female28.00023043413.0[null]S9
8321female30.00023481812.35[null]Q13
8421female29.01022841426.0[null]S10
8521female0.9212C.A. 3465127.75[null]S10
8621female24.010SC/PARIS 216727.7208[null]C12
8731female16.0003481257.65[null]S16
8831female18.0013920919.35[null]S13
8931female5.04234707731.3875[null]S15
9031female30.00036451612.475[null]S13
9131female[null]003356777.75[null]Q13
9231male[null]0034707.8875[null]SC
9331female[null]00143137.75[null]QD
9431female[null]00143127.75[null]QD
9531female4.00231515322.025[null]S2
9631female[null]003309327.7875[null]Q13
9731male6.00139209612.475E121S14
9831female[null]00143117.75[null]QD
9931male9.001C 173683.1708[null]S13
10031male32.0003470797.775[null]SA
Rows: 1-100 of 439 | Columns: 12

The dropped columns are excluded from the relation’s generated SELECT query:

[20]:
print(vdf.current_relation())
(
   SELECT
     *
   FROM
 (

   SELECT

                    "pclass",
     "survived",
     "sex",
     "age",
     "sibsp",
     "parch",
     "ticket",
     "fare",
     "cabin",
     "embarked",
     "boat",
     "home.dest"

   FROM
 "public"."titanic")
VERTICAPY_SUBTABLE
   WHERE (boat IS NOT NULL))
VERTICAPY_SUBTABLE

8. Maximize your resources#

Large datasets often contain hundreds of columns. These datasets require VerticaPy to compute many concurrent, resource-intensive aggregations. To limit the impact of these aggregations, you can control the number of queries that VerticaPy sends to the system, which allows for some useful optimizations.

In the following example, we’ll explore a couple of these optimizations. First, generate a dataset:

[21]:
from verticapy.datasets import gen_dataset

vp.drop("public.test_dataset", method= "table") # drop an existing table with the same schema and name
features_ranges = {}
for i in range(20):
    features_ranges[f"x{i}"] = {"type": float, "range": [0, 1]}
vp.drop("test_dataset", method = "table")
vdf = gen_dataset(features_ranges, nrows = 100000).to_db("test_dataset",
                                                        relation_type = "table",
                                                        inplace = True)
display(vdf)
123
x0
Float(22)
123
x1
Float(22)
123
x2
Float(22)
123
x3
Float(22)
123
x4
Float(22)
123
x5
Float(22)
123
x6
Float(22)
123
x7
Float(22)
123
x8
Float(22)
123
x9
Float(22)
123
x10
Float(22)
123
x11
Float(22)
123
x12
Float(22)
123
x13
Float(22)
123
x14
Float(22)
123
x15
Float(22)
123
x16
Float(22)
123
x17
Float(22)
123
x18
Float(22)
123
x19
Float(22)
17.07397703081369e-050.07877404964528980.1153182238340380.6752608621027320.4542569839395580.6124411204364150.7105870088562370.6561745149083440.9296241477131840.7948536295443770.2874974934384230.6935442034155130.2084374753758310.7560207988135520.7720011663623150.8929917686618860.2486708727665250.4915250709746030.661702823592350.0471792332828045
28.221878670156e-050.6648881160654130.0839720456860960.5686296273488550.5729661611840130.7837509154342110.5602528662420810.3172041142825040.06373731908388440.1341307237744330.4585693576373160.7890609093010430.6940810552332550.6847826754674320.7282413388602440.1480848342180250.6750277746468780.226946226088330.6775560651440170.687195596750826
30.0001620841212570670.6667805188335480.4711311224382370.4890336103271690.4136137117166070.4503820145037030.7215517149306830.9123797698412090.4750800495967270.7869811349082740.9407507912255820.506424774182960.6530000218190250.2575367379467930.3514188267290590.3563831760548060.4420618321746590.5446537001989780.9337889354210350.261729783844203
40.0001726956106722350.5384483421221380.01146661932580170.153145421994850.681785748573020.3222061065025630.8243096303194760.6326143012847750.4087818753905590.719362456817180.213507274165750.6620199070312080.1463084509596230.5376079888083040.4162171040661630.2267782613635060.5502717767376450.4400388919748370.9067093154881150.630714946193621
50.0002159969881176950.4400471951812510.4031629106029870.6497709564864640.7594153403770180.63694315101020.6621496244333680.9444839747156950.4111184068024160.5063815249595790.9687998844310640.936602385714650.156494198599830.5679548652842640.07997613959014420.2856890887487680.8107975427992640.1251721174921840.635263481410220.690739138750359
60.0002353165764361620.7317020690534260.04546610289253290.1829088474623860.7785325648728760.2654175625648350.557341507403180.4828400539699940.193791522411630.7940330954734240.3473271315451710.9891776228323580.5930367941036820.1339902626350520.2411749383900310.7473492920398710.5091022562701260.4276786160189660.3925929842516780.0654848476406187
70.0002792491577565670.284371868940070.6412818422541020.3815257935784760.9250383579637860.3352852652315050.7748838800471280.5202900019939990.8901241209823640.8671102754306050.5981767345219850.9163475411478430.1920460953842850.08584471186622980.2172701905947180.0976299787871540.7566923985723410.005570597713813190.8506084028631450.454450654564425
80.0003046994097530840.6333777890540660.4929393259808420.2036267879884690.9425555160269140.6578577489126470.6054965362418440.6262922531459480.9316967972554270.984506717650220.491902447538450.6289412323385480.890456074615940.4368127202615140.5619298936799170.2973664165474470.1528352838940920.2674631138797850.862993245711550.923588463105261
90.0003283589612692590.8672089392784980.2696657190099360.3734769260045140.2030311671551320.1230035694316030.4637407488189640.243284490657970.7718185847625140.569569985615090.8767262580804530.8763857136946170.8818942264188080.4203205592930320.6577974869869650.4127495770808310.4471135244239120.2111058258451520.6670033435802910.43986318516545
100.0003659001085907220.5968441551085560.6745763560757040.919476191513240.3798278076574210.8997017820365730.520251058274880.8099163812585180.4738839156925680.2137054402846840.23602720675990.1395479938946660.08020767220295970.347405797103420.05418308381922540.7453130234498530.5480757588520650.317194219445810.9193351885769520.320289980620146
110.0003944626078009610.2093074729200450.3271990926004950.9243874503299590.01602093107067050.7537602765951310.8228401897940780.496803313959390.1715452461503450.2519651288166640.7931029107421640.7103936278726910.2326787721831350.3477552388794720.7590841888450090.428392478730530.6739359376952050.5271267327480020.1065559962298720.263772482052445
120.0004680571146309380.4609816754236820.4034601694438610.129623898537830.8231869512237610.5718669735360890.3091028896160420.1769983614794910.7669890248216690.7058473101351410.5283251418732110.830625299829990.1840901537798350.2474289899691940.3065344800706950.589368179906160.3655563904903830.8348927900660780.5617412740830330.0581626649945974
130.0004901168867945670.9283136839512740.5826172397937630.6891459771431980.03574852785095570.3166500765364620.5382844228297470.5455831792205570.5007519549690190.1523444871418180.2919969747308640.6486164487432690.2376650064252320.4262518314644690.662771928356960.9764687456190590.847581668058410.3232371583580970.6109740056563170.179504566593096
140.0005591004155576230.6463184333406390.7949182335287330.1303818484302610.2293518250808120.426639793673530.6714905875269320.7871370308566840.1462171715684240.9966175472363830.294231793843210.1164074207190420.4788599370513110.6740834750235080.7758612975012510.296238379320130.03220021980814640.9795077452436090.542473802110180.165981483878568
150.0008138185366988180.4011596383061260.1227919422090050.08114911126904190.3714587416034190.9895937822293490.1712497535627340.4221674094442280.6366362341213970.393388782395050.9310372215695680.5893300620373340.7390326652675870.7179708802141250.2918292572721840.5255002907942980.8244493827223780.1178009966388340.276864989427850.539769044611603
160.0008810376748442650.9265634515322740.494036366231740.6067772612441330.7702391620259730.9223348409868780.2937886384315790.6590728349983690.3961479899007830.5578680317848920.5792188085615630.5567562112119050.7671202332712710.4603900746442380.5580314476974310.6982431723736230.06559630460105840.2412761605810370.09796914062462750.739939774386585
170.0009041477460414170.007137423846870660.8872799840755760.4034919915720820.5303414571098980.6108615179546180.334974791854620.6115005137398840.2765842820517720.430974552175030.3387594218365850.9856810667552050.6582706002518530.5006260732188820.02878062985837460.2447634011041370.4931469557341190.3588231555186210.5586651603225620.199766838224605
180.0009115838911384340.7739906522911040.9202831038273870.1082305992022160.3822680406738070.6430238939356060.8815408113878220.8139240364544090.7132289460860190.2185454808641230.6857873701956120.8357336551416670.678335382835940.8384693963453170.1359017093200240.131116725970060.7587146498262880.4634919106028970.613356182817370.174273746088147
190.0009417273104190830.002066895598545670.7469669163692740.7830180586315690.7921911417506640.9339568349532780.7054155804216860.4666540913749490.1962424591183660.3643954317085440.9961396639700980.8170204525813460.2563313774298880.3230855450965460.7374418179970230.3242958120536060.2040352877229450.781076212646440.3700234089046720.299085040576756
200.001030249753966930.4286277242936190.781493981368840.3333100106101480.4713240975979720.1788055519573390.1656253312248740.09992362326011060.4979064681101590.4785474434029310.7928941014688460.4092828277498480.3480934204999360.7005315120331940.5096468303818260.9437773094978180.969232858857140.4782935958355670.2558953787665810.0177682114299387
210.001086011528968810.9414621843025090.2721633787732570.1858710171654820.5095204226672650.4627051162533460.7707952130585910.8558782467152920.09986701840534810.1655370595399290.5070035615935920.311515928013250.9658564240671690.04877966130152340.1973274326883260.2046436388045550.9713295469991860.1689893882721660.9391094122547660.722158912802115
220.001086952397599820.7771546409931030.6201553707942370.8092436729930340.71934319450520.2373891102615740.3083369357045740.8283767935354260.2479330475907770.2199117969721560.8190041983034460.3799112946726380.5292486811522390.8999069700948890.4738819529302420.8002739457879220.2435865455772730.9497312735766170.4457050384953620.841658351710066
230.001153497956693170.1200157548300920.9167471213731910.6052907311823220.2006410397589210.8349487297236920.0862778781447560.1294359508901830.5680091332178560.3831862497609110.7371675160247830.8096594510134310.7094166667666290.5884575841482730.9553204160183670.7155343145132060.02721802517771720.7485231203027070.7410116894170640.817204979015514
240.001248366897925730.4040066732559350.7291888762265440.1349052926525470.6384471470955760.03265989106148480.8758259399328380.5181269776076080.2840117011219260.9444753797724840.2700626994483170.4555648167151960.06991510256193580.7794028334319590.618732782080770.8681165683083240.5221230597235260.2653734169434760.9610124118626120.252335847122595
250.001610405975952740.7975319745019080.7279533001128580.6203686336521060.8271857115905730.9663988770917060.144679339602590.2858438789844510.4828842356801030.5657719213049860.5867445152252910.1390162485186010.648238775087520.182607058901340.5440854183398190.1593206033576280.8457864699885250.4015542445704340.589504296425730.828654535580426
260.001803881954401730.1553392542991790.7927257940173150.7567881948780270.4454290373250840.1634536557830870.5292770890519020.8551685109268870.2654604071285580.07139533874578770.1070842158515010.2121334706898780.8628284127917140.7290055940393360.9586292582098390.9771734061650930.3231388896238060.1507345701102170.6638918910175560.884100564057007
270.001867284299805760.7463741512037810.3163660389836880.9973309375345710.7486607844475660.1147949281148610.6231782827526330.5740271189715710.4084526684600860.8488815156742930.4317485471256080.5429892141837630.5885088851209730.4409576421603560.01727158133871850.630884215235710.2706926751416180.8443240835331380.2798793581314380.325400947825983
280.001871123677119610.0742157960776240.2095016597304490.876498397905380.5601549069397150.1325931877363470.8102295587304980.9643948951270430.4147517506498840.4048937861807640.183485270012170.03457706840708850.07407003175467250.368325272109360.6715244471561160.0461713890545070.5539258709177370.7386482462752610.1613849601708350.615539412479848
290.001950400182977320.80877356743440.9016277592163530.4778178920969370.2127133691683410.3073376698885110.08836742653511460.6243260477203880.991469438886270.02357978606596590.3620478105731310.7088777604512870.9643862005323170.3758612389210610.8695142290089280.270654795225710.8168892334215340.295244265813380.4096764251589780.599162093829364
300.002094842493534090.9877637287136170.840272844536230.2122102235443890.456317471805960.1342946456279610.005333700217306610.552604393567890.7899275845848020.09726217272691430.09623084799386560.4719115195330230.3448683971073480.9472553469240670.3307236109394580.4165635891258720.6806030501611530.5526431340258570.1280440513510260.571059670532122
310.002144343685358760.5870252784807240.3582697152160110.8558626717422160.1429126095026730.6565680021885780.1312651967164130.3050693103577940.6552899847738440.5586987172719090.6086582900024950.5579654041212050.5463692420162260.04976254212670030.1551410553511230.5753876506350930.6492786032613370.4062519865110520.1465648820158090.0264485760126263
320.00215808209031820.05820283340290190.9087303208652880.1970449027139690.1814424097537990.1172336395829920.7073062965646390.03320668544620280.7339299016166480.3135474254377190.5674574540462340.2521095387637620.4764627614058550.7476794526446610.3942151213996110.75739075220190.5225564101710920.5776537496130910.2154566207900640.139164109481499
330.002171057974919680.07179940189234910.8460763557814060.6136712841689590.2852755293715750.8879132168367510.2042209983337670.02935216343030330.7676972155459230.3098810440860690.4578389017842710.02184843365103010.5621871768962590.946808510692790.9955433984287080.7910824613645670.06646316847763960.2415659297257660.4070181050337850.727160071954131
340.002340219449251890.369677385548130.638962932396680.4570299764163790.4745355132035910.7179518179036680.3899280996993180.8436242921743540.8908866078127180.448732718592510.1595422765240070.5378496337216350.5264013526029880.3204450437333430.9121873360127210.1524774008430540.5850337618030610.9182516040746120.1701701709534970.31046407786198
350.002517029643058780.4449636694043870.4336169003508990.4355254559777680.1195589886046950.452428118325770.6056388122960930.5730708716437220.9555727380793540.5652082755696030.09544367203488950.2620731438510120.4076624924782660.3826974795665590.08076002914458510.5483375024050470.03845088044181470.7804163994733240.7527668399270620.698577198199928
360.002529185032472010.4368200621102010.2293610472697760.6903724726289510.6755829220637680.03903506486676630.266577586764470.3514765286818150.7686572223901750.6154173982795330.2313990213442590.6340817387681450.5195277535822240.4993936591781680.1763335186988120.2804578989744190.09207327174954120.4247661645058540.9125889611896130.0552232270129025
370.002597708720713850.5964806119445710.8340012910775840.6707212966866790.9806873821653430.1786078419536350.2438393014017490.7197434476111080.7464921886567030.686494118999690.6824283234309410.4962953156791630.1160687133669850.105974054429680.7724182545207440.9415025224443530.9034449039027090.5626856179442260.6383370945695790.818067149724811
380.00279400218278170.2649162150919440.9585744242649530.07642312394455080.3750318598467860.0330554100219160.4730609150137750.5578901143744590.314439625013620.4067005089018490.8577672352548690.02989693311974410.274193644290790.9186171912588180.7172835841774940.5935409083031120.3218080345541240.3960973760113120.2565508824773130.683786167064682
390.003086724318563940.964104769984260.2663607308641080.6981604851316660.8488270596135410.06067140214145180.03561153379268940.8653280755970630.981017686659470.3338254066184160.5185362873598930.6611985040362920.534481774084270.3333664815872910.6292025262955580.8774306206032630.5182664257008580.7396945995278660.4464434226974840.372738287318498
400.00313096120953560.7861748915165660.8770052599720660.07513394439592960.8318898170255120.914503177162260.7915487820282580.3389911141712220.003894808702170850.8301664639730010.148216625908390.7757795501966030.5765532338991760.6991808861494060.9397738447878510.6680839345790450.8960937396623190.4318031093571340.1245449485722930.574425296392292
410.003202525433152910.3174970173276960.004920337582007050.2517167956102640.2382801580242810.4117841720581050.4455235439818350.8631600060034540.108301109634340.2534896046854560.6372779519297180.06642547412775460.7356037844438110.1355147543363270.170751432888210.3825053304899480.1593004323076460.7785171186551450.2297982925083490.412966939387843
420.003448395058512690.413655492709950.2625660765916110.4556435626000170.8955062599852680.002086988184601070.7932846162002530.8529269623104480.932549329940230.04261976666748520.73859735741280.06277163862250750.7785981688648460.4239277620799840.5374356908723710.6283658845350150.6674970320891590.1494281147606670.04228210588917140.239791556727141
430.003553469432517890.4788145080674440.8703609656076880.5070973956026140.504853219259530.1696375804021950.9016017222311350.9885379802435640.2513478887267410.6105138270650060.3916420594323430.91734397970140.09305033669807020.6581494666170330.7725424694363030.1443675083573910.2985285003669560.03630435536615550.8721988461911680.220158252399415
440.003597657196223740.9288728905376050.993160058278590.7959610181860630.4950930823106320.38912802352570.04680059826932850.1131138226483020.1545803535263990.725204783491790.910362730734050.7907025339081880.3886809921823440.4265848724171520.7442553527653220.6835465780459340.7656036405824120.6956568358000370.2839178270660340.419625357026234
450.003609212348237630.5318422915879640.7247284774202850.6091362871229650.4461788504850120.4539194251410660.9234140608459710.1137903789058330.723430318292230.09669357980601490.07345772720873360.07682447554543610.2760321390815080.2816807271447030.6174800193402920.03395511466078460.9442914600949730.2512667358387260.237116137985140.634889496024698
460.003650795435532930.8176153914537280.5209338266868140.2263146792538460.5050935233011840.5582824703305960.309546868782490.1234599589370190.5688572225626560.4371372850146140.6921133226715030.8907890038099140.401330098975450.5220173639245330.4896311245393010.09963446157053110.6725588855333630.2025369964540.01334944088011980.536575390025973
470.003744959365576510.5050606941804290.5092722377739850.1781008883845060.7822536409366880.8266235277988020.7857024071272460.13627204275690.3611392052844170.7243974986486140.03702210262417790.4761823860462750.409568716539070.84566654288210.799639965407550.75503996736370.2091254382394250.7622273685410620.3375619107391690.912123567424715
480.003778848564252260.9678140541072930.9694408993236720.8816498315427450.6900311552453790.8658353497739880.1226138994097710.9026686819270250.07482217228971420.9247402455657720.8803058776538820.2991064870730040.4671945984009650.3903353633359070.388668217929080.6040874510072170.1955865283962340.9528612985741350.07048114039935170.832441444741562
490.003813638351857660.09413760295137760.5746947145089510.5504496057983490.03867527423426510.2344827214255930.3822341572958980.6070239804685120.8845215451437980.8727396256290380.9775359097402540.9137877887114880.3593785683624450.9492936749011280.08198456862010060.9863604370038960.705099420156330.7601602594368160.1090008232276890.112887593684718
500.003970798105001450.2120734371710570.482117498759180.8836846330668780.3976403935812410.4872277944814410.009200384374707940.5029920565430070.3733184740412980.4067489036824550.07703420682810250.9243261474184690.4760439782403410.3089954408351330.3545355154201390.04247418325394390.0572777437046170.9880129371304060.5110290388111030.413184882607311
510.004037956940010190.6192527564708140.05783590255305170.1434908139053730.2364465191494670.9883203268982470.3852048979606480.5022035241127010.7374514765106140.08109654625877740.3119495071005080.1251186626031990.198850654298440.8502927036024630.5901104430668060.9800884120631960.228867464466020.4496725029312070.3217220967635510.237496974412352
520.00418024952523410.3775759858544920.1728449519723650.07092141662724320.5451189300511030.3863556163851170.2216945139225570.03835969255305830.4964671293273570.4450880098156630.5577457770705220.6938553589861840.08054886362515390.5467236188706010.1762080143671480.1877498917747290.2208000938408080.5155055860523130.7818089877255260.0498490098398179
530.004257518565282230.08416409837082030.03876576689071950.778949546394870.01182653848081830.7130904169753190.4580256652552630.740479137981310.8632106161676350.5214727022685110.8101105764508250.7531101312488320.01281059905886650.9707384044304490.8056989016477020.3537945763673630.7992843454703690.8784461943432690.8843300896696750.939427450764924
540.004260930931195620.1535476667340840.3815341920126230.3217585585080090.4279153011739250.6136343139223750.6541816908866170.4175385024864230.4546308927237990.9658519800286740.3367111049592490.8538232492282990.2681402801536020.472033125581220.8639935706742110.511306356638670.5935863482300190.8842927271034570.3704391336068510.706301348749548
550.004285089904442430.1052987833973020.6447063295636330.6591744227334860.68945556320250.727034532232210.2551431474275890.05822691414505240.001466357847675680.02121533988974990.2259228008333590.3177632188890130.2754774671047930.4800498546101150.09778671991080050.6860377418342980.7588678230531510.01992730586789550.8486144456546750.805880650877953
560.004314795834943650.04249198222532870.05029107234440740.5963766551576550.571943704737350.8927945585455750.1847181301563980.7874837156850840.9908031697850670.5904957614839080.4647883581928910.09218320832587780.6340979500673710.8301709496881810.02537186443805690.5357874019537120.204510939074680.5128243835642930.07163906656205650.93218103912659
570.004334465367719530.5243807141669090.9702655633445830.3593138149008150.4134423269424590.06284308433532710.07320887013338510.6940099732019010.2018279295880350.7476987075060610.03705122251994910.3752217593137170.3515359833836560.2622137230355290.9669457250274720.08814118290320040.09835250629112120.8738087394740430.8948390115983780.327558801043779
580.004466539248824120.8435365429613740.1053647110238670.1181607511825860.9406036431901160.07496634568087760.4793849622365090.2430991851724680.4969062951859090.2946733750868590.6867752270773050.9902224333491180.7400342265609650.6706428497564050.4065235897433010.8433848612476140.6727279436308890.1658350641373540.7467544123064730.965748598799109
590.00460221804678440.9566664821468290.08052973286248740.2297713663429020.2516176335047930.3887864120770250.4647466894239190.9042253473307940.2892629313282670.7441575226839630.2768418791238220.748206236166880.6030384211335330.2339640965219590.07400055113248530.1406865660101180.1411284694913770.9385265717282890.4585829682182520.587838252540678
600.004611603450030090.4738137356471270.5812280816026030.003701620036736130.4591149126645180.7118670919444410.9965958371758460.6211845241487030.354009845992550.03049484151415530.6162856253795330.0002791306469589470.12457592645660.4068909110501410.3104927793610840.8044010135345160.7745326191652570.1142027722671630.7036353293806310.452991388272494
610.004617329686880110.4653845424763860.3932974792551250.8482184426393360.5359704666770990.3438994886819270.2585604081396010.1543499315157530.08166760997846720.7107677177991720.3814137643203140.386389262275770.2824773816391830.9469399091321970.2942807604558770.9211654064711180.5031425007618960.4324066212866460.3259630226530130.356749014696106
620.004768455401062970.2581061017699540.5474726711399850.01212815032340590.9440412856638430.2217970881611110.4770634444430470.5252131849993020.9825390912592410.5907246749848130.7315594181418420.08037684671580790.005127043928951030.5647509007249030.1386841833591460.3467629803344610.9363255093339830.945011158473790.5659636738710110.942184565588832
630.0049478542059660.4190567454788830.5135059168096630.3255714443512260.07823360338807110.9291127158794550.2297730543650690.2738747945986690.3973759254440670.6036888565868140.1887171950656920.7573745148256420.6720739621669050.6934709905181080.5110726298298690.7776803646702320.9516130015254020.8814058571588250.8257314891088750.516175739932805
640.005383339012041690.3222980168648060.694103368790820.6844825865700840.4775101589038970.1195163798984140.304688035277650.3999204705469310.2663706350140270.3257699478417630.08392708469182250.6844694083556530.4009785277303310.4015939256642010.7789541720412670.1336278836242850.08076266059651970.3804366176482290.7349213773850350.674469094490632
650.005451381439343090.3067297590896490.7985118690412490.0910410881042480.1459957270417360.9531733193434770.3081440082751210.8693393866997210.2950382244307550.657827433198690.4955363059416410.08724596514366570.9180301076266910.5088555715046820.8374911502469330.9498527783434840.5883395983837540.12699149036780.5578779601491990.950301753124222
660.005465235561132430.9538956417236480.6249903894495220.9141649694647640.9246979644522070.7247425243258480.8806993872858580.4076535892672840.1907426936086270.1511879472527650.2350005463231350.2532560301478950.5306984693743290.3243033329490570.2000286881811920.4565868985373530.8489666273817420.07863575592637060.3863558794837450.205620076041669
670.005468504503369330.2834986879024650.4417957689147440.322218870976940.2873371422756460.2093647138681260.8810160292778160.3843267399352040.948357681510970.8595537403598430.4589664479717610.6708835207391530.2387658033985640.3618011558428410.1801795556675640.02725105243735020.3697552308440210.9478662761393930.7476217057555910.136986246798187
680.005481878062710170.4758412914816290.5276737122330810.04501033178530630.8842718196101490.7788341636769470.6237337694037710.3633044632151720.8745214915834370.4296813535038380.2534727796446530.5807035889010880.6555497727822510.1830505714751780.8026759584899990.1887290847953410.3524477663449940.7404646533541380.7401042117271570.0518087858799845
690.005499323597177860.2413373512681570.2992869412992150.5195123588200660.837257654406130.2727870298549530.5568201504647730.01223883521743120.5158471334725620.7023818413726990.6703928117640320.8254386004991830.1060377385001630.5643459840212020.7119192010723050.6420841012150050.1352553765755150.3973159834276880.01129247830249370.618799487128854
700.005537980236113070.6077269655652340.9858781790826470.002356817945837970.04893954889848830.9894795694854110.2943600625731050.7789908929262310.793302083155140.07198728783987460.6477206449490040.7825897508300840.286200684495270.107293807202950.09406306291930380.2565616776701060.5985216437838970.1982203603256490.9494723337702450.823433440411463
710.005600504344329240.8414744739420710.7283136907499280.9244166121352460.4071945864707230.9900370838586240.1990750364493580.4879471259191630.4678318277001380.3150167705025520.6578353771474210.01991134486161170.2089198029134420.7545928938779980.08295039599761370.6288899146020410.6397023873869330.5587892297189680.04848696640692650.456474096514285
720.006013850215822460.2503592488355930.06700821570120750.6255058271344750.2947537640575320.4203913863748310.6141961452085520.1603465520311150.1047031707130370.784403979778290.654774977127090.3202331201173370.351021583192050.409439769806340.562854823656380.4402416662778710.6824166250880810.9217233164235950.2914798723068090.103472447255626
730.006044936832040550.08374862675555050.6016025771386920.3982139569707210.09314735815860330.7576795725617560.5600571099203080.9411567545030270.4296735539101060.5463258784729990.1421564030461010.1905283520463850.3113634528126570.7686751044820990.3113631061278280.8056186304893340.557487575104460.4414084127638490.8715493734925990.0695229934062809
740.00604530726559460.3007229370996360.7901510975789280.206368834245950.7415325536858290.7818861852865670.1432611674536020.4462499131914230.8475850783288480.1467603261116890.5939403704833240.2756384175736460.8117242951411750.9603878329508010.2935318241361530.9336955221369860.2982250784989450.3625553098972890.2652045516297220.072734983637929
750.006189862731844190.8056139822583650.04227828350849450.765194190898910.7477560653351250.1535569324623790.9034753662999720.7895245528779920.1240387104917320.5043458286672830.03742268774658440.5784134850837290.78557081008330.6187234078533950.494291883427650.6601202688179910.9566984972916540.4463745381217450.8277214693371210.965238370699808
760.006229224149137740.7547141998074950.2408748872112480.8465436112601310.4160601210314780.5017652877140790.4933459120802580.9322343706153330.9985494604334240.5378869690466670.9860694089438770.9100137304048990.1965878456830980.597114454023540.9231521000619980.9050410392228510.1558276701252910.2353264081757520.6370327426120640.500663405284286
770.006561990361660720.41906222072430.923396613914520.2403874136507510.7770291927736250.1828976850956680.8392524251248690.1493914399761710.9189178540837020.1019204244948920.7926020119339230.8229293546173720.353801614604890.4278259631246330.8834981406107540.8086628112941980.998684200923890.3933794903568920.7903471083845940.678845811402425
780.006643486674875020.004593580728396770.5465728575363760.4085107885766770.6523006292991340.01097828522324560.1489908066578210.6067021493799980.6924028212670240.5951407074462620.6245870813727380.4190663676708940.7323200940154490.3591926982626320.3597625498659910.05792184430174530.7565886753145610.3105649841018020.115321787539870.183660146314651
790.006788122467696670.3627009452320640.3595028745476160.5155221328604970.03480108221992850.955952863441780.3042440128047020.872558145318180.7025963282212620.007321126526221630.6394133754074570.3660686512012030.2557427755091340.06848904094658790.8240618673153220.4478658144362270.4918645219877360.3918729953002180.251386062009260.632435228675604
800.006856533233076330.5413665072992440.8545941170305010.07848321553319690.1681987040210520.1204679247457530.916197077371180.5917812960688020.5288742270786320.4254733521956950.7997698029503230.3251796341501180.7513479811605070.4307651896961030.3108421333599840.9558873805217440.1976608920376750.702562335645780.1002422128804030.237061544088647
810.006858297856524590.07373466691933570.411380320787430.06988296331837770.6085483022034170.719468860421330.804645629366860.9452064044307920.1583399144001310.9389527284074570.2511207647621630.206517242360860.9193621380254630.419600330758840.1142745837569240.917382439132780.820504561997950.9741340107284490.624768038745970.238603635458276
820.007075178902596240.8708707783371210.2893571949098260.2595143283251670.6042674619238820.06845689145848160.5062927494291220.8415206873323770.4523497214540840.8126835436560210.2451386938337240.6054191121365880.8081262425985190.3522289439570160.354805332375690.002310215495526790.03586888895370070.08780255448073150.3362055986654010.73932251194492
830.007173753809183840.03420542250387370.807587275980040.1833177062217150.1365406119730320.2567449209745970.1412750442977990.92842239793390.4934532484039660.7355695588048550.4161060380283740.6045728740282360.02482883771881460.5826009013690050.8465324505232270.140659397002310.542918820166960.7171211044769730.04381753131747250.820026016095653
840.007290842011570930.7854899235535410.4452200562227520.553302443586290.5483371494337920.4813419417478140.06493959575891490.3116129375994210.4636834957636890.6289229062385860.5561178009957080.3474870899226520.4083902635611590.1713988410774620.7260016100481150.6772491447627540.3574708129744980.2510906860698010.5868530017323790.88893955363892
850.007497667334973810.6965317148715260.4224630964454260.1802869790699330.3531884569674730.8486363715492190.6796482813078910.3117123593110590.2657664888538420.6044662790372970.7874533946160230.9338366785086690.4499023356474940.7107741711661220.3502677499782290.4130020756274460.06264549493789670.1436652166303250.1501946502830830.804719494190067
860.007511694682762030.2857410239521410.3386634676717220.4941535380203280.2025138377211990.6546969879418610.463962862035260.8194982917048040.9386563496664170.4779137377627190.4113987202290450.734767837915570.8355945451185110.1756786205805840.1990210446529090.6315952611621470.4084877409040930.8049650886096060.287718153093010.328727405751124
870.007528548827394840.9135740059427920.3365050335414710.1796389559749510.9688672693446280.1588927940465510.4800281934440140.5128949666395780.857569736894220.6984730295371260.8237001074012370.03909468417987230.2850546604022380.4126297475304450.7330851291771980.1983459983021020.6466000555083160.7258471404202280.6463834100868550.0498906851280481
880.007572191301733260.01094036339782180.06339181074872610.4794439489487560.5457094048615550.378831125330180.1048788309562950.9721517807338390.9617145925294610.3879868879448620.9949666464235630.7150122974999250.8812532820738850.7777515761554240.03547958773560820.2121954881586130.9266894457396120.6602838889230040.3396789876278490.471689041471109
890.007669705431908370.5272847600281240.199341189116240.9215487593319270.07939597871154550.6832876044791190.08401980577036740.1182317605707790.6116018523462120.4608744876459240.1664863429032270.4255095503758640.5503201314713810.414873009081930.5203590646851810.4890402504242960.6081201920751480.699598323320970.1536636063829060.619109965860844
900.00769680971279740.3481483045034110.4873288059607150.104201813694090.9262032515835020.8361753404606130.4368455328512940.965383274247870.7947264187969270.6342928565572950.7965830760076640.6672776285558940.5500962848309430.5655248002149160.7090385975316170.3549244673922660.1565753717441110.264340971829370.1649550911970440.704713765298948
910.007796240039169790.6242674239911140.8797939047217370.4262805553153160.03802908421494070.758905764669180.5978091103024780.3612774999346580.8140185340307650.1701965916436170.09125735005363820.1437928574159740.3623681378085170.04036758048459890.983339033788070.7706816501449790.6071240634191780.6822193926200270.06863331375643610.484713949495926
920.007799213752150540.1110189130995420.3099025364499540.6104642297141250.5618851224426180.1778270555660130.9039548221044240.6152232873719190.4121606005355720.2715556838084010.6060073222033680.6133711594156920.4095900908578190.1104192670900370.3182326215319340.179832681547850.08557388396002350.07863785931840540.5740286363288760.926056883763522
930.007944361539557580.08661173493601380.231991423992440.6349418945610520.7872276499401780.6302252330351620.9021531972102820.1996626013424250.5239800515118990.2719247576314960.797108907252550.5331403722520920.2139437829609960.3765942270401870.537741126492620.5087360132019970.4888184498995540.1896850126795470.4377293467987330.292704855790362
940.008068932686001060.8281326822470870.1775618267711250.6264622227754440.475377583177760.9240953738335520.4972231856081630.5429030805826190.5579851293005050.9077071766369040.6192916587460790.4419774601701650.5226190574467180.1123002143576740.5312305686529730.7080213387962430.3574945169966670.9011095191817730.7727415466215460.0914568158332258
950.008130611153319480.1373171121813360.7409094900358470.117984020849690.4374855563510210.8058635101187970.3871812082361430.1826867964118720.7242049623746420.6669244817458090.5798423069063570.4607310795690860.7899491330608730.973523156950250.3574600829742850.7951441006734970.08330991724506020.6303770593367520.4163959207944570.294484910788015
960.008142651524394750.117872715927660.1502070038113740.9277522366028280.1527697097044440.9847227169666440.0338233779184520.4824687615036960.3400827415753160.6506523983553050.4174056833144280.5629316261038180.4794464765582230.4250178281217810.2089579263702030.8206240651197730.7630779619794340.377142477547750.6906849704682830.825754955410957
970.008146135369315740.1319755779113620.2345672256778930.7759604200255130.03717216546647250.7154275211505590.833601339720190.3523243465460840.9831076376140120.8186273658648130.7080070846714080.2400824497453870.5459348766598850.01600568951107560.8031378942541780.4536500896792860.3777196053415540.268442398402840.3385502852033820.809020956512541
980.008275314932689070.2768670872319490.4713209383189680.9815425248816610.6483978184405710.7382907182909550.9458923763595520.3518391461111610.3974149751011280.09670283342711630.4667769721709190.4190063211135570.8228650349192320.1295136185362940.6980733098462220.8958244554232810.7903038049116730.7187578151933850.642975697992370.677666405215859
990.008299917681142690.9388954644091430.3523537600412960.128516985801980.2709596017375590.4662165052723140.107633716659620.8638997350353750.4460319704376160.3180852197110650.07166401646099980.7125001999083910.3459397095721220.2616338157095020.4294797473121430.3757710298523310.009930672822520140.01337036234326660.4682062640786170.0531278869602829
1000.00832824432291090.3746900698170070.9194654675666240.5583141054958110.2894401110243050.4551594844087960.529757519951090.2790299325715750.2766563608311120.5048094671219590.7269471234176310.4425822261255230.8582131187431510.3140726478304710.5989370292518290.5810301357414570.2627358650788660.09383108397014440.2125637636054310.545174148632213
Rows: 1-100 | Columns: 20

To monitor how VerticaPy is computing the aggregations, use the set_option() function to turn on SQL code generation and turn off cache:

[22]:
vp.set_option("sql_on", True)
vp.set_option("cache", False)

VerticaPy allows you to send multiple queries, either iteratively or concurrently, to the database when computing aggregations.

First, let’s send a single query to compute the average for all columns in the vDataFrame:

[23]:
display(vdf.avg(ncols_block = 20))

Computing the different aggregations.

  SELECT
    /*+LABEL('vDataframe.aggregate')*/ AVG("x0"),
    AVG("x1"),
    AVG("x2"),
    AVG("x3"),
    AVG("x4"),
    AVG("x5"),
    AVG("x6"),
    AVG("x7"),
    AVG("x8"),
    AVG("x9"),
    AVG("x10"),
    AVG("x11"),
    AVG("x12"),
    AVG("x13"),
    AVG("x14"),
    AVG("x15"),
    AVG("x16"),
    AVG("x17"),
    AVG("x18"),
    AVG("x19")  
  FROM
"public"."test_dataset"  
  LIMIT 1
avg
"x0"0.499510914725466
"x1"0.500098380795368
"x2"0.498650639873019
"x3"0.498249878197482
"x4"0.498723634853531
"x5"0.49859746068794
"x6"0.499440802120219
"x7"0.500571392295512
"x8"0.500615366488232
"x9"0.500191712573576
"x10"0.500156461340457
"x11"0.500385406252143
"x12"0.499548987546798
"x13"0.498913883469834
"x14"0.500118768013034
"x15"0.499257680216099
"x16"0.499268517902286
"x17"0.499183649747421
"x18"0.498663500893123
"x19"0.498528332974443
Rows: 1-20 | Columns: 2

We see that there was one SELECT query for all columns in the vDataFrame. You can reduce the impact on the system by using the ncols_block parameter to split the computation into multiple iterative queries, where the value of the parameter is the number of columns included in each query.

For example, setting ncols_block to 5 will split the computation, which consists of 20 total columns, into 4 separate queries, each of which computes the average for 5 columns:

[24]:
display(vdf.avg(ncols_block = 5))
  0%|          | 0/4 [00:00<?, ?it/s]

Computing the different aggregations.

  SELECT
    /*+LABEL('vDataframe.aggregate')*/ AVG("x0"),
    AVG("x1"),
    AVG("x2"),
    AVG("x3"),
    AVG("x4")  
  FROM
"public"."test_dataset"  
  LIMIT 1
 25%|██▌       | 1/4 [00:00<00:00,  7.57it/s]

Computing the different aggregations.

  SELECT
    /*+LABEL('vDataframe.aggregate')*/ AVG("x5"),
    AVG("x6"),
    AVG("x7"),
    AVG("x8"),
    AVG("x9")  
  FROM
"public"."test_dataset"  
  LIMIT 1
 50%|█████     | 2/4 [00:00<00:00,  7.35it/s]

Computing the different aggregations.

  SELECT
    /*+LABEL('vDataframe.aggregate')*/ AVG("x10"),
    AVG("x11"),
    AVG("x12"),
    AVG("x13"),
    AVG("x14")  
  FROM
"public"."test_dataset"  
  LIMIT 1
 75%|███████▌  | 3/4 [00:00<00:00,  7.44it/s]

Computing the different aggregations.

  SELECT
    /*+LABEL('vDataframe.aggregate')*/ AVG("x15"),
    AVG("x16"),
    AVG("x17"),
    AVG("x18"),
    AVG("x19")  
  FROM
"public"."test_dataset"  
  LIMIT 1
100%|██████████| 4/4 [00:00<00:00,  7.50it/s]
avg
"x0"0.499510914725466
"x1"0.500098380795368
"x2"0.498650639873019
"x3"0.498249878197482
"x4"0.498723634853531
"x5"0.49859746068794
"x6"0.499440802120219
"x7"0.500571392295512
"x8"0.500615366488232
"x9"0.500191712573576
"x10"0.500156461340457
"x11"0.500385406252143
"x12"0.499548987546798
"x13"0.498913883469834
"x14"0.500118768013034
"x15"0.499257680216099
"x16"0.499268517902286
"x17"0.499183649747421
"x18"0.498663500893123
"x19"0.498528332974443
Rows: 1-20 | Columns: 2

In addition to spliting up the computation into separate queries, you can send multiple queries to the database concurrently. You specify the number of concurrent queries with the processes parameter, which defines the number of workers involved in the computation. Each child process creates a DB connection and then sends its query. In the following example, we use 4 ‘processes’:

[25]:
display(vdf.avg(ncols_block = 5, processes = 4))
avg
"x0"0.499510914725466
"x1"0.500098380795368
"x2"0.498650639873019
"x3"0.498249878197482
"x4"0.498723634853531
"x5"0.49859746068794
"x6"0.499440802120219
"x7"0.500571392295512
"x8"0.500615366488232
"x9"0.500191712573576
"x10"0.500156461340457
"x11"0.500385406252143
"x12"0.499548987546798
"x13"0.498913883469834
"x14"0.500118768013034
"x15"0.499257680216099
"x16"0.499268517902286
"x17"0.499183649747421
"x18"0.498663500893123
"x19"0.498528332974443
Rows: 1-20 | Columns: 2

Whether it’s best to send multiple queries iteratively, in parallel, or in one single query depends on your specific use case, and often on the structure of your dataset.