SQL Magic

In [ ]:
%sql -c sql_command -f input_file -ncols nb_of_cols -nrows nb_of_rows -o output_file

Executes SQL queries in the Jupyter cell.

Parameters

Name Type Optional Description
-c / --command
str
SQL Command to execute.
-f / --file
str
Input file to execute.
-ncols
int
Maximum number of columns to display.
-nrows
int
Maximum number of rows to display.
-o / --output
str
Output file. Use this option to export the result of the query to the CSV or JSON format.

Returns

vDataFrame : Result of the query.

Example

These examples demonstrate the following:

  • Setting up the environment.
  • SQL Magic.
  • Getting the vDataFrame of a query.
  • Using variables inside a query.
  • Limiting the number of rows and columns.
  • Exporting a query to JSON or CSV
  • Executing SQL files

Setting up the environment

If you don't already have one, create a new connection:

In [11]:
import verticapy as vp

# Save a new connection
vp.new_connection({"host": "10.211.55.14", 
                   "port": "5433", 
                   "database": "testdb", 
                   "password": "XxX", 
                   "user": "dbadmin"},
                   name = "VerticaDSN")

If you already have a connection in a connection file, you can use it by running the following command:

In [12]:
# Connection using the VerticaDSN connection
vp.connect("VerticaDSN")

Load the extension:

In [13]:
%load_ext verticapy.sql

Load a sample dataset. These sample datasets are loaded into the public schema by default. You can specify a target schema with the 'name' and 'schema' parameters:

In [14]:
from verticapy.datasets import load_titanic, load_iris
titanic = load_titanic()
iris = load_iris()

SQL Magic

Use '%%sql' to run a query on the dataset:

In [15]:
%%sql
SELECT 
    survived,
    AVG(fare) AS avg_fare,
    AVG(age) AS avg_age
FROM titanic
GROUP BY 1;
Execution: 0.006s
Out[15]:
123
survived
Integer
123
avg_fare
Float
123
avg_age
Float
1023.425595019157130.6420462046205
2152.300259333333329.3936572890026
Rows: 1-2 | Columns: 3

You can also run queries with '%sql' and the '-c' option:

In [17]:
%sql -c 'SELECT DISTINCT Species FROM iris;'
Execution: 0.006s
Out[17]:
Abc
Species
Varchar(30)
1Iris-versicolor
2Iris-virginica
3Iris-setosa
Rows: 1-3 | Column: Species | Type: varchar(30)

You can use a single cell for multiple queries:

In [18]:
%%sql
DROP TABLE IF EXISTS test;
CREATE TABLE test AS SELECT 'Badr Ouali' AS name;
SELECT * FROM test;
DROP
CREATE
Execution: 0.05s
Out[18]:
Abc
name
Varchar(10)
1Badr Ouali
Rows: 1-1 | Column: name | Type: varchar(10)

To add comments to query:

In [19]:
%%sql
-- Comment Test
/* My Vertica Version */
SELECT version(); -- Looking at the current Vertica version.
Execution: 0.005s
Out[19]:
Abc
Varchar(128)
1
Rows: 1-1 | Column: version | Type: varchar(128)

Getting the vDataFrame of a query

Results are stored in a vDataFrame, which is assigned to a temporary variable called '_'. As shown in the following procedure, you can assign this to a variable to save your results.

In [20]:
%%sql
SELECT
    age,
    fare,
    pclass
FROM titanic
WHERE age IS NOT NULL AND fare IS NOT NULL;
Execution: 0.007s
Out[20]:
123
age
Numeric(6,3)
123
fare
Numeric(10,5)
123
pclass
Integer
12.0151.551
230.0151.551
325.0151.551
439.00.01
571.049.50421
647.0227.5251
724.0247.52081
836.075.24171
925.026.01
1045.035.51
1142.026.551
1241.030.51
1348.050.49581
1445.026.551
1533.05.01
1628.047.11
1717.047.11
1849.026.01
1936.078.851
2046.061.1751
2127.0136.77921
2247.025.58751
2337.083.15831
2470.071.01
2539.071.28331
2631.052.01
2750.0106.4251
2839.029.71
2936.031.67921
3030.027.751
3119.0263.01
3264.0263.01
3337.053.11
3447.038.51
3524.079.21
3671.034.65421
3738.0153.46251
3846.079.21
3945.083.4751
4040.00.01
4155.093.51
4242.042.51
4355.050.01
4442.052.01
4550.028.71251
4646.026.01
4750.026.01
4832.5211.51
4958.029.71
5041.051.86251
5129.030.01
5230.045.51
5330.026.01
5419.053.11
5546.075.24171
5654.051.86251
5728.082.17081
5865.026.551
5944.090.01
6055.030.51
6147.042.41
6237.029.71
6358.0113.2751
6464.026.01
6565.061.97921
6628.527.72081
6745.528.51
6823.093.51
6929.066.61
7018.0108.91
7147.052.01
7238.00.01
7322.0135.63331
7431.050.49581
7536.040.1251
7655.059.41
7733.026.551
7861.0262.3751
7950.055.91
8056.026.551
8156.030.69581
8224.060.01
8357.0146.52081
8462.026.551
8567.0221.77921
8663.0221.77921
8761.032.32081
8852.079.651
8949.0110.88331
9040.027.72081
9161.033.51
9247.034.02081
9364.075.251
9460.026.551
9554.077.28751
9621.077.28751
9757.0164.86671
9850.0211.51
9927.0211.51
10051.061.37921
Rows: 1-100 | Columns: 3

Assign the results to a new variable:

In [21]:
titanic_clean = _
display(titanic_clean)
123
age
Numeric(6,3)
123
fare
Numeric(10,5)
123
pclass
Integer
12.0151.551
230.0151.551
325.0151.551
439.00.01
571.049.50421
647.0227.5251
724.0247.52081
836.075.24171
925.026.01
1045.035.51
1142.026.551
1241.030.51
1348.050.49581
1445.026.551
1533.05.01
1628.047.11
1717.047.11
1849.026.01
1936.078.851
2046.061.1751
2127.0136.77921
2247.025.58751
2337.083.15831
2470.071.01
2539.071.28331
2631.052.01
2750.0106.4251
2839.029.71
2936.031.67921
3030.027.751
3119.0263.01
3264.0263.01
3337.053.11
3447.038.51
3524.079.21
3671.034.65421
3738.0153.46251
3846.079.21
3945.083.4751
4040.00.01
4155.093.51
4242.042.51
4355.050.01
4442.052.01
4550.028.71251
4646.026.01
4750.026.01
4832.5211.51
4958.029.71
5041.051.86251
5129.030.01
5230.045.51
5330.026.01
5419.053.11
5546.075.24171
5654.051.86251
5728.082.17081
5865.026.551
5944.090.01
6055.030.51
6147.042.41
6237.029.71
6358.0113.2751
6464.026.01
6565.061.97921
6628.527.72081
6745.528.51
6823.093.51
6929.066.61
7018.0108.91
7147.052.01
7238.00.01
7322.0135.63331
7431.050.49581
7536.040.1251
7655.059.41
7733.026.551
7861.0262.3751
7950.055.91
8056.026.551
8156.030.69581
8224.060.01
8357.0146.52081
8462.026.551
8567.0221.77921
8663.0221.77921
8761.032.32081
8852.079.651
8949.0110.88331
9040.027.72081
9161.033.51
9247.034.02081
9364.075.251
9460.026.551
9554.077.28751
9621.077.28751
9757.0164.86671
9850.0211.51
9927.0211.51
10051.061.37921
Rows: 1-100 | Columns: 3

Temporary results are stored in a vDataFrame, so you can call vDataFrame methods:

In [22]:
titanic_clean.corr()
Out[22]:
"age"
"fare"
"pclass"
"age"1.00.178575164117464-0.403900427391808
"fare"0.1785751641174641.0-0.568258663041552
"pclass"-0.403900427391808-0.5682586630415521.0
Rows: 1-3 | Columns: 4

Using variables inside a query

You can use variables in a SQL query with the ':' operator. This variable can be a vDataFrame, a tablesample, a pandas.DataFrame, or any standard Python type.

In [23]:
import verticapy.stats as st

class_fare = titanic_clean.groupby("pclass", 
                                   [st.avg(titanic_clean["fare"])._as("avg_fare")])
class_fare
Out[23]:
123
pclass
Integer
123
avg_fare
Float
1193.1410288321168
2221.9666833333333
3312.8090323529412
Rows: 1-3 | Columns: 2

Use the 'class_fare' variable in a SQL query:

In [17]:
%%sql
SELECT 
    x.*, 
    y.avg_fare
FROM titanic AS x LEFT JOIN (SELECT * FROM :class_fare) AS y 
ON x.pclass = y.pclass;
Execution: 0.011s
Out[17]:
123
pclass
Integer
123
survived
Integer
Abc
Varchar(164)
Abc
sex
Varchar(20)
123
age
Numeric(6,3)
123
sibsp
Integer
123
parch
Integer
Abc
ticket
Varchar(36)
123
fare
Numeric(10,5)
Abc
cabin
Varchar(30)
Abc
embarked
Varchar(20)
Abc
boat
Varchar(100)
123
body
Integer
Abc
home.dest
Varchar(100)
123
avg_fare
Float
110female2.012113781151.55C22 C26S[null][null]Montreal, PQ / Chesterville, ON93.1410288321168
210male30.012113781151.55C22 C26S[null]135Montreal, PQ / Chesterville, ON93.1410288321168
310female25.012113781151.55C22 C26S[null][null]Montreal, PQ / Chesterville, ON93.1410288321168
410male39.0001120500.0A36S[null][null]Belfast, NI93.1410288321168
510male71.000PC 1760949.5042[null]C[null]22Montevideo, Uruguay93.1410288321168
610male47.010PC 17757227.525C62 C64C[null]124New York, NY93.1410288321168
710male[null]00PC 1731825.925[null]S[null][null]New York, NY93.1410288321168
810male24.001PC 17558247.5208B58 B60C[null][null]Montreal, PQ93.1410288321168
910male36.0001305075.2417C6CA[null]Winnipeg, MN93.1410288321168
1010male25.0001390526.0[null]C[null]148San Francisco, CA93.1410288321168
1110male45.00011378435.5TS[null][null]Trenton, NJ93.1410288321168
1210male42.00011048926.55D22S[null][null]London / Winnipeg, MB93.1410288321168
1310male41.00011305430.5A21S[null][null]Pomeroy, WA93.1410288321168
1410male48.000PC 1759150.4958B10C[null]208Omaha, NE93.1410288321168
1510male[null]0011237939.6[null]C[null][null]Philadelphia, PA93.1410288321168
1610male45.00011305026.55B38S[null][null]Washington, DC93.1410288321168
1710male[null]0011379831.0[null]S[null][null][null]93.1410288321168
1810male33.0006955.0B51 B53 B55S[null][null]New York, NY93.1410288321168
1910male28.00011305947.1[null]S[null][null]Montevideo, Uruguay93.1410288321168
2010male17.00011305947.1[null]S[null][null]Montevideo, Uruguay93.1410288321168
2110male49.0001992426.0[null]S[null][null]Ascot, Berkshire / Rochester, NY93.1410288321168
2210male36.0101987778.85C46S[null]172Little Onn Hall, Staffs93.1410288321168
2310male46.010W.E.P. 573461.175E31S[null][null]Amenia, ND93.1410288321168
2410male[null]001120510.0[null]S[null][null]Liverpool, England / Belfast93.1410288321168
2510male27.01013508136.7792C89C[null][null]Los Angeles, CA93.1410288321168
2610male[null]0011046552.0A14S[null][null]Stoughton, MA93.1410288321168
2710male47.000572725.5875E58S[null][null]Victoria, BC93.1410288321168
2810male37.011PC 1775683.1583E52C[null][null]Lakewood, NJ93.1410288321168
2910male[null]0011379126.55[null]S[null][null]Roachdale, IN93.1410288321168
3010male70.011WE/P 573571.0B22S[null]269Milwaukee, WI93.1410288321168
3110male39.010PC 1759971.2833C85C[null][null]New York, NY93.1410288321168
3210male31.010F.C. 1275052.0B71S[null][null]Montreal, PQ93.1410288321168
3310male50.010PC 17761106.425C86C[null]62Deephaven, MN / Cedar Rapids, IA93.1410288321168
3410male39.000PC 1758029.7A18C[null]133Philadelphia, PA93.1410288321168
3510female36.000PC 1753131.6792A29C[null][null]New York, NY93.1410288321168
3610male[null]00PC 17483221.7792C95S[null][null][null]93.1410288321168
3710male30.00011305127.75C111C[null][null]New York, NY93.1410288321168
3810male19.03219950263.0C23 C25 C27S[null][null]Winnipeg, MB93.1410288321168
3910male64.01419950263.0C23 C25 C27S[null][null]Winnipeg, MB93.1410288321168
4010male[null]00113778