Half-Join Subquery Semantics
The Vertica optimizer uses several keywords in directed queries to recreate half-join subqueries with certain search operators, such as ANY
or NOT IN
:
SEMI JOIN
Recreates a query that contains a subquery preceded by an IN
, EXIST
, or ANY
operator and executes a semi-join.
Input query
SELECT product_description FROM product_dimension WHERE product_dimension.product_key IN (SELECT qty_in_stock from inventory_fact);
Query plan
QUERY PLAN DESCRIPTION: ------------------------------ explain SELECT product_description FROM product_dimension WHERE product_dimension.product_key IN (SELECT qty_in_stock from inventory_fact); Access Path: +-JOIN HASH [Semi] [Cost: 1K, Rows: 30K] (PATH ID: 1) Outer (FILTER) Inner (RESEGMENT) | Join Cond: (product_dimension.product_key = VAL(2)) | Materialize at Output: product_dimension.product_description | Execute on: All Nodes | +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2) | | Projection: public.product_dimension | | Materialize: product_dimension.product_key | | Execute on: All Nodes | | Runtime Filter: (SIP1(HashJoin): product_dimension.product_key) | +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3) | | Execute on: All Nodes | | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4) | | | Projection: public.inventory_fact_b0 | | | Materialize: inventory_fact.qty_in_stock | | | Execute on: All Nodes
Optimizer-generated annotated query
SELECT /*+ syntactic_join */ product_dimension.product_description AS product_description FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/ SEMI JOIN /*+Distrib(F,R),JType(H)*/ (SELECT inventory_fact.qty_in_stock AS qty_in_stock FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1 ON (product_dimension.product_key = subQ_1.qty_in_stock))
NULLAWARE ANTI JOIN
Recreates a query that contains a subquery preceded by a NOT IN
or !=ALL
operator, and executes a null-aware anti-join.
Input query
SELECT product_description FROM product_dimension WHERE product_dimension.product_key NOT IN (SELECT qty_in_stock from inventory_fact);
Query plan
QUERY PLAN DESCRIPTION: ------------------------------ EXPLAIN SELECT product_description FROM product_dimension WHERE product_dimension.product_key not IN (SELECT qty_in_sto ck from inventory_fact); Access Path: +-JOIN HASH [Anti][NotInAnti] [Cost: 7K, Rows: 30K] (PATH ID: 1) Inner (BROADCAST) | Join Cond: (product_dimension.product_key = VAL(2)) | Materialize at Output: product_dimension.product_description | Execute on: Query Initiator | +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2) | | Projection: public.product_dimension_DBD_2_rep_VMartDesign | | Materialize: product_dimension.product_key | | Execute on: Query Initiator | +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3) | | Execute on: All Nodes | | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4) | | | Projection: public.inventory_fact_DBD_9_seg_VMartDesign_b0 | | | Materialize: inventory_fact.qty_in_stock | | | Execute on: All Nodes
Optimizer-generated annotated query
SELECT /*+ syntactic_join */ product_dimension.product_description AS product_description FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/ NULLAWARE ANTI JOIN /*+Distrib(L,B),JType(H)*/ (SELECT inventory_fact.qty_in_stock AS qty_in_stock FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1 ON (product_dimension.product_key = subQ_1.qty_in_stock))
SEMIALL JOIN
Recreates a query that contains a subquery preceded by an ALL
operator, and executes a semi-all join.
Input query
SELECT product_key, product_description FROM product_dimension WHERE product_dimension.product_key > ALL (SELECT product_key from inventory_fact);
Query plan
QUERY PLAN DESCRIPTION: ------------------------------ explain SELECT product_key, product_description FROM product_dimension WHERE product_dimension.product_key > ALL (SELECT product_key from inventory_fact); Access Path: +-JOIN HASH [Semi][All] [Cost: 7M, Rows: 30K] (PATH ID: 1) Outer (FILTER) Inner (BROADCAST) | Join Filter: (product_dimension.product_key > VAL(2)) | Materialize at Output: product_dimension.product_description | Execute on: All Nodes | +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2) | | Projection: public.product_dimension | | Materialize: product_dimension.product_key | | Execute on: All Nodes | +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3) | | Execute on: All Nodes | | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4) | | | Projection: public.inventory_fact_b0 | | | Materialize: inventory_fact.product_key | | | Execute on: All Nodes
Optimizer-generated annotated query
SELECT /*+ syntactic_join */ product_dimension.product_key AS product_key, product_dimension.product_description AS product_description FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/ SEMIALL JOIN /*+Distrib(F,B),JType(H)*/ (SELECT inventory_fact.product_key AS product_key FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1 ON (product_dimension.product_key > subQ_1.product_key))
ANTI JOIN
Recreates a query that contains a subquery preceded by a NOT EXISTS
operator, and executes an anti-join.
Input query
SELECT product_key, product_description FROM product_dimension WHERE NOT EXISTS (SELECT inventory_fact.product_key FROM inventory_fact WHERE inventory_fact.product_key=product_dimension.product_key);
Query plan
QUERY PLAN DESCRIPTION: ------------------------------ explain SELECT product_key, product_description FROM product_dimension WHERE NOT EXISTS (SELECT inventory_fact.product_ key FROM inventory_fact WHERE inventory_fact.product_key=product_dimension.product_key); Access Path: +-JOIN HASH [Anti] [Cost: 703, Rows: 30K] (PATH ID: 1) Outer (FILTER) | Join Cond: (VAL(1) = product_dimension.product_key) | Materialize at Output: product_dimension.product_description | Execute on: All Nodes | +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2) | | Projection: public.product_dimension_DBD_2_rep_VMartDesign | | Materialize: product_dimension.product_key | | Execute on: All Nodes | +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3) | | Execute on: All Nodes | | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4) | | | Projection: public.inventory_fact_DBD_9_seg_VMartDesign_b0 | | | Materialize: inventory_fact.product_key | | | Execute on: All Nodes
Optimizer-generated annotated query
SELECT /*+ syntactic_join */ product_dimension.product_key AS product_key, product_dimension.product_description AS product_description FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/ ANTI JOIN /*+Distrib(F,L),JType(H)*/ (SELECT inventory_fact.product_key AS "inventory_fact.product_key" FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1 ON (subQ_1."inventory_fact.product_key" = product_dimension.product_key))