Half-Join Subquery Semantics

The Vertica optimizer uses several keywords in directed queries to recreate half-join subqueries with certain search operators, such as ANY or NOT IN:

SEMI JOIN

Recreates a query that contains a subquery preceded by an IN, EXIST, or ANY operator and executes a semi-join.

Input query

SELECT product_description FROM product_dimension 
  WHERE product_dimension.product_key IN (SELECT qty_in_stock from inventory_fact);
			

Query plan

QUERY PLAN DESCRIPTION:
------------------------------

explain SELECT product_description FROM product_dimension WHERE product_dimension.product_key IN (SELECT qty_in_stock from inventory_fact);

 Access Path:
 +-JOIN HASH [Semi] [Cost: 1K, Rows: 30K] (PATH ID: 1) Outer (FILTER) Inner (RESEGMENT)
 |  Join Cond: (product_dimension.product_key = VAL(2))
 |  Materialize at Output: product_dimension.product_description
 |  Execute on: All Nodes
 | +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
 | |      Projection: public.product_dimension
 | |      Materialize: product_dimension.product_key
 | |      Execute on: All Nodes
 | |      Runtime Filter: (SIP1(HashJoin): product_dimension.product_key)
 | +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
 | |      Execute on: All Nodes
 | | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
 | | |      Projection: public.inventory_fact_b0
 | | |      Materialize: inventory_fact.qty_in_stock
 | | |      Execute on: All Nodes

Optimizer-generated annotated query

SELECT /*+ syntactic_join */ product_dimension.product_description AS product_description 
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/ 
SEMI JOIN /*+Distrib(F,R),JType(H)*/ (SELECT inventory_fact.qty_in_stock AS qty_in_stock 
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1 
ON (product_dimension.product_key = subQ_1.qty_in_stock))

NULLAWARE ANTI JOIN

Recreates a query that contains a subquery preceded by a NOT IN or !=ALL operator, and executes a null-aware anti-join.

Input query

SELECT product_description FROM product_dimension 
WHERE product_dimension.product_key NOT IN (SELECT qty_in_stock from inventory_fact);

Query plan

 QUERY PLAN DESCRIPTION:
 ------------------------------

 EXPLAIN SELECT product_description FROM product_dimension WHERE product_dimension.product_key not IN (SELECT qty_in_sto
ck from inventory_fact);

 Access Path:
 +-JOIN HASH [Anti][NotInAnti] [Cost: 7K, Rows: 30K] (PATH ID: 1) Inner (BROADCAST)
 |  Join Cond: (product_dimension.product_key = VAL(2))
 |  Materialize at Output: product_dimension.product_description
 |  Execute on: Query Initiator
 | +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
 | |      Projection: public.product_dimension_DBD_2_rep_VMartDesign
 | |      Materialize: product_dimension.product_key
 | |      Execute on: Query Initiator
 | +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
 | |      Execute on: All Nodes
 | | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
 | | |      Projection: public.inventory_fact_DBD_9_seg_VMartDesign_b0
 | | |      Materialize: inventory_fact.qty_in_stock
 | | |      Execute on: All Nodes

Optimizer-generated annotated query

SELECT /*+ syntactic_join */ product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/ 
NULLAWARE ANTI JOIN /*+Distrib(L,B),JType(H)*/ (SELECT inventory_fact.qty_in_stock AS qty_in_stock
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1  
ON (product_dimension.product_key = subQ_1.qty_in_stock))

SEMIALL JOIN

Recreates a query that contains a subquery preceded by an ALL operator, and executes a semi-all join.

Input query

SELECT product_key, product_description FROM product_dimension 
  WHERE product_dimension.product_key > ALL (SELECT product_key from inventory_fact);

Query plan

QUERY PLAN DESCRIPTION:
------------------------------

explain SELECT product_key, product_description FROM product_dimension WHERE product_dimension.product_key > ALL (SELECT product_key from inventory_fact);

Access Path:
+-JOIN HASH [Semi][All] [Cost: 7M, Rows: 30K] (PATH ID: 1) Outer (FILTER) Inner (BROADCAST)
|  Join Filter: (product_dimension.product_key > VAL(2))
|  Materialize at Output: product_dimension.product_description
|  Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
| |      Projection: public.product_dimension
| |      Materialize: product_dimension.product_key
| |      Execute on: All Nodes
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| |      Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | |      Projection: public.inventory_fact_b0
| | |      Materialize: inventory_fact.product_key
| | |      Execute on: All Nodes
		

Optimizer-generated annotated query

SELECT /*+ syntactic_join */ product_dimension.product_key AS product_key, product_dimension.product_description AS product_description 
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/ 
SEMIALL JOIN /*+Distrib(F,B),JType(H)*/ (SELECT inventory_fact.product_key AS product_key FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1  
ON (product_dimension.product_key > subQ_1.product_key))
				

ANTI JOIN

Recreates a query that contains a subquery preceded by a NOT EXISTS operator, and executes an anti-join.

Input query

SELECT product_key, product_description FROM product_dimension 
WHERE NOT EXISTS (SELECT inventory_fact.product_key FROM inventory_fact 
WHERE inventory_fact.product_key=product_dimension.product_key);

Query plan

 QUERY PLAN DESCRIPTION:
 ------------------------------

 explain SELECT product_key, product_description FROM product_dimension WHERE NOT EXISTS (SELECT inventory_fact.product_
key FROM inventory_fact WHERE inventory_fact.product_key=product_dimension.product_key);

 Access Path:
 +-JOIN HASH [Anti] [Cost: 703, Rows: 30K] (PATH ID: 1) Outer (FILTER)
 |  Join Cond: (VAL(1) = product_dimension.product_key)
 |  Materialize at Output: product_dimension.product_description
 |  Execute on: All Nodes
 | +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
 | |      Projection: public.product_dimension_DBD_2_rep_VMartDesign
 | |      Materialize: product_dimension.product_key
 | |      Execute on: All Nodes
 | +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
 | |      Execute on: All Nodes
 | | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
 | | |      Projection: public.inventory_fact_DBD_9_seg_VMartDesign_b0
 | | |      Materialize: inventory_fact.product_key
 | | |      Execute on: All Nodes

Optimizer-generated annotated query

SELECT /*+ syntactic_join */ product_dimension.product_key AS product_key, product_dimension.product_description AS product_description 
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/ 
ANTI JOIN /*+Distrib(F,L),JType(H)*/ (SELECT inventory_fact.product_key AS "inventory_fact.product_key" 
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1 
ON (subQ_1."inventory_fact.product_key" = product_dimension.product_key))