IN and NOT IN

While you cannot equate a single value to a set of values, you can check to see if a single value is found within that set of values. Use the IN clause for multiple-record, single-column subqueries. After the subquery returns results introduced by IN or NOT IN, the outer query uses them to return the final result.

[NOT] IN subqueries take the following form:

{ expression [ NOT ] IN ( subquery )| expression [ NOT ] IN ( expression ) }

There is no limit to the number of parameters passed to the IN clause of the SELECT statement; for example:

=> SELECT * FROM tablename WHERE column IN (a, b, c, d, e, ...);

Vertica also supports queries where two or more outer expressions refer to different inner expressions:

=> SELECT * FROM A WHERE (A.x,A.x) IN (SELECT B.x, B.y FROM B); 

Examples

The following query uses the VMart schema to illustrate the use of outer expressions referring to different inner expressions:

=> SELECT product_description, product_price FROM product_dimension
   WHERE (product_dimension.product_key, product_dimension.product_key) IN 
      (SELECT store.store_orders_fact.order_number, 
         store.store_orders_fact.quantity_ordered 
       FROM store.store_orders_fact);  
 product_description         | product_price
-----------------------------+---------------
 Brand #72 box of candy      |           326
 Brand #71 vanilla ice cream |           270
(2 rows)

To find all products supplied by stores in MA, first create the inner query and run it to ensure that it works as desired. The following query returns all stores located in MA:

=> SELECT store_key FROM store.store_dimension WHERE store_state = 'MA';
 store_key 
-----------
        13
        31
(2 rows)

Then create the outer or main query that specifies all distinct products that were sold in stores located in MA. This statement combines the inner and outer queries using the IN predicate:

=> SELECT DISTINCT s.product_key, p.product_description
   FROM store.store_sales_fact s, public.product_dimension p
   WHERE s.product_key = p.product_key
       AND s.product_version = p.product_version
       AND s.store_key IN
	     (SELECT store_key
	      FROM store.store_dimension
	      WHERE store_state = 'MA')
   ORDER BY s.product_key;
 product_key |          product_description          
-------------+---------------------------------------
           1 | Brand #1 white bread
           1 | Brand #4 vegetable soup
           3 | Brand #9 wheelchair
           5 | Brand #15 cheddar cheese
           5 | Brand #19 bleach
           7 | Brand #22 canned green beans
           7 | Brand #23 canned tomatoes
           8 | Brand #24 champagne
           8 | Brand #25 chicken nuggets
          11 | Brand #32 sausage
         ...   ...
(281 rows)

When using NOT IN, the subquery returns a list of zero or more values in the outer query where the comparison column does not match any of the values returned from the subquery. Using the previous example, NOT IN returns all the products that are not supplied from MA.

Notes

Vertica supports multicolumn NOT IN subqueries in which the columns are not marked NOT NULL. If one of the columns is found to contain a NULL value during query execution, Vertica returns a run-time error.

Similarly, IN subqueries nested within another expression are not supported if any of the column values are NULL. For example, if in the following statement column x from either table contained a NULL value, Vertica returns a run-time error:

=> SELECT * FROM t1 WHERE (x IN (SELECT x FROM t2)) IS FALSE;
   ERROR: NULL value found in a column used by a subquery