INTERSECT Clause

Calculates the intersection of the results of two or more SELECT queries. INTERSECT returns distinct values by both the query on the left and right sides of the INTERSECT operand.

Syntax

SELECT 
... INTERSECT select
... [ INTERSECT select ]...
... [ ORDER BY { column-name | ordinal-number }  [ ASC | DESC ] [,…] ]
... [ LIMIT { integer | ALL } ]
... [ OFFSET integer ]

Notes

  • Use the INTERSECT clause to return all elements that are common to the results of all the SELECT queries. The INTERSECT query operates on the results of two or more SELECT queries. INTERSECT returns only the rows that are returned by all the specified queries.
  • You cannot use the ALL keyword with an INTERSECT query.
  • The results of each SELECT query must be union compatible; they must return the same number of columns, and the corresponding columns must have compatible data types. For example, you cannot use the INTERSECT clause on a column of type INTEGER and a column of type VARCHAR. If the SELECT queries do not meet these criteria, Vertica returns an error.

    The Data Type Coercion Chart lists the data types that can be cast to other data types. If one data type can be cast to the other, those two data types are compatible.

  • Order the results of an INTERSECT operation by using an ORDER BY clause. In the ORDER BY list, specify the column names from the leftmost SELECT statement or specify integers that indicate the position of the columns by which to sort.
  • You can use INTERSECT in FROM, WHERE, and HAVING clauses.
  • The rightmost ORDER BY, LIMIT, or OFFSET clauses in an INTERSECT query do not need to be enclosed in parentheses because the rightmost query specifies that Vertica perform the operation on the results of the INTERSECT operation. Any ORDER BY, LIMIT, or OFFSET clauses contained in SELECT queries that appear earlier in the INTERSECT query must be enclosed in parentheses.
  • The order by column names is from the first select.
  • Vertica supports INTERSECT noncorrelated subquery predicates. For example:

    => SELECT * FROM T1
       WHERE T1.x IN
          (SELECT MAX(c1) FROM T2 
           INTERSECT 
              SELECT MAX(cc1) FROM T3 
           INTERSECT 
              SELECT MAX(d1) FROM T4);
    

Examples

Consider the following three tables:

Company_A

id       emp_lname     dept          sales
------+------------+----------------+-------
1234  | Stephen    | auto parts     | 1000
5678  | Alice      | auto parts     | 2500
9012  | Katherine  | floral         |  500
3214  | Smithson   | sporting goods | 1500

Company_B

id       emp_lname     dept        sales
------+------------+-------------+-------
4321  | Marvin     | home goods  |   250
9012  | Katherine  | home goods  |   500
8765  | Bob        | electronics | 20000
3214  | Smithson   | home goods  |  1500

Company_C

  id  | emp_lname |      dept      | sales
------+-----------+----------------+-------
 3214 | Smithson  | sporting goods |  1500
 5432 | Madison   | sporting goods |   400
 7865 | Cleveland | outdoor        |  1500
 1234 | Stephen   | floral         |  1000

The following query returns the IDs and last names of employees that exist in both Company_A and Company_B:

=> SELECT id, emp_lname FROM Company_A
   INTERSECT
   SELECT id, emp_lname FROM Company_B; 
 id   | emp_lname 
------+-----------
 3214 | Smithson
 9012 | Katherine
(2 rows)

The following query returns the same two employees in descending order of sales:

=> SELECT id, emp_lname, sales FROM Company_A
   INTERSECT
   SELECT id, emp_lname, sales FROM Company_B
   ORDER BY sales DESC;
  id  | emp_lname | sales 
------+-----------+-------
 3214 | Smithson  |  1500
 9012 | Katherine |   500
(2 rows)

You can also use the integer that represents the position of the sales column (3) to return the same result:

=> SELECT id, emp_lname, sales FROM Company_A
   INTERSECT
   SELECT id, emp_lname, sales FROM Company_B
   ORDER BY 3 DESC;
  id  | emp_lname | sales 
------+-----------+-------
 3214 | Smithson  |  1500
 9012 | Katherine |   500
(2 rows)

The following query returns the employee who works for both companies whose sales in Company_B are greater than 1000:

=> SELECT id, emp_lname, sales FROM Company_A
   INTERSECT
   (SELECT id, emp_lname, sales FROM company_B WHERE sales > 1000)
   ORDER BY sales DESC;
  id  | emp_lname | sales 
------+-----------+-------
 3214 | Smithson  |  1500
(1 row)

In the following query returns the ID and last name of the employee who works for all three companies:

=> SELECT id, emp_lname FROM Company_A
    INTERSECT
   SELECT id, emp_lname FROM Company_B
   INTERSECT
   SELECT id, emp_lname FROM Company_C;
  id  | emp_lname 
------+-----------
 3214 | Smithson
(1 row)

The following query shows the results of a mismatched data types; these two queries are not union compatible:

=> SELECT id, emp_lname FROM Company_A
    INTERSECT
   SELECT emp_lname, id FROM Company_B;
ERROR 3429:  For 'INTERSECT', types int and varchar are inconsistent
DETAIL:  Columns: id and emp_lname

Using the VMart example database, the following query returns information about all Connecticut-based customers who bought items online and whose purchase amounts were between $400 and $500:

=> SELECT customer_key, customer_name from public.customer_dimension
       WHERE customer_key IN (SELECT customer_key 
         FROM online_sales.online_sales_fact
         WHERE sales_dollar_amount > 400 
         INTERSECT 
         SELECT customer_key FROM online_sales.online_sales_fact 
         WHERE sales_dollar_amount < 500) 
      AND customer_state = 'CT';
 customer_key |     customer_name      
--------------+------------------------
           39 | Sarah S. Winkler
           44 | Meghan H. Overstreet
           70 | Jack X. Cleveland
          103 | Alexandra I. Vu
          110 | Matt . Farmer
          173 | Mary R. Reyes
          188 | Steve G. Williams
          233 | Theodore V. McNulty
          250 | Marcus E. Williams
          294 | Samantha V. Young
          313 | Meghan P. Pavlov
          375 | Sally N. Vu
          384 | Emily R. Smith
          387 | Emily L. Garcia
...

The previous query returns the same data as:

=> SELECT customer_key,customer_name FROM public.customer_dimension
       WHERE customer_key IN (SELECT customer_key 
      FROM online_sales.online_sales_fact 
         WHERE sales_dollar_amount > 400 
         AND sales_dollar_amount < 500) 
   AND customer_state = 'CT';