UNION Clause
Combines the results of multiple SELECT
statements. You can include UNION
in FROM
, WHERE
, and HAVING
clauses.
Syntax
select‑stmt UNION { ALL | DISTINCT } select-stmt [ UNION { ALL | DISTINCT } select-stmt ]… [ ORDER BYexpression { ASC | DESC }[,…] ] [ LIMIT { count | ALL } ] [ OFFSETstart ]
Parameters
select‑stmt |
A The following options also apply:
|
DISTINCT | ALL
|
Specifies whether to return unique rows:
|
Requirements
- All rows of the
UNION
result set must be in the result set of at least one of itsSELECT
statements. -
Each
SELECT
statement must specify the same number of columns. - Data types of corresponding
SELECT
statement columns must be compatible, otherwise Vertica returns an error.
ORDER BY, LIMIT, and OFFSET Clauses in UNION
A UNION
statement can specify its own ORDER BY
, LIMIT
, and OFFSET
clauses. For example, given the tables described below in Examples, the following query orders the UNION
result set by emp_name
and limits output to the first two rows:
=> SELECT id, emp_name FROM company_a UNION ALL SELECT id, emp_name FROM company_b ORDER BY emp_name LIMIT 2; id | emp_name ------+---------- 5678 | Alice 8765 | Bob (2 rows)
Each SELECT
statement in a UNION
clause can specify its own ORDER BY
, LIMIT
, and OFFSET
clauses. In this case, the SELECT
statement must be enclosed by parentheses. Vertica processes the SELECT
statement ORDER BY
, LIMIT
, and OFFSET
clauses before it processes the UNION
clauses.
For example, each SELECT
statement in the following UNION
specifies its own ORDER BY
and LIMIT
clauses. Vertica processes the individual queries and then concatenates the two result sets:
=> (SELECT id, emp_name FROM company_a ORDER BY emp_name LIMIT 2) UNION ALL (SELECT id, emp_name FROM company_b ORDER BY emp_name LIMIT 2); id | emp_name ------+----------- 5678 | Alice 9012 | Katherine 8765 | Bob 9012 | Katherine (4 rows)
The following requirements and restrictions determine how Vertica processes a UNION
clause that contains ORDER BY
, LIMIT
, and OFFSET
clauses:
- A
UNION
'sORDER BY
clause must specify columns from the first (leftmost)SELECT
statement. - Always use an
ORDER BY
clause withLIMIT
andOFFSET
. Otherwise, the query returns an undefined subset of the result set. ORDER BY
must precedeLIMIT
andOFFSET
.- When a
SELECT
orUNION
statement specifies bothLIMIT
andOFFSET
, Vertica first processes theOFFSET
statement, and then applies theLIMIT
statement to the remaining rows.
UNION in Non-Correlated Subqueries
Vertica supports UNION
in noncorrelated subquery predicates. For example:
=> SELECT DISTINCT customer_key, customer_name FROM public.customer_dimension WHERE customer_key IN (SELECT customer_key FROM store.store_sales_fact WHERE sales_dollar_amount > 500 UNION ALL SELECT customer_key FROM online_sales.online_sales_fact WHERE sales_dollar_amount > 500) AND customer_state = 'CT'; customer_key | customer_name --------------+------------------------ 7021 | Luigi T. Dobisz 1971 | Betty V. Dobisz 46284 | Ben C. Gauthier 33885 | Tanya Y. Taylor 5449 | Sarah O. Robinson 29059 | Sally Z. Fortin 11200 | Foodhope 15582 | John J. McNulty 24638 | Alexandra F. Jones …
Examples
The examples that follow use these two tables:
company_a
ID emp_name dept sales ------+------------+-------------+------- 1234 | Stephen | auto parts | 1000 5678 | Alice | auto parts | 2500 9012 | Katherine | floral | 500
company_b
ID emp_name dept sales ------+------------+-------------+------- 4321 | Marvin | home goods | 250 9012 | Katherine | home goods | 500 8765 | Bob | electronics | 20000
Find all employee IDs and names from company_a and company_b
The UNION
statement specifies DISTINCT
to combine unique IDs and last names of employees; Katherine works for both companies, so she appears only once in the result set. DISTINCT
is the default and can be omitted:
=> SELECT id, emp_name FROM company_a UNION DISTINCT SELECT id, emp_name FROM company_b ORDER BY id; id | emp_name ------+----------- 1234 | Stephen 4321 | Marvin 5678 | Alice 8765 | Bob 9012 | Katherine (5 rows)
The next UNION
statement specifies the option ALL
. Katherine works for both companies, so the query returns two records for her:
=> SELECT id, emp_name FROM company_a UNION ALL SELECT id, emp_name FROM company_b ORDER BY id; id | emp_name ------+----------- 1234 | Stephen 5678 | Alice 9012 | Katherine 4321 | Marvin 9012 | Katherine 8765 | Bob (6 rows)
Find the top two top performing salespeople in each company
Each SELECT
statement specifies its own ORDER BY
and LIMIT
clauses, so the UNION
statement concatenates the result sets as returned by each query:
=> (SELECT id, emp_name, sales FROM company_a ORDER BY sales DESC LIMIT 2) UNION ALL (SELECT id, emp_name, sales FROM company_b ORDER BY sales DESC LIMIT 2); id | emp_name | sales ------+-----------+------- 8765 | Bob | 20000 5678 | Alice | 2500 1234 | Stephen | 1000 9012 | Katherine | 500 (4 rows)
Find all employee orders by sales
The UNION
statement specifies its own ORDER BY
clause, which Vertica applies to the entire result:
=> SELECT id, emp_name, sales FROM company_a UNION SELECT id, emp_name, sales FROM company_b ORDER BY sales; id | emp_name | sales ------+-----------+------- 4321 | Marvin | 250 9012 | Katherine | 500 1234 | Stephen | 1000 5678 | Alice | 2500 8765 | Bob | 20000 (5 rows)
Calculate the sum of sales for each company grouped by department
Each SELECT
statement has its own GROUP BY
clause. UNION
combines the aggregate results from each query:
=> (SELECT 'Company A' as company, dept, SUM(sales) FROM company_a GROUP BY dept) UNION (SELECT 'Company B' as company, dept, SUM(sales) FROM company_b GROUP BY dept) ORDER BY 1; company | dept | sum -----------+-------------+------- Company A | auto parts | 3500 Company A | floral | 500 Company B | electronics | 20000 Company B | home goods | 750 (4 rows)