Subquery Examples
This topic illustrates some of the subqueries you can write. The examples use the VMart example database.
Single-Row Subqueries
Single-row subqueries are used with single-row comparison operators (=, >=, <=, <>, and <=>) and return exactly one row.
For example, the following query retrieves the name and hire date of the oldest employee in the Vmart database:
=> SELECT employee_key, employee_first_name, employee_last_name, hire_date FROM employee_dimension WHERE hire_date = (SELECT MIN(hire_date) FROM employee_dimension); employee_key | employee_first_name | employee_last_name | hire_date --------------+---------------------+--------------------+------------ 2292 | Mary | Bauer | 1956-01-11 (1 row)
Multiple-Row Subqueries
Multiple-row subqueries return multiple records.
For example, the following IN clause subquery returns the names of the employees making the highest salary in each of the six regions:
=> SELECT employee_first_name, employee_last_name, annual_salary, employee_region FROM employee_dimension WHERE annual_salary IN (SELECT MAX(annual_salary) FROM employee_dimension GROUP BY employee_region) ORDER BY annual_salary DESC; employee_first_name | employee_last_name | annual_salary | employee_region ---------------------+--------------------+---------------+------------------- Alexandra | Sanchez | 992363 | West Mark | Vogel | 983634 | South Tiffany | Vu | 977716 | SouthWest Barbara | Lewis | 957949 | MidWest Sally | Gauthier | 927335 | East Wendy | Nielson | 777037 | NorthWest (6 rows)
Multicolumn Subqueries
Multicolumn subqueries return one or more columns. Sometimes a subquery's result set is evaluated in the containing query in column-to-column and row-to-row comparisons.
Note: Multicolumn subqueries can use the <>, !=, and = operators but not the <, >, <=, >= operators.
You can substitute some multicolumn subqueries with a join, with the reverse being true as well. For example, the following two queries ask for the sales transactions of all products sold online to customers located in Massachusetts and return the same result set. The only difference is the first query is written as a join and the second is written as a subquery.
Join query: |
Subquery: |
=> SELECT * FROM online_sales.online_sales_fact INNER JOIN public.customer_dimension USING (customer_key) WHERE customer_state = 'MA'; |
=> SELECT * FROM online_sales.online_sales_fact WHERE customer_key IN (SELECT customer_key FROM public.customer_dimension WHERE customer_state = 'MA'); |
The following query returns all employees in each region whose salary is above the average:
=> SELECT e.employee_first_name, e.employee_last_name, e.annual_salary, e.employee_region, s.average FROM employee_dimension e, (SELECT employee_region, AVG(annual_salary) AS average FROM employee_dimension GROUP BY employee_region) AS s WHERE e.employee_region = s.employee_region AND e.annual_salary > s.average ORDER BY annual_salary DESC;
employee_first_name | employee_last_name | annual_salary | employee_region | average ---------------------+--------------------+---------------+-----------------+------------------ Doug | Overstreet | 995533 | East | 61192.786013986 Matt | Gauthier | 988807 | South | 57337.8638902996 Lauren | Nguyen | 968625 | West | 56848.4274914089 Jack | Campbell | 963914 | West | 56848.4274914089 William | Martin | 943477 | NorthWest | 58928.2276119403 Luigi | Campbell | 939255 | MidWest | 59614.9170454545 Sarah | Brown | 901619 | South | 57337.8638902996 Craig | Goldberg | 895836 | East | 61192.786013986 Sam | Vu | 889841 | MidWest | 59614.9170454545 Luigi | Sanchez | 885078 | MidWest | 59614.9170454545 Michael | Weaver | 882685 | South | 57337.8638902996 Doug | Pavlov | 881443 | SouthWest | 57187.2510548523 Ruth | McNulty | 874897 | East | 61192.786013986 Luigi | Dobisz | 868213 | West | 56848.4274914089 Laura | Lang | 865829 | East | 61192.786013986 ...
You can also use the EXCEPT, INTERSECT, and UNION [ALL] keywords in FROM, WHERE, and HAVING clauses.
The following subquery returns information about all Connecticut-based customers who bought items through either stores or online sales channel and whose purchases amounted to more than 500 dollars:
=> SELECT DISTINCT customer_key, customer_name FROM public.customer_dimension WHERE customer_key IN (SELECT customer_key FROM store.store_sales_fact WHERE sales_dollar_amount > 500 UNION ALL SELECT customer_key FROM online_sales.online_sales_fact WHERE sales_dollar_amount > 500) AND customer_state = 'CT'; customer_key | customer_name --------------+------------------ 200 | Carla Y. Kramer 733 | Mary Z. Vogel 931 | Lauren X. Roy 1533 | James C. Vu 2948 | Infocare 4909 | Matt Z. Winkler 5311 | John Z. Goldberg 5520 | Laura M. Martin 5623 | Daniel R. Kramer 6759 | Daniel Q. Nguyen ...
HAVING Clause Subqueries
A HAVING clause is used in conjunction with the GROUP BY clause to filter the select-list records that a GROUP BY returns. HAVING clause subqueries must use Boolean comparison operators: =, >, <, <>, <=, >= and take the following form:
SELECT <column, ...> FROM <table> GROUP BY <expression> HAVING <expression> (SELECT <column, ...> FROM <table> HAVING <expression>);
For example, the following statement uses the VMart database and returns the number of customers who purchased lowfat products. Note that the GROUP BY clause is required because the query uses an aggregate (COUNT).
=> SELECT s.product_key, COUNT(s.customer_key) FROM store.store_sales_fact s GROUP BY s.product_key HAVING s.product_key IN (SELECT product_key FROM product_dimension WHERE diet_type = 'Low Fat');
The subquery first returns the product keys for all low-fat products, and the outer query then counts the total number of customers who purchased those products.
product_key | count -------------+------- 15 | 2 41 | 1 66 | 1 106 | 1 118 | 1 169 | 1 181 | 2 184 | 2 186 | 2 211 | 1 229 | 1 267 | 1 289 | 1 334 | 2 336 | 1 (15 rows)