Data Aggregation
You can use functions such as SUM and COUNT to aggregate the results of GROUP BY
queries at one or more levels.
Aggregating Data at a Single Level
The simplest GROUP BY
queries aggregate data at a single level. For example, a table might contain the following information about family expenses:
- Category
- Amount spent on that category during the year
- Year
Table data might look like this:
=> SELECT * FROM expenses ORDER BY Category; Year | Category | Amount ------+------------+-------- 2005 | Books | 39.98 2007 | Books | 29.99 2008 | Books | 29.99 2006 | Electrical | 109.99 2005 | Electrical | 109.99 2007 | Electrical | 229.98
You can use aggregate functions to get the total expenses per category or per year:
=> SELECT SUM(Amount), Category FROM expenses GROUP BY Category; SUM | Category ---------+------------ 99.96 | Books 449.96 | Electrical
=> SELECT SUM(Amount), Year FROM expenses GROUP BY Year; SUM | Year --------+------ 149.97 | 2005 109.99 | 2006 29.99 | 2008 259.97 | 2007
Aggregating Data at Multiple Levels
Over time, tables that are updated frequently can contain large amounts of data. Using the simple table shown earlier, suppose you want a multilevel query, like the number of expenses per category per year.
The following query uses the ROLLUP
aggregation with the SUM function to calculate the total expenses by category and the overall expenses total. The NULL fields indicate subtotal values in the aggregation.
- When only the
Year
column isNULL
, the subtotal is for all theCategory
values. - When both the
Year
andCategory
columns areNULL
, the subtotal is for allAmount
values for both columns.
Using the ORDER BY
clause orders the results by expense category, the year the expenses took place, and the GROUP BY
level that the GROUPING_ID
function creates:
=> SELECT Category, Year, SUM(Amount) FROM expenses GROUP BY ROLLUP(Category, Year) ORDER BY Category, Year, GROUPING_ID(); Category | Year | SUM ------------+------+-------- Books | 2005 | 39.98 Books | 2007 | 29.99 Books | 2008 | 29.99 Books | | 99.96 Electrical | 2005 | 109.99 Electrical | 2006 | 109.99 Electrical | 2007 | 229.98 Electrical | | 449.96 | | 549.92
Similarly, the following query calculates the total sales by year and the overall sales total and then uses the ORDER BY
clause to sort the results:
=> SELECT Category, Year, SUM(Amount) FROM expenses GROUP BY ROLLUP(Year, Category) ORDER BY 2, 1, GROUPING_ID(); Category | Year | SUM ------------+------+-------- Books | 2005 | 39.98 Electrical | 2005 | 109.99 | 2005 | 149.97 Electrical | 2006 | 109.99 | 2006 | 109.99 Books | 2007 | 29.99 Electrical | 2007 | 229.98 | 2007 | 259.97 Books | 2008 | 29.99 | 2008 | 29.99 | | 549.92 (11 rows)
You can use the CUBE
aggregate to perform all possible groupings of the category and year expenses. The following query returns all possible groupings, ordered by grouping:
=> SELECT Category, Year, SUM(Amount) FROM expenses GROUP BY CUBE(Category, Year) ORDER BY 1, 2, GROUPING_ID(); Category | Year | SUM ------------+------+-------- Books | 2005 | 39.98 Books | 2007 | 29.99 Books | 2008 | 29.99 Books | | 99.96 Electrical | 2005 | 109.99 Electrical | 2006 | 109.99 Electrical | 2007 | 229.98 Electrical | | 449.96 | 2005 | 149.97 | 2006 | 109.99 | 2007 | 259.97 | 2008 | 29.99 | | 549.92
The results include subtotals for each category and each year and a total ($549.92) for all transactions, regardless of year or category.
ROLLUP
, CUBE
, and GROUPING SETS
generate NULL
values in grouping columns to identify subtotals. If table data includes NULL
values, differentiating these from NULL
values in subtotals can sometimes be challenging.
In the preceding output, the NULL
values in the Year
column indicate that the row was grouped on the Category
column, rather than on both columns. In this case, ROLLUP
added the NULL
value to indicate the subtotal row.
To distinguish subtotal rows from NULL
values that are part of the input data, use the GROUPING
function.