COUNT [Analytic]
Counts occurrences within a group within a window. If you specify * or some non-null constant, COUNT()
counts all rows.
Behavior Type
Syntax
COUNT ( expression ) OVER ( [ window-partition-clause ] [ window-order-clause ] [ window-frame-clause ] )
Parameters
expression |
Returns the number of rows in each group for which the expression is not null. Can be any expression resulting in BIGINT. |
OVER()
|
Example
Using the schema defined in Window Framing in Analyzing Data, the following COUNT
function omits window order and window frame clauses; otherwise Vertica would treat it as a window aggregate. Think of the window of reporting aggregates as UNBOUNDED PRECEDING
and UNBOUNDED FOLLOWING
.
=> SELECT deptno, sal, empno, COUNT(sal) OVER (PARTITION BY deptno) AS count FROM emp;
deptno | sal | empno | count --------+-----+-------+------- 10 | 101 | 1 | 2 10 | 104 | 4 | 2 20 | 110 | 10 | 6 20 | 110 | 9 | 6 20 | 109 | 7 | 6 20 | 109 | 6 | 6 20 | 109 | 8 | 6 20 | 109 | 11 | 6 30 | 105 | 5 | 3 30 | 103 | 3 | 3 30 | 102 | 2 | 3
Using ORDER BY sal
creates a moving window query with default window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
.
=> SELECT deptno, sal, empno, COUNT(sal) OVER (PARTITION BY deptno ORDER BY sal) AS count FROM emp;
deptno | sal | empno | count --------+-----+-------+------- 10 | 101 | 1 | 1 10 | 104 | 4 | 2 20 | 100 | 11 | 1 20 | 109 | 7 | 4 20 | 109 | 6 | 4 20 | 109 | 8 | 4 20 | 110 | 10 | 6 20 | 110 | 9 | 6 30 | 102 | 2 | 1 30 | 103 | 3 | 2 30 | 105 | 5 | 3
Using the VMart schema, the following query finds the number of employees who make less than or equivalent to the hourly rate of the current employee. The query returns a running/cumulative average (sometimes called a moving average) using the default window of RANGE UNBOUNDED PRECEDING AND CURRENT ROW
:
=> SELECT employee_last_name AS "last_name", hourly_rate, COUNT(*) OVER (ORDER BY hourly_rate) AS moving_count from employee_dimension;
last_name | hourly_rate | moving_count ------------+-------------+-------------- Gauthier | 6 | 4 Taylor | 6 | 4 Jefferson | 6 | 4 Nielson | 6 | 4 McNulty | 6.01 | 11 Robinson | 6.01 | 11 Dobisz | 6.01 | 11 Williams | 6.01 | 11 Kramer | 6.01 | 11 Miller | 6.01 | 11 Wilson | 6.01 | 11 Vogel | 6.02 | 14 Moore | 6.02 | 14 Vogel | 6.02 | 14 Carcetti | 6.03 | 19 ...
To return a moving average that is not also a running (cumulative) average, the window should specify ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
:
=> SELECT employee_last_name AS "last_name", hourly_rate, COUNT(*)
OVER (ORDER BY hourly_rate ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
)
AS moving_count from employee_dimension;