Get the Row Count from an Outer Table Join

Posted October 22, 2019 by James Knicely, Vertica Field Chief Technologist

Construction site crane building a blue SQL 3D text.

Aggregate functions summarize data over groups of rows from a query result set. When using an aggregate function like COUNT with an “*” or “1” parameter value, you may get a different result when the query implements a LEFT join verses an INNER join.

Example:

dbadmin=> SELECT * FROM t1 ORDER BY 1;
 c
---
 1
 2
 4
(3 rows)

dbadmin=> SELECT * FROM t2 ORDER BY 1;
 c
---
 1
 2
(2 rows)

dbadmin=> SELECT COUNT(*)
dbadmin->   INNER FROM t1
dbadmin->   JOIN t2 ON t1.c = t2.c;
 COUNT
-------
     2
(1 row)

dbadmin=> SELECT COUNT(*)
dbadmin->   FROM t1
dbadmin->   LEFT JOIN t2 ON t1.c = t2.c;
 COUNT
-------
     3
(1 row)

If you want a row count of just the rows from the left outer table that join to the inner table, you have to specify a column name from the left outer table in the COUNT function.

dbadmin=> SELECT COUNT(t2.c) t2_count
dbadmin->   FROM t1
dbadmin->   LEFT JOIN t2 ON t1.c = t2.c;
 t2_count
----------
        2
(1 row)

Have fun!

Helpful links:
https://www.vertica.com/docs/9.3.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Aggregate/AggregateFunctions.htm
https://www.vertica.com/docs/9.3.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Aggregate/COUNTAggregate.htm
https://www.vertica.com/docs/9.3.x/HTML/Content/Authoring/AnalyzingData/Queries/Joins/OuterJoins.htm