
So we’ll need to join the TABLES, PROJECTION_STORAGE, and DELETE_VECTORS system tables together and handle the row counts a little differently by projection type. (Note that we need the TABLES system table because you can have a table with no projections.)
Example:
dbadmin=> SELECT table_schema,
dbadmin-> SUM(row_count) AS row_count
dbadmin-> FROM (SELECT table_schema,
dbadmin(> NVL(CASE
dbadmin(> WHEN NOT is_segmented THEN (row_count / node_cnt)::INT
dbadmin(> ELSE row_count
dbadmin(> END, 0) row_count
dbadmin(> FROM (SELECT t.table_schema,
dbadmin(> t.table_name,
dbadmin(> ps.projection_name,
dbadmin(> (SELECT MIN(is_segmented)
dbadmin(> FROM projections p
dbadmin(> WHERE p.projection_name = ps.projection_name
dbadmin(> AND p.projection_schema = ps.anchor_table_schema) is_segmented,
dbadmin(> node_cnt,
dbadmin(> SUM(ps.row_count - NVL(dv.deleted_row_count, 0)) row_count
dbadmin(> FROM tables t
dbadmin(> LEFT JOIN projection_storage ps
dbadmin(> ON t.table_id = ps.anchor_table_id
dbadmin(> LEFT JOIN delete_vectors dv
dbadmin(> ON dv.node_name = ps.node_name
dbadmin(> AND dv.schema_name = ps.anchor_table_schema
dbadmin(> AND dv.projection_name = ps.projection_name
dbadmin(> CROSS JOIN (SELECT COUNT(1) node_cnt FROM nodes WHERE node_type = 'PERMANENT') n
dbadmin(> GROUP BY 1, 2, 3, 4, 5
dbadmin(> LIMIT 1 OVER(PARTITION BY t.table_schema, t.table_name ORDER BY 1)) foo) foo2
dbadmin-> GROUP BY table_schema
dbadmin-> ORDER BY table_schema;
table_schema | row_count
---------------+-----------
ctg_analytics | 0
dds | 39759
idea_dwh | 12288
jim1 | 16
public | 71007066
stlouis | 77396606
test | 8
(7 rows)
But I’m not comfortable with this solution. What happens when Vertica inevitably introduces improved projection types? We may have to modify the query to handle those.
I prefer to run actual row counts from the tables. To do that, we can generate a SQL statement that literally runs a SELECT COUNT(1) from each table in the database, summing them by schema:
dbadmin=> \! vsql -Atc "SELECT 'SELECT table_schema, SUM(table_row_count) total_row_count FROM ('UNION ALL SELECT 'SELECT ''' || table_schema || ''' AS table_schema, (SELECT COUNT(1) FROM ' || table_schema || '.' || table_name || ') AS table_row_count' || DECODE(LEAD(table_name) OVER (ORDER BY 1), NULL, ') foo GROUP BY table_schema ORDER BY table_schema;', ' UNION ALL') FROM tables WHERE table_definition = '';" | vsql
table_schema | total_row_count
---------------+-----------------
ctg_analytics | 0
dds | 39759
idea_dwh | 12288
jim1 | 16
public | 71007066
stlouis | 77396606
test | 8
(7 rows)
This works because Vertica is pretty darn quick at performing table row counts!
Helpful Links:
https://www.vertica.com/docs/latest/HTML/Content/Authoring/SQLReferenceManual/SystemTables/CATALOG/TABLES.htm https://www.vertica.com/docs/latest/HTML/Content/Authoring/SQLReferenceManual/SystemTables/MONITOR/PROJECTION_STORAGE.htm https://www.vertica.com/docs/latest/HTML/Content/Authoring/SQLReferenceManual/SystemTables/MONITOR/DELETE_VECTORS.htm https://www.vertica.com/docs/latest/HTML/Content/Authoring/SQLReferenceManual/SystemTables/CATALOG/NODES.htm
Have fun!