Analyze Statistics at the Schema Level (Part 2): Quick Tip

Posted June 19, 2018 by Phil Molea, Sr. Information Developer, Vertica

Rear view of two partially unrecognizable men sitting in front of a computer. One of them is pointing at a screen where are several lines of computer code.
Jim Knicely authored this tip. The ANALYZE_STATISTICS function only accepts a table/projection/column name as input. In yesterday’s Vertica Quick Tip we learned how to get Vertica to generate and execute ANALYZE_STATISTICS SQL statements, one for each table in a given schema. It was an okay solution, but not very convenient. A better option would be to create our own function (external procedure) that does all of the work for us. Example: Here is the code for the procedure (a simple BASH script): dbadmin=> \! cat /home/dbadmin/sfdc/procedures/analyze_schema_statistics.sh #!/bin/bash # Simple external procedure to run analyze_statistics function at the schema level # This procedure accepts one parameter (a schema name) source ~/.bashrc /opt/vertica/bin/vsql -Atc “SELECT ‘SELECT analyze_statistics(”’ || table_schema || ‘.’ || table_name || ”’);’ FROM v_catalog.tables WHERE table_schema = ‘$1’;” | /opt/vertica/bin/vsql -e > /home/dbadmin/analyze_schema_statistics.log 2>&1 exit 0 First, I have to create the procedure in Vertica: dbadmin=> CREATE PROCEDURE analyze_schema_statistics (schema_name VARCHAR) AS 'analyze_schema_statistics.sh' LANGUAGE 'external' USER 'dbadmin'; CREATE PROCEDURE Next I will find a schema to run it on: dbadmin=> \dt test_stats.* List of tables Schema | Name | Kind | Owner | Comment ------------+------+-------+---------+--------- test_stats | t1 | table | dbadmin | test_stats | t2 | table | dbadmin | test_stats | t3 | table | dbadmin | (3 rows) Note that there are three tables in the TEST_STATS schema and all their projections have no statistics: dbadmin=> SELECT projection_name, anchor_table_name, has_statistics FROM projections WHERE projection_schema = 'test_stats'; projection_name | anchor_table_name | has_statistics -----------------+-------------------+---------------- t1_super | t1 | f t2_super | t2 | f t3_super | t3 | f (3 rows) Next I will run our new procedure: dbadmin=> SELECT analyze_schema_statistics('test_stats'); analyze_schema_statistics --------------------------- 0 (1 row) Now all of the projections have statistics! dbadmin=> SELECT projection_name, anchor_table_name, has_statistics FROM projections WHERE projection_schema = ‘test_stats’; projection_name | anchor_table_name | has_statistics —————–+——————-+—————- t1_super | t1 | t t2_super | t2 | t t3_super | t3 | t (3 rows) Oh, the procedure also wrote out to a log file. Here are the contents of that log file: dbadmin=> \! cat /home/dbadmin/analyze_schema_statistics.log SELECT analyze_statistics('test_stats.t1'); analyze_statistics -------------------- 0 (1 row) SELECT analyze_statistics(‘test_stats.t2’); analyze_statistics ——————– 0 (1 row) SELECT analyze_statistics(‘test_stats.t3’); analyze_statistics ——————– 0 (1 row) Have Fun!