Vertica Quick Tip: Default Size of the NUMBER Data Type

Posted February 5, 2018 by Soniya Shah, Information Developer

This blog post was authored by Jim Knicely.

When creating a table where you do not define a precision for a NUMBER column data type, Vertica will use a default precision of 38 digits. Often this is larger than necessary. By specifying NUMBER(37) you will potentially get better query performance and save on storage. Why? It’s all about the data type size.

Example:

Below are two tables each having a column named MY_BIG_INT with varying NUMBER data types. dbadmin=> \d big_int_38_table List of Fields by Tables Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key --------+------------------+------------+---------------+------+---------+----------+-------------+------------- public | big_int_38_table | my_big_int | numeric(38,0) | 24 | | f | f | (1 row) dbadmin=> \d big_int_37_table List of Fields by Tables Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key --------+------------------+------------+---------------+------+---------+----------+-------------+------------- public | big_int_37_table | my_big_int | numeric(37,0) | 16 | | f | f | (1 row) Note that as the precision of the data type decreases, so does the data type size:

NUMBER(38) = 24-byte data type
NUMBER(37) = 16-byte data type

Quick performance comparison: dbadmin=> SELECT COUNT(*) FROM big_int_38_table WHERE my_big_int * 1.1 > 50; COUNT ----------- 550009846 (1 row) Time: First fetch (1 row): 9012.937 ms. All rows formatted: 9012.972 ms dbadmin=> SELECT COUNT(*) FROM big_int_37_table WHERE my_big_int * 1.1 > 50; COUNT ----------- 550009846 (1 row) Time: First fetch (1 row): 5269.690 ms. All rows formatted: 5269.732 ms Quick disk usage comparison: dbadmin=> SELECT (SELECT used_bytes dbadmin(> FROM projection_storage dbadmin(> WHERE anchor_table_name = 'big_int_38_table') / dbadmin-> (SELECT used_bytes dbadmin(> FROM projection_storage dbadmin(> WHERE anchor_table_name = 'big_int_37_table') "38_is_x_times_larger_than_37"; 38_is_x_times_larger_than_37 ------------------------------ 1.462834075093070594 (1 row) Time: First fetch (1 row): 79.050 ms. All rows formatted: 79.110 ms Have Fun!