AUDIT

Returns the raw data size (in bytes) of a database, schema, or table as it is counted in an audit of the database size. Unless you specify zero error tolerance and 100 percent confidence level, AUDIT returns only approximate results that can vary over multiple iterations.

AUDIT estimates the size for data in Vertica tables using the same data sampling method as Vertica uses, to determine if a database complies with the licensed database size allowance. Vertica does not use these results to determine whether the size of the database complies with the Vertica license's data allowance. For details, see Auditing Database Size in the Administrator's Guide.

For data stored in external tables based on ORC or Parquet format, AUDIT uses the total size of the data files. This value is never estimated—it is read from the file system storing the ORC or Parquet files (either the Vertica node's local file system, S3, or HDFS).

Syntax

AUDIT('[[[database.]schema.]scope ]'[, 'granularity'] [, error‑tolerance[, confidence‑level]] )

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

scope

Specifies the extent of the audit:

  • Empty string ('') audits the entire database.
  • The name of the schema or table to audit.

The schema or table to audit. To audit the database, set this parameter to an empty string.

granularity

The level at which the audit reports its results, one of the following strings:

  • database
  • schema
  • table

The level of granularity must be equal to or less than the granularity of scope. If you omit this parameter, granularity is set to the same level as scope. Thus, if online_sales is a schema, the following statements are identical:

AUDIT('online_sales', 'schema');
AUDIT('online_sales');

If AUDIT sets granularity to a level lower than the target object, it returns with a message that refers you to system table USER_AUDITS. For details, see Querying V_CATALOG.USER_AUDITS, below.

error‑tolerance

Specifies the percentage margin of error allowed in the audit estimate. Enter the tolerance value as a decimal number, between 0 and 100. The default value is 5, for a 5% margin of error.

This argument has no effect on audits of external tables based on ORC or Parquet files. Audits of these tables always returns the actual size of the underlying data files.

Setting this value to 0 results in a full database audit, which is very resource intensive, as AUDIT analyzes the entire database. A full database audit significantly impacts performance, so Vertica does not recommend it for a production database.

Due to the iterative sampling that the auditing process uses, setting the error tolerance to a small fraction of a percent (for example, 0.00001) can cause AUDIT to run for a longer period than a full database audit. The lower you specify this value, the more resources the audit uses, as it performs more data sampling.

confidence‑level

Specifies the statistical confidence level percentage of the estimate. Enter the confidence value as a decimal number, between 0 and 100. The default value is 99, indicating a confidence level of 99%.

This argument has no effect on audits of external tables based on ORC or Parquet files. Audits of these tables always returns the actual size of the underlying data files.

The higher the confidence value, the more resources the function uses, as it performs more data sampling. Setting this value to 100 results in a full audit of the database, which is very resource intensive, as the function analyzes all of the database. A full database audit significantly impacts performance, so Vertica does not recommend it for a production database.

Privileges

Superuser, or the following privileges:

  • SELECT privilege on the target tables
  • USAGE privilege on the target schemas

If you audit a schema or the database, Vertica only returns the size of all objects that you have privileges to access within the audited object, as described above.

Querying V_CATALOG.USER_AUDITS

If AUDIT sets granularity to a level lower than the target object, it returns with a message that refers you to system table USER_AUDITS. To obtain audit data on objects of the specified granularity, query this table. For example, the following query seeks to audit all tables in the store schema:

=> SELECT AUDIT('store', 'table');
                           AUDIT
-----------------------------------------------------------
 See table sizes in v_catalog.user_audits for schema store
(1 row)

The next query queries USER_AUDITS and obtains the latest audits on those tables:

=> SELECT object_name, AVG(size_bytes)::int size_bytes, MAX(audit_start_timestamp::date) audit_start 
      FROM user_audits WHERE object_schema='store' 
      GROUP BY rollup(object_name) HAVING GROUPING_ID(object_name) < 1 ORDER BY GROUPING_ID();
    object_name    | size_bytes | audit_start
-------------------+------------+-------------
 store_dimension   |      22067 | 2017-10-26
 store_orders_fact |   27201312 | 2017-10-26
 store_sales_fact  |  301260170 | 2017-10-26
(3 rows)

Examples

See Auditing Database Size.