Viewing License Compliance for Hadoop File Formats
You can use the EXTERNAL_TABLE_DETAILS system table to gather information about your use of Hadoop file formats. This information can help you assess your compliance with your Vertica license.
Vertica computes the values in this table at query time, so to avoid performance problems, restrict your queries to filter by table_schema, table_name, or source_format. These three columns are the only columns you can use in a predicate, but you may use all of the usual predicate operators.
=> SELECT * FROM EXTERNAL_TABLE_DETAILS WHERE source_format = 'PARQUET' OR source_format = 'ORC'; -[ RECORD 1 ]---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- schema_oid | 45035996273704978 table_schema | public table_oid | 45035996273760390 table_name | ORC_demo source_format | ORC total_file_count | 5 total_file_size_bytes | 789 source_statement | COPY FROM 'ORC_demo/*' ORC file_access_error | -[ RECORD 2 ]---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- schema_oid | 45035196277204374 table_schema | public table_oid | 45035996274460352 table_name | Parquet_demo source_format | PARQUET total_file_count | 3 total_file_size_bytes | 498 source_statement | COPY FROM 'Parquet_demo/*' PARQUET file_access_error |
When computing the size of an external table, Vertica counts all data found in the location specified by the COPY FROM clause. If you have a directory that contains ORC and delimited files, for example, and you define your external table with "COPY FROM *" instead of "COPY FROM *.orc", this table includes the size of the delimited files. (You would probably also encounter errors when querying that external table.) When you query this table Vertica does not validate your table definition; it just uses the path to find files to report.