Viewing Hive Schema and Table Metadata
When using Hive, you access metadata about schemas and tables by executing statements written in HiveQL (Hive's version of SQL) such as SHOW TABLES
. When using the HCatalog Connector, you can get metadata about the tables in the Hive database through several Vertica system tables.
There are four system tables that contain metadata about the tables accessible through the HCatalog Connector:
- HCATALOG_SCHEMATA lists all of the schemas that have been defined using the HCatalog Connector.
- HCATALOG_TABLE_LIST contains an overview of all of the tables available from all schemas defined using the HCatalog Connector. This table only shows the tables that the user querying the table can access. The information in this table is retrieved using a single call to HiveServer2 for each schema defined using the HCatalog Connector, which means there is a little overhead when querying this table.
- HCATALOG_TABLES contains more in-depth information than HCATALOG_TABLE_LIST.
- HCATALOG_COLUMNS lists metadata about all of the columns in all of the tables available through the HCatalog Connector. As for HCATALOG_TABLES, querying this table results in one call to HiveServer2 per table, and therefore can take a while to complete.
The following example demonstrates querying the system tables containing metadata for the tables available through the HCatalog Connector.
=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost' -> HCATALOG_SCHEMA='default' HCATALOG_DB='default' HCATALOG_USER='hcatuser'; CREATE SCHEMA => SELECT * FROM HCATALOG_SCHEMATA; -[ RECORD 1 ]--------+----------------------------- schema_id | 45035996273864536 schema_name | hcat schema_owner_id | 45035996273704962 schema_owner | dbadmin create_time | 2013-11-05 10:19:54.70965-05 hostname | hcathost port | 9083 webservice_hostname | hcathost webservice_port | 50111 hcatalog_schema_name | default hcatalog_user_name | hcatuser metastore_db_name | hivemetastoredb => SELECT * FROM HCATALOG_TABLE_LIST; -[ RECORD 1 ]------+------------------ table_schema_id | 45035996273864536 table_schema | hcat hcatalog_schema | default table_name | hcatalogtypes hcatalog_user_name | hcatuser -[ RECORD 2 ]------+------------------ table_schema_id | 45035996273864536 table_schema | hcat hcatalog_schema | default table_name | tweets hcatalog_user_name | hcatuser -[ RECORD 3 ]------+------------------ table_schema_id | 45035996273864536 table_schema | hcat hcatalog_schema | default table_name | messages hcatalog_user_name | hcatuser -[ RECORD 4 ]------+------------------ table_schema_id | 45035996273864536 table_schema | hcat hcatalog_schema | default table_name | msgjson hcatalog_user_name | hcatuser => -- Get detailed description of a specific table => SELECT * FROM HCATALOG_TABLES WHERE table_name = 'msgjson'; -[ RECORD 1 ]---------+----------------------------------------------------------- table_schema_id | 45035996273864536 table_schema | hcat hcatalog_schema | default table_name | msgjson hcatalog_user_name | hcatuser min_file_size_bytes | total_number_files | 10 location | hdfs://hive.example.com:8020/user/exampleuser/msgjson last_update_time | output_format | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat last_access_time | max_file_size_bytes | is_partitioned | f partition_expression | table_owner | input_format | org.apache.hadoop.mapred.TextInputFormat total_file_size_bytes | 453534 hcatalog_group | permission | => -- Get list of columns in a specific table => SELECT * FROM HCATALOG_COLUMNS WHERE table_name = 'hcatalogtypes' -> ORDER BY ordinal_position; -[ RECORD 1 ]------------+----------------- table_schema | hcat hcatalog_schema | default table_name | hcatalogtypes is_partition_column | f column_name | intcol hcatalog_data_type | int data_type | int data_type_id | 6 data_type_length | 8 character_maximum_length | numeric_precision | numeric_scale | datetime_precision | interval_precision | ordinal_position | 1 -[ RECORD 2 ]------------+----------------- table_schema | hcat hcatalog_schema | default table_name | hcatalogtypes is_partition_column | f column_name | floatcol hcatalog_data_type | float data_type | float data_type_id | 7 data_type_length | 8 character_maximum_length | numeric_precision | numeric_scale | datetime_precision | interval_precision | ordinal_position | 2 -[ RECORD 3 ]------------+----------------- table_schema | hcat hcatalog_schema | default table_name | hcatalogtypes is_partition_column | f column_name | doublecol hcatalog_data_type | double data_type | float data_type_id | 7 data_type_length | 8 character_maximum_length | numeric_precision | numeric_scale | datetime_precision | interval_precision | ordinal_position | 3 -[ RECORD 4 ]------------+----------------- table_schema | hcat hcatalog_schema | default table_name | hcatalogtypes is_partition_column | f column_name | charcol hcatalog_data_type | string data_type | varchar(65000) data_type_id | 9 data_type_length | 65000 character_maximum_length | 65000 numeric_precision | numeric_scale | datetime_precision | interval_precision | ordinal_position | 4 -[ RECORD 5 ]------------+----------------- table_schema | hcat hcatalog_schema | default table_name | hcatalogtypes is_partition_column | f column_name | varcharcol hcatalog_data_type | string data_type | varchar(65000) data_type_id | 9 data_type_length | 65000 character_maximum_length | 65000 numeric_precision | numeric_scale | datetime_precision | interval_precision | ordinal_position | 5 -[ RECORD 6 ]------------+----------------- table_schema | hcat hcatalog_schema | default table_name | hcatalogtypes is_partition_column | f column_name | boolcol hcatalog_data_type | boolean data_type | boolean data_type_id | 5 data_type_length | 1 character_maximum_length | numeric_precision | numeric_scale | datetime_precision | interval_precision | ordinal_position | 6 -[ RECORD 7 ]------------+----------------- table_schema | hcat hcatalog_schema | default table_name | hcatalogtypes is_partition_column | f column_name | timestampcol hcatalog_data_type | string data_type | varchar(65000) data_type_id | 9 data_type_length | 65000 character_maximum_length | 65000 numeric_precision | numeric_scale | datetime_precision | interval_precision | ordinal_position | 7 -[ RECORD 8 ]------------+----------------- table_schema | hcat hcatalog_schema | default table_name | hcatalogtypes is_partition_column | f column_name | varbincol hcatalog_data_type | binary data_type | varbinary(65000) data_type_id | 17 data_type_length | 65000 character_maximum_length | 65000 numeric_precision | numeric_scale | datetime_precision | interval_precision | ordinal_position | 8 -[ RECORD 9 ]------------+----------------- table_schema | hcat hcatalog_schema | default table_name | hcatalogtypes is_partition_column | f column_name | bincol hcatalog_data_type | binary data_type | varbinary(65000) data_type_id | 17 data_type_length | 65000 character_maximum_length | 65000 numeric_precision | numeric_scale | datetime_precision | interval_precision | ordinal_position | 9