CREATE HCATALOG SCHEMA
Define a schema for data stored in a Hive data warehouse using the HCatalog Connector. For more information, see Using the HCatalog Connector in Integrating with Apache Hadoop.
Most of the optional parameters are read out of Hadoop configuration files if available. See the Notes section for further information.
Syntax
CREATE HCATALOG SCHEMA [IF NOT EXISTS] schemaName [AUTHORIZATION user-id] [WITH [HOSTNAME='metastore-host'] [PORT='metastore-port'] [HIVESERVER2_HOSTNAME='hivserver-host'] [WEBSERVICE_HOSTNAME='webHCat-host'] [WEBSERVICE_PORT='webHCat-port'] [WEBHDFS_ADDRESS='webhdfs-address'] [HCATALOG_SCHEMA='hive-schema-name'] [HCATALOG_USER='hcat-username'] [HCATALOG_CONNECTION_TIMEOUT='timeout'] [HCATALOG_SLOW_TRANSFER_LIMIT='xfer-limit'] [HCATALOG_SLOW_TRANSFER_TIME='xfer-time'] ]
Parameters
Parameter | Description | Default Value |
---|---|---|
[IF NOT EXISTS] |
If given, the statement exits without an error when the schema named in schemaName already exists. |
N/A |
schemaName |
The name of the schema to create in the Vertica catalog. The tables in the Hive database will be available through this schema. |
none |
AUTHORIZATION 'user-id' |
The name of a Vertica account to own the schema being created. This parameter is ignored if Kerberos authentication is being used; in that case the current vsql user is used. | current username |
HOSTNAME=
'metastore-host' |
The hostname or IP address of the database server that stores the Hive data warehouse's metastore information. If this value is not specified, hive-site.xml must be available. | none |
PORT=
'metastore-port' |
The port number on which the metastore database is running. | See Notes |
HIVESERVER2_HOSTNAME=
'hiveserver2-host' |
The hostname or IP address of the HiveServer2 service. This parameter is optional if in hive-site.xml you set one of the following properties:
This parameter is ignored if you are using WebHCat. |
none |
WEBSERVICE_HOSTNAME=
'webHCat-host' |
The hostname or IP address of the WebHCat service, if using WebHCat instead of HiveServer2. If this value is not specified, webhcat-site.xml must be available. | none |
WEBSERVICE_PORT=
'webHCat-port' |
The port number on which the WebHCat service is running, if using WebHCat instead of HiveServer2. If this value is not specified, webhcat-site.xml must be available. | See Notes |
WEBHDFS_ADDRESS=
'webhdfs-address' |
The host and port ("host:port") for the WebHDFS service. This parameter is used only for reading ORC and Parquet files. If this value is not set, hdfs-site.xml must be available to read these file types through the HCatalog Connector. | none |
HCATALOG_SCHEMA=
'hive-schema-name' |
The name of the Hive schema or database that the Vertica schema is being mapped to. | schemaName |
HCATALOG_USER=
'hcat-username' |
The username of the HCatalog user to use when making calls to the HiveServer2 or WebHCat server. | current username |
HCATALOG_CONNECTION_TIMEOUT=
'timeout' |
The number of seconds the HCatalog Connector waits for a successful connection to the HiveServer or WebHCat server. A value of 0 means wait indefinitely. | See Notes |
HCATALOG_SLOW_TRANSFER_LIMIT=
'xfer-limit' |
The lowest data transfer rate (in bytes per second) from the HiveServer2 or WebHCat server that the HCatalog Connector accepts. See xfer-time for details. | See Notes |
HCATALOG_SLOW_TRANSFER_TIME=
'xfer-time' |
The number of seconds the HCatalog Connector waits before enforcing the data transfer rate lower limit. After this time has passed, the HCatalog Connector tests whether the data transfer rate is at least as fast as the value set in xfer-limit. If it is not, then the HCatalog Connector breaks the connection and terminates the query. | See Notes |
Notes
The default values for timeout, xfer-limit, and xfer-time are set by the configuration parameters HCatConnectionTimeout, HCatSlowTransferLimit, and HCatSlowTransferTime. See Hadoop Parameters in the Administrator's Guide for more information.
If you are using Kerberos authentication, the current user name is always used and the AUTHORIZATION parameter is ignored.
When using HiveServer2 (the default), use HIVESERVER2_HOSTNAME to specify the server host. When using WebHCat, use WEBSERVICE_HOSTNAME to specify the server host.
If you copied the Hadoop configuration files as described in Configuring Vertica for HCatalog, you can omit most parameters. By default this statement uses the values specified in those configuration files, though you can still override them with this statement. If the configuration files are complete, the following is a valid statement:
=> CREATE HCATALOG SCHEMA hcat;
If a value is not specified in the configuration files and a default is shown in the parameter list, then that default value is used.
Privileges
The user must be a superuser or be granted all permissions on the database to use CREATE HCATALOG SCHEMA.
Example
The following example shows how to use CREATE HCATALOG SCHEMA to define a new schema for tables stored in a Hive database and then query the system tables that contain information about those tables:
=> CREATE HCATALOG SCHEMA hcat WITH HOSTNAME='hcathost' HCATALOG_SCHEMA='default' HIVESERVER2_HOSTNAME='hs.example.com' HCATALOG_USER='admin'; WARNING 0: HOSTNAME will be ignored. hive-site.xml must be contain property [hive.metastore.uris], or both HOSTNAME and PORT must be specified CREATE SCHEMA => \x Expanded display is on. => SELECT * FROM v_catalog.hcatalog_schemata; -[ RECORD 1 ]----------------+------------------------------------------- schema_id | 45035996273748224 schema_name | hcat schema_owner_id | 45035996273704962 schema_owner | admin create_time | 2017-05-22 12:04:59.692155-04 hostname | hcathost port | -1 hiveserver2_hostname | hs.example.com webservice_hostname | webservice_port | 50111 webhdfs_address | hs.example.com:50070 hcatalog_schema_name | default hcatalog_user_name | admin hcatalog_connection_timeout | -1 hcatalog_slow_transfer_limit | -1 hcatalog_slow_transfer_time | -1 => SELECT * FROM v_catalog.hcatalog_table_list; -[ RECORD 1 ]------+------------------ table_schema_id | 45035996273748224 table_schema | hcat hcatalog_schema | default table_name | nation hcatalog_user_name | admin -[ RECORD 2 ]------+------------------ table_schema_id | 45035996273748224 table_schema | hcat hcatalog_schema | default table_name | raw hcatalog_user_name | admin -[ RECORD 3 ]------+------------------ table_schema_id | 45035996273748224 table_schema | hcat hcatalog_schema | default table_name | raw_rcfile hcatalog_user_name | admin -[ RECORD 4 ]------+------------------ table_schema_id | 45035996273748224 table_schema | hcat hcatalog_schema | default table_name | raw_sequence hcatalog_user_name | admin