Scheduler Tool Options
The vkconfig script's scheduler tool lets you configure schedulers that continuously loads data from Kafka into Vertica. Use the scheduler tool to create, update, or delete a scheduler, defined by config-schema
. If you do not specify a scheduler, commands apply to the default stream_config scheduler.
Syntax
vkconfig scheduler {--create | --read | --update | --drop} other_options...
Option | Description |
---|---|
--create
|
Creates a new scheduler. Cannot be used with |
--read |
Outputs the current setting of the scheduler in JSON format. Cannot be used with |
--update
|
Updates an existing scheduler. Cannot be used with |
--drop |
Drops the scheduler's schema. Dropping its schema deletes the scheduler. After you drop the scheduler's schema, you cannot recover it. |
--add-operator user_name |
Grants a Vertica user account or role access to use and alter the scheduler. Requires the --update shared utility option. |
--auto-sync {TRUE|FALSE} |
When TRUE, Vertica automatically synchronizes scheduler source information at the interval specified in Default Value: TRUE For more information on synchronization, refer to Automatically Copying Data From Kafka. |
--config-refresh HH:MM:SS
|
The interval of time that the scheduler runs before synchronizing its settings and updating its cached metadata (such as changes made by using the Default Value: 00:05:00 |
--consumer-group-id id_name |
The name of the Kafka consumer group to which Vertica reports its progress consuming messages. By default, Vertica reports its progress to a group named vertica_database-name. See Monitoring Vertica Message Consumption with Consumer Groups for more information. Set this value to an empty string ('') to disable progress reports to a Kafka consumer group. |
--dump |
When you use this option along with the |
--eof-timeout-ms number of milliseconds
|
If a COPY command does not receive any messages within the eof-timeout-ms interval, Vertica responds by ending that COPY statement. Default Value: 1 second See Manually Copying Data From Kafka for more information. |
--fix-config
|
Repairs the configuration and re-creates any missing tables. Valid only with the --update shared configuration option. |
--frame-duration HH:MM:SS
|
The interval of time that all individual frames last with this scheduler. Vertica must have enough time to complete COPY tasks within this duration. You can approximate the average available time per COPY using the following equation: TimePerCopy=(FrameDuration*Parallelism)/Microbatches This is just a rough estimate as there are many factors that impact the amount of time that each COPY statement will be able to run. Vertica requires at least 100 milliseconds per COPY to function. You can increase the available time per COPY by increasing your frame duration. Default Value: 00:00:10 |
--message_max_bytes max_message_size |
Specifies the maximum size, in bytes, of a Kafka protocol batch message. Default Value: 25165824 You may need to manually update this value if you created a scheduler using Vertica 9.1.0 or earlier. The meaning of Kafka's max.message.bytes setting changed between version 0.10 and 0.11. See Changes to the message.max.bytes Setting in Kafka Version 0.11 and Later for more information. |
--new-source-policy {FAIR|START|END} |
Determines how Vertica allocates resources to the newly added source. Valid Values:
Default Value: FAIR |
--operator username |
Allows the dbadmin to grant privileges to a previously created Vertica user or role. This option gives the specified user all privileges on the scheduler instance and EXECUTE privileges on the libkafka library and all its UDxs. Granting operator privileges gives the user the right to read data off any source in any cluster that can be reached from the Vertica node. The dbadmin must grant the user separate permission for them to have write privileges on the target tables. Requires the To revoke privileges, use the |
--remove-operator user_name |
Removes access to the scheduler from a Vertica user account. Requires the --update shared utility option. |
--resource-pool pool_name |
The resource pool to be used by all queries executed by this scheduler. You must create this pool in advance if you are not using the default pool. Default Value: stream_default_pool |
--upgrade
|
Upgrades the existing scheduler and configuration schema to the current Vertica version. The upgraded version of the scheduler is not backwards compatible with earlier versions. To upgrade a scheduler to an alternate schema, use the upgrade-to-schema parameter. See Updating Schedulers After Vertica Upgrades for more information. |
--upgrade-to-schema schema name |
Copies the scheduler's schema to a new schema specified by schema name and then upgrades it to be compatible with the current version of Vertica. Vertica does not alter the old schema. Requires the |
--validation-type {ERROR|WARN|SKIP} |
Specifies the level of validation performed on the scheduler. Invalid SQL syntax and other errors can cause invalid microbatches. Vertica supports the following validation types:
For more information on validation, refer to Automatically Copying Data From Kafka. Renamed from |
See Common vkconfig Script Options for options that are available in all of the vkconfig tools.
Examples
These examples show how you can use the scheduler utility options.
Give a user, Jim, privileges on the StreamConfig scheduler. Specify that you are making edits to the stream_config scheduler with the --config-schema
option:
$ /opt/vertica/packages/kafka/bin/vkconfig scheduler --update --config-schema stream_config --add-operator Jim
Edit the default stream_config scheduler so that every microbatch waits for data for one second before ending:
$ /opt/vertica/packages/kafka/bin/vkconfig scheduler --update --eof-timeout-ms 1000
Upgrade the scheduler named iot_scheduler_8.1 to a new scheduler named iot_scheduler_9.0 that is compatible with the current version of Vertica:
$ /opt/vertica/packages/kafka/bin/vkconfig scheduler --upgrade --config-schema iot_scheduler_8.1 \ --upgrade-to-schema iot_scheduler_9.0
Drop the schema scheduler219a:
$ /opt/vertica/packages/kafka/bin/vkconfig scheduler --drop --config-schema scheduler219a --username dbadmin
Read the current setting of the options you can set using the scheduler tool for the scheduler defined in weblogs.conf.
$ vkconfig scheduler --read --conf weblog.conf {"version":"v9.2.0", "frame_duration":"00:00:10", "resource_pool":"weblog_pool", "config_refresh":"00:05:00", "new_source_policy":"FAIR", "pushback_policy":"LINEAR", "pushback_max_count":5, "auto_sync":true, "consumer_group_id":null}