Scheduler Tool Options
The vkconfig script's scheduler tool lets you configure schedulers that continuously loads data from Kafka into Vertica. Use the scheduler tool to create, update, or delete a scheduler, defined by config-schema
. If you do not specify a scheduler, commands apply to the default stream_config scheduler.
Syntax
vkconfig scheduler {--create | --read | --update | --drop} other_options...
Option | Description |
---|---|
--create
|
Creates a new load spec, cannot be used with |
--read |
Outputs the current setting of the scheduler in JSON format. Cannot be used with |
--update
|
Updates an existing scheduler. Cannot be used with |
--drop |
Drops the scheduler's schema. Dropping its schema deletes the scheduler. After you drop the scheduler's schema, you cannot recover it. |
--add-operator user_name |
Grants a Vertica user account or role access to use and alter the scheduler. Requires the --update shared utility option. |
--auto-sync {TRUE|FALSE} |
When TRUE, Vertica automatically synchronizes scheduler source information at the interval specified in For details about what the scheduler synchronizes at each interval, see the "Validating Schedulers" and "Synchronizing Schedulers" sections in Automatically Copying Data From Kafka Default Value: TRUE |
--config-refresh HH:MM:SS
|
The interval of time that the scheduler runs before synchronizing its settings and updating its cached metadata (such as changes made by using the Default Value: 00:05:00 |
--consumer-group-id id_name |
The name of the Kafka consumer group to which Vertica reports its progress consuming messages. See Monitoring Vertica Message Consumption with Consumer Groups for more information. Default Value: vertica_database-name Set this value to an empty string ('') to disable progress reports to a Kafka consumer group. |
--dump |
When you use this option along with the |
--eof-timeout-ms number of milliseconds
|
If a COPY command does not receive any messages within the eof-timeout-ms interval, Vertica responds by ending that COPY statement. Default Value: 1 second See Manually Copying Data From Kafka for more information. |
--fix-config
|
Repairs the configuration and re-creates any missing tables. Valid only with the --update shared configuration option. |
--frame-duration HH:MM:SS
|
The interval of time that all individual frames last with this scheduler. The scheduler must have enough time to run each microbatch (each of which execute a COPY statement). You can approximate the average available time per microbatch using the following equation: TimePerMicrobatch=(FrameDuration*Parallelism)/Microbatches This is just a rough estimate as there are many factors that impact the amount of time that each microbatch will be able to run. The vkconfig utility warns you if the time allocated per microbatch is below 2 seconds. You usually should allocate more than two seconds per microbatch to allow the scheduler to load all of the data in the data stream. Default Value: 00:05:00 In versions of Vertica earlier than 10.0, the default frame duration was 10 seconds. In version 10.0, this default value was increased to 5 minutes in part to compensate for the removal of WOS. If you created your scheduler with the default frame duration in a version prior to 10.0, the frame duration is not updated to the new default value. In this case, consider adjusting the frame duration manually. See Choosing a Frame Duration for more information. |
--message_max_bytes max_message_size |
Specifies the maximum size, in bytes, of a Kafka protocol batch message. Default Value: 25165824 You may need to manually update this value if you created a scheduler using Vertica 9.1.0 or earlier. The meaning of Kafka's max.message.bytes setting changed between version 0.10 and 0.11. See Changes to the message.max.bytes Setting in Kafka Version 0.11 and Later for more information. |
--new-source-policy {FAIR|START|END} |
Determines how Vertica allocates resources to the newly added source. Valid Values:
Default Value: FAIR |
--operator username |
Allows the dbadmin to grant privileges to a previously created Vertica user or role. This option gives the specified user all privileges on the scheduler instance and EXECUTE privileges on the libkafka library and all its UDxs. Granting operator privileges gives the user the right to read data off any source in any cluster that can be reached from the Vertica node. The dbadmin must grant the user separate permission for them to have write privileges on the target tables. Requires the To revoke privileges, use the |
--remove-operator user_name |
Removes access to the scheduler from a Vertica user account. Requires the --update shared utility option. |
--resource-pool pool_name |
The resource pool to be used by all queries executed by this scheduler. You must create this pool in advance. Default Value: GENERAL pool The scheduler can use only one-fourth of GENERAL pool's PLANNEDCONCURRENCY. |
--upgrade
|
Upgrades the existing scheduler and configuration schema to the current Vertica version. The upgraded version of the scheduler is not backwards compatible with earlier versions. To upgrade a scheduler to an alternate schema, use the upgrade-to-schema parameter. See Updating Schedulers After Vertica Upgrades for more information. |
--upgrade-to-schema schema name |
Copies the scheduler's schema to a new schema specified by schema name and then upgrades it to be compatible with the current version of Vertica. Vertica does not alter the old schema. Requires the |
--validation-type {ERROR|WARN|SKIP} |
Specifies the level of validation performed on the scheduler. Invalid SQL syntax and other errors can cause invalid microbatches. Vertica supports the following validation types:
Default Value: ERROR For more information on validation, refer to Automatically Copying Data From Kafka. Renamed from |
See Common vkconfig Script Options for options that are available in all of the vkconfig tools.
Examples
These examples show how you can use the scheduler utility options.
Give a user, Jim, privileges on the StreamConfig scheduler. Specify that you are making edits to the stream_config scheduler with the --config-schema
option:
$ /opt/vertica/packages/kafka/bin/vkconfig scheduler --update --config-schema stream_config --add-operator Jim
Edit the default stream_config scheduler so that every microbatch waits for data for one second before ending:
$ /opt/vertica/packages/kafka/bin/vkconfig scheduler --update --eof-timeout-ms 1000
Upgrade the scheduler named iot_scheduler_8.1 to a new scheduler named iot_scheduler_9.0 that is compatible with the current version of Vertica:
$ /opt/vertica/packages/kafka/bin/vkconfig scheduler --upgrade --config-schema iot_scheduler_8.1 \ --upgrade-to-schema iot_scheduler_9.0
Drop the schema scheduler219a:
$ /opt/vertica/packages/kafka/bin/vkconfig scheduler --drop --config-schema scheduler219a --username dbadmin
Read the current setting of the options you can set using the scheduler tool for the scheduler defined in weblogs.conf.
$ vkconfig scheduler --read --conf weblog.conf {"version":"v9.2.0", "frame_duration":"00:00:10", "resource_pool":"weblog_pool", "config_refresh":"00:05:00", "new_source_policy":"FAIR", "pushback_policy":"LINEAR", "pushback_max_count":5, "auto_sync":true, "consumer_group_id":null}