Scheduler Utility Options

The scheduler is a tool that continuously loads data from Kafka into Vertica. Use the scheduler utility to create, update, or delete a scheduler, defined by config-schema. If you do not specify a scheduler, ultility commands apply to the default stream_config scheduler.

Option Description
--operator username

Allows the dbadmin to grant privileges to a previously created Vertica user.

This option gives the specified user all privileges on the scheduler instance and EXECUTE privileges on the libkafka library and all its UDxs.

Granting operator privileges gives the user the right to read data off any source in any cluster that can be reached from the Vertica node.

The dbadmin must grant the user separate permission for them to have write privileges on the target tables.

Requires the --create shared utility option.

To revoke privileges, use the --remove option with the --operator option.

--drop schema_name Drops the specified schema. After you drop the configuration information, you cannot recover it.
--add-operator user_name Adds a user account that operates the specified configuration. Requires the --update shared utility option.
--remove-operator user_name Deletes a user account that operates the specified configuration. Requires the --update shared utility option.
--upgrade Upgrades the existing scheduler and configuration schema to the current version. The upgraded version of the scheduler is not backwards compatible with earlier versions. To upgrade an alternate schema, use the upgrade-to-schema parameter.
--upgrade-to-schema schema name Specifies a configuration schema to use with the upgraded scheduler. Requires the --upgrade scheduler utility option. If you do not include this parameter with an upgrade, Vertica upgrades the current schema.
--frame-duration HH:MM:SS

The interval of time that all individual frames last with this scheduler. Vertica must have enough time to complete COPY tasks within this duration. You can calculate the average available time per COPY using the following equation:

TimePerCopy=(FrameDuration*Parallelism)/Microbatches

Vertica requires at least 100 milliseconds per COPY to function. You can increase the available time per COPY by increasing your frame duration.

Default Value:

00:00:10

--config-refresh HH:MM:SS

The interval of time that the scheduler runs before synchronizing its settings and updating its cached metadata (such as changes made by using the --update option). Renamed from --config-refresh-interval.

Default Value:

00:05:00

--resource-pool pool_name

The resource pool to be used by all queries executed by this scheduler. You must create this pool in advance if you are not using the default pool.

Default Value:

stream_default_pool

--new-source-policy FAIR|START|END

Determines how much Vertica allocates resources to the newly added source.

Valid Values:

  • FAIR: Takes the average length of time from the previous batches and schedules itself appropriately.
  • START: All new sources start at the beginning of the frame. The batch receives the minimal amount of time to run.
  • END: All new sources start at the end of the frame. The batch receives the maximum amount of time to run.

Default Value:

FAIR

--eof-timeout-ms number of milliseconds

If a COPY command does not receive any messages within the eof-timeout-ms interval, Vertica responds by ending that COPY statement.

Default Value:

1 second

See Using COPY with Data Streaming in this guide for more information.

--message_max_bytes max_message_size

The maximum message size, in bytes.

Default Value:

1048576

--fix-config Repairs the configuration and re-creates any missing tables. Valid only with the --update shared configuration option.
--validation-type ERROR|WARN|SKIP

Specifies the level of validation performed on the scheduler. Invalid SQL syntax and other errors can cause invalid micro-batches. Vertica supports the following validation types:

  • ERROR - Cancel configuration or creation if validation fails. If you do not specify a validation type, this value is the default.
  • WARN - Proceed with task if validation fails, but display a warning.
  • SKIP - Perform no validation.

For more information on validation, refer to Data Streaming Job Scheduler.

Renamed from --skip-validation.

--auto-syncTRUE|FALSE

When TRUE, Vertica automatically synchronizes scheduler source information at the interval specified in --config-refresh.

Default Value:

TRUE

For more information on synchronization, refer to Data Streaming Job Scheduler.

Examples

These examples show how you can use the scheduler utility options.

Give a user, Jim, privileges on the StreamConfig scheduler. Specify that you are making edits to the StreamConfig scheduler with the --config-schema option:

/opt/vertica/packages/kafka/bin/vkconfig scheduler --update --config-schema StreamConfig --add-operator Jim

Edit the default stream_config scheduler so that every COPY statement waits for data for one second before ending:

/opt/vertica/packages/kafka/bin/vkconfig scheduler --update --eof-timeout-ms 1000

Upgrade the scheduler and streaming schema to the current version:

/opt/vertica/packages/kafka/bin/vkconfig scheduler --upgrade --config-schema old_config --upgrade-to-schema new_schema

Drop the schema scheduler219a:

/opt/vertica/packages/kafka/bin/vkconfig scheduler --drop --config-schema  scheduler219a --username release