Microbatch Utility Options
A microbatch represents an individual segment of a data load from a Kafka stream. It combines the definitions for your cluster, source, target, and load spec that you create using the other vkconfig options. The scheduler uses all of the information in the microbatch to execute COPY statements using KafkaSource function that transfer data from Kafka to Vertica. The result of each microbatch load is stored in the stream_microbatch_history table.
Option | Description |
---|---|
--microbatch name |
A unique, case insensitive name for the microbatch. |
--new-microbatch updated_name |
The updated name for the microbatch. Requires the --update shared utility option. |
--load-spec loadspec_name |
The load spec to use while processing this microbatch. |
--target-schema schema_name |
The existing Vertica target schema associated with this microbatch. |
--rejection-schema schema_name |
The existing Vertica schema that contains a table for storing rejected messages. |
--target-columns column_name, ...| column expression |
A column expression for the target table. This value can be a list of columns or a complete expression. See the COPY statement COPY Parameters in the core documentation for a description of column expressions. |
--rejection-table table_name |
The existing Vertica table that stores rejected messages. |
--enabled TRUE|FALSE
|
When TRUE, allows the microbatch to execute. |
--target-table table_name |
The name of a Vertica table corresponding to the target. This table must belong to the target schema. |
--add-source source_name |
The name of a source to assign to this microbatch. You can use this parameter once per command. You can also use it with --update to add sources to a microbatch. Requires --add-source-cluster . |
--add-source-cluster cluster_name |
The name of a cluster to assign to this microbatch.
You can use this parameter once per command. You can also use it with --update to add sources to a microbatch. You can only add sources from the same cluster to a single microbatch. Requires --add-source . |
--remove-source source_name |
The name of a source to remove from this microbatch.
You can use this parameter once per command. You can also use it with --update to remove multiple sources from a microbatch. Requires --remove-source-cluster . |
--remove-source-cluster cluster_name |
The name of a cluster to remove from this microbatch.
You can use this parameter once per command. Requires --remove-source . |
--offset partition_1_offset[,partition_2_offset,...]
|
The offset of the message in the source where the microbatch starts its load. If you use this parameter, you must supply an offset value for each partition in the source or each partition you list in the You can use this option to skip some messages in the source or reload previously read messages. Important: You cannot set an offset for a microbatch while the scheduler is running. If you attempt to do so, the vkconfig utility returns an error. Use the shutdown utility to shut the scheduler down before setting an offset for a microbatch. |
--partition partition_1[,partition_2,...]
|
One or more partitions to which the offsets given in the --offset option apply. If you supply this option, then the offset values given in the --offset option applies to the partitions you specify. Requires the --offset option. |
--source source_name
|
The name of the source to which the offset in the --offset option applies. Required when the microbatch defines more than one source or the --cluster parameter is given. Requires the --offset option. |
--cluster cluster_name
|
The name of the cluster to which the --offset option applies. Only required if the microbatch defines more than one cluster or the --source parameter is supplied. Requires the --offset option. |
Examples
This example shows how you can create the microbatch, mbatch1. This microbatch identifies the schema, target table, load spec, and source for the microbatch:
/opt/vertica/packages/kafka/bin/vkconfig microbatch --create --microbatch mbatch1 \ --target-schema public \ --target-table BatchTarget \ --load-spec Filterspec \ --add-source SourceFeed \ --add-source-cluster StreamCluster1