How Vertica and Data Streaming Work Together
Vertica provides a high-performance loading mechanism for streaming data from a third party message bus into your Vertica database.
Data streaming provides high volumes of data with low latency. Vertica can process this data by running many COPY statements, each of which loads small amounts of data into your Vertica database. However, this process can become complex. Instead of writing complicated ETL processes and dispatching COPY statements manually, you can use the data streaming integration feature to automatically load data as it streams through your message bus.
The integration features between Vertica and data streaming consist of:
- A UDL library that loads data from a message bus into Vertica
- A job scheduler that uses the UDL library to continuously consume data from your message bus with exactly-once semantics
- Push-based Monitoring Vertica Using Notifiers capable of sending messages from Vertica to Kafka
- A KafkaExport function that outputs rows that Vertica was unable to send to Kafka
This process contributes to low latency and minimal impact on the other processes on the database.