A parser takes a stream of bytes and passes a corresponding sequence of tuples to the Vertica load process. You can use User-Defined Parser functions to parse:
- Data in formats not understood by the Vertica built-in parser.
- Data that requires more specific control than the built-in parser supplies.
For example, you could load a CSV file using a specific CSV library. See the Vertica SDK for two CSV examples.
COPY supports a single User-Defined Parser that you can use with a
UDSource and zero or more instances of
Sometimes you can improve the performance of your parser by adding a chunker. A chunker divides up the input and uses multiple threads to parse it. See Cooperative Parse. Chunkers are available only in the C++ API.
Under special circumstances you can further improve performance by using apportioned load, an approach where multiple Vertica nodes parse the input. See Apportioned Load.
If you implement a
UDParser, you must also implement a corresponding