UDTF Class Overview

You create your UDTF by subclassing two classes defined by the Vertica SDK: TransformFunction and TransformFunctionFactory.

The TransformFunctionFactory performs two roles:

TransformFunction

The TransformFunction class is where you perform the data-processing, transforming input rows into output rows. Your subclass must define the processPartition() method. It may define methods to set up and tear down the function.

Performing the Transformation

The processPartition() method carries out all of the processing that you want your UDTF to perform. When a user calls your function in a SQL statement, Vertica bundles together the data from the function parameters and passes it to processPartition() .

The input and output of the processPartition() method are supplied by objects of the BlockReader and BlockWriter classes. They define methods that you use to read the input data and write the output data for your UDTF.

Your processPartition() method should follow this basic pattern:

Note: In some cases, you may want to determine the number and types of parameters using PartitionReader's getNumCols() and getTypeMetaData() functions, instead of just hard-coding the data types of the columns in the input row. This is useful if you want your TransformFunction to be able to process input tables with different schemas. You can then use different TransformFunctionFactory classes to define multiple function signatures that call the same TransformFunction class. See Handling Different Numbers and Types of Arguments for more information.

Setting Up and Tearing Down

The TransformFunction class defines two additional methods that you can optionally implement to allocate and free resources: setup() and destroy(). You should use these methods to allocate and deallocate resources that you do not allocate through the UDx API (see Allocating Resources for UDxs for details).

TransformFunctionFactory

The TransformFunctionFactory class tells Vertica metadata about your UDTF: its number of parameters and their data types, as well as the data type of its return value. It also instantiates a subclass of TransformFunction.

You must implement the following methods in your TransformFunctionFactory:

Note: The getReturnType() function is required for UDTFs. It is optional for UDxs that return single values, such as User-Defined Scalar Functions.

See Also

Creating Multi-Phase UDTFs