UDAF Class Overview
You create your UDAF by subclassing two classes defined by the Vertica SDK: AggregateFunction
and AggregateFunctionFactory
.
AggregateFunction
The AggregateFunction
class performs the aggregation. It computes values on each database node where relevant data is stored and then combines the results from the nodes. You must implement the following methods:
initAggregate()
- Initializes the class, defines variables, and sets the starting value for the variables. This function must be idempotent.aggregate()
- The main aggregation operation, executed on each node.combine()
- If multiple invocations ofaggregate()
are needed, Vertica callscombine()
to combine all the sub-aggregations into a final aggregation. Although this method might not be called, you must define it.terminate()
- Terminates the function and returns the result as a column.
Important: The aggregate()
function might not operate on the complete input set all at once. For this reason, initAggregate()
must be idempotent.
The AggregateFunction
class also provides optional methods that you can implement to allocate and free resources: setup()
and destroy()
. You should use these methods to allocate and deallocate resources that you do not allocate through the UDAF API (see Allocating Resources for UDxs for details).
AggregateFunctionFactory
The AggregateFunctionFactory
class specifies metadata information such as the argument and return types of your aggregate function. It also instantiates your AggregateFunction
subclass. Your subclass must implement the following methods:
getPrototype()
- Defines the number of parameters and data types accepted by the function. There is a single parameter for aggregate functions.getIntermediateTypes()
- Defines the intermediate variable(s) used by the function. These variables are used when combining the results ofaggregate()
calls.getParameterType()
- Defines the names and types of parameters that this function uses (optional).getReturnType()
- Defines the type of the output column.
Vertica uses this data when you call the CREATE AGGREGATE FUNCTION SQL statement to add the function to the database catalog.