Parser Classes

This section describes information that is specific to the Java API. See User-Defined Parser for general information about implementing the UDParser and ParserFactory classes.

UDParser API

The API provides the following methods for extension by subclasses:

public void setup(ServerInterface srvInterface, SizedColumnTypes returnType) 
	throws UdfException;
				
public abstract StreamState process(ServerInterface srvInterface, 
				DataBuffer input, InputState input_state) 
	throws UdfException, DestroyInvocation;
				
protected void cancel(ServerInterface srvInterface);
				
public void destroy(ServerInterface srvInterface, SizedColumnTypes returnType) 
	throws UdfException;

public RejectedRecord getRejectedRecord() throws UdfException;

A UDParser uses a StreamWriter to write its output. StreamWriter provides methods for all the basic types, such as setBooleanValue(), setStringValue(), and so on. In the Java API this class also provides the setValue() method, which automatically sets the data type.

The methods described so far write single column values. StreamWriter also provides a method to write a complete row from a map. The setRowFromMap() method takes a map of column names and values and writes all the values into their corresponding columns. This method does not define new columns but instead writes values only to existing columns. The JsonParser example uses this method to write arbitrary JSON input. (See Parser Example: JSON Parser.)

The setRowFromMap() method does not automatically advance the input to the next line; you must call next(). You can thus read a row and then override selected column values.

setRowsFromMap() also populates any VMap ('__raw__') column of Flex Tables (see Using Flex Tables) with the entire provided map. For most cases, setRowsFromMap() is the appropriate way to populate a Flex Table. However, you can also generate a VMap value into a specified column using setVMap(), similar to other setValue() methods.

The setRowFromMap() method automatically coerces the input values into the types defined for those columns using an associated TypeCoercion. In most cases, using the default implementation (StandardTypeCoercion) is appropriate.

TypeCoercion uses policies to govern its behavior. For example, the FAIL_INVALID_INPUT_VALUE policy means invalid input is treated as an error instead of using a null value. Errors are caught and handled as rejections (see "Rejecting Rows" in User-Defined Parser). Policies also govern whether input that is too long is truncated. Use the setPolicy() method on the parser's TypeCoercion to set policies. See the API documentation for supported values.

You might need to customize type coercion beyond setting these policies. To do so, subclass one of the provided implementations of TypeCoercion and override the asType() methods. Such customization could be necessary if your parser reads objects that come from a third-party library. A parser handling geo-coordinates, for example, might override asLong to translate inputs like "40.4397N" into numbers. See the Vertica API documentation for a list of implementations.

ContinuousUDParser API

The ContinuousUDParser class extends UDParser and adds the following methods for extension by subclasses:

public void initialize(ServerInterface srvInterface, SizedColumnTypes returnType);

public abstract void run() throws UdfException;
				
public void deinitialize(ServerInterface srvInterface, SizedColumnTypes returnType);

See the API documentation for additional utility methods.

ParserFactory API

The API provides the following methods for extension by subclasses:

public void plan(ServerInterface srvInterface, PerColumnParamReader perColumnParamReader, PlanContext planCtxt) 
	throws UdfException;

public abstract UDParser prepare(ServerInterface srvInterface, PerColumnParamReader perColumnParamReader, 
				PlanContext planCtxt, SizedColumnTypes returnType) 
	throws UdfException;
				
public void getParameterType(ServerInterface srvInterface, SizedColumnTypes parameterTypes);
				
public void getParserReturnType(ServerInterface srvInterface, PerColumnParamReader perColumnParamReader, 
				PlanContext planCtxt, SizedColumnTypes argTypes, SizedColumnTypes returnType) 
	throws UdfException;