Exiting When the Calling Query Has Been Canceled

Since User-Defined Transform Functions (UDTFs) often perform lengthy and CPU-intensive processing, it makes sense for them to terminate if the query that called them has been canceled. Exiting when the query has been canceled helps prevent wasting CPU cycles and memory on continued processing.

The TransformFunction class has a getter named .isCanceled that returns true if the calling query has been canceled. Your processPartition method can periodically check the value of this getter to determine if the query has been canceled, and exit if it has.

How often your processPartition function calls isCanceled depends on how much processing it performs on each row of data. Calling isCanceled does add overhead to your function, so you shouldn't call it too often. For transforms that do not perform lengthy processing, you could check for cancelation every 100 or 1000 rows. If your processPartition performs extensive processing for each row, you may want to check isCanceled every 10 or so rows.

The following code fragment shows how you could have the StringTokenizer UDTF example check whether its query has been canceled:

    public class CancelableTokenizeString extends TransformFunction
    {
        @Override
        public void processPartition(ServerInterface srvInterface, 
                                      PartitionReader inputReader, 
                                      PartitionWriter outputWriter)
                    throws UdfException, DestroyInvocation
        {
            // Loop over all rows passed in in this partition.
            
            int rowcount = 0; // maintain count of rows processed
            do {
                rowcount++; // Processing new row
                
                // Check for cancelation every 100 rows
                if (rowcount % 100 == 0) {
                    // Check to see if Vertica marked this class as canceled
                    if (this.isCanceled()) {
                        srvInterface.log("Got canceled! Exiting...");
                        return;
                    }
                }
                // Rest of the function here
                .         .        .

This example checks for cancelation after processing 100 rows in the partition of data. If the query has been canceled, the example logs a message, then returns to the caller to exit the function.

Note: You need to strike a balance between adding overhead to your functions by calling isCanceled and having your functions waste CPU time by running after their query has been canceled (a rare event). For functions such as StringTokenizer which have a low overall processing cost, it usually does not make sense to test for cancelation. The cost of adding overhead to all function calls outweigh the amount of resources wasted by having the function run to completion or having its JVM process killed by Vertica on the rare occasions that its query is canceled.