Numeric Data Type Overflow with SUM, SUM_FLOAT, and AVG
When you use the functions SUM, SUM_FLOAT, and AVG with a NUMERIC data type, be aware that overflow can occur and how Vertica responds to that overflow.
This discussion applies to both the aggregate and analytic functions.
For queries, when using the functions SUM, SUM_FLOAT, and AVG with a NUMERIC data type, Vertica allows for silent overflow if you exceed your specified precision.
Vertica also allows numeric overflow when you use the SUM or SUM_FLOAT functions with LAPs.
Default Overflow Handling
With NUMERIC data types, Vertica internally works with multiples of 18 digits. If your specified precision is less than 18 (for example, x(12,0)
), Vertica allows for an overflow up to and including the first multiple of 18. In some situations, if you sum a column (SUM(x)
), you can exceed the number of digits Vertica internally reserves for the result. In this case, Vertica allows a silent overflow.
Turning Off Silent Numeric Overflow
You can turn off silent numeric overflow and instruct Vertica to implicitly include extra digit places. Specifying extra spaces allows Vertica to consistently return your expected results, even when you exceed the precision specified in your DDL.
You turn off silent numeric overflow by setting the parameter AllowNumericOverflow to 0 (false).
When you set the parameter to 0, Vertica considers the value of a corresponding parameter, NumericSumExtraPrecisionDigits.
The NumericSumExtraPrecisionDigits parameter defaults to 6, meaning that Vertica internally add six places beyond your DDL-specified precision. Adding extra precision digits can allow Vertica to consistently return results that overflow your DDL-specified precision. However, there can be a performance impact for crossing into the second multiple of 18 internally.
- Suppose your DDL specifies 11 (for example,
x(11,0)
) and you accept the default of NumericSumExtraPrecisionDigits (6). In this case, Vertica internally stays within the first multiple of 18 digits and no additional performance impact occurs. - Given the same example, if you set NumericSumExtraPrecisionDigits to 10, Vertica internally crosses a threshold into the second multiple of 18. Performance-wise, if (hypothetically) the first example is performance “a,” then the second is “2a,” substantially increasing the performance impact. Beyond the second multiple of 18, the performance impact continues to be "2a."
An example:
This sample representation shows how Vertica responds internally when you set AllowNumericOverflow
to 0 (false).
Vertica recommends that you turn off silent numeric overflow and set the parameter NumericSumExtraPrecisionDigits if you expect to exceed the precision specified in your DDL. Crossing into the second multiple of 18 can affect performance. Therefore, consider carefully before setting NumericSumExtraPrecisionDigits to a number higher than what you need for returning the SUM of your numeric columns.
Be aware that, if you turn off AllowNumericOverflow, and you exceed the number of extra precision digits set by NumericSumExtraPrecisionDigits, Vertica returns an error.
Impact on Live Aggregate Projections (LAPs)
For LAPs, Vertica also allows silent numeric overflow if your LAP uses the SUM or SUM_FLOAT functions. To turn off silent numeric overflow for LAPs:
- Set the parameter AllowNumericOverflow to 0.
- Set the parameter NumericSumExtraPrecisionDigits to the number of implicit digits you want. Alternatively, use the default setting of 6.
- Drop and re-create your LAPs.
If you turn off silent numeric overflow, be aware of the following scenarios where an overflow causes a roll back or error message. In these examples, AllowNumericOverflow is set to 0 (false), and each LAP uses the SUM or SUM_FLOAT function.
When numeric overflow is off:
-
A load can roll back upon overflow.
Vertica aggregates data before loading in into a LAP. If you are inserting, copying, or merging data, and an overflow occurs during load as Vertica is aggregating the data, Vertica rolls back the load.
-
An overflow can occur after load as Vertica sums existing data.
Vertica computes the sum of existing data separately from the computation that it does during data load. If your LAP selects a column using SUM or SUM_FLOAT and an overflow occurs, Vertica produces an error message. This response is similar to the way Vertica produces an error for a query using the SUM or SUM_FLOAT function.
-
An overflow can occur during merge-out.
Vertica logs a message during merge-out if an overflow occurs as Vertica computes a final sum during the tuple mover operation. If an error occurs, Vertica marks the LAP as out-of-date. Vertica no longer runs tuple mover operations with the out-of-date LAP.