ERROR_RATE

Using an input table, returns a table that calculates the rate of incorrect classifications and displays them as FLOAT values. ERROR_RATE returns a table with the following dimensions:

  • Rows: Number of classes plus one row that contains the total error rate across classes
  • Columns: 2

Syntax

ERROR_RATE ( targets, predictions
              [ USING PARAMETERS num_classes=num‑classes ] )
           OVER()

Arguments

targets

An input column that contains the true values of the response variable.

predictions

An input column that contains the predicted class labels.

Arguments targets and predictions must be set to input columns of the same data type, one of the following: INTEGER, BOOLEAN, or CHAR/VARCHAR. Depending on their data type, these columns identify classes as follows:

  • INTEGER: Zero-based consecutive integers between 0 and (num-classes-1) inclusive, where num-classes is the number of classes. For example, given the following input column values— {0, 1, 2, 3, 4}—Vertica assumes five classes.

    If input column values are not consecutive, Vertica interpolates the missing values. Thus, given the following input values— {0, 1, 3, 5, 6,}Vertica assumes seven classes.

  • BOOLEAN: Yes or No
  • CHAR/VARCHAR: Class names. If the input columns are of type CHAR/VARCHAR columns, you must also set parameter num_classes to the number of classes.

    Vertica computes the number of classes as the union of values in both input columns. For example, given the following sets of values in the targets and predictions input columns, Vertica counts four classes:

    {'milk', 'soy milk', 'cream'} 
    {'soy milk', 'almond milk'}

Parameter Settings

Parameter name Set to…
num_classes

An integer > 1, specifies the number of classes to pass to the function.

You must set this parameter if the specified input columns are of type CHAR/VARCHAR. Otherwise, the function processes this parameter according to the column data types:

  • INTEGER: By default set to 2, you must set this parameter correctly if the number of classes is any other value.
  • BOOLEAN: By default set to 2, cannot be set to any other value.

Privileges

Non-superusers: model owner, or USAGE privileges on the model

Examples

This example shows how to execute the ERROR_RATE function on an input table named mtcars. The response variables appear in the column obs, while the prediction variables appear in the column pred. Because this example is a classification problem, all response variable values and prediction variable values are either 0 or 1, indicating binary classification.

In the table returned by the function, the first column displays the class id column. The second column displays the corresponding error rate for the class id. The third column indicates how many rows were successfully used by the function and whether any rows were ignored.

=> SELECT ERROR_RATE(obs::int, pred::int USING PARAMETERS num_classes=2) OVER() 
	FROM (SELECT am AS obs, PREDICT_LOGISTIC_REG (mpg, cyl, disp, drat, wt, qsec, vs, gear, carb
                USING PARAMETERS model_name='myLogisticRegModel', type='response') AS pred
             FROM mtcars) AS prediction_output; 
 class |     error_rate     |                   comment
-------+--------------------+---------------------------------------------
     0 |                  0 |
     1 | 0.0769230797886848 |
       |            0.03125 | Of 32 rows, 32 were used and 0 were ignored
(3 rows)