LIFT_TABLE

Returns a table that compares the predictive quality of a logistic regression model. This function is also known as a lift chart.

You cannot pass any inputs to the OVER() clause.

Important: Before using a machine learning function, be aware that all the ongoing transactions might be committed.

Syntax

LIFT_TABLE ( target, probabilities
              [ USING PARAMETERS [num_bins=nBins] ])
             OVER()

Arguments

target

The column in the input table containing the response variable. Must be an integer.

probabilities

The column in the input table where the observation is of class 1. Must be a float.

Parameters

num_bins=nBins

(Optional) Groups rows together, based upon the probability column, for faster processing. You use this parameter to determine the number of different decision boundaries to consider. The parameter partitions the number line from 0 to 1 in nBin points that are equally spaced. It evaluates the table at each of the nBin points. Must be an integer.

Default Value: 100

Examples

This example demonstrates how you can execute the LIFT_TABLE function on an input table named mtcars.

=> SELECT LIFT_TABLE(obs, prob USING PARAMETERS num_bins=2) OVER() 
	FROM (SELECT am AS obs, PREDICT_LOGISTIC_REG(mpg, cyl, disp, hp, drat, wt, qsec, vs, gear, carb
                                                    USING PARAMETERS model_name='logisticRegModel',
                                                    type='probability') AS prob
             FROM mtcars) AS prediction_output;
 decision_boundary | positive_prediction_ratio |       lift       |                   comment
-------------------+---------------------------+------------------+---------------------------------------------
                 1 |                         0 |              NaN |
               0.5 |                   0.40625 | 2.46153846153846 |
                 0 |                         1 |                1 | Of 32 rows, 32 were used and 0 were ignored
(3 rows)

The first column, decision_boundary, indicates the cut-off point for whether to classify a response as 0 or 1. For instance, for each row, if prob is greater than decision_boundary, the response is classified as 1. If prob is less than decision_boundary, the response is classified as 0.

The second column, positive_prediction_ratio, shows the percentage of samples in class 1 that the function classified correctly using the corresponding decision_boundary value.

For the third column, lift, the function divides the positive_prediction_ratio by the percentage of rows correctly classified as class 1.