LIFT_TABLE
Returns a table that compares the predictive quality of a logistic regression model. This function is also known as a lift chart.
You cannot pass any inputs to the OVER()
clause.
Important: Before using a machine learning function, be aware that all the ongoing transactions might be committed.
Syntax
LIFT_TABLE ( target, probabilities [ USING PARAMETERS [num_bins=nBins] ]) OVER()
Arguments
target |
The column in the input table containing the response variable. Must be an integer. |
probabilities |
The column in the input table where the observation is of class 1. Must be a float. |
Parameters
num_bins=nBins |
(Optional) Groups rows together, based upon the probability column, for faster processing. You use this parameter to determine the number of different decision boundaries to consider. The parameter partitions the number line from 0 to 1 in nBin points that are equally spaced. It evaluates the table at each of the nBin points. Must be an integer. Default Value: 100 |
Examples
This example demonstrates how you can execute the LIFT_TABLE function on an input table named mtcars
.
=> SELECT LIFT_TABLE(obs, prob USING PARAMETERS num_bins=2) OVER() FROM (SELECT am AS obs, PREDICT_LOGISTIC_REG(mpg, cyl, disp, hp, drat, wt, qsec, vs, gear, carb USING PARAMETERS model_name='logisticRegModel', type='probability') AS prob FROM mtcars) AS prediction_output;
decision_boundary | positive_prediction_ratio | lift | comment -------------------+---------------------------+------------------+--------------------------------------------- 1 | 0 | NaN | 0.5 | 0.40625 | 2.46153846153846 | 0 | 1 | 1 | Of 32 rows, 32 were used and 0 were ignored (3 rows)
The first column, decision_boundary
, indicates the cut-off point for whether to classify a response as 0 or 1. For instance, for each row, if prob
is greater than decision_boundary
, the response is classified as 1. If prob
is less than decision_boundary
, the response is classified as 0.
The second column, positive_prediction_ratio
, shows the percentage of samples in class 1 that the function classified correctly using the corresponding decision_boundary
value.
For the third column, lift
, the function divides the positive_prediction_ratio
by the percentage of rows correctly classified as class 1.