PREDICT_NAIVE_BAYES_CLASSES

Applies a Naive Bayes model on an input table or view and returns the probabilities of classes.

Important: Before using a machine learning function, be aware that all the ongoing transactions might be committed.

Syntax

PREDICT_NAIVE_BAYES_CLASSES ( predictor_columns
                                USING PARAMETERS model_name = 'model_name'
                                                 [,key_columns = 'key_columns',]
                                                 [exclude_columns = 'col1, col2, ..., coln',] 
                                                 [classes = 'class1, class2, ..., classn', ] 
                                                 [match_by_pos = 'method'] )
         OVER() 
            AS (key_columns, Predicted, Probability, class1, class2, ..., classn)

Arguments

predictor_columns

A comma-separated list of the columns in input_relationthat represent the independent variables for the model.

Supports the use of wildcard (*) characters in place of column names. If you use a wildcard character (*) in place of a column name, all the columns in input_relation are selected.

Parameters

model_name = 'model_name'

The name of the model. Model names are case insensitive.

key_columns = 'key_columns'
(Optional) A comma-separated list of column names from the input_relation which you use to identify each row of the output.
exclude_columns = 'col1, col2, ..., coln'
(Optional) The columns from predictor_columns that you want to exclude. This parameter is useful when using the wildcard (*) in the predictor_columns.
classes = 'class1, class2, ..., classn'
(Optional) Class labels in the model. The probability of belonging to this given class as predicted by the classifier. The values are case sensitive.
match_by_pos= 'method'

(Optional) Valid Values:

  • false (default): Input columns will be matched to features in the model based on their names.

  • true: Input columns will be matched to features in the model based on their position in the list of indicated input columns.

Return

Return data type: One VARCHAR column and multiple FLOAT columns

The VARCHAR column is named predicted and contains the class label with the highest probability. The first FLOAT column is named probability and contains the probability for the class specified in the predicted column. The other FLOAT columns contain each class listed in the classes column.

Examples

This example shows how you can use the PREDICT_NAIVE_BAYES_CLASSES function.

=> SELECT PREDICT_NAIVE_BAYES_CLASSES (id, vote1, vote2
                                       USING PARAMETERS model_name = 'naive_house84_model',
                                                        key_columns = 'id',
                                                        exclude_columns = 'id',
                                                        classes = 'democrat, republican', 
                                                        match_by_pos = 'false') 
        OVER() FROM house84_test;
 id  | Predicted  |    Probability    |     democrat      |    republican
-----+------------+-------------------+-------------------+-------------------
  21 | democrat   | 0.775473383353576 | 0.775473383353576 | 0.224526616646424
  28 | democrat   | 0.775473383353576 | 0.775473383353576 | 0.224526616646424
  83 | republican | 0.592510497724379 | 0.407489502275621 | 0.592510497724379
 102 | democrat   | 0.779889432167111 | 0.779889432167111 | 0.220110567832889
 107 | republican | 0.598662714551597 | 0.401337285448403 | 0.598662714551597
 125 | republican | 0.598662714551597 | 0.401337285448403 | 0.598662714551597
 132 | republican | 0.592510497724379 | 0.407489502275621 | 0.592510497724379
 136 | republican | 0.592510497724379 | 0.407489502275621 | 0.592510497724379
 155 | republican | 0.598662714551597 | 0.401337285448403 | 0.598662714551597
 174 | republican | 0.592510497724379 | 0.407489502275621 | 0.592510497724379
.
.
.
(1 row)

See Also