Classifying Data Using Random Forest

This random forest example uses a data set named iris. The example contains four variables that measure various parts of the iris flower to predict its species.

Before you begin the example, make sure that you have followed the steps in Downloading the Machine Learning Example Data.

  1. Create the random forest model, named rf_iris, using the iris data.

    => SELECT RF_CLASSIFIER ('rf_iris', 'iris', 'Species', 'Sepal_Length, Sepal_Width, Petal_Length, Petal_Width' 
    USING PARAMETERS ntree=100, sampling_size=0.5);
    
            RF_CLASSIFIER
    ----------------------------
    The random forest is trained
    
    (1 row)
  2. View the summary output of rf_iris.

    => SELECT SUMMARIZE_MODEL('rf_iris');
    Number of trees: 100
    Number of skipped samples: 0, Number of processed samples: 150
    Call string:
    SELECT rf_classifier('rf_iris', 'iris', '"species"', 'Sepal_Length, Sepal_Width, Petal_Length, Petal_Width' 
    USING PARAMETERS exclude_columns='', ntree=100, mtry='2', sampling_size=0.5, max_depth=5, max_breadth=32, min_leaf_size=1, min_info_gain=0, nbins=32);
    Predictor names and types:
    sepal_length: float, sepal_width: float, petal_length: float, petal_width: float
    (1 row)
  3. Apply the classifier to the test data:

    => SELECT PREDICT_RF_CLASSIFIER (Sepal_Length, Sepal_Width, Petal_Length, Petal_Width
                                      USING PARAMETERS model_name='rf_iris') FROM iris1;
    
    PREDICT_RF_CLASSIFIER
    -----------------------
    setosa
    setosa
    setosa
    .
    .
    .
    versicolor
    versicolor
    versicolor
    .
    .
    .
    virginica
    virginica
    virginica
    .
    .
    .
    (90 rows)
  4. Use PREDICT_RF_CLASSES to view the probability of the classes:

    => SELECT PREDICT_RF_CLASSIFIER_CLASSES(Sepal_Length, Sepal_Width, Petal_Length, Petal_Width
                                   USING PARAMETERS model_name='rf_iris') OVER () FROM iris1;
    predicted  |    probability
    -----------+-------------------
    setosa     |                 1
    setosa     |                 1
    setosa     |                 1
    setosa     |                 1
    setosa     |                 1
    setosa     |                 1
    setosa     |                 1
    setosa     |                 1
    setosa     |                 1
    setosa     |                 1
    setosa     |              0.99
    .
    .
    .
    (90 rows)

See Also