Loading...

verticapy.machine_learning.memmodel.ensemble.IsolationForest#

class verticapy.machine_learning.memmodel.ensemble.IsolationForest(trees: list[BinaryTreeAnomaly])#

InMemoryModel implementation of the isolation forest algorithm.

Parameters#

trees: list[BinaryTreeAnomaly]

list of BinaryTree for anomaly detection.

Attributes#

Attributes are identical to the input parameters, followed by an underscore (‘_’).

Examples#

Initalization

An Isolation Forest model is an ensemble of multiple binary tree anomaly models. In this example, we will create three BinaryTreeAnomaly models:

from verticapy.machine_learning.memmodel.tree import BinaryTreeAnomaly

model1 = BinaryTreeAnomaly(
    children_left = [1, 3, None, None, None],
    children_right = [2, 4, None, None, None],
    feature = [0, 1, None, None, None],
    threshold = ["female", 30, None, None, None],
    value = [None, None, [2, 10], [3, 4], [7, 8]],
    psy = 100,
)


model2 = BinaryTreeAnomaly(
    children_left = [1, 3, None, None, None],
    children_right = [2, 4, None, None, None],
    feature = [0, 1, None, None, None],
    threshold = ["female", 30, None, None, None],
    value = [None, None, [1, 11], [2, 5], [5, 10]],
    psy = 100,
)


model3 = BinaryTreeAnomaly(
    children_left = [1, 3, None, None, None],
    children_right = [2, 4, None, None, None],
    feature = [0, 1, None, None, None],
    threshold = ["female", 30, None, None, None],
    value = [None, None, [3, 9], [1, 6], [8, 7]],
    psy = 100,
)

Now we will use above models to create IsolationForest model.

from verticapy.machine_learning.memmodel.ensemble import IsolationForest

model_isf = IsolationForest(trees = [model1, model2, model3])

Create a dataset.

data = [["male", 100], ["female", 20], ["female", 50]]

Making In-Memory Predictions

Use predict() method to do predictions.

model_isf.predict(data)
Out[8]: array([0.6213801 , 0.70052979, 0.43580485])

Deploy SQL Code

Let’s use the following column names:

cnames = ["sex", "fare"]

Use predict_sql() method to get the SQL code needed to deploy the model using its attributes.

model_isf.predict_sql(cnames)
Out[10]: "POWER(2, - (((CASE WHEN sex = 'female' THEN (CASE WHEN fare < 30 THEN 0.5800175392069298 ELSE 1.2309212867903394 END) ELSE 0.6872811212546747 END) + (CASE WHEN sex = 'female' THEN (CASE WHEN fare < 30 THEN 0.5172970982941328 ELSE 1.0459324046363703 END) ELSE 0.5907488387580265 END) + (CASE WHEN sex = 'female' THEN (CASE WHEN fare < 30 THEN 0.44313045601876805 ELSE 1.3178838132835962 END) ELSE 0.7813262006226015 END)) / 3))"

Hint

This object can be pickled and used in any in-memory environment, just like SKLEARN models.

Drawing Trees

Use plot_tree() method to draw the input tree.

model_isf.plot_tree(tree_id = 0)
../_images/machine_learning_memmodel_ensemble_iforest.png

Important

plot_tree() requires the Graphviz module.

Note

The above example is a very basic one. For other more detailed examples and customization options, please see :ref:`chart_gallery.tree`_

__init__(trees: list[BinaryTreeAnomaly]) None#

Methods

__init__(trees)

get_attributes()

Returns the model attributes.

plot_tree([pic_path, tree_id])

Draws the input tree.

predict(X)

Predicts using the IsolationForest model.

predict_sql(X)

Returns the SQL code needed to deploy the model.

set_attributes(**kwargs)

Sets the model attributes.

Attributes

object_type

Must be overridden in child class