verticapy.machine_learning.vertica.svm.LinearSVC#

class verticapy.machine_learning.vertica.svm.LinearSVC(name: str = None, overwrite_model: bool = False, tol: float = 0.0001, C: float = 1.0, intercept_scaling: float = 1.0, intercept_mode: Literal['regularized', 'unregularized'] = 'regularized', class_weight: Literal['auto', 'none'] | list = [1, 1], max_iter: int = 100)#

Creates a LinearSVC object using the Vertica Support Vector Machine (SVM) algorithm on the data. Given a set of training examples, where each is marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier.

Parameters#

name: str, optional

Name of the model. The model is stored in the database.

overwrite_model: bool, optional

If set to True, training a model with the same name as an existing model overwrites the existing model.

tol: float, optional

Tolerance for stopping criteria. This is used to control accuracy.

C: float, optional

Weight for misclassification cost. The algorithm minimizes the regularization cost and the misclassification cost.

intercept_scaling: float

A float value, serves as the value of a dummy feature whose coefficient Vertica uses to calculate the model intercept. Because the dummy feature is not in the training data, its values are set to a constant, by default set to 1.

intercept_mode: str, optional

Specify how to treat the intercept.

regularized:
Fits the intercept and applies a regularization.
unregularized:
Fits the intercept but does not include it in regularization.

class_weight: str | list, optional

Specifies how to determine weights for the two classes. It can be a list of 2 elements or one of the following methods:

auto:
Weights each class according to the number of samples.
none:
No weights are used.

max_iter: int, optional

The maximum number of iterations that the algorithm performs.

Attributes#

Many attributes are created during the fitting phase.

coef_: numpy.array: The regression coefficients. The order of coefficients is the same as the order of columns used during the fitting phase.
intercept_: float: The expected value of the dependent variable when all independent variables are zero, serving as the baseline or constant term in the model.
features_importance_: numpy.array: The importance of features is computed through the model coefficients, which are normalized based on their range. Subsequently, an activation function calculates the final score. It is necessary to use the features_importance() method to compute it initially, and the computed values will be subsequently utilized for subsequent calls.
classes_: numpy.array: The classes labels.

Note

All attributes can be accessed using the get_attributes() method.

Note

Several other attributes can be accessed by using the get_vertica_attributes() method.

Examples#

The following examples provide a basic understanding of usage. For more detailed examples, please refer to the Machine Learning or the Examples section on the website.

Load data for machine learning#

We import verticapy:

import verticapy as vp

Hint

By assigning an alias to verticapy, we mitigate the risk of code collisions with other libraries. This precaution is necessary because verticapy uses commonly known function names like “average” and “median”, which can potentially lead to naming conflicts. The use of an alias ensures that the functions from verticapy are used as intended without interfering with functions from other libraries.

For this example, we will use the winequality dataset.

import verticapy.datasets as vpd

data = vpd.load_winequality()

	123 fixed_acidity Numeric(8)	123 volatile_acidity Numeric(9)	123 citric_acid Numeric(8)	123 residual_sugar Numeric(9)	123 chlorides Float(22)	123 free_sulfur_dioxide Numeric(9)	123 total_sulfur_dioxide Numeric(9)	123 density Float(22)	123 pH Numeric(8)	123 sulphates Numeric(8)	123 alcohol Float(22)	123 quality Integer	123 good Integer	Abc color Varchar(20)
1	3.8	0.31	0.02	11.1	0.036	20.0	114.0	0.99248	3.75	0.44	12.4	6	0	white
2	3.9	0.225	0.4	4.2	0.03	29.0	118.0	0.989	3.57	0.36	12.8	8	1	white
3	4.2	0.17	0.36	1.8	0.029	93.0	161.0	0.98999	3.65	0.89	12.0	7	1	white
4	4.2	0.215	0.23	5.1	0.041	64.0	157.0	0.99688	3.42	0.44	8.0	3	0	white
5	4.4	0.32	0.39	4.3	0.03	31.0	127.0	0.98904	3.46	0.36	12.8	8	1	white
6	4.4	0.46	0.1	2.8	0.024	31.0	111.0	0.98816	3.48	0.34	13.1	6	0	white
7	4.4	0.54	0.09	5.1	0.038	52.0	97.0	0.99022	3.41	0.4	12.2	7	1	white
8	4.5	0.19	0.21	0.95	0.033	89.0	159.0	0.99332	3.34	0.42	8.0	5	0	white
9	4.6	0.445	0.0	1.4	0.053	11.0	178.0	0.99426	3.79	0.55	10.2	5	0	white
10	4.6	0.52	0.15	2.1	0.054	8.0	65.0	0.9934	3.9	0.56	13.1	4	0	red
11	4.7	0.145	0.29	1.0	0.042	35.0	90.0	0.9908	3.76	0.49	11.3	6	0	white
12	4.7	0.335	0.14	1.3	0.036	69.0	168.0	0.99212	3.47	0.46	10.5	5	0	white
13	4.7	0.455	0.18	1.9	0.036	33.0	106.0	0.98746	3.21	0.83	14.0	7	1	white
14	4.7	0.6	0.17	2.3	0.058	17.0	106.0	0.9932	3.85	0.6	12.9	6	0	red
15	4.7	0.67	0.09	1.0	0.02	5.0	9.0	0.98722	3.3	0.34	13.6	5	0	white
16	4.7	0.785	0.0	3.4	0.036	23.0	134.0	0.98981	3.53	0.92	13.8	6	0	white
17	4.8	0.13	0.32	1.2	0.042	40.0	98.0	0.9898	3.42	0.64	11.8	7	1	white
18	4.8	0.17	0.28	2.9	0.03	22.0	111.0	0.9902	3.38	0.34	11.3	7	1	white
19	4.8	0.21	0.21	10.2	0.037	17.0	112.0	0.99324	3.66	0.48	12.2	7	1	white
20	4.8	0.225	0.38	1.2	0.074	47.0	130.0	0.99132	3.31	0.4	10.3	6	0	white
21	4.8	0.26	0.23	10.6	0.034	23.0	111.0	0.99274	3.46	0.28	11.5	7	1	white
22	4.8	0.29	0.23	1.1	0.044	38.0	180.0	0.98924	3.28	0.34	11.9	6	0	white
23	4.8	0.33	0.0	6.5	0.028	34.0	163.0	0.9937	3.35	0.61	9.9	5	0	white
24	4.8	0.34	0.0	6.5	0.028	33.0	163.0	0.9939	3.36	0.61	9.9	6	0	white
25	4.8	0.65	0.12	1.1	0.013	4.0	10.0	0.99246	3.32	0.36	13.5	4	0	white
26	4.9	0.235	0.27	11.75	0.03	34.0	118.0	0.9954	3.07	0.5	9.4	6	0	white
27	4.9	0.33	0.31	1.2	0.016	39.0	150.0	0.98713	3.33	0.59	14.0	8	1	white
28	4.9	0.335	0.14	1.3	0.036	69.0	168.0	0.99212	3.47	0.46	10.4666666666667	5	0	white
29	4.9	0.335	0.14	1.3	0.036	69.0	168.0	0.99212	3.47	0.46	10.4666666666667	5	0	white
30	4.9	0.345	0.34	1.0	0.068	32.0	143.0	0.99138	3.24	0.4	10.1	5	0	white
31	4.9	0.345	0.34	1.0	0.068	32.0	143.0	0.99138	3.24	0.4	10.1	5	0	white
32	4.9	0.42	0.0	2.1	0.048	16.0	42.0	0.99154	3.71	0.74	14.0	7	1	red
33	4.9	0.47	0.17	1.9	0.035	60.0	148.0	0.98964	3.27	0.35	11.5	6	0	white
34	5.0	0.17	0.56	1.5	0.026	24.0	115.0	0.9906	3.48	0.39	10.8	7	1	white
35	5.0	0.2	0.4	1.9	0.015	20.0	98.0	0.9897	3.37	0.55	12.05	6	0	white
36	5.0	0.235	0.27	11.75	0.03	34.0	118.0	0.9954	3.07	0.5	9.4	6	0	white
37	5.0	0.24	0.19	5.0	0.043	17.0	101.0	0.99438	3.67	0.57	10.0	5	0	white
38	5.0	0.24	0.21	2.2	0.039	31.0	100.0	0.99098	3.69	0.62	11.7	6	0	white
39	5.0	0.24	0.34	1.1	0.034	49.0	158.0	0.98774	3.32	0.32	13.1	7	1	white
40	5.0	0.255	0.22	2.7	0.043	46.0	153.0	0.99238	3.75	0.76	11.3	6	0	white
41	5.0	0.27	0.32	4.5	0.032	58.0	178.0	0.98956	3.45	0.31	12.6	7	1	white
42	5.0	0.27	0.32	4.5	0.032	58.0	178.0	0.98956	3.45	0.31	12.6	7	1	white
43	5.0	0.27	0.4	1.2	0.076	42.0	124.0	0.99204	3.32	0.47	10.1	6	0	white
44	5.0	0.29	0.54	5.7	0.035	54.0	155.0	0.98976	3.27	0.34	12.9	8	1	white
45	5.0	0.3	0.33	3.7	0.03	54.0	173.0	0.9887	3.36	0.3	13.0	7	1	white
46	5.0	0.31	0.0	6.4	0.046	43.0	166.0	0.994	3.3	0.63	9.9	6	0	white
47	5.0	0.33	0.16	1.5	0.049	10.0	97.0	0.9917	3.48	0.44	10.7	6	0	white
48	5.0	0.33	0.16	1.5	0.049	10.0	97.0	0.9917	3.48	0.44	10.7	6	0	white
49	5.0	0.33	0.16	1.5	0.049	10.0	97.0	0.9917	3.48	0.44	10.7	6	0	white
50	5.0	0.33	0.18	4.6	0.032	40.0	124.0	0.99114	3.18	0.4	11.0	6	0	white
51	5.0	0.33	0.23	11.8	0.03	23.0	158.0	0.99322	3.41	0.64	11.8	6	0	white
52	5.0	0.35	0.25	7.8	0.031	24.0	116.0	0.99241	3.39	0.4	11.3	6	0	white
53	5.0	0.35	0.25	7.8	0.031	24.0	116.0	0.99241	3.39	0.4	11.3	6	0	white
54	5.0	0.38	0.01	1.6	0.048	26.0	60.0	0.99084	3.7	0.75	14.0	6	0	red
55	5.0	0.4	0.5	4.3	0.046	29.0	80.0	0.9902	3.49	0.66	13.6	6	0	red
56	5.0	0.42	0.24	2.0	0.06	19.0	50.0	0.9917	3.72	0.74	14.0	8	1	red
57	5.0	0.44	0.04	18.6	0.039	38.0	128.0	0.9985	3.37	0.57	10.2	6	0	white
58	5.0	0.455	0.18	1.9	0.036	33.0	106.0	0.98746	3.21	0.83	14.0	7	1	white
59	5.0	0.55	0.14	8.3	0.032	35.0	164.0	0.9918	3.53	0.51	12.5	8	1	white
60	5.0	0.61	0.12	1.3	0.009	65.0	100.0	0.9874	3.26	0.37	13.5	5	0	white
61	5.0	0.74	0.0	1.2	0.041	16.0	46.0	0.99258	4.01	0.59	12.5	6	0	red
62	5.0	1.02	0.04	1.4	0.045	41.0	85.0	0.9938	3.75	0.48	10.5	4	0	red
63	5.0	1.04	0.24	1.6	0.05	32.0	96.0	0.9934	3.74	0.62	11.5	5	0	red
64	5.1	0.11	0.32	1.6	0.028	12.0	90.0	0.99008	3.57	0.52	12.2	6	0	white
65	5.1	0.14	0.25	0.7	0.039	15.0	89.0	0.9919	3.22	0.43	9.2	6	0	white
66	5.1	0.165	0.22	5.7	0.047	42.0	146.0	0.9934	3.18	0.55	9.9	6	0	white
67	5.1	0.21	0.28	1.4	0.047	48.0	148.0	0.99168	3.5	0.49	10.4	5	0	white
68	5.1	0.23	0.18	1.0	0.053	13.0	99.0	0.98956	3.22	0.39	11.5	5	0	white
69	5.1	0.25	0.36	1.3	0.035	40.0	78.0	0.9891	3.23	0.64	12.1	7	1	white
70	5.1	0.26	0.33	1.1	0.027	46.0	113.0	0.98946	3.35	0.43	11.4	7	1	white
71	5.1	0.26	0.34	6.4	0.034	26.0	99.0	0.99449	3.23	0.41	9.2	6	0	white
72	5.1	0.29	0.28	8.3	0.026	27.0	107.0	0.99308	3.36	0.37	11.0	6	0	white
73	5.1	0.29	0.28	8.3	0.026	27.0	107.0	0.99308	3.36	0.37	11.0	6	0	white
74	5.1	0.3	0.3	2.3	0.048	40.0	150.0	0.98944	3.29	0.46	12.2	6	0	white
75	5.1	0.305	0.13	1.75	0.036	17.0	73.0	0.99	3.4	0.51	12.3333333333333	5	0	white
76	5.1	0.31	0.3	0.9	0.037	28.0	152.0	0.992	3.54	0.56	10.1	6	0	white
77	5.1	0.33	0.22	1.6	0.027	18.0	89.0	0.9893	3.51	0.38	12.5	7	1	white
78	5.1	0.33	0.22	1.6	0.027	18.0	89.0	0.9893	3.51	0.38	12.5	7	1	white
79	5.1	0.33	0.22	1.6	0.027	18.0	89.0	0.9893	3.51	0.38	12.5	7	1	white
80	5.1	0.33	0.27	6.7	0.022	44.0	129.0	0.99221	3.36	0.39	11.0	7	1	white
81	5.1	0.35	0.26	6.8	0.034	36.0	120.0	0.99188	3.38	0.4	11.5	6	0	white
82	5.1	0.35	0.26	6.8	0.034	36.0	120.0	0.99188	3.38	0.4	11.5	6	0	white
83	5.1	0.35	0.26	6.8	0.034	36.0	120.0	0.99188	3.38	0.4	11.5	6	0	white
84	5.1	0.39	0.21	1.7	0.027	15.0	72.0	0.9894	3.5	0.45	12.5	6	0	white
85	5.1	0.42	0.0	1.8	0.044	18.0	88.0	0.99157	3.68	0.73	13.6	7	1	red
86	5.1	0.42	0.01	1.5	0.017	25.0	102.0	0.9894	3.38	0.36	12.3	7	1	white
87	5.1	0.47	0.02	1.3	0.034	18.0	44.0	0.9921	3.9	0.62	12.8	6	0	red
88	5.1	0.51	0.18	2.1	0.042	16.0	101.0	0.9924	3.46	0.87	12.9	7	1	red
89	5.1	0.52	0.06	2.7	0.052	30.0	79.0	0.9932	3.32	0.43	9.3	5	0	white
90	5.1	0.585	0.0	1.7	0.044	14.0	86.0	0.99264	3.56	0.94	12.9	7	1	red
91	5.2	0.155	0.33	1.6	0.028	13.0	59.0	0.98975	3.3	0.84	11.9	8	1	white
92	5.2	0.155	0.33	1.6	0.028	13.0	59.0	0.98975	3.3	0.84	11.9	8	1	white
93	5.2	0.16	0.34	0.8	0.029	26.0	77.0	0.99155	3.25	0.51	10.1	6	0	white
94	5.2	0.17	0.27	0.7	0.03	11.0	68.0	0.99218	3.3	0.41	9.8	5	0	white
95	5.2	0.185	0.22	1.0	0.03	47.0	123.0	0.99218	3.55	0.44	10.15	6	0	white
96	5.2	0.2	0.27	3.2	0.047	16.0	93.0	0.99235	3.44	0.53	10.1	7	1	white
97	5.2	0.21	0.31	1.7	0.048	17.0	61.0	0.98953	3.24	0.37	12.0	7	1	white
98	5.2	0.22	0.46	6.2	0.066	41.0	187.0	0.99362	3.19	0.42	9.73333333333333	5	0	white
99	5.2	0.24	0.15	7.1	0.043	32.0	134.0	0.99378	3.24	0.48	9.9	6	0	white
100	5.2	0.24	0.45	3.8	0.027	21.0	128.0	0.992	3.55	0.49	11.2	8	1	white

Rows: 1-100 | Columns: 14

Note

VerticaPy offers a wide range of sample datasets that are ideal for training and testing purposes. You can explore the full list of available datasets in the Datasets, which provides detailed information on each dataset and how to use them effectively. These datasets are invaluable resources for honing your data analysis and machine learning skills within the VerticaPy environment.

You can easily divide your dataset into training and testing subsets using the vDataFrame.train_test_split() method. This is a crucial step when preparing your data for machine learning, as it allows you to evaluate the performance of your models accurately.

data = vpd.load_winequality()
train, test = data.train_test_split(test_size = 0.2)

Warning

In this case, VerticaPy utilizes seeded randomization to guarantee the reproducibility of your data split. However, please be aware that this approach may lead to reduced performance. For a more efficient data split, you can use the vDataFrame.to_db() method to save your results into tables or temporary tables. This will help enhance the overall performance of the process.

Model Initialization#

First we import the LinearSVC model:

from verticapy.machine_learning.vertica import LinearSVC

Then we can create the model:

model = LinearSVC(
    tol = 1e-4,
    C = 1.0,
    intercept_scaling = 1.0,
    intercept_mode = "regularized",
    class_weight = [1, 1],
    max_iter = 100,
)

Hint

In verticapy 1.0.x and higher, you do not need to specify the model name, as the name is automatically assigned. If you need to re-use the model, you can fetch the model name from the model’s attributes.

Important

The model name is crucial for the model management system and versioning. It’s highly recommended to provide a name if you plan to reuse the model later.

Model Training#

We can now fit the model:

model.fit(
    train,
    [
        "fixed_acidity",
        "volatile_acidity",
        "citric_acid",
        "residual_sugar",
        "chlorides",
        "density"
    ],
    "good",
    test,
)

Important

To train a model, you can directly use the vDataFrame or the name of the relation stored in the database. The test set is optional and is only used to compute the test metrics. In verticapy, we don’t work using X matrices and y vectors. Instead, we work directly with lists of predictors and the response name.

Features Importance#

We can conveniently get the features importance:

result = model.features_importance()

Note

For LinearModel, feature importance is computed using the coefficients. These coefficients are then normalized using the feature distribution. An activation function is applied to get the final score.

Metrics#

We can get the entire report using:

model.report()

	value
auc	0.6634095201510934
prc_auc	0.24623754546890747
accuracy	0.8202764976958525
log_loss	0.231043032730828
precision	0.0
recall	0.0
f1_score	0.0
mcc	0.0
informedness	0.0
markedness	-0.17972350230414746
csi	0.0

Rows: 1-11 | Columns: 2

Important

Most metrics are computed using a single SQL query, but some of them might require multiple SQL queries. Selecting only the necessary metrics in the report can help optimize performance. E.g. model.report(metrics = ["auc", "accuracy"]).

For classification models, we can easily modify the cutoff to observe the effect on different metrics:

model.report(cutoff = 0.2)

	value
auc	0.6634095201510934
prc_auc	0.24623754546890747
accuracy	0.19047619047619047
log_loss	0.231043032730828
precision	0.18167701863354038
recall	1.0
f1_score	0.3074901445466492
mcc	0.048800962614738055
informedness	0.013108614232209659
markedness	0.18167701863354035
csi	0.18167701863354038

Rows: 1-11 | Columns: 2

You can also use the LinearModel.score function to compute any classification metric. The default metric is the accuracy:

model.score()
Out[3]: 0.8202764976958525

Prediction#

Prediction is straight-forward:

model.predict(
    test,
    [
        "fixed_acidity",
        "volatile_acidity",
        "citric_acid",
        "residual_sugar",
        "chlorides",
        "density"
    ],
    "prediction",
)

	123 fixed_acidity Numeric(8)	123 volatile_acidity Numeric(9)	123 citric_acid Numeric(8)	123 residual_sugar Numeric(9)	123 chlorides Float(22)	123 free_sulfur_dioxide Numeric(9)	123 total_sulfur_dioxide Numeric(9)	123 density Float(22)	123 pH Numeric(8)	123 sulphates Numeric(8)	123 alcohol Float(22)	123 quality Integer	123 good Integer	Abc color Varchar(20)	123 prediction Integer
1	4.4	0.46	0.1	2.8	0.024	31.0	111.0	0.98816	3.48	0.34	13.1	6	0	white	0
2	4.5	0.19	0.21	0.95	0.033	89.0	159.0	0.99332	3.34	0.42	8.0	5	0	white	0
3	4.7	0.455	0.18	1.9	0.036	33.0	106.0	0.98746	3.21	0.83	14.0	7	1	white	0
4	4.7	0.785	0.0	3.4	0.036	23.0	134.0	0.98981	3.53	0.92	13.8	6	0	white	0
5	4.8	0.29	0.23	1.1	0.044	38.0	180.0	0.98924	3.28	0.34	11.9	6	0	white	0
6	4.8	0.33	0.0	6.5	0.028	34.0	163.0	0.9937	3.35	0.61	9.9	5	0	white	0
7	4.9	0.235	0.27	11.75	0.03	34.0	118.0	0.9954	3.07	0.5	9.4	6	0	white	0
8	4.9	0.335	0.14	1.3	0.036	69.0	168.0	0.99212	3.47	0.46	10.4666666666667	5	0	white	0
9	4.9	0.345	0.34	1.0	0.068	32.0	143.0	0.99138	3.24	0.4	10.1	5	0	white	0
10	4.9	0.42	0.0	2.1	0.048	16.0	42.0	0.99154	3.71	0.74	14.0	7	1	red	0
11	5.0	0.24	0.21	2.2	0.039	31.0	100.0	0.99098	3.69	0.62	11.7	6	0	white	0
12	5.0	0.24	0.34	1.1	0.034	49.0	158.0	0.98774	3.32	0.32	13.1	7	1	white	0
13	5.0	0.27	0.32	4.5	0.032	58.0	178.0	0.98956	3.45	0.31	12.6	7	1	white	0
14	5.0	0.27	0.4	1.2	0.076	42.0	124.0	0.99204	3.32	0.47	10.1	6	0	white	0
15	5.0	0.31	0.0	6.4	0.046	43.0	166.0	0.994	3.3	0.63	9.9	6	0	white	0
16	5.0	0.33	0.16	1.5	0.049	10.0	97.0	0.9917	3.48	0.44	10.7	6	0	white	0
17	5.0	0.33	0.16	1.5	0.049	10.0	97.0	0.9917	3.48	0.44	10.7	6	0	white	0
18	5.0	0.35	0.25	7.8	0.031	24.0	116.0	0.99241	3.39	0.4	11.3	6	0	white	0
19	5.0	0.38	0.01	1.6	0.048	26.0	60.0	0.99084	3.7	0.75	14.0	6	0	red	0
20	5.0	0.74	0.0	1.2	0.041	16.0	46.0	0.99258	4.01	0.59	12.5	6	0	red	0
21	5.0	1.02	0.04	1.4	0.045	41.0	85.0	0.9938	3.75	0.48	10.5	4	0	red	0
22	5.1	0.11	0.32	1.6	0.028	12.0	90.0	0.99008	3.57	0.52	12.2	6	0	white	0
23	5.1	0.14	0.25	0.7	0.039	15.0	89.0	0.9919	3.22	0.43	9.2	6	0	white	0
24	5.1	0.165	0.22	5.7	0.047	42.0	146.0	0.9934	3.18	0.55	9.9	6	0	white	0
25	5.1	0.31	0.3	0.9	0.037	28.0	152.0	0.992	3.54	0.56	10.1	6	0	white	0
26	5.1	0.42	0.01	1.5	0.017	25.0	102.0	0.9894	3.38	0.36	12.3	7	1	white	0
27	5.2	0.2	0.27	3.2	0.047	16.0	93.0	0.99235	3.44	0.53	10.1	7	1	white	0
28	5.2	0.22	0.46	6.2	0.066	41.0	187.0	0.99362	3.19	0.42	9.73333333333333	5	0	white	0
29	5.2	0.335	0.2	1.7	0.033	17.0	74.0	0.99002	3.34	0.48	12.3	6	0	white	0
30	5.3	0.21	0.29	0.7	0.028	11.0	66.0	0.99215	3.3	0.4	9.8	5	0	white	0
31	5.3	0.3	0.2	1.1	0.077	48.0	166.0	0.9944	3.3	0.54	8.7	4	0	white	0
32	5.3	0.3	0.3	1.2	0.029	25.0	93.0	0.98742	3.31	0.4	13.6	7	1	white	0
33	5.3	0.32	0.12	6.6	0.043	22.0	141.0	0.9937	3.36	0.6	10.4	6	0	white	0
34	5.3	0.47	0.11	2.2	0.048	16.0	89.0	0.99182	3.54	0.88	13.5666666666667	7	1	red	0
35	5.3	0.58	0.07	6.9	0.043	34.0	149.0	0.9944	3.34	0.57	9.7	5	0	white	0
36	5.3	0.585	0.07	7.1	0.044	34.0	145.0	0.9945	3.34	0.57	9.7	6	0	white	0
37	5.4	0.205	0.16	12.55	0.051	31.0	115.0	0.99564	3.4	0.38	10.8	6	0	white	0
38	5.4	0.23	0.36	1.5	0.03	74.0	121.0	0.98976	3.24	0.99	12.1	7	1	white	0
39	5.4	0.24	0.18	2.3	0.05	22.0	145.0	0.99207	3.24	0.46	10.3	5	0	white	0
40	5.4	0.3	0.3	1.2	0.029	25.0	93.0	0.98742	3.31	0.4	13.6	7	1	white	0
41	5.4	0.58	0.08	1.9	0.059	20.0	31.0	0.99484	3.5	0.64	10.2	6	0	red	0
42	5.4	0.74	0.09	1.7	0.089	16.0	26.0	0.99402	3.67	0.56	11.6	6	0	red	0
43	5.5	0.16	0.22	4.5	0.03	30.0	102.0	0.9938	3.24	0.36	9.4	6	0	white	0
44	5.5	0.23	0.19	2.2	0.044	39.0	161.0	0.99209	3.19	0.43	10.4	6	0	white	0
45	5.5	0.29	0.3	1.1	0.022	20.0	110.0	0.98869	3.34	0.38	12.8	7	1	white	0
46	5.5	0.31	0.29	3.0	0.027	16.0	102.0	0.99067	3.23	0.56	11.2	6	0	white	0
47	5.5	0.335	0.3	2.5	0.071	27.0	128.0	0.9924	3.14	0.51	9.6	6	0	white	0
48	5.5	0.62	0.33	1.7	0.037	24.0	118.0	0.98758	3.15	0.39	13.55	6	0	white	0
49	5.6	0.12	0.26	4.3	0.038	18.0	97.0	0.99477	3.36	0.46	9.2	5	0	white	0
50	5.6	0.185	0.49	1.1	0.03	28.0	117.0	0.9918	3.55	0.45	10.3	6	0	white	0
51	5.6	0.19	0.27	0.9	0.04	52.0	103.0	0.99026	3.5	0.39	11.2	5	0	white	0
52	5.6	0.19	0.39	1.1	0.043	17.0	67.0	0.9918	3.23	0.53	10.3	6	0	white	0
53	5.6	0.19	0.46	1.1	0.032	33.0	115.0	0.9909	3.36	0.5	10.4	6	0	white	0
54	5.6	0.205	0.16	12.55	0.051	31.0	115.0	0.99564	3.4	0.38	10.8	6	0	white	0
55	5.6	0.21	0.24	4.4	0.027	37.0	150.0	0.991	3.3	0.31	11.5	7	1	white	0
56	5.6	0.245	0.32	1.1	0.047	24.0	152.0	0.9927	3.12	0.42	9.3	6	0	white	0
57	5.6	0.26	0.27	10.6	0.03	27.0	119.0	0.9947	3.4	0.34	10.7	7	1	white	0
58	5.6	0.28	0.4	6.1	0.034	36.0	118.0	0.99144	3.21	0.43	12.1	7	1	white	0
59	5.6	0.35	0.37	1.0	0.038	6.0	72.0	0.9902	3.37	0.34	11.4	5	0	white	0
60	5.6	0.42	0.34	2.4	0.022	34.0	97.0	0.98915	3.22	0.38	12.8	7	1	white	0
61	5.6	0.605	0.05	2.4	0.073	19.0	25.0	0.99258	3.56	0.55	12.9	5	0	red	0
62	5.7	0.1	0.27	1.3	0.047	21.0	100.0	0.9928	3.27	0.46	9.5	5	0	white	0
63	5.7	0.16	0.26	6.3	0.043	28.0	113.0	0.9936	3.06	0.58	9.9	6	0	white	0
64	5.7	0.16	0.32	1.2	0.036	7.0	89.0	0.99111	3.26	0.48	11.0	5	0	white	0
65	5.7	0.18	0.22	4.2	0.042	25.0	111.0	0.994	3.35	0.39	9.4	5	0	white	0
66	5.7	0.2	0.3	2.5	0.046	38.0	125.0	0.99276	3.34	0.5	9.9	6	0	white	0
67	5.7	0.21	0.25	1.1	0.035	26.0	81.0	0.9902	3.31	0.52	11.4	6	0	white	0
68	5.7	0.21	0.32	0.9	0.038	38.0	121.0	0.99074	3.24	0.46	10.6	6	0	white	0
69	5.7	0.22	0.2	16.0	0.044	41.0	113.0	0.99862	3.22	0.46	8.9	6	0	white	0
70	5.7	0.22	0.2	16.0	0.044	41.0	113.0	0.99862	3.22	0.46	8.9	6	0	white	0
71	5.7	0.22	0.25	1.1	0.05	97.0	175.0	0.99099	3.44	0.62	11.1	6	0	white	0
72	5.7	0.25	0.22	9.8	0.049	50.0	125.0	0.99571	3.2	0.45	10.1	6	0	white	0
73	5.7	0.26	0.25	10.4	0.02	7.0	57.0	0.994	3.39	0.37	10.6	5	0	white	0
74	5.7	0.26	0.27	4.1	0.201	73.5	189.5	0.9942	3.27	0.38	9.4	6	0	white	0
75	5.7	0.28	0.28	2.2	0.019	15.0	65.0	0.9902	3.06	0.52	11.2	6	0	white	0
76	5.7	0.28	0.36	1.8	0.041	38.0	90.0	0.99002	3.27	0.98	11.9	7	1	white	0
77	5.7	0.31	0.28	4.1	0.03	22.0	86.0	0.99062	3.31	0.38	11.7	7	1	white	0
78	5.7	0.31	0.29	7.3	0.05	33.0	143.0	0.99332	3.31	0.5	11.0666666666667	6	0	white	0
79	5.7	0.32	0.5	2.6	0.049	17.0	155.0	0.9927	3.22	0.64	10.0	6	0	white	0
80	5.7	0.39	0.25	4.9	0.033	49.0	113.0	0.98966	3.26	0.58	13.1	7	1	white	0
81	5.7	0.4	0.35	5.1	0.026	17.0	113.0	0.99052	3.18	0.67	12.4	6	0	white	0
82	5.7	0.41	0.21	1.9	0.048	30.0	112.0	0.99138	3.29	0.55	11.2	6	0	white	0
83	5.7	0.43	0.3	5.7	0.039	24.0	98.0	0.992	3.54	0.61	12.3	7	1	white	0
84	5.7	1.13	0.09	1.5	0.172	7.0	19.0	0.994	3.5	0.48	9.8	4	0	red	0
85	5.8	0.18	0.37	1.1	0.036	31.0	96.0	0.98942	3.16	0.48	12.0	6	0	white	0
86	5.8	0.21	0.32	1.6	0.045	38.0	95.0	0.98946	3.23	0.94	12.4	8	1	white	0
87	5.8	0.22	0.25	1.5	0.024	21.0	109.0	0.99234	3.37	0.58	10.4	6	0	white	0
88	5.8	0.22	0.29	1.3	0.036	25.0	68.0	0.98865	3.24	0.35	12.6	6	0	white	0
89	5.8	0.27	0.2	14.95	0.044	22.0	179.0	0.9962	3.37	0.37	10.2	5	0	white	0
90	5.8	0.27	0.22	12.7	0.058	42.0	206.0	0.9946	3.32	0.38	12.3	6	0	white	0
91	5.8	0.275	0.3	5.4	0.043	41.0	149.0	0.9926	3.33	0.42	10.8	7	1	white	0
92	5.8	0.28	0.3	3.9	0.026	36.0	105.0	0.98963	3.26	0.58	12.75	6	0	white	0
93	5.8	0.28	0.34	4.0	0.031	40.0	99.0	0.9896	3.39	0.39	12.8	7	1	white	0
94	5.8	0.28	0.66	9.1	0.039	26.0	159.0	0.9965	3.66	0.55	10.8	5	0	white	0
95	5.8	0.29	0.26	1.7	0.063	3.0	11.0	0.9915	3.39	0.54	13.5	6	0	red	0
96	5.8	0.29	0.27	1.6	0.062	17.0	140.0	0.99138	3.23	0.35	11.1	6	0	white	0
97	5.8	0.3	0.09	6.3	0.042	36.0	138.0	0.99382	3.15	0.48	9.7	5	0	white	0
98	5.8	0.32	0.28	4.3	0.032	46.0	115.0	0.98946	3.16	0.57	13.0	8	1	white	0
99	5.8	0.32	0.31	2.7	0.049	25.0	153.0	0.99067	3.44	0.73	12.2	7	1	white	0
100	5.8	0.32	0.38	4.75	0.033	23.0	94.0	0.991	3.42	0.42	11.8	7	1	white	0

Rows: 1-100 | Columns: 15

Note

Predictions can be made automatically using the test set, in which case you don’t need to specify the predictors. Alternatively, you can pass only the vDataFrame to the predict() function, but in this case, it’s essential that the column names of the vDataFrame match the predictors and response name in the model.

Probabilities#

It is also easy to get the model’s probabilities:

model.predict_proba(
    test,
    [
        "fixed_acidity",
        "volatile_acidity",
        "citric_acid",
        "residual_sugar",
        "chlorides",
        "density"
    ],
    "prediction",
)

	123 fixed_acidity Numeric(8)	123 volatile_acidity Numeric(9)	123 citric_acid Numeric(8)	123 residual_sugar Numeric(9)	123 chlorides Float(22)	123 free_sulfur_dioxide Numeric(9)	123 total_sulfur_dioxide Numeric(9)	123 density Float(22)	123 pH Numeric(8)	123 sulphates Numeric(8)	123 alcohol Float(22)	123 quality Integer	123 good Integer	Abc color Varchar(20)	123 prediction Integer	123 prediction_0 Float(22)	123 prediction_1 Float(22)
1	4.4	0.46	0.1	2.8	0.024	31.0	111.0	0.98816	3.48	0.34	13.1	6	0	white	0	0.626979592919818	0.373020407080182
2	4.5	0.19	0.21	0.95	0.033	89.0	159.0	0.99332	3.34	0.42	8.0	5	0	white	0	0.589783088059858	0.410216911940142
3	4.7	0.455	0.18	1.9	0.036	33.0	106.0	0.98746	3.21	0.83	14.0	7	1	white	0	0.630671350916544	0.369328649083456
4	4.7	0.785	0.0	3.4	0.036	23.0	134.0	0.98981	3.53	0.92	13.8	6	0	white	0	0.686370420278118	0.313629579721882
5	4.8	0.29	0.23	1.1	0.044	38.0	180.0	0.98924	3.28	0.34	11.9	6	0	white	0	0.612676837470789	0.387323162529211
6	4.8	0.33	0.0	6.5	0.028	34.0	163.0	0.9937	3.35	0.61	9.9	5	0	white	0	0.638240447718202	0.361759552281799
7	4.9	0.235	0.27	11.75	0.03	34.0	118.0	0.9954	3.07	0.5	9.4	6	0	white	0	0.641415095630232	0.358584904369768
8	4.9	0.335	0.14	1.3	0.036	69.0	168.0	0.99212	3.47	0.46	10.4666666666667	5	0	white	0	0.615864748882639	0.384135251117361
9	4.9	0.345	0.34	1.0	0.068	32.0	143.0	0.99138	3.24	0.4	10.1	5	0	white	0	0.640584206309622	0.359415793690378
10	4.9	0.42	0.0	2.1	0.048	16.0	42.0	0.99154	3.71	0.74	14.0	7	1	red	0	0.649115020941208	0.350884979058792
11	5.0	0.24	0.21	2.2	0.039	31.0	100.0	0.99098	3.69	0.62	11.7	6	0	white	0	0.607400854332315	0.392599145667685
12	5.0	0.24	0.34	1.1	0.034	49.0	158.0	0.98774	3.32	0.32	13.1	7	1	white	0	0.5892982499641	0.4107017500359
13	5.0	0.27	0.32	4.5	0.032	58.0	178.0	0.98956	3.45	0.31	12.6	7	1	white	0	0.609133332624399	0.390866667375601
14	5.0	0.27	0.4	1.2	0.076	42.0	124.0	0.99204	3.32	0.47	10.1	6	0	white	0	0.638006742687448	0.361993257312552
15	5.0	0.31	0.0	6.4	0.046	43.0	166.0	0.994	3.3	0.63	9.9	6	0	white	0	0.653959879398099	0.346040120601901
16	5.0	0.33	0.16	1.5	0.049	10.0	97.0	0.9917	3.48	0.44	10.7	6	0	white	0	0.628937764718452	0.371062235281548
17	5.0	0.33	0.16	1.5	0.049	10.0	97.0	0.9917	3.48	0.44	10.7	6	0	white	0	0.628937764718452	0.371062235281548
18	5.0	0.35	0.25	7.8	0.031	24.0	116.0	0.99241	3.39	0.4	11.3	6	0	white	0	0.638141229540982	0.361858770459018
19	5.0	0.38	0.01	1.6	0.048	26.0	60.0	0.99084	3.7	0.75	14.0	6	0	red	0	0.641003372067769	0.358996627932231
20	5.0	0.74	0.0	1.2	0.041	16.0	46.0	0.99258	4.01	0.59	12.5	6	0	red	0	0.677027404670355	0.322972595329645
21	5.0	1.02	0.04	1.4	0.045	41.0	85.0	0.9938	3.75	0.48	10.5	4	0	red	0	0.712787470441028	0.287212529558972
22	5.1	0.11	0.32	1.6	0.028	12.0	90.0	0.99008	3.57	0.52	12.2	6	0	white	0	0.569604380494246	0.430395619505754
23	5.1	0.14	0.25	0.7	0.039	15.0	89.0	0.9919	3.22	0.43	9.2	6	0	white	0	0.585420171504096	0.414579828495904
24	5.1	0.165	0.22	5.7	0.047	42.0	146.0	0.9934	3.18	0.55	9.9	6	0	white	0	0.623470842402848	0.376529157597152
25	5.1	0.31	0.3	0.9	0.037	28.0	152.0	0.992	3.54	0.56	10.1	6	0	white	0	0.604183107061443	0.395816892938557
26	5.1	0.42	0.01	1.5	0.017	25.0	102.0	0.9894	3.38	0.36	12.3	7	1	white	0	0.611821906728257	0.388178093271743
27	5.2	0.2	0.27	3.2	0.047	16.0	93.0	0.99235	3.44	0.53	10.1	7	1	white	0	0.613257521334564	0.386742478665436
28	5.2	0.22	0.46	6.2	0.066	41.0	187.0	0.99362	3.19	0.42	9.73333333333333	5	0	white	0	0.642161886956979	0.357838113043021
29	5.2	0.335	0.2	1.7	0.033	17.0	74.0	0.99002	3.34	0.48	12.3	6	0	white	0	0.610601688936577	0.389398311063423
30	5.3	0.21	0.29	0.7	0.028	11.0	66.0	0.99215	3.3	0.4	9.8	5	0	white	0	0.580430263512084	0.419569736487916
31	5.3	0.3	0.2	1.1	0.077	48.0	166.0	0.9944	3.3	0.54	8.7	4	0	white	0	0.651653661560349	0.348346338439651
32	5.3	0.3	0.3	1.2	0.029	25.0	93.0	0.98742	3.31	0.4	13.6	7	1	white	0	0.593497917451586	0.406502082548414
33	5.3	0.32	0.12	6.6	0.043	22.0	141.0	0.9937	3.36	0.6	10.4	6	0	white	0	0.647282460140617	0.352717539859383
34	5.3	0.47	0.11	2.2	0.048	16.0	89.0	0.99182	3.54	0.88	13.5666666666667	7	1	red	0	0.650570482894423	0.349429517105577
35	5.3	0.58	0.07	6.9	0.043	34.0	149.0	0.9944	3.34	0.57	9.7	5	0	white	0	0.682494701716525	0.317505298283475
36	5.3	0.585	0.07	7.1	0.044	34.0	145.0	0.9945	3.34	0.57	9.7	6	0	white	0	0.684978121280655	0.315021878719345
37	5.4	0.205	0.16	12.55	0.051	31.0	115.0	0.99564	3.4	0.38	10.8	6	0	white	0	0.667335829980444	0.332664170019556
38	5.4	0.23	0.36	1.5	0.03	74.0	121.0	0.98976	3.24	0.99	12.1	7	1	white	0	0.584867060037474	0.415132939962526
39	5.4	0.24	0.18	2.3	0.05	22.0	145.0	0.99207	3.24	0.46	10.3	5	0	white	0	0.621071717466961	0.378928282533039
40	5.4	0.3	0.3	1.2	0.029	25.0	93.0	0.98742	3.31	0.4	13.6	7	1	white	0	0.593370584109935	0.406629415890065
41	5.4	0.58	0.08	1.9	0.059	20.0	31.0	0.99484	3.5	0.64	10.2	6	0	red	0	0.676058487699382	0.323941512300618
42	5.4	0.74	0.09	1.7	0.089	16.0	26.0	0.99402	3.67	0.56	11.6	6	0	red	0	0.721598289232689	0.278401710767311
43	5.5	0.16	0.22	4.5	0.03	30.0	102.0	0.9938	3.24	0.36	9.4	6	0	white	0	0.598273737008	0.401726262992
44	5.5	0.23	0.19	2.2	0.044	39.0	161.0	0.99209	3.19	0.43	10.4	6	0	white	0	0.61226078127348	0.38773921872652
45	5.5	0.29	0.3	1.1	0.022	20.0	110.0	0.98869	3.34	0.38	12.8	7	1	white	0	0.584201801770394	0.415798198229606
46	5.5	0.31	0.29	3.0	0.027	16.0	102.0	0.99067	3.23	0.56	11.2	6	0	white	0	0.602827228557215	0.397172771442785
47	5.5	0.335	0.3	2.5	0.071	27.0	128.0	0.9924	3.14	0.51	9.6	6	0	white	0	0.650816114116758	0.349183885883242
48	5.5	0.62	0.33	1.7	0.037	24.0	118.0	0.98758	3.15	0.39	13.55	6	0	white	0	0.643928610099862	0.356071389900138
49	5.6	0.12	0.26	4.3	0.038	18.0	97.0	0.99477	3.36	0.46	9.2	5	0	white	0	0.599265959584917	0.400734040415083
50	5.6	0.185	0.49	1.1	0.03	28.0	117.0	0.9918	3.55	0.45	10.3	6	0	white	0	0.571395565081218	0.428604434918782
51	5.6	0.19	0.27	0.9	0.04	52.0	103.0	0.99026	3.5	0.39	11.2	5	0	white	0	0.591863576431076	0.408136423568924
52	5.6	0.19	0.39	1.1	0.043	17.0	67.0	0.9918	3.23	0.53	10.3	6	0	white	0	0.591184561986636	0.408815438013364
53	5.6	0.19	0.46	1.1	0.032	33.0	115.0	0.9909	3.36	0.5	10.4	6	0	white	0	0.575344920828597	0.424655079171403
54	5.6	0.205	0.16	12.55	0.051	31.0	115.0	0.99564	3.4	0.38	10.8	6	0	white	0	0.667101463292585	0.332898536707415
55	5.6	0.21	0.24	4.4	0.027	37.0	150.0	0.991	3.3	0.31	11.5	7	1	white	0	0.598874059472847	0.401125940527153
56	5.6	0.245	0.32	1.1	0.047	24.0	152.0	0.9927	3.12	0.42	9.3	6	0	white	0	0.606342373773662	0.393657626226338
57	5.6	0.26	0.27	10.6	0.03	27.0	119.0	0.9947	3.4	0.34	10.7	7	1	white	0	0.638083340264255	0.361916659735745
58	5.6	0.28	0.4	6.1	0.034	36.0	118.0	0.99144	3.21	0.43	12.1	7	1	white	0	0.616580465027588	0.383419534972412
59	5.6	0.35	0.37	1.0	0.038	6.0	72.0	0.9902	3.37	0.34	11.4	5	0	white	0	0.606400614144177	0.393599385855823
60	5.6	0.42	0.34	2.4	0.022	34.0	97.0	0.98915	3.22	0.38	12.8	7	1	white	0	0.605749298569284	0.394250701430716
61	5.6	0.605	0.05	2.4	0.073	19.0	25.0	0.99258	3.56	0.55	12.9	5	0	red	0	0.695157746950115	0.304842253049885
62	5.7	0.1	0.27	1.3	0.047	21.0	100.0	0.9928	3.27	0.46	9.5	5	0	white	0	0.590595641511835	0.409404358488166
63	5.7	0.16	0.26	6.3	0.043	28.0	113.0	0.9936	3.06	0.58	9.9	6	0	white	0	0.618903657728379	0.381096342271621
64	5.7	0.16	0.32	1.2	0.036	7.0	89.0	0.99111	3.26	0.48	11.0	5	0	white	0	0.582830561834059	0.417169438165941
65	5.7	0.18	0.22	4.2	0.042	25.0	111.0	0.994	3.35	0.39	9.4	5	0	white	0	0.612343610079964	0.387656389920036
66	5.7	0.2	0.3	2.5	0.046	38.0	125.0	0.99276	3.34	0.5	9.9	6	0	white	0	0.606971459577521	0.393028540422479
67	5.7	0.21	0.25	1.1	0.035	26.0	81.0	0.9902	3.31	0.52	11.4	6	0	white	0	0.590725429750902	0.409274570249098
68	5.7	0.21	0.32	0.9	0.038	38.0	121.0	0.99074	3.24	0.46	10.6	6	0	white	0	0.590022377123664	0.409977622876336
69	5.7	0.22	0.2	16.0	0.044	41.0	113.0	0.99862	3.22	0.46	8.9	6	0	white	0	0.676463564072942	0.323536435927058
70	5.7	0.22	0.2	16.0	0.044	41.0	113.0	0.99862	3.22	0.46	8.9	6	0	white	0	0.676463564072942	0.323536435927058
71	5.7	0.22	0.25	1.1	0.05	97.0	175.0	0.99099	3.44	0.62	11.1	6	0	white	0	0.608783902119693	0.391216097880307
72	5.7	0.25	0.22	9.8	0.049	50.0	125.0	0.99571	3.2	0.45	10.1	6	0	white	0	0.655429857028468	0.344570142971532
73	5.7	0.26	0.25	10.4	0.02	7.0	57.0	0.994	3.39	0.37	10.6	5	0	white	0	0.627035006971057	0.372964993028943
74	5.7	0.26	0.27	4.1	0.201	73.5	189.5	0.9942	3.27	0.38	9.4	6	0	white	0	0.771277753365493	0.228722246634507
75	5.7	0.28	0.28	2.2	0.019	15.0	65.0	0.9902	3.06	0.52	11.2	6	0	white	0	0.586221972734748	0.413778027265252
76	5.7	0.28	0.36	1.8	0.041	38.0	90.0	0.99002	3.27	0.98	11.9	7	1	white	0	0.604703012487443	0.395296987512557
77	5.7	0.31	0.28	4.1	0.03	22.0	86.0	0.99062	3.31	0.38	11.7	7	1	white	0	0.611575969219882	0.388424030780118
78	5.7	0.31	0.29	7.3	0.05	33.0	143.0	0.99332	3.31	0.5	11.0666666666667	6	0	white	0	0.648478150951928	0.351521849048072
79	5.7	0.32	0.5	2.6	0.049	17.0	155.0	0.9927	3.22	0.64	10.0	6	0	white	0	0.617038947192804	0.382961052807196
80	5.7	0.39	0.25	4.9	0.033	49.0	113.0	0.98966	3.26	0.58	13.1	7	1	white	0	0.629825115566256	0.370174884433744
81	5.7	0.4	0.35	5.1	0.026	17.0	113.0	0.99052	3.18	0.67	12.4	6	0	white	0	0.620377594039149	0.379622405960851
82	5.7	0.41	0.21	1.9	0.048	30.0	112.0	0.99138	3.29	0.55	11.2	6	0	white	0	0.636642590949893	0.363357409050107
83	5.7	0.43	0.3	5.7	0.039	24.0	98.0	0.992	3.54	0.61	12.3	7	1	white	0	0.643589307144362	0.356410692855638
84	5.7	1.13	0.09	1.5	0.172	7.0	19.0	0.994	3.5	0.48	9.8	4	0	red	0	0.823148414992671	0.176851585007329
85	5.8	0.18	0.37	1.1	0.036	31.0	96.0	0.98942	3.16	0.48	12.0	6	0	white	0	0.581843988352835	0.418156011647165
86	5.8	0.21	0.32	1.6	0.045	38.0	95.0	0.98946	3.23	0.94	12.4	8	1	white	0	0.600494826604227	0.399505173395773
87	5.8	0.22	0.25	1.5	0.024	21.0	109.0	0.99234	3.37	0.58	10.4	6	0	white	0	0.582540744808104	0.417459255191896
88	5.8	0.22	0.29	1.3	0.036	25.0	68.0	0.98865	3.24	0.35	12.6	6	0	white	0	0.591518297998195	0.408481702001805
89	5.8	0.27	0.2	14.95	0.044	22.0	179.0	0.9962	3.37	0.37	10.2	5	0	white	0	0.676791522245578	0.323208477754422
90	5.8	0.27	0.22	12.7	0.058	42.0	206.0	0.9946	3.32	0.38	12.3	6	0	white	0	0.679426820259093	0.320573179740907
91	5.8	0.275	0.3	5.4	0.043	41.0	149.0	0.9926	3.33	0.42	10.8	7	1	white	0	0.627005715869341	0.372994284130659
92	5.8	0.28	0.3	3.9	0.026	36.0	105.0	0.98963	3.26	0.58	12.75	6	0	white	0	0.600928287958382	0.399071712041618
93	5.8	0.28	0.34	4.0	0.031	40.0	99.0	0.9896	3.39	0.39	12.8	7	1	white	0	0.605025328110237	0.394974671889763
94	5.8	0.28	0.66	9.1	0.039	26.0	159.0	0.9965	3.66	0.55	10.8	5	0	white	0	0.626136558471736	0.373863441528264
95	5.8	0.29	0.26	1.7	0.063	3.0	11.0	0.9915	3.39	0.54	13.5	6	0	red	0	0.634163345235799	0.365836654764201
96	5.8	0.29	0.27	1.6	0.062	17.0	140.0	0.99138	3.23	0.35	11.1	6	0	white	0	0.632141619894637	0.367858380105363
97	5.8	0.3	0.09	6.3	0.042	36.0	138.0	0.99382	3.15	0.48	9.7	5	0	white	0	0.643123865946441	0.356876134053559
98	5.8	0.32	0.28	4.3	0.032	46.0	115.0	0.98946	3.16	0.57	13.0	8	1	white	0	0.615411835742825	0.384588164257175
99	5.8	0.32	0.31	2.7	0.049	25.0	153.0	0.99067	3.44	0.73	12.2	7	1	white	0	0.625202876392331	0.374797123607669
100	5.8	0.32	0.38	4.75	0.033	23.0	94.0	0.991	3.42	0.42	11.8	7	1	white	0	0.614688534647693	0.385311465352307

Rows: 1-100 | Columns: 17

Note

Probabilities are added to the vDataFrame, and VerticaPy uses the corresponding probability function in SQL behind the scenes. You can use the pos_label parameter to add only the probability of the selected category.

Confusion Matrix#

You can obtain the confusion matrix of your choice by specifying the desired cutoff.

model.confusion_matrix(cutoff = 0.5)
Out[4]: 
array([[1068,    0],
       [ 234,    0]])

Note

In classification, the cutoff is a threshold value used to determine class assignment based on predicted probabilities or scores from a classification model. In binary classification, if the predicted probability for a specific class is greater than or equal to the cutoff, the instance is assigned to the positive class; otherwise, it is assigned to the negative class. Adjusting the cutoff allows for trade-offs between true positives and false positives, enabling the model to be optimized for specific objectives or to consider the relative costs of different classification errors. The choice of cutoff is critical for tailoring the model’s performance to meet specific needs.

Main Plots (Classification Curves)#

Classification models allow for the creation of various plots that are very helpful in understanding the model, such as the ROC Curve, PRC Curve, Cutoff Curve, Gain Curve, and more.

Most of the classification curves can be found in the Machine Learning - Classification Curve.

For example, let’s draw the model’s ROC curve.

model.roc_curve()

Important

Most of the curves have a parameter called nbins, which is essential for estimating metrics. The larger the nbins, the more precise the estimation, but it can significantly impact performance. Exercise caution when increasing this parameter excessively.

Hint

In binary classification, various curves can be easily plotted. However, in multi-class classification, it’s important to select the pos_label, representing the class to be treated as positive when drawing the curve.

Other Plots#

If the model allows, you can also generate relevant plots. For example, classification plots can be found in the Machine Learning - Classification Plots.

model.plot()

Important

The plotting feature is typically suitable for models with fewer than three predictors.

Contour plot is another useful plot that can be produced for models with two predictors.

model.contour()

Important

Machine learning models with two predictors can usually benefit from their own contour plot. This visual representation aids in exploring predictions and gaining a deeper understanding of how these models perform in different scenarios. Please refer to Contour Plot for more examples.

Parameter Modification#

In order to see the parameters:

model.get_params()
Out[5]: 
{'tol': 0.0001,
 'C': 1.0,
 'intercept_scaling': 1.0,
 'intercept_mode': 'regularized',
 'class_weight': [1, 1],
 'max_iter': 100}

And to manually change some of the parameters:

model.set_params({'tol': 0.001})

Model Register#

In order to register the model for tracking and versioning:

model.register("model_v1")

Please refer to Model Tracking and Versioning for more details on model tracking and versioning.

Model Exporting#

To Memmodel

model.to_memmodel()

Note

MemModel objects serve as in-memory representations of machine learning models. They can be used for both in-database and in-memory prediction tasks. These objects can be pickled in the same way that you would pickle a scikit-learn model.

The following methods for exporting the model use MemModel, and it is recommended to use MemModel directly.

To SQL

You can get the SQL code by:

model.to_sql()
Out[7]: '((1 / (1 + EXP(- (1.47377673486566 + 0.00527762759337545 * "fixed_acidity" + -0.543583271863033 * "volatile_acidity" + 0.192258472232652 * "citric_acid" + -0.0201778588607175 * "residual_sugar" + -4.56718403612728 * "chlorides" + -1.63874906986082 * "density")))) > 0.5)::int'

To Python

To obtain the prediction function in Python syntax, use the following code:

X = [[4.2, 0.17, 0.36, 1.8, 0.029, 0.9899]]

model.to_python()(X)
Out[9]: array([0])

Hint

The to_python() method is used to retrieve predictions, probabilities, or cluster distances. For specific details on how to use this method for different model types, refer to the relevant documentation for each model.

__init__(name: str = None, overwrite_model: bool = False, tol: float = 0.0001, C: float = 1.0, intercept_scaling: float = 1.0, intercept_mode: Literal['regularized', 'unregularized'] = 'regularized', class_weight: Literal['auto', 'none'] | list = [1, 1], max_iter: int = 100) → None#

Methods

`__init__`([name, overwrite_model, tol, C, ...])
`classification_report`([metrics, cutoff, nbins])	Computes a classification report using multiple model evaluation metrics (`auc`, `accuracy`, `f1`...).
`confusion_matrix`([cutoff])	Computes the model confusion matrix.
`contour`([nbins, chart])	Draws the model's contour plot.
`cutoff_curve`([nbins, show, chart])	Draws the model Cutoff curve.
`deploySQL`([X, cutoff])	Returns the SQL code needed to deploy the model.
`does_model_exists`(name[, raise_error, ...])	Checks whether the model is stored in the Vertica database.
`drop`()	Drops the model from the Vertica database.
`export_models`(name, path[, kind])	Exports machine learning models.
`features_importance`([show, chart])	Computes the model's features importance.
`fit`(input_relation, X, y[, test_relation, ...])	Trains the model.
`get_attributes`([attr_name])	Returns the model attributes.
`get_match_index`(x, col_list[, str_check])	Returns the matching index.
`get_params`()	Returns the parameters of the model.
`get_plotting_lib`([class_name, chart, ...])	Returns the first available library (Plotly, Matplotlib, or Highcharts) to draw a specific graphic.
`get_vertica_attributes`([attr_name])	Returns the model Vertica attributes.
`import_models`(path[, schema, kind])	Imports machine learning models.
`lift_chart`([nbins, show, chart])	Draws the model Lift Chart.
`plot`([max_nb_points, chart])	Draws the model.
`prc_curve`([nbins, show, chart])	Draws the model PRC curve.
`predict`(vdf[, X, name, cutoff, inplace])	Makes predictions on the input relation.
`predict_proba`(vdf[, X, name, pos_label, inplace])	Returns the model's probabilities using the input relation.
`register`(registered_name[, raise_error])	Registers the model and adds it to in-DB Model versioning environment with a status of 'under_review'.
`report`([metrics, cutoff, nbins])	Computes a classification report using multiple model evaluation metrics (`auc`, `accuracy`, `f1`...).
`roc_curve`([nbins, show, chart])	Draws the model ROC curve.
`score`([metric, cutoff, nbins])	Computes the model score.
`set_params`([parameters])	Sets the parameters of the model.
`summarize`()	Summarizes the model.
`to_binary`(path)	Exports the model to the Vertica Binary format.
`to_memmodel`()	Converts the model to an InMemory object that can be used for different types of predictions.
`to_pmml`(path)	Exports the model to PMML.
`to_python`([return_proba, ...])	Returns the Python function needed for in-memory scoring without using built-in Vertica functions.
`to_sql`([X, return_proba, ...])	Returns the SQL code needed to deploy the model without using built-in Vertica functions.
`to_tf`(path)	Exports the model to the Frozen Graph format (TensorFlow).

Attributes

`classes_`
`object_type`