verticapy.machine_learning.vertica.tree.DecisionTreeClassifier.score#

DecisionTreeClassifier.score(metric: Literal['aic', 'bic', 'accuracy', 'acc', 'balanced_accuracy', 'ba', 'auc', 'roc_auc', 'prc_auc', 'best_cutoff', 'best_threshold', 'false_discovery_rate', 'fdr', 'false_omission_rate', 'for', 'false_negative_rate', 'fnr', 'false_positive_rate', 'fpr', 'recall', 'tpr', 'precision', 'ppv', 'specificity', 'tnr', 'negative_predictive_value', 'npv', 'negative_likelihood_ratio', 'lr-', 'positive_likelihood_ratio', 'lr+', 'diagnostic_odds_ratio', 'dor', 'log_loss', 'logloss', 'f1', 'f1_score', 'mcc', 'bm', 'informedness', 'mk', 'markedness', 'ts', 'csi', 'critical_success_index', 'fowlkes_mallows_index', 'fm', 'prevalence_threshold', 'pm', 'confusion_matrix', 'classification_report'] = 'accuracy', average: Literal[None, 'binary', 'micro', 'macro', 'scores', 'weighted'] | None = None, pos_label: bool | float | str | timedelta | datetime | None = None, cutoff: int | float | Decimal = 0.5, nbins: int = 10000) → float | list[float]#

Computes the model score.

Parameters#

metric: str, optional

The metric used to compute the score.

accuracy:
Accuracy.

\[Accuracy = \frac{TP + TN}{TP + TN + FP + FN}\]
aic:
Akaike’s Information Criterion

\[AIC = 2k - 2\ln(\hat{L})\]
auc:
Area Under the Curve (ROC).

\[AUC = \int_{0}^{1} TPR(FPR) \, dFPR\]
ba:
Balanced Accuracy.

\[BA = \frac{TPR + TNR}{2}\]
best_cutoff:
Cutoff which optimised the ROC Curve prediction.
bic:
Bayesian Information Criterion

\[BIC = -2\ln(\hat{L}) + k \ln(n)\]
bm:
Informedness

\[BM = TPR + TNR - 1\]
csi:
Critical Success Index

\[index = \frac{TP}{TP + FN + FP}\]
f1:
F1 Score

\[F_1 Score = 2 \times \frac{Precision \times Recall}{Precision + Recall}\]
fdr:
False Discovery Rate

\[FDR = 1 - PPV\]
fm:
Fowlkes-Mallows index

\[FM = \sqrt{PPV * TPR}\]
fnr:
False Negative Rate

\[FNR = \frac{FN}{FN + TP}\]
for:
False Omission Rate

\[FOR = 1 - NPV\]
fpr:
False Positive Rate

\[FPR = \frac{FP}{FP + TN}\]
logloss:
Log Loss.

\[Loss = -\frac{1}{N} \sum_{i=1}^{N} \left( y_i \log(p_i) + (1 - y_i) \log(1 - p_i) \right)\]
lr+:
Positive Likelihood Ratio.

\[LR+ = \frac{TPR}{FPR}\]
lr-:
Negative Likelihood Ratio.

\[LR- = \frac{FNR}{TNR}\]
dor:
Diagnostic Odds Ratio.

\[DOR = \frac{TP \times TN}{FP \times FN}\]

mc:

Matthews Correlation Coefficient .. math:

MCC = \frac{TP \times TN - FP \times FN}{\sqrt{(TP + FP)(TP + FN)(TN + FP)(TN + FN)}}

mk:
Markedness

\[MK = PPV + NPV - 1\]
npv:
Negative Predictive Value

\[NPV = \frac{TN}{TN + FN}\]
prc_auc:
Area Under the Curve (PRC)

\[AUC = \int_{0}^{1} Precision(Recall) \, dRecall\]
precision:
Precision

\[Precision = TP / (TP + FP)\]
pt:
Prevalence Threshold.

\[threshold = \frac{\sqrt{FPR}}{\sqrt{TPR} + \sqrt{FPR}}\]
recall:
Recall.

\[Recall = \frac{TP}{TP + FN}\]
specificity:
Specificity.

\[Specificity = \frac{TN}{TN + FP}\]

average: str, optional

The method used to compute the final score for multiclass-classification.

binary:
considers one of the classes as positive and use the binary confusion matrix to compute the score.
micro:
positive and negative values globally.
macro:
average of the score of each class.
scores:
scores for all the classes.
weighted:
weighted average of the score of each class.

If empty, the result will depend on the input metric. Whenever it is possible, the exact score is computed. Otherwise, the behaviour is similar to the ‘scores’ option.

pos_label: PythonScalar, optional

Label to consider as positive. All the other classes will be merged and considered as negative for multiclass classification.

cutoff: PythonNumber, optional

Cutoff for which the tested category is accepted as a prediction.

nbins: int, optional

[Only when method is set to auc|prc_auc|best_cutoff] An integer value that determines the number of decision boundaries. Decision boundaries are set at equally spaced intervals between 0 and 1, inclusive. Greater values for nbins give more precise estimations of the AUC, but can potentially decrease performance. The maximum value is 999,999. If negative, the maximum value is used.

Returns#

float: score.

Examples#

For this example, we will use the Iris dataset.

import verticapy.datasets as vpd

data = vpd.load_iris()

train, test = data.train_test_split(test_size = 0.2)

	123 SepalLengthCm Numeric(7)	123 SepalWidthCm Numeric(7)	123 PetalLengthCm Numeric(7)	123 PetalWidthCm Numeric(7)	Abc Species Varchar(30)
1	3.3	4.5	5.6	7.8	Iris-setosa
2	3.3	4.5	5.6	7.8	Iris-setosa
3	3.3	4.5	5.6	7.8	Iris-setosa
4	3.3	4.5	5.6	7.8	Iris-setosa
5	3.3	4.5	5.6	7.8	Iris-setosa
6	3.3	4.5	5.6	7.8	Iris-setosa
7	3.3	4.5	5.6	7.8	Iris-setosa
8	3.3	4.5	5.6	7.8	Iris-setosa
9	3.3	4.5	5.6	7.8	Iris-setosa
10	3.3	4.5	5.6	7.8	Iris-setosa
11	3.3	4.5	5.6	7.8	Iris-setosa
12	3.3	4.5	5.6	7.8	Iris-setosa
13	3.3	4.5	5.6	7.8	Iris-setosa
14	3.3	4.5	5.6	7.8	Iris-setosa
15	3.3	4.5	5.6	7.8	Iris-setosa
16	3.3	4.5	5.6	7.8	Iris-setosa
17	3.3	4.5	5.6	7.8	Iris-setosa
18	3.3	4.5	5.6	7.8	Iris-setosa
19	3.3	4.5	5.6	7.8	Iris-setosa
20	3.3	4.5	5.6	7.8	Iris-setosa
21	3.3	4.5	5.6	7.8	Iris-setosa
22	3.3	4.5	5.6	7.8	Iris-setosa
23	3.3	4.5	5.6	7.8	Iris-setosa
24	3.3	4.5	5.6	7.8	Iris-setosa
25	3.3	4.5	5.6	7.8	Iris-setosa
26	3.3	4.5	5.6	7.8	Iris-setosa
27	3.3	4.5	5.6	7.8	Iris-setosa
28	3.3	4.5	5.6	7.8	Iris-setosa
29	3.3	4.5	5.6	7.8	Iris-setosa
30	3.3	4.5	5.6	7.8	Iris-setosa
31	3.3	4.5	5.6	7.8	Iris-setosa
32	3.3	4.5	5.6	7.8	Iris-setosa
33	3.3	4.5	5.6	7.8	Iris-setosa
34	3.3	4.5	5.6	7.8	Iris-setosa
35	3.3	4.5	5.6	7.8	Iris-setosa
36	3.3	4.5	5.6	7.8	Iris-setosa
37	3.3	4.5	5.6	7.8	Iris-setosa
38	3.3	4.5	5.6	7.8	Iris-setosa
39	3.3	4.5	5.6	7.8	Iris-setosa
40	3.3	4.5	5.6	7.8	Iris-setosa
41	3.3	4.5	5.6	7.8	Iris-setosa
42	3.3	4.5	5.6	7.8	Iris-setosa
43	4.3	3.0	1.1	0.1	Iris-setosa
44	4.3	4.7	9.6	1.8	Iris-virginica
45	4.3	4.7	9.6	1.8	Iris-virginica
46	4.3	4.7	9.6	1.8	Iris-virginica
47	4.3	4.7	9.6	1.8	Iris-virginica
48	4.3	4.7	9.6	1.8	Iris-virginica
49	4.3	4.7	9.6	1.8	Iris-virginica
50	4.3	4.7	9.6	1.8	Iris-virginica
51	4.3	4.7	9.6	1.8	Iris-virginica
52	4.3	4.7	9.6	1.8	Iris-virginica
53	4.3	4.7	9.6	1.8	Iris-virginica
54	4.3	4.7	9.6	1.8	Iris-virginica
55	4.3	4.7	9.6	1.8	Iris-virginica
56	4.3	4.7	9.6	1.8	Iris-virginica
57	4.3	4.7	9.6	1.8	Iris-virginica
58	4.3	4.7	9.6	1.8	Iris-virginica
59	4.3	4.7	9.6	1.8	Iris-virginica
60	4.3	4.7	9.6	1.8	Iris-virginica
61	4.3	4.7	9.6	1.8	Iris-virginica
62	4.3	4.7	9.6	1.8	Iris-virginica
63	4.3	4.7	9.6	1.8	Iris-virginica
64	4.3	4.7	9.6	1.8	Iris-virginica
65	4.3	4.7	9.6	1.8	Iris-virginica
66	4.3	4.7	9.6	1.8	Iris-virginica
67	4.3	4.7	9.6	1.8	Iris-virginica
68	4.3	4.7	9.6	1.8	Iris-virginica
69	4.3	4.7	9.6	1.8	Iris-virginica
70	4.3	4.7	9.6	1.8	Iris-virginica
71	4.3	4.7	9.6	1.8	Iris-virginica
72	4.3	4.7	9.6	1.8	Iris-virginica
73	4.3	4.7	9.6	1.8	Iris-virginica
74	4.3	4.7	9.6	1.8	Iris-virginica
75	4.3	4.7	9.6	1.8	Iris-virginica
76	4.3	4.7	9.6	1.8	Iris-virginica
77	4.3	4.7	9.6	1.8	Iris-virginica
78	4.3	4.7	9.6	1.8	Iris-virginica
79	4.3	4.7	9.6	1.8	Iris-virginica
80	4.3	4.7	9.6	1.8	Iris-virginica
81	4.3	4.7	9.6	1.8	Iris-virginica
82	4.3	4.7	9.6	1.8	Iris-virginica
83	4.3	4.7	9.6	1.8	Iris-virginica
84	4.3	4.7	9.6	1.8	Iris-virginica
85	4.3	4.7	9.6	1.8	Iris-virginica
86	4.4	2.9	1.4	0.2	Iris-setosa
87	4.4	3.0	1.3	0.2	Iris-setosa
88	4.4	3.2	1.3	0.2	Iris-setosa
89	4.5	2.3	1.3	0.3	Iris-setosa
90	4.6	3.1	1.5	0.2	Iris-setosa
91	4.6	3.2	1.4	0.2	Iris-setosa
92	4.6	3.4	1.4	0.3	Iris-setosa
93	4.6	3.6	1.0	0.2	Iris-setosa
94	4.7	3.2	1.3	0.2	Iris-setosa
95	4.7	3.2	1.6	0.2	Iris-setosa
96	4.8	3.0	1.4	0.1	Iris-setosa
97	4.8	3.0	1.4	0.3	Iris-setosa
98	4.8	3.1	1.6	0.2	Iris-setosa
99	4.8	3.4	1.6	0.2	Iris-setosa
100	4.8	3.4	1.9	0.2	Iris-setosa

Rows: 1-100 | Columns: 5

Let’s import the model:

from verticapy.machine_learning.vertica import NearestCentroid

Then we can create the model:

model = NearestCentroid(p = 2)

We can now fit the model:

model.fit(
    train,
    [
        "SepalLengthCm",
        "SepalWidthCm",
        "PetalLengthCm",
        "PetalWidthCm",
    ],
    "Species",
    test,
)

We can get the score:

model.score()
Out[108]: 0.52

To get the score of a particular class:

model.score(pos_label= "Iris-setosa")
Out[109]: 0.58

Important

For this example, a specific model is utilized, and it may not correspond exactly to the model you are working with. To see a comprehensive example specific to your class of interest, please refer to that particular class.