verticapy.machine_learning.vertica.tree.DummyTreeClassifier.score#
- DummyTreeClassifier.score(metric: Literal['aic', 'bic', 'accuracy', 'acc', 'balanced_accuracy', 'ba', 'auc', 'roc_auc', 'prc_auc', 'best_cutoff', 'best_threshold', 'false_discovery_rate', 'fdr', 'false_omission_rate', 'for', 'false_negative_rate', 'fnr', 'false_positive_rate', 'fpr', 'recall', 'tpr', 'precision', 'ppv', 'specificity', 'tnr', 'negative_predictive_value', 'npv', 'negative_likelihood_ratio', 'lr-', 'positive_likelihood_ratio', 'lr+', 'diagnostic_odds_ratio', 'dor', 'log_loss', 'logloss', 'f1', 'f1_score', 'mcc', 'bm', 'informedness', 'mk', 'markedness', 'ts', 'csi', 'critical_success_index', 'fowlkes_mallows_index', 'fm', 'prevalence_threshold', 'pm', 'confusion_matrix', 'classification_report'] = 'accuracy', average: Literal[None, 'binary', 'micro', 'macro', 'scores', 'weighted'] | None = None, pos_label: bool | float | str | timedelta | datetime | None = None, cutoff: int | float | Decimal = 0.5, nbins: int = 10000) float | list[float] #
Computes the model score.
Parameters#
- metric: str, optional
The metric used to compute the score.
- accuracy:
Accuracy.
\[Accuracy = \frac{TP + TN}{TP + TN + FP + FN}\]
- aic:
Akaike’s Information Criterion
\[AIC = 2k - 2\ln(\hat{L})\]
- auc:
Area Under the Curve (ROC).
\[AUC = \int_{0}^{1} TPR(FPR) \, dFPR\]
- ba:
Balanced Accuracy.
\[BA = \frac{TPR + TNR}{2}\]
- best_cutoff:
Cutoff which optimised the ROC Curve prediction.
- bic:
Bayesian Information Criterion
\[BIC = -2\ln(\hat{L}) + k \ln(n)\]
- bm:
Informedness
\[BM = TPR + TNR - 1\]
- csi:
Critical Success Index
\[index = \frac{TP}{TP + FN + FP}\]
- f1:
F1 Score
\[F_1 Score = 2 \times \frac{Precision \times Recall}{Precision + Recall}\]
- fdr:
False Discovery Rate
\[FDR = 1 - PPV\]
- fm:
Fowlkes-Mallows index
\[FM = \sqrt{PPV * TPR}\]
- fnr:
False Negative Rate
\[FNR = \frac{FN}{FN + TP}\]
- for:
False Omission Rate
\[FOR = 1 - NPV\]
- fpr:
False Positive Rate
\[FPR = \frac{FP}{FP + TN}\]
- logloss:
Log Loss.
\[Loss = -\frac{1}{N} \sum_{i=1}^{N} \left( y_i \log(p_i) + (1 - y_i) \log(1 - p_i) \right)\]
- lr+:
Positive Likelihood Ratio.
\[LR+ = \frac{TPR}{FPR}\]
- lr-:
Negative Likelihood Ratio.
\[LR- = \frac{FNR}{TNR}\]
- dor:
Diagnostic Odds Ratio.
\[DOR = \frac{TP \times TN}{FP \times FN}\]
- mc:
Matthews Correlation Coefficient .. math:
MCC = \frac{TP \times TN - FP \times FN}{\sqrt{(TP + FP)(TP + FN)(TN + FP)(TN + FN)}}
- mk:
Markedness
\[MK = PPV + NPV - 1\]
- npv:
Negative Predictive Value
\[NPV = \frac{TN}{TN + FN}\]
- prc_auc:
Area Under the Curve (PRC)
\[AUC = \int_{0}^{1} Precision(Recall) \, dRecall\]
- precision:
Precision
\[Precision = TP / (TP + FP)\]
- pt:
Prevalence Threshold.
\[threshold = \frac{\sqrt{FPR}}{\sqrt{TPR} + \sqrt{FPR}}\]
- recall:
Recall.
\[Recall = \frac{TP}{TP + FN}\]
- specificity:
Specificity.
\[Specificity = \frac{TN}{TN + FP}\]
- average: str, optional
The method used to compute the final score for multiclass-classification.
- binary:
considers one of the classes as positive and use the binary confusion matrix to compute the score.
- micro:
positive and negative values globally.
- macro:
average of the score of each class.
- scores:
scores for all the classes.
- weighted:
weighted average of the score of each class.
If empty, the result will depend on the input metric. Whenever it is possible, the exact score is computed. Otherwise, the behaviour is similar to the ‘scores’ option.
- pos_label: PythonScalar, optional
Label to consider as positive. All the other classes will be merged and considered as negative for multiclass classification.
- cutoff: PythonNumber, optional
Cutoff for which the tested category is accepted as a prediction.
- nbins: int, optional
[Only when method is set to auc|prc_auc|best_cutoff] An integer value that determines the number of decision boundaries. Decision boundaries are set at equally spaced intervals between 0 and 1, inclusive. Greater values for nbins give more precise estimations of the AUC, but can potentially decrease performance. The maximum value is 999,999. If negative, the maximum value is used.
Returns#
- float
score.
Examples#
For this example, we will use the Iris dataset.
import verticapy.datasets as vpd data = vpd.load_iris() train, test = data.train_test_split(test_size = 0.2)
123SepalLengthCmNumeric(7)123SepalWidthCmNumeric(7)123PetalLengthCmNumeric(7)123PetalWidthCmNumeric(7)AbcSpeciesVarchar(30)1 3.3 4.5 5.6 7.8 Iris-setosa 2 3.3 4.5 5.6 7.8 Iris-setosa 3 3.3 4.5 5.6 7.8 Iris-setosa 4 3.3 4.5 5.6 7.8 Iris-setosa 5 3.3 4.5 5.6 7.8 Iris-setosa 6 3.3 4.5 5.6 7.8 Iris-setosa 7 3.3 4.5 5.6 7.8 Iris-setosa 8 3.3 4.5 5.6 7.8 Iris-setosa 9 3.3 4.5 5.6 7.8 Iris-setosa 10 3.3 4.5 5.6 7.8 Iris-setosa 11 3.3 4.5 5.6 7.8 Iris-setosa 12 3.3 4.5 5.6 7.8 Iris-setosa 13 3.3 4.5 5.6 7.8 Iris-setosa 14 3.3 4.5 5.6 7.8 Iris-setosa 15 3.3 4.5 5.6 7.8 Iris-setosa 16 3.3 4.5 5.6 7.8 Iris-setosa 17 3.3 4.5 5.6 7.8 Iris-setosa 18 3.3 4.5 5.6 7.8 Iris-setosa 19 3.3 4.5 5.6 7.8 Iris-setosa 20 3.3 4.5 5.6 7.8 Iris-setosa 21 3.3 4.5 5.6 7.8 Iris-setosa 22 3.3 4.5 5.6 7.8 Iris-setosa 23 3.3 4.5 5.6 7.8 Iris-setosa 24 3.3 4.5 5.6 7.8 Iris-setosa 25 3.3 4.5 5.6 7.8 Iris-setosa 26 3.3 4.5 5.6 7.8 Iris-setosa 27 3.3 4.5 5.6 7.8 Iris-setosa 28 3.3 4.5 5.6 7.8 Iris-setosa 29 3.3 4.5 5.6 7.8 Iris-setosa 30 3.3 4.5 5.6 7.8 Iris-setosa 31 3.3 4.5 5.6 7.8 Iris-setosa 32 3.3 4.5 5.6 7.8 Iris-setosa 33 3.3 4.5 5.6 7.8 Iris-setosa 34 3.3 4.5 5.6 7.8 Iris-setosa 35 3.3 4.5 5.6 7.8 Iris-setosa 36 3.3 4.5 5.6 7.8 Iris-setosa 37 3.3 4.5 5.6 7.8 Iris-setosa 38 3.3 4.5 5.6 7.8 Iris-setosa 39 3.3 4.5 5.6 7.8 Iris-setosa 40 3.3 4.5 5.6 7.8 Iris-setosa 41 3.3 4.5 5.6 7.8 Iris-setosa 42 3.3 4.5 5.6 7.8 Iris-setosa 43 4.3 3.0 1.1 0.1 Iris-setosa 44 4.3 4.7 9.6 1.8 Iris-virginica 45 4.3 4.7 9.6 1.8 Iris-virginica 46 4.3 4.7 9.6 1.8 Iris-virginica 47 4.3 4.7 9.6 1.8 Iris-virginica 48 4.3 4.7 9.6 1.8 Iris-virginica 49 4.3 4.7 9.6 1.8 Iris-virginica 50 4.3 4.7 9.6 1.8 Iris-virginica 51 4.3 4.7 9.6 1.8 Iris-virginica 52 4.3 4.7 9.6 1.8 Iris-virginica 53 4.3 4.7 9.6 1.8 Iris-virginica 54 4.3 4.7 9.6 1.8 Iris-virginica 55 4.3 4.7 9.6 1.8 Iris-virginica 56 4.3 4.7 9.6 1.8 Iris-virginica 57 4.3 4.7 9.6 1.8 Iris-virginica 58 4.3 4.7 9.6 1.8 Iris-virginica 59 4.3 4.7 9.6 1.8 Iris-virginica 60 4.3 4.7 9.6 1.8 Iris-virginica 61 4.3 4.7 9.6 1.8 Iris-virginica 62 4.3 4.7 9.6 1.8 Iris-virginica 63 4.3 4.7 9.6 1.8 Iris-virginica 64 4.3 4.7 9.6 1.8 Iris-virginica 65 4.3 4.7 9.6 1.8 Iris-virginica 66 4.3 4.7 9.6 1.8 Iris-virginica 67 4.3 4.7 9.6 1.8 Iris-virginica 68 4.3 4.7 9.6 1.8 Iris-virginica 69 4.3 4.7 9.6 1.8 Iris-virginica 70 4.3 4.7 9.6 1.8 Iris-virginica 71 4.3 4.7 9.6 1.8 Iris-virginica 72 4.3 4.7 9.6 1.8 Iris-virginica 73 4.3 4.7 9.6 1.8 Iris-virginica 74 4.3 4.7 9.6 1.8 Iris-virginica 75 4.3 4.7 9.6 1.8 Iris-virginica 76 4.3 4.7 9.6 1.8 Iris-virginica 77 4.3 4.7 9.6 1.8 Iris-virginica 78 4.3 4.7 9.6 1.8 Iris-virginica 79 4.3 4.7 9.6 1.8 Iris-virginica 80 4.3 4.7 9.6 1.8 Iris-virginica 81 4.3 4.7 9.6 1.8 Iris-virginica 82 4.3 4.7 9.6 1.8 Iris-virginica 83 4.3 4.7 9.6 1.8 Iris-virginica 84 4.3 4.7 9.6 1.8 Iris-virginica 85 4.3 4.7 9.6 1.8 Iris-virginica 86 4.4 2.9 1.4 0.2 Iris-setosa 87 4.4 3.0 1.3 0.2 Iris-setosa 88 4.4 3.2 1.3 0.2 Iris-setosa 89 4.5 2.3 1.3 0.3 Iris-setosa 90 4.6 3.1 1.5 0.2 Iris-setosa 91 4.6 3.2 1.4 0.2 Iris-setosa 92 4.6 3.4 1.4 0.3 Iris-setosa 93 4.6 3.6 1.0 0.2 Iris-setosa 94 4.7 3.2 1.3 0.2 Iris-setosa 95 4.7 3.2 1.6 0.2 Iris-setosa 96 4.8 3.0 1.4 0.1 Iris-setosa 97 4.8 3.0 1.4 0.3 Iris-setosa 98 4.8 3.1 1.6 0.2 Iris-setosa 99 4.8 3.4 1.6 0.2 Iris-setosa 100 4.8 3.4 1.9 0.2 Iris-setosa Rows: 1-100 | Columns: 5Let’s import the model:
from verticapy.machine_learning.vertica import NearestCentroid
Then we can create the model:
model = NearestCentroid(p = 2)
We can now fit the model:
model.fit( train, [ "SepalLengthCm", "SepalWidthCm", "PetalLengthCm", "PetalWidthCm", ], "Species", test, )
We can get the score:
model.score() Out[106]: 0.5714285714285714
To get the score of a particular class:
model.score(pos_label= "Iris-setosa") Out[107]: 0.5510204081632653
Important
For this example, a specific model is utilized, and it may not correspond exactly to the model you are working with. To see a comprehensive example specific to your class of interest, please refer to that particular class.