
verticapy.machine_learning.vertica.neighbors.KNeighborsClassifier.classification_report¶
- KNeighborsClassifier.classification_report(metrics: None | str | list[Literal['aic', 'bic', 'accuracy', 'acc', 'balanced_accuracy', 'ba', 'auc', 'roc_auc', 'prc_auc', 'best_cutoff', 'best_threshold', 'false_discovery_rate', 'fdr', 'false_omission_rate', 'for', 'false_negative_rate', 'fnr', 'false_positive_rate', 'fpr', 'recall', 'tpr', 'precision', 'ppv', 'specificity', 'tnr', 'negative_predictive_value', 'npv', 'negative_likelihood_ratio', 'lr-', 'positive_likelihood_ratio', 'lr+', 'diagnostic_odds_ratio', 'dor', 'log_loss', 'logloss', 'f1', 'f1_score', 'mcc', 'bm', 'informedness', 'mk', 'markedness', 'ts', 'csi', 'critical_success_index', 'fowlkes_mallows_index', 'fm', 'prevalence_threshold', 'pm', 'confusion_matrix', 'classification_report']] = None, cutoff: Annotated[int | float | Decimal, 'Python Numbers'] | None = None, labels: None | str | list[str] = None, nbins: int = 10000) float | TableSample ¶
Computes a classification report using multiple model evaluation metrics (
auc
,accuracy
,f1
…). For multiclass classification, it considers each category as positive and switches to the next one during the computation.Parameters¶
- metrics: list, optional
List of the metrics used to compute the final report.
- accuracy:
Accuracy.
\[Accuracy = \frac{TP + TN}{TP + TN + FP + FN}\]
- aic:
Akaike’s Information Criterion
\[AIC = 2k - 2\ln(\hat{L})\]
- auc:
Area Under the Curve (ROC).
\[AUC = \int_{0}^{1} TPR(FPR) \, dFPR\]
- ba:
Balanced Accuracy.
\[BA = \frac{TPR + TNR}{2}\]
- best_cutoff:
Cutoff which optimised the ROC Curve prediction.
- bic:
Bayesian Information Criterion
\[BIC = -2\ln(\hat{L}) + k \ln(n)\]
- bm:
Informedness
\[BM = TPR + TNR - 1\]
- csi:
Critical Success Index
\[index = \frac{TP}{TP + FN + FP}\]
- f1:
F1 Score
\[F_1 Score = 2 \times \frac{Precision \times Recall}{Precision + Recall}\]
- fdr:
False Discovery Rate
\[FDR = 1 - PPV\]
- fm:
Fowlkes-Mallows index
\[FM = \sqrt{PPV * TPR}\]
- fnr:
False Negative Rate
\[FNR = \frac{FN}{FN + TP}\]
- for:
False Omission Rate
\[FOR = 1 - NPV\]
- fpr:
False Positive Rate
\[FPR = \frac{FP}{FP + TN}\]
- logloss:
Log Loss.
\[Loss = -\frac{1}{N} \sum_{i=1}^{N} \left( y_i \log(p_i) + (1 - y_i) \log(1 - p_i) \right)\]
- lr+:
Positive Likelihood Ratio.
\[LR+ = \frac{TPR}{FPR}\]
- lr-:
Negative Likelihood Ratio.
\[LR- = \frac{FNR}{TNR}\]
- dor:
Diagnostic Odds Ratio.
\[DOR = \frac{TP \times TN}{FP \times FN}\]
- mc:
Matthews Correlation Coefficient .. math:
MCC = \frac{TP \times TN - FP \times FN}{\sqrt{(TP + FP)(TP + FN)(TN + FP)(TN + FN)}}
- mk:
Markedness
\[MK = PPV + NPV - 1\]
- npv:
Negative Predictive Value
\[NPV = \frac{TN}{TN + FN}\]
- prc_auc:
Area Under the Curve (PRC)
\[AUC = \int_{0}^{1} Precision(Recall) \, dRecall\]
- precision:
Precision
\[Precision = TP / (TP + FP)\]
- pt:
Prevalence Threshold.
\[threshold = \frac{\sqrt{FPR}}{\sqrt{TPR} + \sqrt{FPR}}\]
- recall:
Recall.
\[Recall = \frac{TP}{TP + FN}\]
- specificity:
Specificity.
\[Specificity = \frac{TN}{TN + FP}\]
- cutoff: PythonNumber, optional
Cutoff for which the tested category is accepted as a prediction. For multiclass classification, each tested category becomes the positives and the others are merged into the negatives. The cutoff represents the classes threshold. If it is empty, the regular cutoff (1 / number of classes) is used.
- labels: str | list, optional
List of the different labels to be used during the computation.
- nbins: int, optional
[Used to compute ROC AUC, PRC AUC and the best cutoff] An integer value that determines the number of decision boundaries. Decision boundaries are set at equally spaced intervals between 0 and 1, inclusive. Greater values for nbins give more precise estimations of the metrics, but can potentially decrease performance. The maximum value is 999,999. If negative, the maximum value is used.
Returns¶
- TableSample
report.
Examples¶
For this example, we will use the Iris dataset.
import verticapy.datasets as vpd data = vpd.load_iris() train, test = data.train_test_split(test_size = 0.2)
123SepalLengthCm123SepalWidthCm123PetalLengthCm123PetalWidthCmAbcSpecies1 3.3 4.5 5.6 7.8 Iris-setosa 2 3.3 4.5 5.6 7.8 Iris-setosa 3 3.3 4.5 5.6 7.8 Iris-setosa 4 3.3 4.5 5.6 7.8 Iris-setosa 5 3.3 4.5 5.6 7.8 Iris-setosa 6 3.3 4.5 5.6 7.8 Iris-setosa 7 3.3 4.5 5.6 7.8 Iris-setosa 8 3.3 4.5 5.6 7.8 Iris-setosa 9 3.3 4.5 5.6 7.8 Iris-setosa 10 3.3 4.5 5.6 7.8 Iris-setosa 11 3.3 4.5 5.6 7.8 Iris-setosa 12 3.3 4.5 5.6 7.8 Iris-setosa 13 3.3 4.5 5.6 7.8 Iris-setosa 14 3.3 4.5 5.6 7.8 Iris-setosa 15 3.3 4.5 5.6 7.8 Iris-setosa 16 3.3 4.5 5.6 7.8 Iris-setosa 17 3.3 4.5 5.6 7.8 Iris-setosa 18 3.3 4.5 5.6 7.8 Iris-setosa 19 3.3 4.5 5.6 7.8 Iris-setosa 20 3.3 4.5 5.6 7.8 Iris-setosa 21 3.3 4.5 5.6 7.8 Iris-setosa 22 3.3 4.5 5.6 7.8 Iris-setosa 23 3.3 4.5 5.6 7.8 Iris-setosa 24 3.3 4.5 5.6 7.8 Iris-setosa 25 3.3 4.5 5.6 7.8 Iris-setosa 26 3.3 4.5 5.6 7.8 Iris-setosa 27 4.3 3.0 1.1 0.1 Iris-setosa 28 4.3 4.7 9.6 1.8 Iris-virginica 29 4.3 4.7 9.6 1.8 Iris-virginica 30 4.3 4.7 9.6 1.8 Iris-virginica 31 4.3 4.7 9.6 1.8 Iris-virginica 32 4.3 4.7 9.6 1.8 Iris-virginica 33 4.3 4.7 9.6 1.8 Iris-virginica 34 4.3 4.7 9.6 1.8 Iris-virginica 35 4.3 4.7 9.6 1.8 Iris-virginica 36 4.3 4.7 9.6 1.8 Iris-virginica 37 4.3 4.7 9.6 1.8 Iris-virginica 38 4.3 4.7 9.6 1.8 Iris-virginica 39 4.3 4.7 9.6 1.8 Iris-virginica 40 4.3 4.7 9.6 1.8 Iris-virginica 41 4.3 4.7 9.6 1.8 Iris-virginica 42 4.3 4.7 9.6 1.8 Iris-virginica 43 4.3 4.7 9.6 1.8 Iris-virginica 44 4.3 4.7 9.6 1.8 Iris-virginica 45 4.3 4.7 9.6 1.8 Iris-virginica 46 4.3 4.7 9.6 1.8 Iris-virginica 47 4.3 4.7 9.6 1.8 Iris-virginica 48 4.3 4.7 9.6 1.8 Iris-virginica 49 4.3 4.7 9.6 1.8 Iris-virginica 50 4.3 4.7 9.6 1.8 Iris-virginica 51 4.3 4.7 9.6 1.8 Iris-virginica 52 4.3 4.7 9.6 1.8 Iris-virginica 53 4.3 4.7 9.6 1.8 Iris-virginica 54 4.4 2.9 1.4 0.2 Iris-setosa 55 4.4 3.0 1.3 0.2 Iris-setosa 56 4.4 3.2 1.3 0.2 Iris-setosa 57 4.5 2.3 1.3 0.3 Iris-setosa 58 4.6 3.1 1.5 0.2 Iris-setosa 59 4.6 3.2 1.4 0.2 Iris-setosa 60 4.6 3.4 1.4 0.3 Iris-setosa 61 4.6 3.6 1.0 0.2 Iris-setosa 62 4.7 3.2 1.3 0.2 Iris-setosa 63 4.7 3.2 1.6 0.2 Iris-setosa 64 4.8 3.0 1.4 0.1 Iris-setosa 65 4.8 3.0 1.4 0.3 Iris-setosa 66 4.8 3.1 1.6 0.2 Iris-setosa 67 4.8 3.4 1.6 0.2 Iris-setosa 68 4.8 3.4 1.9 0.2 Iris-setosa 69 4.9 2.4 3.3 1.0 Iris-versicolor 70 4.9 2.5 4.5 1.7 Iris-virginica 71 4.9 3.0 1.4 0.2 Iris-setosa 72 4.9 3.1 1.5 0.1 Iris-setosa 73 4.9 3.1 1.5 0.1 Iris-setosa 74 4.9 3.1 1.5 0.1 Iris-setosa 75 5.0 2.0 3.5 1.0 Iris-versicolor 76 5.0 2.3 3.3 1.0 Iris-versicolor 77 5.0 3.0 1.6 0.2 Iris-setosa 78 5.0 3.2 1.2 0.2 Iris-setosa 79 5.0 3.3 1.4 0.2 Iris-setosa 80 5.0 3.4 1.5 0.2 Iris-setosa 81 5.0 3.4 1.6 0.4 Iris-setosa 82 5.0 3.5 1.3 0.3 Iris-setosa 83 5.0 3.5 1.6 0.6 Iris-setosa 84 5.0 3.6 1.4 0.2 Iris-setosa 85 5.1 2.5 3.0 1.1 Iris-versicolor 86 5.1 3.3 1.7 0.5 Iris-setosa 87 5.1 3.4 1.5 0.2 Iris-setosa 88 5.1 3.5 1.4 0.2 Iris-setosa 89 5.1 3.5 1.4 0.3 Iris-setosa 90 5.1 3.7 1.5 0.4 Iris-setosa 91 5.1 3.8 1.5 0.3 Iris-setosa 92 5.1 3.8 1.6 0.2 Iris-setosa 93 5.1 3.8 1.9 0.4 Iris-setosa 94 5.2 2.7 3.9 1.4 Iris-versicolor 95 5.2 3.4 1.4 0.2 Iris-setosa 96 5.2 3.5 1.5 0.2 Iris-setosa 97 5.2 4.1 1.5 0.1 Iris-setosa 98 5.3 3.7 1.5 0.2 Iris-setosa 99 5.4 3.0 4.5 1.5 Iris-versicolor 100 5.4 3.4 1.5 0.4 Iris-setosa Rows: 1-100 | Columns: 5Let’s import the model:
from verticapy.machine_learning.vertica import NearestCentroid
Then we can create the model:
model = NearestCentroid(p = 2)
We can now fit the model:
model.fit( train, [ "SepalLengthCm", "SepalWidthCm", "PetalLengthCm", "PetalWidthCm", ], "Species", test, )
We can get all the classification metrics using the
classification_report
:model.classification_report()
Iris-setosa Iris-versicolor Iris-virginica avg_macro avg_weighted avg_micro auc 1.0 0.9968253968253968 0.9594017094017091 0.9854090354090353 0.9827422577422577 [null] prc_auc 1.0 0.9882716049382714 0.9710752419085751 0.9864489489489489 0.9857681545181545 [null] accuracy 0.7954545454545454 0.7045454545454546 0.9090909090909091 0.8030303030303031 0.8233471074380165 0.803030303030303 log_loss 0.221731261643405 0.185301348142484 0.217526920758945 0.20818651018161136 0.2125597307927557 [null] precision 1.0 0.4090909090909091 1.0 0.8030303030303031 0.8791322314049588 0.7045454545454546 recall 0.47058823529411764 1.0 0.7777777777777778 0.7494553376906318 0.7045454545454546 0.7045454545454546 f1_score 0.6399999999999999 0.5806451612903226 0.8750000000000001 0.6985483870967742 0.7239956011730206 0.7045454545454546 mcc 0.5940885257860046 0.50709255283711 0.821020142307163 0.6407337403100926 0.6691295562596591 0.5568181818181818 informedness 0.47058823529411775 0.6285714285714286 0.7777777777777777 0.625645813881108 0.6285714285714286 0.5568181818181819 markedness 0.75 0.40909090909090917 0.8666666666666667 0.6752525252525253 0.727995867768595 0.5568181818181819 csi 0.47058823529411764 0.4090909090909091 0.7777777777777778 0.5524856407209349 0.5836776859504131 0.543859649122807 Rows: 1-11 | Columns: 7Important
For this example, a specific model is utilized, and it may not correspond exactly to the model you are working with. To see a comprehensive example specific to your class of interest, please refer to that particular class.