Loading...

verticapy.machine_learning.vertica.tree.DummyTreeClassifier.score#

DummyTreeClassifier.score(metric: Literal['aic', 'bic', 'accuracy', 'acc', 'balanced_accuracy', 'ba', 'auc', 'roc_auc', 'prc_auc', 'best_cutoff', 'best_threshold', 'false_discovery_rate', 'fdr', 'false_omission_rate', 'for', 'false_negative_rate', 'fnr', 'false_positive_rate', 'fpr', 'recall', 'tpr', 'precision', 'ppv', 'specificity', 'tnr', 'negative_predictive_value', 'npv', 'negative_likelihood_ratio', 'lr-', 'positive_likelihood_ratio', 'lr+', 'diagnostic_odds_ratio', 'dor', 'log_loss', 'logloss', 'f1', 'f1_score', 'mcc', 'bm', 'informedness', 'mk', 'markedness', 'ts', 'csi', 'critical_success_index', 'fowlkes_mallows_index', 'fm', 'prevalence_threshold', 'pm', 'confusion_matrix', 'classification_report'] = 'accuracy', average: Literal[None, 'binary', 'micro', 'macro', 'scores', 'weighted'] | None = None, pos_label: bool | float | str | timedelta | datetime | None = None, cutoff: int | float | Decimal = 0.5, nbins: int = 10000) float | list[float]#

Computes the model score.

Parameters#

metric: str, optional

The metric used to compute the score.

  • accuracy:

    Accuracy.

    \[Accuracy = \frac{TP + TN}{TP + TN + FP + FN}\]
  • aic:

    Akaike’s Information Criterion

    \[AIC = 2k - 2\ln(\hat{L})\]
  • auc:

    Area Under the Curve (ROC).

    \[AUC = \int_{0}^{1} TPR(FPR) \, dFPR\]
  • ba:

    Balanced Accuracy.

    \[BA = \frac{TPR + TNR}{2}\]
  • best_cutoff:

    Cutoff which optimised the ROC Curve prediction.

  • bic:

    Bayesian Information Criterion

    \[BIC = -2\ln(\hat{L}) + k \ln(n)\]
  • bm:

    Informedness

    \[BM = TPR + TNR - 1\]
  • csi:

    Critical Success Index

    \[index = \frac{TP}{TP + FN + FP}\]
  • f1:

    F1 Score

    \[F_1 Score = 2 \times \frac{Precision \times Recall}{Precision + Recall}\]
  • fdr:

    False Discovery Rate

    \[FDR = 1 - PPV\]
  • fm:

    Fowlkes-Mallows index

    \[FM = \sqrt{PPV * TPR}\]
  • fnr:

    False Negative Rate

    \[FNR = \frac{FN}{FN + TP}\]
  • for:

    False Omission Rate

    \[FOR = 1 - NPV\]
  • fpr:

    False Positive Rate

    \[FPR = \frac{FP}{FP + TN}\]
  • logloss:

    Log Loss.

    \[Loss = -\frac{1}{N} \sum_{i=1}^{N} \left( y_i \log(p_i) + (1 - y_i) \log(1 - p_i) \right)\]
  • lr+:

    Positive Likelihood Ratio.

    \[LR+ = \frac{TPR}{FPR}\]
  • lr-:

    Negative Likelihood Ratio.

    \[LR- = \frac{FNR}{TNR}\]
  • dor:

    Diagnostic Odds Ratio.

    \[DOR = \frac{TP \times TN}{FP \times FN}\]
  • mc:

    Matthews Correlation Coefficient .. math:

    MCC = \frac{TP \times TN - FP \times FN}{\sqrt{(TP + FP)(TP + FN)(TN + FP)(TN + FN)}}
    
  • mk:

    Markedness

    \[MK = PPV + NPV - 1\]
  • npv:

    Negative Predictive Value

    \[NPV = \frac{TN}{TN + FN}\]
  • prc_auc:

    Area Under the Curve (PRC)

    \[AUC = \int_{0}^{1} Precision(Recall) \, dRecall\]
  • precision:

    Precision

    \[Precision = TP / (TP + FP)\]
  • pt:

    Prevalence Threshold.

    \[threshold = \frac{\sqrt{FPR}}{\sqrt{TPR} + \sqrt{FPR}}\]
  • recall:

    Recall.

    \[Recall = \frac{TP}{TP + FN}\]
  • specificity:

    Specificity.

    \[Specificity = \frac{TN}{TN + FP}\]
average: str, optional

The method used to compute the final score for multiclass-classification.

  • binary:

    considers one of the classes as positive and use the binary confusion matrix to compute the score.

  • micro:

    positive and negative values globally.

  • macro:

    average of the score of each class.

  • scores:

    scores for all the classes.

  • weighted:

    weighted average of the score of each class.

If empty, the result will depend on the input metric. Whenever it is possible, the exact score is computed. Otherwise, the behaviour is similar to the ‘scores’ option.

pos_label: PythonScalar, optional

Label to consider as positive. All the other classes will be merged and considered as negative for multiclass classification.

cutoff: PythonNumber, optional

Cutoff for which the tested category is accepted as a prediction.

nbins: int, optional

[Only when method is set to auc|prc_auc|best_cutoff] An integer value that determines the number of decision boundaries. Decision boundaries are set at equally spaced intervals between 0 and 1, inclusive. Greater values for nbins give more precise estimations of the AUC, but can potentially decrease performance. The maximum value is 999,999. If negative, the maximum value is used.

Returns#

float

score.

Examples#

For this example, we will use the Iris dataset.

import verticapy.datasets as vpd

data = vpd.load_iris()

train, test = data.train_test_split(test_size = 0.2)
123
SepalLengthCm
Numeric(7)
123
SepalWidthCm
Numeric(7)
123
PetalLengthCm
Numeric(7)
123
PetalWidthCm
Numeric(7)
Abc
Species
Varchar(30)
13.34.55.67.8Iris-setosa
23.34.55.67.8Iris-setosa
33.34.55.67.8Iris-setosa
43.34.55.67.8Iris-setosa
53.34.55.67.8Iris-setosa
63.34.55.67.8Iris-setosa
73.34.55.67.8Iris-setosa
83.34.55.67.8Iris-setosa
93.34.55.67.8Iris-setosa
103.34.55.67.8Iris-setosa
113.34.55.67.8Iris-setosa
123.34.55.67.8Iris-setosa
133.34.55.67.8Iris-setosa
143.34.55.67.8Iris-setosa
153.34.55.67.8Iris-setosa
163.34.55.67.8Iris-setosa
173.34.55.67.8Iris-setosa
183.34.55.67.8Iris-setosa
193.34.55.67.8Iris-setosa
203.34.55.67.8Iris-setosa
213.34.55.67.8Iris-setosa
223.34.55.67.8Iris-setosa
233.34.55.67.8Iris-setosa
243.34.55.67.8Iris-setosa
253.34.55.67.8Iris-setosa
263.34.55.67.8Iris-setosa
273.34.55.67.8Iris-setosa
283.34.55.67.8Iris-setosa
293.34.55.67.8Iris-setosa
303.34.55.67.8Iris-setosa
313.34.55.67.8Iris-setosa
323.34.55.67.8Iris-setosa
333.34.55.67.8Iris-setosa
343.34.55.67.8Iris-setosa
353.34.55.67.8Iris-setosa
363.34.55.67.8Iris-setosa
373.34.55.67.8Iris-setosa
383.34.55.67.8Iris-setosa
393.34.55.67.8Iris-setosa
403.34.55.67.8Iris-setosa
413.34.55.67.8Iris-setosa
423.34.55.67.8Iris-setosa
434.33.01.10.1Iris-setosa
444.34.79.61.8Iris-virginica
454.34.79.61.8Iris-virginica
464.34.79.61.8Iris-virginica
474.34.79.61.8Iris-virginica
484.34.79.61.8Iris-virginica
494.34.79.61.8Iris-virginica
504.34.79.61.8Iris-virginica
514.34.79.61.8Iris-virginica
524.34.79.61.8Iris-virginica
534.34.79.61.8Iris-virginica
544.34.79.61.8Iris-virginica
554.34.79.61.8Iris-virginica
564.34.79.61.8Iris-virginica
574.34.79.61.8Iris-virginica
584.34.79.61.8Iris-virginica
594.34.79.61.8Iris-virginica
604.34.79.61.8Iris-virginica
614.34.79.61.8Iris-virginica
624.34.79.61.8Iris-virginica
634.34.79.61.8Iris-virginica
644.34.79.61.8Iris-virginica
654.34.79.61.8Iris-virginica
664.34.79.61.8Iris-virginica
674.34.79.61.8Iris-virginica
684.34.79.61.8Iris-virginica
694.34.79.61.8Iris-virginica
704.34.79.61.8Iris-virginica
714.34.79.61.8Iris-virginica
724.34.79.61.8Iris-virginica
734.34.79.61.8Iris-virginica
744.34.79.61.8Iris-virginica
754.34.79.61.8Iris-virginica
764.34.79.61.8Iris-virginica
774.34.79.61.8Iris-virginica
784.34.79.61.8Iris-virginica
794.34.79.61.8Iris-virginica
804.34.79.61.8Iris-virginica
814.34.79.61.8Iris-virginica
824.34.79.61.8Iris-virginica
834.34.79.61.8Iris-virginica
844.34.79.61.8Iris-virginica
854.34.79.61.8Iris-virginica
864.42.91.40.2Iris-setosa
874.43.01.30.2Iris-setosa
884.43.21.30.2Iris-setosa
894.52.31.30.3Iris-setosa
904.63.11.50.2Iris-setosa
914.63.21.40.2Iris-setosa
924.63.41.40.3Iris-setosa
934.63.61.00.2Iris-setosa
944.73.21.30.2Iris-setosa
954.73.21.60.2Iris-setosa
964.83.01.40.1Iris-setosa
974.83.01.40.3Iris-setosa
984.83.11.60.2Iris-setosa
994.83.41.60.2Iris-setosa
1004.83.41.90.2Iris-setosa
Rows: 1-100 | Columns: 5

Let’s import the model:

from verticapy.machine_learning.vertica import NearestCentroid

Then we can create the model:

model = NearestCentroid(p = 2)

We can now fit the model:

model.fit(
    train,
    [
        "SepalLengthCm",
        "SepalWidthCm",
        "PetalLengthCm",
        "PetalWidthCm",
    ],
    "Species",
    test,
)

We can get the score:

model.score()
Out[106]: 0.5714285714285714

To get the score of a particular class:

model.score(pos_label= "Iris-setosa")
Out[107]: 0.5510204081632653

Important

For this example, a specific model is utilized, and it may not correspond exactly to the model you are working with. To see a comprehensive example specific to your class of interest, please refer to that particular class.