verticapy.machine_learning.metrics.confusion_matrix#
- verticapy.machine_learning.metrics.confusion_matrix(y_true: str, y_score: str, input_relation: str | vDataFrame, labels: list | ndarray | None = None, pos_label: bool | float | str | timedelta | datetime | None = None) ndarray #
Computes the confusion matrix.
Parameters#
- y_true: str
Response column.
- y_score: str
Prediction.
- input_relation: SQLRelation
Relation used for scoring. This relation can be a view, table, or a customized relation (if an alias is used at the end of the relation). For example: (SELECT … FROM …) x
- average: str, optional
The method used to compute the final score for multiclass-classification.
- binary:
considers one of the classes as positive and use the binary confusion matrix to compute the score.
- micro:
positive and negative values globally.
- macro:
average of the score of each class.
- score:
scores for all the classes.
- weighted :
weighted average of the score of each class.
- None:
accuracy.
- labels: ArrayLike, optional
List of the response column categories.
- pos_label: PythonScalar, optional
Label used to identify the positive class. If pos_label is NULL then the global accuracy is be computed.
Returns#
- Array
confusion matrix.
Examples#
We should first import verticapy.
import verticapy as vp
Binary Classification#
Let’s create a small dataset that has:
true value
predicted value
data = vp.vDataFrame( { "y_true": [1, 1, 0, 0, 1], "y_pred": [1, 1, 1, 0, 1], }, )
Next, we import the metric:
from verticapy.machine_learning.metrics import confusion_matrix
Now we can conveniently calculate the score:
confusion_matrix( y_true = "y_true", y_score = "y_pred", input_relation = data, ) Out[4]: array([[1, 1], [0, 3]])
It is also possible to directly compute the score from the vDataFrame:
data.score( y_true = "y_true", y_score = "y_pred", metric = "confusion_matrix", ) Out[5]: array([[1, 1], [0, 3]])
Note
VerticaPy uses simple SQL queries to compute various metrics. You can use the
set_option()
function with thesql_on
parameter to enable SQL generation and examine the generated queries.Multi-class Classification#
Let’s create a small dataset that has:
true value with more than two classes
predicted value
data = vp.vDataFrame( { "y_true": [1, 2, 0, 0, 1], "y_pred": [1, 2, 0, 1, 1], }, )
Next, we import the metric:
from verticapy.machine_learning.metrics import confusion_matrix
Now we can conveniently calculate the score:
confusion_matrix( y_true = "y_true", y_score = "y_pred", labels = [0,1,2], input_relation = data, ) Out[8]: array([[1, 1, 0], [0, 2, 0], [0, 0, 1]])
See also
vDataFrame.
score()
: Computes the input ML metric.