verticapy.machine_learning.metrics.regression_report#
- verticapy.machine_learning.metrics.regression_report(y_true: str, y_score: str, input_relation: str | vDataFrame, metrics: None | str | list[str] = None, k: int = 1, genSQL: bool = False) float | TableSample #
Computes a regression report using multiple metrics to evaluate the model (
r2
,mse
,max error
…).Parameters#
- y_true: str
Response column.
- y_score: str
Prediction.
- input_relation: SQLRelation
Relation to use for scoring. This relation can be a view, table, or a customized relation (if an alias is used at the end of the relation). For example: (SELECT … FROM …) x
- metrics: list, optional
List of the metrics used to compute the final report.
- aic:
Akaike’s Information Criterion
\[AIC = 2k - 2\ln(\hat{L})\]
- bic:
Bayesian Information Criterion
\[BIC = -2\ln(\hat{L}) + k \ln(n)\]
- max:
Max Error.
\[ME = \max_{i=1}^{n} \left| y_i - \hat{y}_i \right|\]
- mae:
Mean Absolute Error.
\[MAE = \frac{1}{n} \sum_{i=1}^{n} \left| y_i - \hat{y}_i \right|\]
- median:
Median Absolute Error.
\[MedAE = \text{median}_{i=1}^{n} \left| y_i - \hat{y}_i \right|\]
- mse:
Mean Squared Error.
\[MsE = \frac{1}{n} \sum_{i=1}^{n} \left( y_i - \hat{y}_i \right)^2\]
- msle:
Mean Squared Log Error.
\[MSLE = \frac{1}{n} \sum_{i=1}^{n} (\log(1 + y_i) - \log(1 + \hat{y}_i))^2\]
- r2:
R squared coefficient.
\[R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}\]
- r2a:
R2 adjusted
\[\text{Adjusted } R^2 = 1 - \frac{(1 - R^2)(n - 1)}{n - k - 1}\]
- qe:
quantile error, the quantile must be included in the name. Example: qe50.1% will return the quantile error using q=0.501.
- rmse:
Root-mean-squared error
\[RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}\]
- var:
Explained Variance
\[\text{Explained Variance} = 1 - \frac{Var(y - \hat{y})}{Var(y)}\]
- k: int, optional
Number of predictors. Used to compute the adjusted R2
- genSQL: bool, optional
If set to
True
, returns the sql that is used to generate the metrics.
Returns#
- TableSample
report.
Examples#
We should first import verticapy.
import verticapy as vp
Let’s create a small dataset that has:
true value
predicted value
data = vp.vDataFrame( { "y_true": [1, 1.5, 3, 2, 5], "y_pred": [1.1, 1.55, 2.9, 2.01, 4.5], } )
Next, we import the metric:
from verticapy.machine_learning.metrics import regression_report
Now we can conveniently compute the report:
regression_report( y_true = "y_true", y_score = "y_pred", input_relation = data, ) Out[4]: None value explained_variance 0.976612 max_error 0.5 median_absolute_error 0.1 mean_absolute_error 0.152 mean_squared_error 0.05452 root_mean_squared_error 0.23349518196314 r2 0.97274 r2_adj 0.963653333333333 aic -0.545938360769027 bic -11.3270625359008 Rows: 1-10 | Columns: 2
Note
VerticaPy uses simple SQL queries to compute various metrics. You can use the
set_option()
function with thesql_on
parameter to enable SQL generation and examine the generated queries.See also
vDataFrame.
score()
: Computes the input ML metric.