

class verticapy.machine_learning.vertica.automl.AutoML(name: str | None = None, overwrite_model: bool = False, estimator: list | str = 'fast', estimator_type: Literal['auto', 'regressor', 'binary', 'multi'] = 'auto', metric: str = 'auto', cv: int = 3, pos_label: bool | float | str | timedelta | datetime | None = None, cutoff: float = -1, nbins: int = 100, lmax: int = 5, optimized_grid: int = 2, stepwise: bool = True, stepwise_criterion: Literal['aic', 'bic'] = 'aic', stepwise_direction: Literal['forward', 'backward'] = 'backward', stepwise_max_steps: int = 100, stepwise_x_order: Literal['pearson', 'spearman', 'random', 'none'] = 'pearson', preprocess_data: bool = True, preprocess_dict: dict = {'identify_ts': False}, print_info: bool = True)#

Tests multiple models to find those that maximize the input score.


name: str, optional

Name of the model.

overwrite_model: bool, optional

If set to True, training a model with the same name as an existing model overwrites the existing model.

estimator: list / ‘native’ / ‘all’ / ‘fast’ / object

List of Vertica estimators with a fit method. Alternatively, you can specify ‘native’ for all native Vertica models, ‘all’ for all VerticaPy models, and ‘fast’ for quick modeling.

estimator_type: str, optional
Estimator Type.
autoAutomatically detects the

estimator type.

regressorThe estimator is used to

perform a regression.

binaryThe estimator is used to

perform a binary classification.

multiThe estimator is used to

perform a multiclass classification.

metric: str, optional
Metric used for the model evaluation.
auto: logloss for classification & RMSE for


For Classification:

accuracy : Accuracy auc : Area Under the Curve


baBalanced Accuracy

= (tpr + tnr) / 2


= tpr + tnr - 1

csiCritical Success Index

= tp / (tp + fn + fp)

f1 : F1 Score fdr : False Discovery Rate = 1 - ppv fm : Fowlkes–Mallows index

= sqrt(ppv * tpr)

fnrFalse Negative Rate

= fn / (fn + tp)

for : False Omission Rate = 1 - npv fpr : False Positive Rate

= fp / (fp + tn)

logloss : Log Loss lr+ : Positive Likelihood Ratio

= tpr / fpr

lr-Negative Likelihood Ratio

= fnr / tnr

dor : Diagnostic Odds Ratio mcc : Matthews Correlation Coefficient mk : Markedness

= ppv + npv - 1

npvNegative Predictive Value

= tn / (tn + fn)

prc_aucArea Under the Curve



= tp / (tp + fp)

ptPrevalence Threshold

= sqrt(fpr) / (sqrt(tpr) + sqrt(fpr))


= tp / (tp + fn)


= tn / (tn + fp)

For Regression:

max : Max error mae : Mean absolute error median : Median absolute error mse : Mean squared error msle : Mean squared log error r2 : R-squared coefficient r2a : R2 adjusted rmse : Root-mean-squared error var : Explained variance

cv: int, optional

Number of folds.

pos_label: PythonScalar, optional

The main class to be considered as positive (classification only).

cutoff: float, optional

The model cutoff (classification only).

nbins: int, optional

Number of bins used to compute the different parameter categories.

lmax: int, optional

Maximum length of each parameter list.

optimized_grid: int, optional

If set to zero, the randomness is based on the input parameters. If set to one, the randomness is limited to some parameters while others are picked based on a default grid. If set to two, no randomness is used and a default grid is returned.

stepwise: bool, optional

If True, the stepwise algorithm is used to determine the final model list of parameters.

stepwise_criterion: str, optional

Criterion used when performing the final estimator stepwise.

aic : Akaike’s information criterion bic : Bayesian information criterion

stepwise_direction: str, optional

Direction to start the stepwise search, either ‘backward’ or ‘forward’.

stepwise_max_steps: int, optional

The maximum number of steps to be considered when performing the final estimator stepwise.

x_order: str, optional

Method for preprocessing X before using the stepwise algorithm.

pearsonX is ordered based on the

Pearson’s correlation coefficient.

spearmanX is ordered based on

Spearman’s rank correlation coefficient.

randomShuffles the vector X before

applying the stepwise algorithm.

noneDoes not change the order of
preprocess_data: bool, optional

If True, the data will be preprocessed.

preprocess_dict: dict, optional

Dictionary to pass to the AutoDataPrep class in order to preprocess the data before clustering.

print_info: bool

If True, prints the model information at each step.


preprocess_: object

Model used to preprocess the data.

best_model_: object

Most efficient models found during the search.


Grid containing the different models information.

__init__(name: str | None = None, overwrite_model: bool = False, estimator: list | str = 'fast', estimator_type: Literal['auto', 'regressor', 'binary', 'multi'] = 'auto', metric: str = 'auto', cv: int = 3, pos_label: bool | float | str | timedelta | datetime | None = None, cutoff: float = -1, nbins: int = 100, lmax: int = 5, optimized_grid: int = 2, stepwise: bool = True, stepwise_criterion: Literal['aic', 'bic'] = 'aic', stepwise_direction: Literal['forward', 'backward'] = 'backward', stepwise_max_steps: int = 100, stepwise_x_order: Literal['pearson', 'spearman', 'random', 'none'] = 'pearson', preprocess_data: bool = True, preprocess_dict: dict = {'identify_ts': False}, print_info: bool = True) None#

Must be overridden in the child class


__init__([name, overwrite_model, estimator, ...])

Must be overridden in the child class

contour([nbins, chart])

Draws the model's contour plot.


Returns the SQL code needed to deploy the model.

does_model_exists(name[, raise_error, ...])

Checks whether the model is stored in the Vertica database.


Drops the model from the Vertica database.

export_models(name, path[, kind])

Exports machine learning models.


Computes the model's features importance.

fit(input_relation[, X, y, return_report])

Trains the model.


Returns the model attributes.

get_match_index(x, col_list[, str_check])

Returns the matching index.


Returns the parameters of the model.

get_plotting_lib([class_name, chart, ...])

Returns the first available library (Plotly, Matplotlib, or Highcharts) to draw a specific graphic.


Returns the model attribute.

import_models(path[, schema, kind])

Imports machine learning models.

plot([mltype, chart])

Draws the AutoML plot.

register(registered_name[, raise_error])

Registers the model and adds it to in-DB Model versioning environment with a status of 'under_review'.


Sets the parameters of the model.


Summarizes the model.


Exports the model to the Vertica Binary format.


Converts the model to an InMemory object that can be used for different types of predictions.


Exports the model to PMML.

to_python([return_proba, ...])

Returns the Python function needed for in-memory scoring without using built-in Vertica functions.

to_sql([X, return_proba, ...])

Returns the SQL code needed to deploy the model without using built-in Vertica functions.


Exports the model to the Frozen Graph format (TensorFlow).
