VerticaPy

Python API for Vertica Data Science at Scale

Machine Learning

Tools

API Reference

verticapy.learn.cluster

Class Definition
BisectingKMeans Creates a BisectingKMeans object by using the Vertica Highly Distributed and Scalable BisectingKMeans on the data.
DBSCAN Creates a DBSCAN object by using the DBSCAN algorithm as defined by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu.
KMeans Creates a KMeans object by using the Vertica Highly Distributed and Scalable KMeans on the data.

verticapy.learn.decomposition

Class Definition
PCA Creates a PCA (Principal Component Analysis) object by using the Vertica Highly Distributed and Scalable PCA on the data.
SVD Creates a SVD (Singular Value Decomposition) object by using the Vertica Highly Distributed and Scalable SVD on the data.

verticapy.learn.ensemble

Class Definition
RandomForestClassifier Creates a RandomForestClassifier object by using the Vertica Highly Distributed and Scalable Random Forest on the data.
RandomForestRegressor Creates a RandomForestRegressor object by using the Vertica Highly Distributed and Scalable Random Forest on the data.
XGBoostClassifier Creates a XGBoostClassifier object by using the Vertica Highly Distributed and Scalable XGBOOST on the data.
XGBoostRegressor Creates a XGBoostRegressor object by using the Vertica Highly Distributed and Scalable XGBOOST on the data.

verticapy.learn.linear_model

Class Definition
ElasticNet Creates a ElasticNet object by using the Vertica Highly Distributed and Scalable Linear Regression on the data.
Lasso Creates a Lasso object by using the Vertica Highly Distributed and Scalable Linear Regression on the data.
LinearRegression Creates a LinearRegression object by using the Vertica Highly Distributed and Scalable Linear Regression on the data.
LogisticRegression Creates a LogisticRegression object by using the Vertica Highly Distributed and Scalable Logistic Regression on the data.
Ridge Creates a Ridge object by using the Vertica Highly Distributed and Scalable Linear Regression on the data.

verticapy.learn.metrics

Function Definition
accuracy_score Computes the Accuracy Score.
anova_table Computes the Anova Table.
auc Computes the ROC AUC (Area Under Curve).
classification_report Computes a classification report using multiple metrics (AUC, accuracy, PRC AUC, F1...).
confusion_matrix Computes the Confusion Matrix.
critical_success_index Computes the Critical Success Index.
explained_variance Computes the Explained Variance.
f1_score Computes the F1 Score.
informedness Computes the Informedness.
log_loss Computes the Log Loss.
markedness Computes the Markedness.
matthews_corrcoef Computes the Matthews Correlation Coefficient.
max_error Computes the Max Error.
mean_absolute_error Computes the Mean Absolute Error.
mean_squared_error Computes the Mean Squared Error.
mean_squared_log_error Computes the Mean Squared Log Error.
median_absolute_error Computes the Median Absolute Error.
multilabel_confusion_matrix Computes the Multi Label Confusion Matrix.
negative_predictive_score Computes the Negative Predictive Score.
prc_auc Computes the PRC AUC (Area Under Curve).
precision_score Computes the Precision Score.
recall_score Computes the Recall Score.
r2_score Computes the R2 Score.
regression_report Computes a regression report using multiple metrics (r2, mse, max error...).
specificity_score Computes the Specificity Score.

verticapy.learn.model_selection

Function Definition
autoML Tests multiple models to find those that maximize the input score.
best_k Finds the k-means k based on a score.
cross_validate Computes the k-fold cross-validation of an estimator.
elbow Draws the an elbow curve.
gen_params_grid Generates the estimator grid.
grid_search_cv Computes the K-Fold grid search of an estimator.
learning_curve Draws the learning curve.
lift_chart Draws a lift chart.
parameter_grid Generates the list of grid combinations with different input parameters.
plot_acf_pacf Draws ACF and PACF Charts.
prc_curve Draws a precision-recall curve.
randomized_search_cv Computes the K-Fold randomized search of an estimator.
roc_curve Draws a receiver operating characteristic (ROC) curve.
validation_curve Draws the Validation curve.

verticapy.learn.naive_bayes

Class Definition
BernoulliNB i.e. NaiveBayes with param nbtype = 'bernoulli'.
CategoricalNB i.e. NaiveBayes with param nbtype = 'categorical'.
GaussianNB i.e. NaiveBayes with param nbtype = 'gaussian'.
MultinomialNB i.e. NaiveBayes with param nbtype = 'multinomial'.
NaiveBayes Creates a NaiveBayes object by using the Vertica Highly Distributed and Scalable Naive Bayes on the data.

verticapy.learn.neighbors

Class Definition
KernelDensity Creates a KernelDensity object.
KNeighborsClassifier Creates a KNeighborsClassifier object by using the K Nearest Neighbors Algorithm.
KNeighborsRegressor Creates a KNeighborsRegressor object by using the k-nearest neighbors algorithm.
LocalOutlierFactor Creates a LocalOutlierFactor object by using the Local Outlier Factor algorithm as defined by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander.
NearestCentroid Creates a NearestCentroid object by using the K Nearest Centroid Algorithm.

verticapy.learn.pipeline

Class Definition
Pipeline Creates a Pipeline object, sequentially applying a list of transformations and a final estimator. The intermediate steps must implement a transform method.

verticapy.learn.preprocessing

Class / Function Definition
Balance Creates a view with an equal distribution of the input data based on the response_column.
CountVectorizer Creates a Text Index which will count the occurences of each word in the data.
MinMaxScaler i.e. Normalizer with param method = 'minmax'.
Normalizer Creates a Vertica Normalizer object.
OneHotEncoder Creates a Vertica OneHotEncoder object.
RobustScaler i.e. Normalizer with param method = 'robust_zscore'.
StandardScaler i.e. Normalizer with param method = 'zscore'.

verticapy.learn.svm

Class Definition
LinearSVC Creates a LinearSVC object by using the Vertica Highly Distributed and Scalable SVM on the data.
LinearSVR Creates a LinearSVR object by using the Vertica Highly Distributed and Scalable SVM on the data.

verticapy.learn.tree

Class Definition
DecisionTreeClassifier Single Decision Tree Classifier.
DecisionTreeRegressor Single Decision Tree Regressor.
DummyTreeClassifier This classifier learns by heart the training data.
DummyTreeRegressor This regressor learns by heart the training data.

verticapy.learn.tsa

Class Definition
SARIMAX Creates an SARIMAX object by using the Vertica Highly Distributed and Scalable Linear Regression on the data.
VAR Creates an VAR object by using the Vertica Highly Distributed and Scalable Linear Regression on the data.

verticapy.learn.utilities

Class Definition
check_model Checks if the model already exists.
load_model Loads a Vertica model and returns the associated object.