Python API for Vertica Data Science at Scale

Machine Learning


API Reference


Class Definition
BisectingKMeans Creates a BisectingKMeans object using the Vertica BisectingKMeans algorithm.
DBSCAN Creates a DBSCAN object using the DBSCAN algorithm as defined by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu.
KMeans Creates a KMeans object using the Vertica k-means algorithm.


Class Definition
PCA Creates a PCA (Principal Component Analysis) object using the Vertica PCA algorithm.
SVD Creates a SVD (Singular Value Decomposition) object using the Vertica SVD algorithm.


Class Definition
IsolationForest Creates an IsolationForest object using the Vertica IFOREST algorithm.
RandomForestClassifier Creates a RandomForestClassifier object using the Vertica random forest algorithm.
RandomForestRegressor Creates a RandomForestRegressor object using the Vertica random forest algorithm.
XGBoostClassifier Creates a XGBoostClassifier object using the Vertica XGB_CLASSIFIER algorithm.
XGBoostRegressor Creates a XGBoostRegressor object using the Vertica XGB_REGRESSOR algorithm.


Class Definition
ElasticNet Creates a ElasticNet object using the Vertica linear regression algorithm.
Lasso Creates a Lasso object using the Verticalinear regression algorithm.
LinearRegression Creates a LinearRegression object using the Vertica linear regression algorithm.
LogisticRegression Creates a LogisticRegression object using the Vertica logistic regression algorithm.
Ridge Creates a Ridge object using the Vertica linear regression algorithm.


Class Definition
memModel Creates platform-independent machine learning models that you can export as SQL or Python code for deployment in other environments.


Function Definition
accuracy_score Computes the Accuracy Score.
anova_table Computes the Anova Table.
auc Computes the ROC AUC (Area Under Curve).
classification_report / report Computes a classification report using multiple metrics (AUC, accuracy, PRC AUC, F1...).
confusion_matrix Computes the Confusion Matrix.
critical_success_index Computes the Critical Success Index.
explained_variance Computes the Explained Variance.
f1_score Computes the F1 Score.
informedness Computes the Informedness.
log_loss Computes the Log Loss.
markedness Computes the Markedness.
matthews_corrcoef Computes the Matthews Correlation Coefficient.
max_error Computes the Max Error.
mean_absolute_error Computes the Mean Absolute Error.
mean_squared_error Computes the Mean Squared Error.
mean_squared_log_error Computes the Mean Squared Log Error.
median_absolute_error Computes the Median Absolute Error.
multilabel_confusion_matrix Computes the Multi Label Confusion Matrix.
negative_predictive_score Computes the Negative Predictive Score.
prc_auc Computes the PRC AUC (Area Under Curve).
precision_score Computes the Precision Score.
recall_score Computes the Recall Score.
r2_score Computes the R2 Score.
regression_report Computes a regression report using multiple metrics (r2, mse, max error...).
specificity_score Computes the Specificity Score.


Function Definition
autoML Tests multiple models to find the ones which maximize the input score.
bayesian_search_cv Computes the k-fold bayesian search of an estimator using a random forest model to estimate a probable optimal set of parameters.
best_k Finds the k-means k based on a score.
cross_validate Computes the k-fold cross-validation of an estimator.
elbow Draws the an elbow curve.
enet_search_cv Computes the k-fold grid search using multiple enet model.
gen_params_grid Generates the estimator grid.
grid_search_cv Computes the k-fold grid search of an estimator.
learning_curve Draws the learning curve.
lift_chart Draws a lift chart.
parameter_grid Generates the list of the different input parameters grid combinations.
plot_acf_pacf Draws ACF and PACF Charts.
prc_curve Draws a precision-recall curve.
randomized_features_search_cv Computes the k-fold grid search of an estimator using different features combinations.
randomized_search_cv Computes the k-fold randomized search of an estimator.
roc_curve Draws a receiver operating characteristic (ROC) curve.
stepwise Uses the stepwise algorithm to find the most suitable number of features when fitting the estimator.
validation_curve Draws the Validation curve.


Class Definition
BernoulliNB i.e. NaiveBayes with param nbtype = 'bernoulli'.
CategoricalNB i.e. NaiveBayes with param nbtype = 'categorical'.
GaussianNB i.e. NaiveBayes with param nbtype = 'gaussian'.
MultinomialNB i.e. NaiveBayes with param nbtype = 'multinomial'.
NaiveBayes Creates a NaiveBayes object using the Vertica Naive Bayes algorithm.


Class Definition
KernelDensity Creates a KernelDensity object.
KNeighborsClassifier Creates a KNeighborsClassifier object using the k-nearest neighbors algorithm.
KNeighborsRegressor Creates a KNeighborsRegressor object using the k-nearest neighbors algorithm.
LocalOutlierFactor Creates a LocalOutlierFactor object using the Local Outlier Factor algorithm as defined by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander.
NearestCentroid Creates a NearestCentroid object using the k-nearest centroid algorithm.


Class Definition
Pipeline Creates a Pipeline object, sequentially applying a list of transformations and a final estimator. The intermediate steps must implement a transform method.


Class / Function Definition
Balance Creates a view with an equal distribution of the input data based on the response_column.
CountVectorizer Creates a Text Index which will count the occurences of each word in the data.
MinMaxScaler i.e. Normalizer with param method = 'minmax'.
Normalizer Creates a Vertica Normalizer object.
OneHotEncoder Creates a Vertica OneHotEncoder object.
RobustScaler i.e. Normalizer with param method = 'robust_zscore'.
StandardScaler i.e. Normalizer with param method = 'zscore'.


Class Definition
LinearSVC Creates a LinearSVC object using the Vertica SVM algorithm.
LinearSVR Creates a LinearSVR object using the Vertica SVM algorithm.


Class Definition
DecisionTreeClassifier Single Decision Tree Classifier.
DecisionTreeRegressor Single Decision Tree Regressor.
DummyTreeClassifier A classifier that overfits the training data.
DummyTreeRegressor A regressor that overfits the training data.


Class Definition
SARIMAX Creates an SARIMAX object using the Vertica linear regression algorithm.
VAR Creates an VAR object using the Vertica linear regression algorithm.


Class Definition
check_model Checks if the model already exists.
load_model Loads a Vertica model and returns the associated object.