Machine Learning Functions
The machine learning functions contain algorithms for machine learning and data preparation. Additionally, these functions provide evaluation metrics for models. You can use these evaluation metrics to determine the accuracy of your models.
Vertica machine learning functions do not support temp tables.
Important: Before using a machine learning function, be aware that all the ongoing transactions might be committed.
Data Preparation Functions
You can use the following functions to pre-process your data:
-
APPLY_NORMALIZE - Use this function to apply normalization parameters saved in a model to specific columns.
- BALANCE - Use this function to balance your data.
- DETECT_OUTLIERS - Use this function to remove the outliers from your data.
- IMPUTE- Imputes missing values in a data set.
- NORMALIZE - Use this function before running one of the machine learning algorithms on your data.
-
NORMALIZE_FIT - Use this function to compute normalization parameters for specific columns in an input table. The normalization parameters are saved.
-
REVERSE_NORMALIZE - Use this function to reverse the normalization transformation.
Evaluation Functions
You can use the following functions to evaluate your data:
- APPLY_KMEANS - Assigns each row of an input table to a cluster center from an already-existing k-means model.
- CONFUSION_MATRIX - Returns a confusion matrix based on both predicted and observed values.
- GET_MODEL_ATTRIBUTE - Extracts specific model attributes.
- ERROR_RATE - Returns a table that calculates the rate of incorrect classifications.
- LIFT_TABLE - Returns a table that compares the predictive quality of a binary classifier model.
- MSE - Returns a table that displays the mean squared error.
- ROC - Returns a table that displays the points on a receiver operating characteristic curve.
- RSQUARED - Returns a table with the R-squared value of the predictions in a linear regression model.
- SUMMARIZE_MODEL - Returns the summary information of a model.
Prediction Functions
You can use the following functions to apply a model to a table:
- PREDICT_LINEAR_REG- Applies a linear regression model on an input table.
- PREDICT_LOGISTIC_REG - Applies a logistic regression model on an input table.
- PREDICT_NAIVE_BAYES - Applies a Naïve Bayes model on an input table.
- PREDICT_NAIVE_BAYES_CLASSES - Applies a Naïve Bayes model on an input table and returns the probabilities of classes.
- PREDICT_RF_CLASSIFIER_CLASSES - Applies a random forest model on an input table and returns the probabilities of classes.
- PREDICT_RF_CLASSIFIER - Applies a random forest model on an input table.
- PREDICT_SVM_CLASSIFIER - Applies an SVM classification model on an input table.
- PREDICT_SVM_REGRESSOR - Applies an SVM regressor model on an input table.
Supervised Learning Functions
You can use the following supervised learning functions to run predictive analytics on a data set:
- LINEAR_REG - Use this function to model the linear relationship between independent variables and some dependent variable.
- LOGISTIC_REG - Use this function to model the relationship between independent variables and some dependent variable.
- NAIVE_BAYES - Use this function to classify your data when features can be assumed independent.
- RF_CLASSIFIER - Use this function to create an ensemble model of decision trees.
- SVM_CLASSIFIER - Use this function to assign data to one category or the other.
- SVM_REGRESSOR - Use this function to predict continuous ordered variables.
Unsupervised Learning Functions
You can use the following unsupervised learning functions to run analytics on a data set:
- KMEANS - Use this function to cluster data points into k different groups.