LINEAR_REG
Executes linear regression on an input relation, and returns a linear regression model.
You can export the resulting linear regression model in VERTICA_MODELS or PMML format to apply it on data outside Vertica. You can also train a linear regression model elsewhere, then import it to Vertica in PMML format to predict on data in Vertica.
Syntax
LINEAR_REG ( 'model‑name', 'input‑relation', 'response‑column', 'predictor‑columns' [ USING PARAMETERS [exclude_columns = 'excluded‑columns'] [, optimizer = 'optimizer‑method'] [, regularization = 'regularization‑method'] [, epsilon = epsilon‑value] [, max_iterations = iterations] [, lambda = lamda‑value] [, alpha = alpha‑value] ] )
Arguments
model‑name |
Identifies the model to create, where model‑name conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema. |
input‑relation |
The table or view that contains the training data for building the model. If the input relation is defined in Hive, use |
response‑column |
Name of the input column that represents the dependent variable or outcome. All values in this column must be numeric, otherwise the model is invalid. |
predictor‑columns |
Comma-separated list of columns in the input relation that represent independent variables for the model, or asterisk (*) to select all columns. If you select all columns, the argument list for parameter All predictor columns must be of type numeric or BOOLEAN; otherwise the model is invalid. All BOOLEAN predictor values are converted to FLOAT values before training: 0 for false, 1 for true. No type checking occurs during prediction, so you can use a BOOLEAN predictor column in training, and during prediction provide a FLOAT column of the same name. In this case, all FLOAT values must be either 0 or 1. |
Parameters
exclude_columns | Comma-separated list of columns from predictor‑columns to exclude from processing. |
optimizer |
The optimizer method used to train the model, one of the following:
Default: |
regularization |
Specifies the method of regularization, one of the following:
|
epsilon |
Specifies whether the algorithm has reached the specified accuracy result. Default: 1e-6 |
max_iterations |
Maximum number of iterations the algorithm performs before achieving the specified accuracy result. Default: 100 |
lambda |
Integer ≥ 0, specifies the value of the Default: 1 |
alpha |
Integer ≥ 0, specifies the value of the ENET Value range: [0,1] Default: 0.5 |
Model Attributes
data
|
The data for the function, including:
|
regularization
|
Type of regularization to use when training the model. |
lambda
|
Regularization parameter. Higher values enforce stronger regularization. This value must be nonnegative. |
alpha
|
Elastic net mixture parameter. |
iterations
|
Number of iterations that actually occur for the convergence before exceeding max_iterations . |
skippedRows
|
Number of rows of the input relation that were skipped because they contained an invalid value. |
processedRows
|
Total number of input relation rows minus skippedRows . |
callStr
|
Value of all input arguments specified when the function was called. |
Examples
=> SELECT LINEAR_REG('myLinearRegModel', 'faithful', 'eruptions', 'waiting' USING PARAMETERS optimizer='BFGS'); LINEAR_REG ---------------------------- Finished in 10 iterations (1 row)