IMPUTE
Imputes missing values in a data set with either the mean or the mode, based on observed values for a variable. This function supports both numeric and categorical data types.
Important: Before using a machine learning function, be aware that all the ongoing transactions might be committed.
Syntax
IMPUTE( 'output_view', 'input_relation', 'input_columns', 'method' [ USING PARAMETERS [exclude_columns='col1, col2, ... coln',] [partition_columns='col1, col2, ... coln'] ])
Arguments
output_view |
The name of the View where the missing-value imputed rows and rows without missing values are stored. The value must be VARCHAR. |
input_relation |
The table or view that contains the data for missing-value imputation. The value must be VARCHAR. |
input_columns |
A comma-separated list of the columns in |
method |
The missing-value imputation method to use. Valid Values:
Numeric and categorical input values are supported. |
Parameters
exclude_columns=col1, col2, ... coln |
(Optional) The columns from Default Value: Empty |
partition_columns=col1, col2, ... coln |
(Optional) A comma-separated list of column names from Default Value: Empty |
Privileges
To use IMPUTE, you must either be a superuser or have CREATE privileges for the schema of the output view and SELECT privileges for the input table or view. There are no privileges needed on the function itself.
See GRANT (Schema) and GRANT (Table).
Examples
These examples show how you can use the IMPUTE function on the small_input_impute
table.
Execute the IMPUTE function, specifying the mean method:
=> SELECT impute('output_view','small_input_impute', 'pid, x1,x2,x3,x4','mean' USING PARAMETERS exclude_columns='pid'); impute -------------------------- Finished in 1 iteration (1 row)
Execute the IMPUTE function, specifying the mode method:
=> SELECT impute('output_view3','small_input_impute', 'pid, x5,x6','mode' USING PARAMETERS exclude_columns='pid'); impute -------------------------- Finished in 1 iteration (1 row)