Best Practices for Creating Access Policies on Vertica
Overview
The Vertica Analytic Database access policies act on columns and rows to provide extra security on data in your tables. You can create access policies that limit which users can access certain data by applying the access policy to a table. The access policy identifies any row, columns, or roles you want to restrict.
For example, you can create an access policy to prevent a specific role from viewing employee salaries in an employee table. Users who have that role will be prevented from seeing employee salaries when they run queries or reports against the table.
How to Create an Access Policy
Create access policies as follows:
- Create a table:
=> CREATE TABLE customers_table(custID INT, password VARCHAR, SSN VARCHAR);
- Create a row or column access policy on an existing table. This example shows how to create a column access policy:
=> CREATE ACCESS POLICY on customers_table FOR COLUMN SSN CASE WHEN enabled_role('manager') THEN SSN ELSE substr(SSN, 8, 4) END ENABLE;
This policy allows a manager role to view the entire SSN
column and limits other roles to viewing the last four digits of the SSN
column.
- Run the following SQL query:
=> SELECT * FROM customers_table;
In the background, Vertica uses the access policy and rewrites and runs this query as:
=> SELECT * FROM (SELECT custID, password, CASE WHEN enabled_role('manager') THEN SSN ELSE substr(SSN, 8, 4) END AS SSN FROM customers_table) customers_table;
Performance Implications
When Vertica runs this rewritten query against the table, the process will be more expensive with regards to performance than the simple SELECT
statement in Step 3.
SQL queries that are run against tables with access policies are rewritten before execution to implement behavior defined by the access policy.
To determine how expensive a SQL query may be, test the rewritten query to determine its effect on system performance using the details in the following section.
The next section explains how to test this rewritten SQL query to collect processing information to ensure that you achieve maximum performance.
How to Test Performance
Be sure to test the rewritten query against the table before implementing the access policy. Use the examples below as a guide to determining the contents of the rewritten query.
For the purposes of this example:
User1
has been granted the manager role and is a DBADMIN userUser2
is not a managerjohn
is not a manager
Use the \timing
meta-command to determine how long your query will run outside of the access policy. For more information see \timing in the Vertica documentation.
Testing Column Access Policy Example
This examples shows queries and output with only the column access policy enabled.
Two users query the customers table
User1=> SELECT * FROM customers; name | ssn | type -----+-------------+-------------- john | 123-44-6789 | active adam | 123-45-6789 | active alex | 123-54-6789 | inactive (3 rows)
User2=> SELECT * FROM customers where type = 'active'; name | ssn | type -----+-------------+-------- adam | 123-45-6789 | active john | 123-44-6789 | active (2 rows)
User1, as a DBADMIN user, creates the following column access policy
User1=> CREATE ACCESS POLICY ON customers FOR column ssn CASE WHEN enabled_role('manager') THEN ssn ELSE substr(ssn, 8, 4) END ENABLE;
This access policy allows a user who has been granted the manager role to view the entire contents of the ssn column in the customers table, and limits other users to viewing only the last four digits.
Run a query as User2 (non-manager)
With this access policy enabled on the table, run the following query:
User2=> SELECT * FROM customers WHERE type = 'active';
Query transformation
With the column access policy enabled by the DBADMIN, Vertica transforms the previous query to:
User2=> SELECT * FROM (SELECT customers.name, (substr(customers.ssn, 8, 4))::VARCHAR(80) AS ssn, customers.type FROM public.customers) customers WHERE type = 'active';
In the transformed query:
SELECT * FROM
and customersWHERE type = 'active'
represent the text of the query before transformation.- Vertica substitutes
(select customers.name, (substr(customers.ssn, 8, 4))::varchar(80) AS ssn, customers.type from public.customers)
inside the query as a subquery and represents the parts of the access policy that determine what data is returned.
If you know the access policy you want to create and the SQL query users may run, you can use this information to determine the syntax of your transformed query. Then test that query outside of the access policy to determine performance results.
The output for User2
(non-manager) is:
name | ssn | type -----+------+-------- adam | 6789 | active john | 6789 | active (2 rows)
Run a query as User1 (manager role)
With this access policy enabled by the DBADMIN, run the following query:
User1=> select * from customers where type = 'active';
Query transformation
With the column access policy enabled, Vertica transforms the previous query to:
User1=> SELECT * FROM(SELECT customers.name, customers.ssn, customers.type, customers.epoch FROM public.customers) customers WHERE type= 'active';
As in the previous example where User2
ran a query:
SELECT * FROM
and customersWHERE type = 'active'
represent the text of the query before transformation.- Vertica substitutes
(SELECT customers.name, customers.ssn, customers.type, customers.epoch FROM public.customers)
inside the query as a subquery and represents the parts of the access policy that determine what data is returned.
Use this information to determine the syntax of your transformed query and then test that query outside of the access policy to determine performance results.
The output for User1
(manager) is:
name | ssn | type -----+-------------+------- adam | 123-45-6789 | active john | 123-44-6789 | active (2 rows)
Testing Row Access Policy example
This example shows queries and output with only the row access policy enabled.
User1, as the DBADMIN user, creates the following row access policy
User1=> CREATE access policy on customers FOR rows WHERE name = current_user() enable;
This access policy allows only users that match the user name from the table to view the contents of that row.
Run a query as user john
john=> SELECT * FROM customers WHERE type = 'active';
Query transformation
With the row access policy enabled by the DBADMIN, Vertica transforms the previous query to:
SELECT * FROM (SELECT customers.name, customers.ssn, customers.type FROM public.customers WHERE (customers.name = 'john'::VARCHAR(128))) customers WHERE type = 'active';
In the transformed query:
SELECT * FROM
and customersWHERE type = 'active'
represent the text of the query before transformation.- Vertica substitutes
SELECT customers.name, customers.ssn, customers.type FROM public.customers WHERE (customers.name = 'john'::varchar(128)))
inside the query as a subquery and represents the parts of the access policy that determine what data is returned.
Again, before implementing the access policy use this information to determine the syntax of your transformed query and then test that query outside of the access policy to determine performance results.
The output for user john is:
name | ssn | type -----+-------------+-------- john | 123-44-6789 | active (1 row)
Testing Row and Column Access Policies example
This example shows the output when both the row and column access policies from the previous examples are enabled on a table at the same time.
Run a query as user john
john=> SELECT * FROM customers WHERE type = 'active';
Query transformation
With the row and column access policies enabled by the DBADMIN, Vertica transforms the previous query to:
SELECT * FROM (SELECT customers.name, (substr(customers.ssn, 8, 4)) ::varchar(80) AS ssn, customers.type FROM public.customers WHERE (customers.name = 'john'::VARCHAR(128))) customers WHERE type = 'active';
In the above query:
SELECT * FROM
and customersWHERE type = 'active'
represent the text of the query before transformation.- Vertica substitutes
(SELECT customers.name, (substr(customers.ssn, 8, 4))::VARCHAR(80) AS ssn, customers.type FROM public.customers WHERE (customers.name = 'john'::varchar(128)))
inside the query as a subquery and represents the parts of both the row and column access policies that determine what data is returned.
Again, before implementing the access policy use this information to determine the syntax of your transformed query and then test that query outside of the access policy to determine performance results.
The output for user john is:
name | ssn | type -----+------+-------- john | 6789 | active (1 row)
Following these best practices tips when implementing access policies ensures your system continues to run smoothly and efficiently. In addition, you get the reassurance that your sensitive data is well-protected from unauthorized users.