Cross joins are the simplest joins to write, but they are not usually the fastest to run because they consist of all possible combinations of two tables’ records. Cross joins contain no join condition and return what is known as a Cartesian product, where the number of rows in the result set is equal to the number of rows in the first table multiplied by the number of rows in the second table.
The following query returns all possible combinations from the promotion table and the store sales table:
=> SELECT * FROM promotion_dimension CROSS JOIN store.store_sales_fact;
Because this example returns over 600 million records, many cross join results can be extremely large and difficult to manage. Cross joins can be useful, however, such as when you want to return a single-row result set.
Filter out unwanted records in a cross with
WHERE clause join predicates:
=> SELECT * FROM promotion_dimension p CROSS JOIN store.store_sales_fact f
WHEREp.promotion_key LIKE f.promotion_key;
Implicit versus Explicit Joins
Vertica recommends that you do not write implicit cross joins (comma-separated tables in the
FROM clause). These queries can imply accidental omission of a join predicate.
The following query implicitly cross joins tables
=> SELECT * FROM promotion_dimension, store.store_sales_fact;
It is better practice to express this cross join explicitly, as follows:
=> SELECT * FROM promotion_dimension CROSS JOIN
The following example creates two small tables and their superprojections and then runs a cross join on the tables:
=> CREATE TABLE employee(employee_id INT, employee_fname VARCHAR(50)); => CREATE TABLE department(dept_id INT, dept_name VARCHAR(50)); => INSERT INTO employee VALUES (1, 'Andrew'); => INSERT INTO employee VALUES (2, 'Priya'); => INSERT INTO employee VALUES (3, 'Michelle'); => INSERT INTO department VALUES (1, 'Engineering'); => INSERT INTO department VALUES (2, 'QA'); => SELECT * FROM employee CROSS JOIN department;
In the result set, the cross join retrieves records from the first table and then creates a new row for every row in the 2nd table. It then does the same for the next record in the first table, and so on.
employee_id | employee_name | dept_id | dept_name -------------+---------------+---------+----------- 1 | Andrew | 1 | Engineering 2 | Priya | 1 | Engineering 3 | Michelle | 1 | Engineering 1 | Andrew | 2 | QA 2 | Priya | 2 | QA 3 | Michelle | 2 | QA (6 rows)