GROUPBY HASH Query Plan
Here's an example of how GROUPBY HASH
operations look in EXPLAIN
output.
=> EXPLAIN SELECT COUNT(DISTINCT annual_income) FROM customer_dimension WHERE customer_region='NorthWest';
The output shows that the optimizer chose the less efficient GROUPBY HASH
path, which means the projection was not presorted on the annual_income
column. If such a projection is available, the optimizer would choose the GROUPBY PIPELINED
algorithm.
Access Path: +-GROUPBY NOTHING [Cost: 256, Rows: 1 (NO STATISTICS)] (PATH ID: 1) | Aggregates: count(DISTINCT customer_dimension.annual_income) | +---> GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 253, Rows: 10K (NO STATISTICS)] (PATH ID: 2) | | Group By: customer_dimension.annual_income | | +---> STORAGE ACCESS for customer_dimension [Cost: 227, Rows: 50K (NO STATISTICS)] (PATH ID: 3) | | | Projection: public.customer_dimension_super | | | Materialize: customer_dimension.annual_income | | | Filter: (customer_dimension.customer_region = 'NorthWest' ...