PERCENTILE_DISC [Analytic]

An inverse distribution function where, for each row, PERCENTILE_DISC returns the value that would fall into the specified percentile among a set of values in each partition within a window. PERCENTILE_DISC() assumes a discrete distribution data model. NULL values are ignored.

PERCENTILE_DISC examines the cumulative distribution values in each group until it finds one that is greater than or equal to the specified percentile. Vertica computes the percentile where, for each row, PERCENTILE_DISC outputs the first value of the WITHIN GROUP(ORDER BY) column whose CUME_DIST (cumulative distribution) value is >= the argument FLOAT value—for example, 0.4:

PERCENTILE_DIST(0.4) WITHIN GROUP (ORDER BY salary) OVER(PARTITION BY deptno)...

Given the following query:

SELECT CUME_DIST() OVER(ORDER BY salary) FROM table-name;

The smallest CUME_DIST value that is greater than 0.4 is also the PERCENTILE_DISC.

Behavior Type

Immutable

Syntax

PERCENTILE_DISC ( percentile ) WITHIN GROUP ( 
... ORDER BY expression [ ASC | DESC ] ) OVER ( 
... [ window-partition-clause ] ) 

Parameters

percentile

Percentile value, a FLOAT constant that ranges from 0 to 1 (inclusive).

WITHIN GROUP(ORDER BY expression)

Specifies how to sort data within each group. ORDER BY takes only one column/expression that must be INTEGER, FLOAT, INTERVAL, or NUMERIC data type. NULL values are discarded.

The WITHIN GROUP(ORDER BY) clause does not guarantee the order of the SQL result. To order the final result , use the SQL ORDER BY clause set.

ASC | DESC

Specifies the ordering sequence as ascending (default) or descending.

OVER()

See Analytic Functions

Example

This query computes the 20th percentile annual income by group for first 300 customers in Wisconsin and the District of Columbia.

=> SELECT customer_state, customer_key, annual_income,
      PERCENTILE_DISC(.2) WITHIN GROUP(ORDER BY annual_income) 
      OVER (PARTITION BY customer_state) AS PERCENTILE_DISC
   FROM customer_dimension
   WHERE customer_state IN ('DC','WI')
   AND customer_key < 300
   ORDER BY customer_state, customer_key;
 customer_state | customer_key | annual_income | PERCENTILE_DISC
----------------+--------------+---------------+-----------------
 DC             |          104 |        658383 |          417092
 DC             |          168 |        417092 |          417092
 DC             |          245 |        670205 |          417092
 WI             |          106 |        227279 |          227279
 WI             |          127 |        703889 |          227279
 WI             |          209 |        458607 |          227279
(6 rows)