Reacting to Stale Statistics

During predicate selectivity estimation, the query optimizer can identify when histograms are not available or are out of date. If the value in the predicate is outside the histogram's maximum range, the statistics are stale. If no histograms are available, then no statistics are available to the plan.

When the optimizer detects stale or no statistics, such as when it encounters a column predicate for which it has no histogram, the optimizer performs the following actions:

For example, the following query plan fragment shows no statistics (histograms unavailable):

| | +-- Outer -> STORAGE ACCESS for fact [Cost: 604, Rows: 10K (NO STATISTICS)]

The following query plan fragment shows that the predicate falls outside the histogram range:

| | +-- Outer -> STORAGE ACCESS for fact [Cost: 35, Rows: 1 (PREDICATE VALUE OUT-OF-RANGE)] 

You can get information about which table column has no statistics by querying a system table; for example, view the timestamp for when statistics were last run by querying system table PROJECTION_COLUMNS.

Example

  1. Run full statistics on table trades:
    => SELECT ANALYZE_STATISTICS('trades'); ANALYZE_STATISTICS
    --------------------
                      0
    (1 row)
    
  2. Query columns column_name, statistics_type, and statistics_updated_timestamp from projection PROJECTION_COLUMNS.
    => SELECT projection_column_name, statistics_type,    statistics_updated_timestamp 
       FROM PROJECTION_COLUMNS where table_name = 'trades';
     projection_column_name | statistics_type | STATISTICS_UPDATED_TIMESTAMP
    ------------------------+-------------------------------------------------
     stock                  | FULL            | 2011-03-31 13:39:16.968177-04
     bid                    | FULL            | 2011-03-31 13:39:16.96885-04
     ask                    | FULL            | 2011-03-31 13:39:16.968883-04
    (3 rows)
    

You can also query the PROJECTIONS column HAS_STATISTICS, which returns true only when all non-epoch columns for a table have full statistics. Otherwise, the column returns false.

See Also

Analyzing Workloads