Data Collector Utility

The Data Collector collects and retains history of important system activities, and records essential performance and resource utilization counters.

Data Collector extends system table functionality with minimal overhead, performing the following tasks:

  • Provides a framework for recording events.
  • Propagates information to system tables.

You can query Data Collector information to obtain the past state of system tables and extract aggregate information. It can also help you:

  • See what actions users have taken.
  • Locate performance bottlenecks.
  • Identify potential improvements to Vertica configuration.

Data Collector does not collect data for nodes that are down, so no historical data is available for that node.

Data Collector works with Workload Analyzer, a tool that intelligently monitors the performance of SQL queries and workloads, and recommends tuning actions based on observations of the actual workload history.

Configuring and Accessing Data Collector Information

Data Collector retains the data it gathers according to configurable retention policies. Data Collector is on by default; you can disable it by setting set configuration parameter EnableDataCollector to 0, at the database and node levels, with ALTER DATABASE and ALTER NODE, respectively.

You can access metadata on collected data of all components through system table DATA_COLLECTOR. This table includes information about current collection policies on that component, and how much data is retained in memory and on disk.

Collected data is logged on disk in the DataCollector directory under the Vertica /catalog path. You can query logged data from component-specific Data Collector tables. You can also manage logged data with Vertica meta-functions; see Managing Data Collection Logs for details.