Vertica Analytics Platform Version 9.2.x Documentation

Event Severity Types

Event names are sensitive to case and spaces. Vertica logs the following events:

Event Name Event Type Description Action

Low Disk Space

0

The database is running out of disk space or a disk is failing or there is a I/O hardware failure.

It is imperative that you add more disk space or replace the failing disk or hardware as soon as possible.

Check dmesg to see what caused the problem.

Also, use the DISK_RESOURCE_REJECTIONS system table to determine the types of disk space requests that are being rejected and the hosts on which they are being rejected. See Managing Disk Space for more information about low disk space.

Read Only File System

1

The database does not have write access to the file system for the data or catalog paths. This can sometimes occur if Linux remounts a drive due to a kernel issue.

Modify the privileges on the file system to give the database write access.

Loss Of K Safety

2

The database is no longer
K-Safe because there are insufficient nodes functioning within the cluster. Loss of
K-safety causes the database to shut down.

In a four-node cluster, for example, K-safety=1. If one node fails, the fault tolerance is at a critical level. If two nodes fail, the system loses K-safety.

If a system shuts down due to loss of K-safety, you need to recover the system. See Failure Recovery in the Administrator's Guide.

Current Fault Tolerance at Critical Level

3

One or more nodes in the cluster have failed. If the database loses one more node, it is no longer K-Safe and it shuts down. (For example, a four-node cluster is no longer K-safe if two nodes fail.)

Restore any nodes that have failed or been shut down.

Too Many ROS Containers

4

Heavy load activity on one or more projections can sometimes generate more ROS containers than the Tuple Mover can handle. Vertica allows up to 1024 ROS containers per projection before it rolls back additional load jobs and returns a ROS pushback error message.

Typically, the Tuple Mover catches up with pending mergeout requests and the Optimizer can resume executing queries on the affected tables (see Mergeout).

If this problem does not resolve quickly, or if it occurs frequently, it is probably related to insufficient RAM. You can estimate the optimal amount of RAM as follows:

GbRAM / (6 * #table-cols) > 10

where #table-cols is the number of columns in the largest database table. For example, given a 100-column table, you need at least 6GB of RAM:

6144MB / (6 * 100)  = 10.24

WOS Over Flow

5

The WOS cannot hold all the data that you are attempting to load. This means that the copy fails and the transaction rolls back.

Consider loading the data to disk (ROS) instead of memory (WOS) or splitting the fact table load file into multiple pieces and then performing multiple loads in sequence.

You might also increase the frequency of moveout operations by setting the Tuple Mover's moveout operation more aggressive. See Managing the Tuple Mover in Administrator's Guide.

Node State Change

6

The node state has changed.

Check the status of the node.

Recovery Failure

7

The database was not restored to a functional state after a hardware or software related failure.

The reason for recovery failure can vary. See the event description for more information about your specific situation.

Recovery Error

8

The database encountered an error while attempting to recover. If the number of recovery errors exceeds Max Tries, the Recovery Failure event is triggered. See Recovery Failure within this table.

The reason for a recovery error can vary. See the event description for more information about your specific situation.

Recovery Lock Error

9

A recovering node could not obtain an S lock on the table.

If you have a continuous stream of COPY commands in progress, recovery might not be able to obtain this lock even after multiple re-tries.

Either momentarily stop the loads or pick a time when the cluster is not busy to restart the node and let recovery proceed.

Recovery Projection Retrieval Error

10

Vertica was unable to retrieve information about a projection.

The reason for a recovery projection retrieval error can vary. See the event description for more information about your specific situation.

Refresh Error

11

The database encountered an error while attempting to refresh.

The reason for a refresh error can vary. See the event description for more information about your specific situation.

Refresh Lock Error

12

The database encountered a locking error during refresh.

The reason for a refresh error can vary. See the event description for more information about your specific situation.

Tuple Mover Error

13

The database encountered an error while attempting to move the contents of the Write Optimized Store (WOS) into the Read Optimized Store (ROS).

The reason for a Tuple Mover error can vary. See the event description for more information about your specific situation.

Timer Service Task Error

14

An error occurred in an internal scheduled task.

Internal use only

Stale Checkpoint

15

Data in the WOS has not been completely moved out in a timely manner. An UNSAFE shutdown could require reloading a significant amount of data.

Be sure that Moveout operations are executing successfully. Check the vertica.log files for errors related to Moveout.

CRC Mismatch 16 The Cyclic Redundancy Check returned an error or errors while fetching data. Review the vertica.log file or the SNMP trap utility to review the errors. For more information see Evaluating CRC Errors.