Event Severity Types

Event names are sensitive to case and spaces. Vertica logs the following events:

Event Name Event Type Description Action

Low Disk Space

0

The database is running out of disk space or a disk is failing or there is a I/O hardware failure.

It is imperative that you add more disk space or replace the failing disk or hardware as soon as possible.

Check dmesg to see what caused the problem.

Also, use the DISK_RESOURCE_REJECTIONS system table to determine the types of disk space requests that are being rejected and the hosts on which they are being rejected. See Managing Disk Space for more information about low disk space.

Read Only File System

1

The database does not have write access to the file system for the data or catalog paths. This can sometimes occur if Linux remounts a drive due to a kernel issue.

Modify the privileges on the file system to give the database write access.

Loss Of K Safety

2

The database is no longer
K-Safe because there are insufficient nodes functioning within the cluster. Loss of
K-safety causes the database to shut down.

In a four-node cluster, for example, K-safety=1. If one node fails, the fault tolerance is at a critical level. If two nodes fail, the system loses K-safety.

If a system shuts down due to loss of K-safety, you need to recover the system. See Failure Recovery in the Administrator's Guide.

Current Fault Tolerance at Critical Level

3

One or more nodes in the cluster have failed. If the database loses one more node, it is no longer K-Safe and it shuts down. (For example, a four-node cluster is no longer K-safe if two nodes fail.)

Restore any nodes that have failed or been shut down.

Too Many ROS Containers

4

Due to heavy data load conditions, there are too many ROS containers. This occurs when the Tuple Mover falls behind in performing mergeout operations. The resulting excess number of ROS containers can exhaust all available system resources. To prevent this, Vertica automatically rolls back all transactions that would load data until the Tuple Mover has time to catch up.

You might need to adjust the Tuple Mover's configuration parameters to compensate for the load pattern or rate. See Managing the Tuple Mover in the Administrator's Guide for details.

You can query the TUPLE_MOVER_OPERATIONS table to monitor mergeout activity. However, the Tuple Mover does not immediately start a mergeout when a projection reaches the limit of ROS containers, so you may not see a mergeout in progress when receiving this error.

If waiting for a mergeout does not resolve the error, the problem probably is related to insufficient RAM.. A good rule of thumb is that system RAM in MB divided by 6 times the number of columns in the largest table should be greater than 10. For example, for a 100 column table you would want at least 6GB of RAM (6144MB / (6 * 100) = 10.24) to handle continuous loads.

WOS Over Flow

5

The WOS cannot hold all the data that you are attempting to load. This means that the copy fails and the transaction rolls back.

Consider loading the data to disk (ROS) instead of memory (WOS) or splitting the fact table load file into multiple pieces and then performing multiple loads in sequence.

You might also consider making the Tuple Mover's moveout operation more aggressive. See Managing the Tuple Mover in Administrator's Guide.

Node State Change

6

The node state has changed.

Check the status of the node.

Recovery Failure

7

The database was not restored to a functional state after a hardware or software related failure.

The reason for recovery failure can vary. See the event description for more information about your specific situation.

Recovery Error

8

The database encountered an error while attempting to recover. If the number of recovery errors exceeds Max Tries, the Recovery Failure event is triggered. See Recovery Failure within this table.

The reason for a recovery error can vary. See the event description for more information about your specific situation.

Recovery Lock Error

9

A recovering node could not obtain an S lock on the table.

If you have a continuous stream of COPY commands in progress, recovery might not be able to obtain this lock even after multiple re-tries.

Either momentarily stop the loads or pick a time when the cluster is not busy to restart the node and let recovery proceed.

Recovery Projection Retrieval Error

10

Vertica was unable to retrieve information about a projection.

The reason for a recovery projection retrieval error can vary. See the event description for more information about your specific situation.

Refresh Error

11

The database encountered an error while attempting to refresh.

The reason for a refresh error can vary. See the event description for more information about your specific situation.

Refresh Lock Error

12

The database encountered a locking error during refresh.

The reason for a refresh error can vary. See the event description for more information about your specific situation.

Tuple Mover Error

13

The database encountered an error while attempting to move the contents of the Write Optimized Store (WOS) into the Read Optimized Store (ROS).

The reason for a Tuple Mover error can vary. See the event description for more information about your specific situation.

Timer Service Task Error

14

An error occurred in an internal scheduled task.

Internal use only

Stale Checkpoint

15

Data in the WOS has not been completely moved out in a timely manner. An UNSAFE shutdown could require reloading a significant amount of data.

Be sure that Moveout operations are executing successfully. Check the vertica.log files for errors related to Moveout.

CRC Mismatch 16 The Cyclic Redundancy Check returned an error or errors while fetching data. Review the vertica.log file or the SNMP trap utility to review the errors. For more information see Handling CRC Errors.