Troubleshooting ROS Pushback
This document focuses on troubleshooting ROS pushback issues. With ROS pushback, there are 8 possible cases you might encounter. Read through this document to see each case and find solutions for the issues.
Vertica has a limit of 1024 ROS containers per projection per node. Use the following query to determine how many containers you have per projection per node:
=> SELECT node_name, schema_name, projection_name,
sum(delete_vector_count/cnt)::int + count(*) container_count
FROM storage_containers JOIN
(select projection_id , count(*) cnt from projection_columns group by 1) as proj_cols
on storage_containers.projection_id = proj_cols.projection_id
WHERE storage_type = 'ROS'
GROUP BY 1,2,3
ORDER BY 4 DESC;
Case 1: Long Running Mergeout Jobs
If you have long running mergeout jobs on wide tables (those with greater than 250 columns) or tables with large VARCHAR columns, you may need to increase the MEMORYSIZE of the Tuple Mover resource pool to 2GB per MAXCONCURRENCY and set the PLANNEDCONCURRENCY equal to the MAXCONCURRENCY. To determine if your individual mergeout jobs are long running (60 minutes or more), run the following query:
=> SELECT a.node_name, a.schema_name, b.projection_name, count(*)
FROM dc_tuple_mover_events a, dc_tuple_mover_events b
WHERE a.transaction_id = b.transaction_id
AND a.event = 'Start'
AND b.event = 'Complete'
AND b.time - a.time > interval ‘60 minutes‘
GROUP BY 1, 2, 3
ORDER BY 4 DESC ;
Case 2: Long Running Replay Deletes
If you find that you have long running replay delete operations during mergeout, you should optimize projections for replay deletes. To determine if you have long running replay deletes, run the following query:
=> SELECT a.node_name, a.schema_name, b.projection_name, count(*)
FROM dc_tuple_mover_events a, dc_tuple_mover_events b
WHERE a.transaction_id = b.transaction_id
AND a.event = ‘Change plan type to Replay Delete'
AND b.event = 'Complete'
AND b.time - a.time > interval ’60 minutes‘
GROUP BY 1, 2, 3
ORDER BY 4 DESC ;
If your projections are not optimized for deletes you can counteract this issue by setting the ReplayDeleteAlgorithmSwitchThreshold parameter to 0. For more information, see Understanding the Vertica Replay Delete Algorithms.
Case 3: Hitting ROS Pushback
You may also find that the number of ROS containers per projection per node is close to the ROS pushback limit, but projections are not being chosen for mergeout by the Tuple Mover. This can happen if the number of partitions per table is close to the ROS pushback limit of 1024. You can determine the number of partitions by running the following query:
=> SELECT node_name , table_schema,
projection_name, count(distinct partition_key)
FROM PARTITIONS
GROUP BY 1,2,3
ORDER BY 3 DESC;
To counteract this, you must MOVE or DROP partitions outside the retention policy into a different table.
We recommend using hierarchical partitioning, a feature available in Vertica 9.0 and later. Hierarchical partitioning is designed to help with the case described above.
Case 4: Pending Partition Reorganize
When the partition expression on a table is changed, the existing data in the table must be reorganized to match the new partition expression. If you have ROS containers that are pending reorganization, these contains are not considered for mergeout. Run the following query to determine if this is the issue you are encountering:
=> SELECT table_schema, projection_name
FROM partition_status
WHERE partition_reorganize_percent <> 100;
To resolve this issue, run the following:
-ALTER TABLE <table_name> REORGANIZE;
Case 5: Frequent Loads into Inactive Partitions
You may have at ETL workload that frequently loads data into inactive partitions. By default, Vertica has one active partition (the most recently created partition) and uses the strata algorithm to merge ROS containers in this partition. Vertica merges ROS containers from inactive partitions into a single ROS container per partition. If you frequently load data into inactive partitions, the mergeout operation might be doing a lot of I/O. You can avoid this by setting the ActivePartitionCount configuration parameter to a value that offsets this workload. Some customers have seen significant cluster performance improvements after correctly setting this parameter.
For example, if you have partitioned a table by day and receive data for three days, set the ActivePartitionCount to 3. You may run the following Linux grep command to find the number of mergeouts on inactive partitions:
grep "Projection chosen =" vertica.log | grep "stratumNo = -1" | wc -l
Case 6: WOS Spillover
The Tuple Mover moveout operation may be unable to catch up, which results in WOS spillover. Run the following query to identify the number of WOS spillover by node:
=> SELECT node_name, count(*) from dc_execution_engine_events
WHERE event_type = 'WOS_SPILL'
GROUP BY node_name;
WOS spillover can happen for a number of reasons. To resolve this issue, do the following:
- Enable the reflexiveMoveout configuration parameter.
- Use the /*+DIRECT*/ hint for loading files larger than 100MB per node.
- Avoid parking uncommitted data in WOS because uncommitted data is not moved out.
- Use the /*+DIRECT*/ hint to load large temporary tables because temporary tables are not moved out.
Case 7: Too Many Trickle Loads
If you have too many small (trickle) loads that use the /*+DIRECT*/ hint, you will create ROS containers at high rates and the mergeout operation may fall behind. You should not use the /*+DIRECT*/ hint for small trickle loads.
If that does not work, you can increase the PLANNEDCONCURRENCY and MAXCONCURRENCY of the Tuple Mover resource pool from 3 to 4. You should never increase these values to more than 6 because it can have a negative impact on cluster performance.
Case 8: Reduce MergeoutInterval and MoveoutInterval
The interval determines how long the Tuple Mover sleeps when no projection qualifies for mergeout and moveout, respectively. Reducing the interval will help only if you are creating ROS containers at a high rate when the mergeout operation is sleeping.
You can lower the MergeoutInterval configuration parameter from its default value of 600 seconds.
For more information, see Tuple Mover Best Practices in the Vertica Knowledge Base.