How Do I Manage Delete Vectors?

Posted June 14, 2018 by Soniya Shah, Information Developer

If you want to remove delete vectors manually or troubleshoot why they are not being removed automatically, follow this checklist.
Step Task Results
1 Check if you have too many delete vectors (more than 100) in a projection. =>SELECT node_name, schema_name, projection_name, COUNT(*) num_dv, SUM(deleted_row_count) del_cnt, SUM(used_bytes) ubytes, MIN(start_epoch) min_epoch, MAX(start_epoch) max_epoch FROM delete_vectors GROUP BY 1,2,3 ORDER BY 4 DESC; If there are too many delete vectors as shown by the num_dv (usually more than 100), go to Step 2.
2 Check if the AHM and LGE are advancing. =>SELECT current_epoch,ahm_epoch,last_good_epoch FROM system; If the AHM and LGE are advancing, go to Step 3. If they are not advancing, review Epoch Management checklist.
3 Check if the epochs in the output of Step 1 are older than the AHM in the output of Step 2. If the result of the Step 1 query shows that the deleted rows have a newer epoch than the AHM, go to Step 4.
4 Advance the AHM with: =>SELECT make_ahm_now(); Do one of the following:
  • Wait for the mergeout to remove the delete vectors when the correct parameters are met.
  • Go to Step 5 to ensure they will be removed immediately.
If you’re waiting for the mergeout to remove the delete vectors as part of its standard process, the checklist is now complete. If you want to manually initiate a removal of the mergeout, go to Step 5.
5 Check if the percentage of delete vectors marked meets the PurgeMergeoutPercent. This configuration parameter dictates when the vectors are automatically purged by the mergeout operation.
  • To check the PurgeMergeoutPercent: =>SELECT get_config_parameter('PurgeMergeoutPercent');
  • To check the percentage of delete vectors marked: =>SELECT schema_name, projection_name, ((deleted_row_count/total_row_count)*100)::int per_deleted, total_row_count, deleted_row_count FROM storage_containers WHERE deleted_row_count > 0 ORDER BY 3 DESC;
If the delete vector percentage is more than the PurgeMergeoutPercent and the ROS size is not too big, you have two options. Either:
  • Wait until the mergeout removes the delete vectors.
  • Force a purge by going to Step 5a.
If the delete vector percentage is not more than the PurgeMergeoutPercent but there are too many delete vectors, go to Step 5b. If the delete vector percentage is greater than 50% of the rows in the table, do one of the following:
  • Force a purge (Step 5a).
  • Create a new table with the data that is not deleted ( Step 5c).
a. Force a purge by doing one of the following:
  • To remove all delete rows with an epoch older than the AHM, use PURGE_TABLE or PURGE_PROJECTION.
  • If your table is partitioned and a major concentration of deleted rows are in one particular partition, use PURGE_PARTITION.
The checklist is complete!
b. Merge the delete vectors to reduce the number of files on disk without rewriting the data. =>SELECT DO_TM_TASK('dvmergeout'); The checklist is complete!
c. Create a new table with the data that is not deleted. To do this you must first create a new table (B) like the original table (A), including projections using the CREATE TABLE command. Then execute the following commands: =>INSERT into B SELECT * from A; DROP TABLE A; ALTER TABLE B RENAME to A; The checklist is complete!

Learn More

Learn more about Best Practices for Deleting Data and check out Deletes in Vertica: The FAQS in Vertica Knowledge Base.