At the VLDB 2012 conference a few weeks ago, we had a chance to listen to Jiri Schindler giving a tutorial about NoSQL. His interesting and informative presentation covered the fundamental architecture and I/O usage patterns of RDBMS systems and various NoSQL data management systems, such as HBase, Cassandra, and MongoDB.
During the presentation, Schindler listed basic I/O access patterns for columnar databases using the slide below. It is hard to summarize the operation of the various columnar database systems on a single slide, and Schindler did a great job given the constraints of the presentation. However, while his characterization might hold for other columnar databases, the Vertica Analytic Database has a different I/O pattern for UPDATEs which we wanted to explain in more detail.
First, Vertica does not require synchronous I/O of a recovery log. Unlike most other RDBMS systems, Vertica implements durability and fault tolerance via distributed replication.
Second, since Vertica never modifies storage in place, it avoids the other I/O intensive operations referenced in the slide.
When a user issues an UPDATE statement, Vertica performs the equivalent of a delete followed by an insert. The existing row is deleted by inserting a Delete Vector (a small record saying that the row was deleted), and a new copy of the row with the appropriately updated columns is inserted. Both the Delete Vector and the new version of the row are stored in a memory buffer known as the WOS (write optimized store). After sufficient data has accumulated in the WOS from INSERTs, UPDATEs, DELETEs, and COPYs (bulk loads), they are moved in bulk to disk storage known as the ROS.
It is important to note that existing files in the ROS are not modified while data is moved from WOS to the ROS – rather a new set of sorted and encoded column files is created. To avoid a large number of files accumulating over time, the Tuple Mover regularly merges column files together using an algorithm that limits the number of times any tuple is rewritten and also uses large contiguous disk operations, which is quite efficient well on most modern file and disk systems.
This arrangement has several advantages:
- From the user’s point of view, the update statement completes quickly and future queries get the expected answer (by filtering out the original values at runtime using the Delete Vectors).
- The cost of sorting, encoding, and writing column files to disk is amortized over a large number of rows by utilizing the in memory WOS.
- I/O is always in proportion to the number of rows inserted or modified – it is never the case that an update of a small number of rows causes I/O on a significant amount of previously stored data.
More details about how data is stored and Vertica’s overall architecture and design decisions, please consider reading our VLDB 2012 paper.