![]() |
|
The Rise of Column Databases and MPP Databases for Data WarehousingOver the last decade, data warehouses have been pushed to answer more ad-hoc questions from more people analyzing vastly larger volumes of data warehouse data, often in real time. The problem is, traditional data warehouse databases like Oracle, were written 30 years ago and struggle to support the analysis of large, fast-growing volumes of data warehouse data by lots of users. Recently, scores of new data warehouse databases have entered the market. There are two major DBMS innovations transforming data warehousing: Column DatabasesUnlike row databases such as Oracle or MySQL, column databases answer queries much faster because they only read the data columns referenced by the query (row databases must scan every row and column in the table). Columnar data is also more compressible than row data because the values in a column are much more homogenous. Column database compression can reduce database size by a factor of 20x, which further speeds up performance and lowers storage costs. MPP DatabasesAlso known as shared-nothing databases, massively parallel processing (MPP) technology allows data warehouse databases to work faster by spreading (partitioning) the data and the workload across a cluster of inexpensive servers. MPP databases can also grow bigger and more inexpensively than traditional databases; just add servers to the MPP cluster to increase performance or storage. MPP Column DatabaseWhile many of the new data warehouse database products support either MPP (e.g., Netezza) or Column (Sybase IQ), Vertica is the only data warehouse database to include BOTH of these innovations as well as many other innovations like automatic tuning, recovery and high-availability, trickle loading, and just-in-time decompression, etc. As a result, Vertica runs faster on less hardware than any other data warehouse database. |
Latest from Vertica
![]() |