Vertica Analytic Database
login Vertica Quick Links
Vertica Database for Hadoop and MapReduce
Enterprise Data Warehousing with Vertica:  Speed, Scalability, Savings and Simplicity...24 x 7

 

> Using Vertica as a Structured Data Repository for Apache Hadoop

The Hadoop MapReduce implementation can be a powerful tool for running complex procedural algorithms on structured data stored in a distributed collection of relational databases.

If the data being processed happens to reside in Vertica Databases, then Hadoop developers can crunch more data, on less hardware, faster than MySQL, Postgres or any other DBMS.

Vertica has implemented a version of the Cloudera DBInputFormat interface that makes it easy for Hadoop developers to push down Map operations to Vertica databases in parallel by specifying parameterized queries which result in pre-aggregated data for each mapper. The interface can also be used by Hadoop Reduce operations to stream data into Vertica for on-going reporting and analysis by end users. In summary, developers can take advantage of the full power of both tools, and break Map operations into pieces that run inside Vertica, and inside Hadoop, to best use the processing power and expressive elegance of the two different tools.

Why does Vertica make such a good database for Hadoop?

  • Column orientation – runs queries 50x - 200x faster than a row DBMS like MySQL
  • Aggressive compression – Stores 10x-30x more data per server than row databases
  • Hardware-independent licensing – Pay per TB of data, not per server
  • Deployment Flexibility – Grow cluster as needed by deploying new VMware Vertica instances on available hardware in your data center

Vertica - Hadoop Applications

  • Parse and load log files
  • Sessionize clickstream data
  • Analyze financial trades and quotes (TAQ), network events and other time-series data
  • Track social media content for keyword or keyword pair occurrences over time
 

 

 
 
Downloads

Vertica Connector for Hadoop

Vertica Connector for Pig (Preview)
 
Data Sheets

The Vertica Hadoop Connector
 
White Papers

Managing Big Data with Hadoop and Vertica

New Paradigms for High Performance Analytical Computing
 
Expert Opinion

Vertica-Hadoop interface discussed at DBMS2.com...

Baking MapReduce into Database Engines - Worth the Reduction Sauce?...

5 Common Questions About Hadoop...