On Hadoop

Product
- Product Overview
  
  Product Overview
  
  Vertica delivers unified analytics and machine learning at unprecedented speed, scale, and value.
  
  Learn More
- Product
  - Vertica Accelerator,
    Vertica-as-a-Service
    - Vertica SaaS offering
    - Built on and delivers all the functionality of the Vertica Unified Analytics Platform
    - Automated administration and runs in your own AWS account
- second column
  - Vertica Unified Analytics Platform,
    Customer-Managed Software
    - Bring Your Own License (BYOL) analytics software
    - Runs on-premises, hybrid, multi clouds, and containerized
    - Advanced analytics, in database ML, and data lake query engine
- Product Resource
  
  Vertica Announces Vertica 12 for Future-Proof Analytics
  
  Latest version of analytics database enables more deployment flexibility, advanced analytics, and enhanced machine learning
Industries
- Solutions Overview
  
  Featured Use Case:
  Customer Behavior Analytics
  
  Customer centricity is a mission critical initiative across industries. Unify customer data, deliver personalized, omni-channel experiences, and grow and retain your customer base.
  
  Learn More
- Industries
- Industries
- Industries Resource
  
  Harness the Internet of Things (IoT)
  
  IoT data is expected to grow exponentially across industries. Learn how to leverage sensor data at massive scale for business and customer value.
  
  Read On
Support & Services
- Support Resource
  
  Support & Services
  
  Access subscription-based pricing: New customers eligible for a 50% discount.
  
  Act now
- Support Links
- Documentation
- Downloads
Partners
- Partners Overview
  
  Partners
  
  Tight integration with and support from leading technology and solution providers.
  
  Learn More
- Partners
- col 2
  - 3rd Party Technology Partner Integration
  - Quickstarts
- Partners Resource
  
  Vertica Inside – Embedded Analytics at Scale
  
  Seize the huge growth opportunity for OEM software developers
Resources
- Resource Library
  
  Resources
  
  Explore our Thought Leadership library, including the most recent articles, webcasts and reports, with expert insights.
  
  Browse Resources
- Resource Library
  - Blog
  - Case Studies
  - Demos
  - eBooks
  - Infographics
  - Videos
- Webcasts
- What is…analytics and database technology topics
About
- About Vertica
  
  About Vertica
  
  Built for Fast. Built for Freedom.
  
  Learn More
- col 2
  - Careers
  - Contact us
- About Vertica
  - News & Recognition
  - Events
- About Resource
  
  Stay Informed
  
  Sign-up to receive our monthly newsletter.
  
  Subscribe
  
  Latest newsletter
Try Vertica
My Account

Derive maximum value from your Hadoop data lake with Vertica

Analyze your data in place to expand your Hadoop investment and unlock more timely insights

Many data-driven companies have adopted HDFS to collect, store, and manage large volumes of varying forms of data. However, multiple copies of everything, inadequate concurrent data access, and an overall lack of analytical performance has resulted in limited business value.

Vertica provides a Unified Analytics Warehouse so organizations can finally bring the data lake and data warehouse together. You can analyze Parquet and ORC including complex data types in place, and optimize key data sets for blazing fast analytics, all using HDFS for safe distributed storage. With Vertica, your BI and data science teams can analyze data at once to squeeze maximum value out of your Hadoop investment.

Read the data sheet

Advanced analytics

Gain full-functionality ANSI SQL capability, not a subset of commands. Run 100% of TPC-DS benchmark queries with no modification.

More options for querying HDFS-managed data

Run Vertica as a SQL on Hadoop query engine to data stored in any major Hadoop distribution, including Cloudera and HPE MapR. And, with Vertica in Eon Mode for HDFS communal storage, Vertica ROS data is stored in HDFS for you to apply the full functionality of Vertica’s advanced analytics and machine learning to this data.

Broad data format and complex data type support

Query data across Parquet, ORC, JSON, and many other data formats and analyze complex data types in Parquet formats on HDFS and S3.

Comprehensive analysis

Use External Tables to analyze data in ORC and Parquet on the same HDFS nodes and execute JOINs between ROS data and External Tables for a more comprehensive view of all your data.

“Vertica 10 introduced expanded support for the analysis of semi-structured data types, especially the complex data types found in Maps, Arrays, and Structs in Parquet data. Instead of replicating the data in Vertica to run a query, Vertica can access the data directly in HDFS or S3 object storage. This eliminates the need for data storage duplication and enables much quicker answers to questions requiring both data stored in Vertica and in other data platforms.”

– John Santaferraro, Analyst, Enterprise Management Associates

Read the EMA White Paper

Maximize your Hadoop investments

If you need to keep some or all of your big data analytics on-premises, or on your Hadoop installation using commodity hardware, Vertica is the unified analytics warehouse you need. Because Vertica runs independently of your infrastructure, you can create a variety of hybrid deployments, including a mix of cloud, on-prem, and Hadoop resources.

Want to get technical? Read the “Hadoop Integration Guide”

Unifying the Data Warehouse and the Data Lake

Vertica offers the fastest way to perform SQL queries on your Hadoop data. Vertica SQL on Apache Hadoop® supports data discovery on your Hadoop data lake as well as highly optimized analytics for the most demanding SLAs. You can use HDFS as a shared object storage layer, and import data from HDFS to Vertica on-premises, as needed, via Vertica in Eon Mode for HDFS communal storage. You can even combine that data with AWS S3 data as well for an extensive hybrid environment that is as flexible as your big data storage and compute deployment needs to be.

Read about Vertica SQL on Apache Hadoop

Why you need an analytical database for big data on Hadoop

Although they offer an inexpensive way to store data, Hadoop-based solutions are no match for Vertica’s unified analytical warehouse, specially designed for big data analytics. Learn how fast and complete the Vertica SQL on Hadoop engine is, as we put the TPC-DS benchmarks to the test against Impala, Hive on Tez, and Apache Spark. Read the results in this report.

Read the benchmark study

Vertica in Eon Mode support for HDFS

Are you looking for more alternatives to cloud storage and compute for your big data analytics? Vertica in Eon Mode now supports HDFS communal storage for additional on-premises deployment with a durable master copy of ROS files on HDFS. This extends Vertica’s deployment flexibility as the only analytical data warehouse that separates compute from storage for both on-premises data centers and multiple public clouds.

Read more about Vertica in Eon Mode

Hadoop Transition Service

Are you experiencing slow queries or a high incidence of query failures? The culprit is likely your open source query engine. Impala, Hive, and Presto are fine for ad-hoc data exploration, but they were designed for small teams of data scientists, not for enterprise organizations that require optimal performance for hundreds of concurrent users. With the Hadoop Transition Service, your organization can easily migrate your open source query engine tools over to Vertica, deriving even greater value from your HDFS data lake.

This streamlined service combines technical experts, proven migration methodology, and an end-to-end transition based on industry best practices.

Read about Vertica Transition Services for Hive, Impala, and Presto

How can you get the value out of Hadoop that you were promised?

Since Hadoop’s initial release 14 years ago, untold volumes of data have been stored in HDFS (Hadoop Distributed File System). Spread across a virtual landscape of data-inspired organizations, those data lakes are wide, and deep. Companies have made tremendous investments in Hadoop over the years, and data continues to pour into their data lakes.

You might ask, have those been wise investments? We think so. Despite what the naysayers are claiming about Hadoop itself these days, it’s still true that vast quantities of data from useful sources can reveal lucrative patterns that make massive data collection worthwhile. Unfortunately, we believe most of those revelations are still out there to be made. The problem is that many analytics teams are using open-source query engines that were designed to work with their Hadoop distros. Those query engines are simply not providing the insights that are possible from HDFS data lakes.

Read more

View All Resources

Product Overview

Vertica Announces Vertica 12 for Future-Proof Analytics

Harness the Internet of Things (IoT)

Support & Services

Partners

Vertica Inside – Embedded Analytics at Scale

Resources

About Vertica

Stay Informed