Vertica

Vertica's Blog

Thoughts About HP Vertica for SQL on Hadoop

Et voilà

Recently, HP has announced HP Vertica for SQL on Hadoop. We’ve leveraged our years of experience in big data analytics and opened up our platform to allow users to tap into the full power of Hadoop. It’s a rich, fast, and enterprise-ready implementation of SQL on Hadoop that we’re very proud to introduce.

We know that you have choice when it comes to SQL-on-Hadoop engines. There are several SQL on Hadoop engines on the market for a reason – they are very powerful way to perform analytics on big data stored in Hadoop by using the familiar SQL language. Users are able to leverage any reporting or analytical tool to analyze and study the data rather than write their own Java and Map/Reduce code.

However, not all SQL-on-Hadoop is created the same. We think HP Vertica for SQL on Hadoop has some very big differences. These include:

  • Platform Agnostic – When you adopt a SQL on Hadoop query engine, it may be stuck to one distribution of Hadoop. Not so with HP Vertica for SQL on Hadoop. Our implementation works with Hortonworks, Cloudera and MapR distributions.
  • SQL Completeness – The richer the SQL engine, the wider the range of analytics that you can perform with extensive coding and data movement. You get a very rich set of analytical functions with HP Vertica for SQL on Hadoop. HP Vertica offers enterprise-ready, advanced analytics that support JOINs, complex data types, and other capabilities only available from our SQL on Hadoop implementation.
  • Manageability – Tools for managing queries and managing the resources of your cluster are fairly scarce and immature in the Hadoop world. However, with some of the tools we include, you can divide resources among different queries and different types of queries. If unplanned and resource-intensive queries have to be cancelled or temporarily interrupted, they can be.
  • Data Source Transparency – It’s important to allow you to query common data standard storage formats such as Parquet, Avro and ORC. When you can use native formats, you avoid having to move the data.
  • Path to Optimization – When you need to boost performance, HP Vertica for SQL on Hadoop offers optimizations like compression, columnar storage, and projections

You can’t really forget the fact that this offering comes from HP Software. Users should be able to take advantage of all the power of our Haven platform for big data. Encompassing proven technologies from HP Software, including Autonomy, Vertica, and ArcSight, Haven enables forward-thinking organizations to make use of virtually all information sources from both inside and outside its four walls to make better, faster decisions.

Download the report here.

See more

And more…

HP Vertica Storage Location for HDFS

Do you find yourself running low on disk space on your HP Vertica database? You could delete older data, but that sacrifices your ability to perform historical queries. You could add new nodes to your cluster or add storage to your existing nodes. However, these options require additional expense.

The HP Vertica Storage Locations for HDFS feature introduced in HP Vertica Version 7.1 offers you a new solution: storing data on an Apache Hadoop cluster. You can use this feature to store data in a Hadoop Distributed File System (HDFS) while still being able to query it through HP Vertica.

Watch this video for an overview of the HP Vertica Storage Locations for HDFS feature and an example of Read More »

HP Vertica Best Practices: Native Connection Load Balancing

You may be aware that each client connection to a host in your HP Vertica cluster requires a small overhead in memory and processor time. For a single connection, this impact is minimal, almost unnoticeable. Now imagine you have many clients all connecting to the same host at the same time. In this situation, the compounded overhead can potentially affect database performance.

To limit the database performance consequences caused by multiple client connections, you might manually assign certain client connections to certain hosts. But this can become tedious and difficult as more and more client connections are added. Luckily, HP Vertica offers a feature that can do all this for you. It’s called native connection load balancing.

Native Read More »

Workshop on Distributed Computing in R

R-icon_167_167 R is used by millions of data scientists. In the near future, these data scientists will have to rely on distributed computing to meet the computational demands of Big Data. Wouldn’t it be helpful if R provides simple ways to harness the power of multiple servers?

HP is hosting an R workshop January 26-27, 2015 where R users will brainstorm on this topic. The workshop is being organized by Indrajit Roy, Principal Researcher at HP Labs, and Michael Lawrence, R-core member at Genentech. A number of well-known R contributors, including members affiliated with universities, national labs, and the industry, are going to present their views at the workshop.

Here Read More »

What Is a Range Join and Why Is It So Fast?

chuck5

Last week, I was at the 2015 Conference on Innovative Data Systems Research (CIDR), held at the beautiful Asilomar Conference Grounds. The picture above shows one of the many gorgeous views you won’t see when you watch other people do PowerPoint presentations. One HP Vertica user at the conference said he saw a “range join” in a query plan, and wondered what it is and why it is so fast.

First, you need to understand what kind of queries turn into range joins. Generally, these are queries with inequality (greater than, less than, or between) predicates. For example, a map of the IPv4 address space might give details about addresses Read More »

The HP Vertica Community is Moving!

The HP Vertica online community will soon have a new home. In the next few months, we’ll be joining the Big Data and Analytics Community, part of the HP Developer Community, located at https://community.dev.hp.com/.

Why are we doing this?

We’re joining the new community so that you’ll have a centralized place to go for all your big data questions and answers. Using the Big Data and Analytics Community, you will be able to:

Connect with customers across all our Big Data offerings, including HP Vertica Enterprise and Community Editions, HP Vertica OnDemand, HP IDOL , and HP IDOL OnDemand. Learn more about HP Haven, the HP Big Data Platform that allows you to harness 100% of your data, including Read More »

HP Vertica Gives Back this Holiday Season

EastEndHouseThanks

This holiday season, four teams of HP Vertica employees and families made a trip to East End House in Cambridge, MA to help with the annual Thanksgiving Basket Giveaway. If this organization sounds familiar, you might have read our blog about our summer interns visiting the same location to work with students to build bridges made of toothpicks and gumdrops.

This time around, Vertica volunteers assisted with a program that provided food to individuals and families for Thanksgiving. On Monday, the team helped stuff hundreds of bags with donated goods like whole frozen turkeys, boxed stuffing, canned fruits and vegetables, potatoes, and even fresh kale. They bagged over 22 thousand Read More »

Get Started With Vertica Today

Subscribe to Vertica