Archive for the ‘vertica’ Category

Taking a Moonshot at Big Data Analytics for Everyone

HP Vertica is very excited about Monday’s announcement of the HP Moonshot system.

Why? Because we believe that the combination of the HP Vertica Analytics Platform running on the HP Moonshot Servers offers a truly game-changing value proposition for a variety of customers, and new segments of the market.

Moonshot is, simply put, a groundbreaking system which offers customers the ability to rapidly deploy, scale and manage with dramatically lower space and energy constraints. While traditional IT services that support business functions will continue to be served by general purpose server infrastructure, a new computing platform is required for specialized workloads that can deliver innovative solutions to market at unprecedented speed and scale.


We’ve already successfully tested the HP Vertica Analytics Platform on HP Moonshot Servers, and achieved very comparable performance to traditional Big Data Analytics hardware across certain performance ranges, which for a large segment of the market is more than sufficient to handle their Big Data Analytics loads – while offering very significant potential cost, space and energy savings.

Running Vertica on Moonshot offers yet another proof point of the unmatched value provided by HP’s combination of Information Optimization solutions, and a great example of the opportunity created by innovation that makes us so excited to be a part of the greater OneHP.

To learn more about HP Project Moonshot, visit

Join HP Vertica’s User-Driven Community!

We at HP Vertica are very excited to announce our new user-driven community that will help us better serve our users, partners, and anyone generally interested in learning more about the HP Vertica Analytics Platform. In partnership with GetSatisfaction, community members can now:

  • Start engaging with other customers to establish new and valuable relationships
  • Add context to your past issues and get the best possible answers to your questions
  • Wield your influence as subject matter expert in the community

Now, HP Vertica newbies as well as seasoned database/data analytics veterans and everyone in between can post questions, share ideas, report issues, and give praise. To make this information readily available for your convenient access, all questions are cataloged and searchable.

The new-and-improved community interface will enable you to easily access more information and better communicate with HP Vertica and other users of the HP Vertica Analytics Platform.

We welcome you to join our community by visiting or by accessing the community tab on the side of the homepage.

Big Data Analytics without Big Data Complexity

New analytics deployments can be complex, taking up to 18 months to implement and optimize. The complexity of maintaining and integrating these environments often results in missed deadlines, incomplete projects, increased costs, and lost opportunities. In fact, only 32 percent* of application deployments are rated as “’successful”’ by organizations.

To remove this Big Data complexity, we are pleased to announce the general availability of the HP AppSystem for Vertica. Following through on the initial announcement at HP Discover as part of the HP AppSystems portfolio, the HP AppSystem for Vertica ensures system performance and reduces implementation time from months to a matter of hours.

But what is an AppSystem and is it right for you?

Built on the HP Converged Infrastructure, the new HP AppSystem for Vertica is a fully pre-integrated technology stack that includes a specifically optimized hardware configuration, factory pre-loaded OS, and the HP Vertica Analytics Platform environment.

HP AppSystem for Vertica is ideal for organizations interested in accelerating time-to-business value with high-performance, massively scalable analytics at each layer of IT infrastructure — server, storage, network, and management. As a result, you can scale seamlessly, while adding capacity as your analytics needs for Big Data evolve.
We encourage you to learn more about the HP AppSystem for Vertica — and get started removing complexity to capitalize on your big data analytics initiatives.

* = CHAOS Summary 2009, Jim Johnson, Standish Group, April 2009

The Disruptive Power of Big Data

Aside from the sheer quantity of digital data created every day—about 2.5 exabytes1 —there’s more to Big Data than volume. Big Data offers enterprise leaders the opportunity to dramatically change the way their organizations operate to gain competitive advantage and find new revenue opportunities. But realizing the value Big Data promises requires a new approach. Traditional data warehouses and business intelligence tools weren’t built for the scale of Big Data, and can’t provide insight quickly enough to be useful or even keep up.

But this isn’t just a case of data growth outstripping technology growth. Big Data embodies fundamental differences that necessitate new approaches and new technologies. Big Data takes many forms, three in particular we’ll discuss here:

  • Transactional data
  • Sentiment and perceptual data based on conversations taking place in social media
  • Data from networked sensors—the so-called “Internet of Things”

Transactional Data

As businesses have expanded—and expanded onto the Internet—the volume of business transactions has grown. The Economist reported in 2010 that Wal-Mart processes more than 1 million customer transactions every hour and maintains databases exceeding 2.5 petabytes (million gigabytes)2. Imagine how those numbers have grown since then.

What’s even more critical is that companies can now capture not just sales transactions, but the detailed histories and clickstreams that lead to the sale. From web-based clickstream analysis to call data records, pre- and post-transaction histories are more robust than ever—and our ability to collect, analyze and act on that data must adjust accordingly.

The social media explosion

Today’s online customer has progressed well beyond accessing information. Today’s consumers are not only interacting and collaborating with each other, but they’re talking about and interacting with your brand. Facebook has more than 1 billion active subscribers3, and it’s estimated they share almost 700,000 individual pieces on content every minute. On Twitter, more than a billion tweets go out every two to three days4. (You can watch them mapped geographically in real-time at

Product reviews, user communities, forums and blogs allow consumers to generate content that contains critical insight for the business. The proliferation of user-generated content in these social channels has lead to new techniques and tools for “sentiment analysis”—the ability to measure emotion to determine how your company and brand are perceived.

The Internet of Things

The amount of information generated by devices rather than people is also growing explosively.
Mobile devices—and the apps people use on them—regularly broadcast individuals’ location, performance and other factors to the network. Retailers and distributors are using radio frequency identification (RFID), bar and QR codes to track inventory and enhance their supply chain and inventory performance. The healthcare industry seeks to improve care and reduce costs through remote patient monitoring. The automotive industry is embedding sensors in vehicles. And utilities are beginning to rely on smart meters to track usage. McKinsey Global Institute reports that more than 30 million networked sensors are in use in the transportation, automotive, industrial, utilities and retail sectors—and the number is growing by 30 percent every year.5

We recently presented a webinar on the Internet of Things and the Power of Sensor Data, which delves into this exciting area in much more detail.

Disrupting conventional analytics – developing a ‘conversational relationship with data’

Using Big Data to make operations more efficient, improve competitiveness and increase revenue is not about generating traditional statistics or producing standard reports.

Just as important as systems to collect and store data are systems to analyze and extract insight from that data. Without insight, you can’t gain new knowledge into your markets, your products and your operations.

When you have this insight at your disposal, you can act faster and with greater probability of success.

Extracting business value from Big Data requires a new approach. We believe that Big Data analytics is an iterative process. We describe it as developing a conversational relationship with your data. Analytics becomes a continuous improvement loop, which uses the results of analyses to frame better, more meaningful analyses, which, in turn, produce more definitive results. When results are available in minutes, analysts can ask, “What if?”

When properly applied, Big Data analytics enables business leaders to:

  • Understand market reaction and brand perception
  • Identify key buying factors
  • Segment populations to customize actions
  • Enable experimentation
  • Accurately predict outcomes
  • Reinvent and enhance inventory and supply chain systems and processes
  • Disrupt their industries, gain an edge over competitors and enable new business models

Big Data already proved its game-changing power during the 2012 U.S. presidential election. Obama campaign chairman Jim Messina said: “We were going to demand data on everything, we were going to measure everything…We were going to put an analytics team inside of us to study us the entire time to make sure we were being smart about things.”
And, in fact, Big Data analytics helped the Obama campaign ratchet up the three key levers in any election: voter registration, persuasion and turnout. Rolling Stone magazine singled out Messina and the campaign’s CTO, Harper Reed, as two among a handful of unsung heroes in Obama’s victory.

You can hear more about how HP Vertica contributed to the high-tech strategy behind Obama’s reelection in a recent webinar featuring Chris Wegrzyn, director of data architecture for the Democratic National Committee.

The traditional data warehouse won’t get it done

The concept of the data warehouse evolved in the 1980s. Then, data warehouses were simply databases into which data from multiple sources was consolidated for the purpose of query and reporting. But today, these systems fall short when confronted with the volume, velocity and variety of Big Data. Why? They fail to enable the conversational approach to data required by Big Data analytics.

Traditional databases and data warehouses don’t easily scale to the hundreds of terabytes or even petabytes needed for many Big Data applications. Data is often not compressed, so huge amounts of storage and I/O bandwidth are needed to load, store and retrieve data. Data is still stored in tables by row, so access to a single data element through many rows—a common operation in business analytics—requires retrieving practically all of the data in a dataset to extract the specific element(s) needed. That strains I/O bandwidth and extends processing time. We have seen cases where the velocity of incoming data exceeds the capacity of the system to load it into the database, and queries produce answers in hours rather than the seconds or minutes needed for iterative business analytics. As a result, systems cost too much to maintain, and they fail to deliver the insight business leaders seek.

Take sentiment analysis, for example. The goal is to extract meaningful information from unstructured data so results can be stored in databases and analyzed. But the formats of resulting data are less predictable, more varied and subject to change during iterative analytics. This requires frequent changes to relational database structure and to processes that load data into them. For IT, it means the iterative approach to extracting business insight from Big Data requires new approaches, new tools and new skills.

Challenges for business leaders

Big Data is not just a technical challenge. Gaining and applying business insight compels business leaders to adopt new and disruptive ways of thinking and working.
Successful leaders we have known in data-driven organizations become more familiar with the sources of data available to them. Rather than asking IT what information is available in the database, they view information as a key competitive asset and explore how insights might be extracted from it to offer immediate and sustainable competitive advantage.

A solution for Big Data analytics

HP Vertica Analytics Platform is a new kind of database designed from the ground up for business analytics at the scale of Big Data. Compared to traditional databases and data warehouses, it drives down the cost of capturing, storing and analyzing data. And it produces answers 50 to 1,000 times faster to enable the iterative, conversational analytics approach needed.

  • HP Vertica Analytics Platform compresses data to reduce storage costs and speed access by up to 90 percent.
  • It stores data by columns rather than rows and caches data in memory to make analytic queries 50 to 1,000 times faster.
  • It uses massively parallel processing (MPP) to spread huge data volumes over any hardware, including low-cost commodity servers.
  • It uses data replication, failover and recovery to achieve automatic high availability.
  • It includes a pre-packaged, in-database analytics library to handle complex analytics and development framework.
  • It supports the R statistical programming language so analysts can create user-defined analytics inside the database.
  • It dynamically integrates with Hadoop to analyze large sets of structured, semi-structured and unstructured data.

HP Vertica Analytics Platform means better, faster business insight at less cost.

Test drive the HP Vertica Analytics Platform at

[1] “Big Data: The Management Revolution,” Andrew McAfee and Erik Brynjolfsson, Harvard Business Review, October, 1012.

[2]“Data, data everywhere,” The Economist, Feb 25, 2010.

[3]Facebook key facts.


[5] “Big data: The next frontier for innovation, competition, and productivity,” The McKinsey Global Institute, June 2011.

Startup Rink

For years, I’ve enjoyed working at Vertica, part of a culture where developers aren’t encumbered by bureaucracy, there is a true meritocracy, and we focus on efficiently delivering meaningful features to customers. I’ve been impressed through the years by the commitment, hard work, and truly impressive accomplishments of my colleagues. It takes an incredible team to build a product, like the original Vertica Analytics Database (now known as the HP Vertica Analytics Platform), from scratch, and tackle complex distributed systems and scalability challenges — it is also a lot of fun, especially with this group.

After HP acquired Vertica over a year and a half ago, I was glad to see the startup culture continue to thrive. The acquisition did bring about some change, which has overall been very positive. The engineering group has benefited from a wealth of resources at HP, including new toys, mostly in the form of hardware, and newfound relationships with the talented folks at HP Labs and in other business units.

It is my great fortune to work with truly talented developers, who have greatly impacted my personal and career growth. The challenges we’ve faced have worked to strengthen their influence. During a recent holiday project, I leaned on lessons learned from my colleagues. Interestingly, the project had nothing to do with my profession.

What does building a backyard, or, in my case front yard, skating rink have to do with a startup experience?

For starters, you hear lots of reasons why you shouldn’t do it. Building a rink is an impractical project, especially in my geographical location. It is relatively expensive compared to skating at a public rink — the cost is roughly what many pay for a few months of cable, but for something that you don’t mind your kids doing for hours each day. It is a lot of work. I call it exercise, something I need more of this time of year. At best, temperatures will remain cold enough to sustain five or six weeks of skating. As I got started, I heard all about how the ground didn’t freeze at all last winter.

To complete a project like this one must filter criticism appropriately. The folks at my local box store were very helpful in improving my rink design while others contributed only negative comments. I’m certain a good many of my neighbors think I am crazy. I was a little concerned when two fire engines came down my street while I was flooding the rink. It turns out that they were carrying Santa Claus on display for kids; his sleigh must have been getting tuned for his big day.

front_yard_rinkPerhaps most importantly, you have to be able to rebound when things don’t go as planned. I broke my back — at least it felt that way — framing the rink. What I didn’t count on was a lot of rain, followed by a fair amount of snow. These conditions added additional weight to the rink and made the ground extremely soggy (it was mush to a depth of more than one foot in some areas). Consequently, the deep end of the rink — the ground isn’t perfectly level — burst at one corner.

I’m certain that I looked crazed as I hurried to mend the damage before the rink fell apart completely. Once things stabilized, I could see that the ground wasn’t holding. The stakes were leaning and the rink was in great jeopardy. I felt defeated. I thought about giving up. I’d invested a lot of time and energy and wasted some money on this foolish project. Comments from the naysayers filled my head. But, as I said earlier, I’ve had the good fortune of working on challenging projects with colleagues who know how to make things work in the face of adversity. I didn’t need to consult them. I knew how they’d react. I’ve seen the same scenario play out dozens of times at work. After I cleared my head and got a pep talk from my wife I doubled down my efforts and made a serious attempt to salvage the rink. There was no guarantee of success—things looked bleak.

Thankfully, hard work paid off. It usually does, but there are times when, despite good intentions and best efforts, things don’t work out as intended. When that happens you’re left with valuable lessons learned. And, in that case, next year’s rink will be a success.

shooting_goalA few days after the rink was repaired Mother Nature did her part. The rink has been in operation for a couple of days now. Already, the work has been worthwhile. My family has had some very memorable times out there. Like, the time my three year old daughter amazed us with her on-ice impression of Prof. Hinkle chasing Frosty down a hill as she laughed hysterically or watching my five-year-old son give my wife a celebratory hug after imagining winning the Stanley Cup for the 1,000th time with another amazing goal.

With any luck, we’ve got a few more weeks to enjoy the cold weather. Now I’ve got to head out to resurface the ice with the homeboni I built (see image) so there’s a fresh sheet for the kids to skate on tomorrow.


Top 4 Considerations When Evaluating a Data Analytics Platform

From fraud detection to clickstream analytics to simply building better products or delivering a more optimal customer experience, Big Data use cases are abounding with analytics at the core.

With a solid business or use case in place, the next step that organizations typically take is to investigate and evaluate the appropriate set of analytics technology from which to accomplish their analysis, often starting with a data analytics platform. But what are the requirements from which to base your evaluation?

The Winter Corporation, the large-scale data experts, just finalized an in-depth white paper (The HP Vertica Analytics Platform: Large Scale Use and Advanced Analytics) that reflects the results and findings through evaluation, independent research, customer and employee interviews, and documentation review.

Intended for a more technical audience, this white paper focuses on key evaluation criteria that your organization can use as a guide as you conduct your own evaluation.



Winter Corporation identified these key feature areas as critical for any data analytics platform:

1. Architecture
• Column store architecture
• Shared nothing parallelism
• Cluster size and elasticity
• Smart K-Safety based availability
• Hybrid storage model
• Multiple database isolation modes
• Both bulk load and trickle feed

2. Performance
• Extensive data compression and data encoding
• Read-optimized storage
• Highly parallel operation
• Storage of multiple projections
• Automatic physical database design

3. General Useful and Noteworthy Features for Large-Scale Use
• Export-import
• Backup/restore
• Workload analyzer
• Workload management
• Role-based security

4. Extensions for Advanced Analytics
• SQL extensions
• Built-in functions
• User-defined extensions
• Flexibility in accessing and analyzing all data (structured, semistructured, or unstructured)

Finally, once you have evaluated and confirmed that the data analytics platform meets your feature and technology requirements, you want to hear from other organizations that have deployed large-scale analytics’ initiatives in real-world environments.

The white paper concludes with a write-up on how Zynga, a social game services company with more than 240 million users of its online games, stores the actions of every player in every game — about 6 TB per day of data — in near-real time in the HP Vertica Analytics Platform. No matter where in the world a game event occurs, the data can be retrieved via a report or query from the central HP Vertica database no more than five minutes later.

Optimizing Value – Creating a Conversational Relationship with Your Big Data

I spent most of the past week on the road, attending Gartner Symposium in Orlando and then later in the week at Strata Hadoop World in NYC. (For more, see my colleague Jeff Healey’s excellent recap of Hadoop World here.)

In the course of delivering the session ‘Big Data, Turning the information Overload into an Information Advantage,’ which I delivered with my colleague Jerome Levadoux of our sister company Autonomy, and in just walking the events in general, I spoke to many people, and unsurprisingly found the interest level in Big Data continuing to skyrocket.

Some of the most notable comments came from those who had already begun to tackle the Big Data challenge, since so many are trying to uncover the fourth ‘V’ of Big DataValue.

What I continue to hear is that the Value of effectively leveraging Big Data (or as we across HP like to call it ‘Information Optimization’) lies in fundamentally changing the relationship between the organization and the data. In particular, moving from static queries which take minutes, hours or sometimes days to run, to providing nearly-instantaneous answers that lead to more interactive ‘conversations’ with the data, completely changes how business executives perceive their data, and allows them to gain significantly more meaning and value.

Suddenly, it is no longer “specify the reports, set up the queries, run the reports, deliver to the business users” daily or weekly (rinse and repeat), but “I have a question, I need an answer”, which delivered in near-real-time via a platform such as Vertica then leads quickly to follow-on questions, what-if scenarios, and a virtuous cycle that puts the data – and the Analysts/Data Scientists who provide access to it – in a much more strategic and business-critical role.

My colleague Jim Campbell discussed this during his visit to Cloudera’s booth at Hadoop World.

Live from Strata + Hadoop World 2012: Jim Campbell, Vertica from Cloudera.

If you want to take a live look at how Vertica can add game-changing Velocity to your organizations’ conversations with your Big Data, sign up for an Evaluation today.

Get Started With Vertica Today

Subscribe to Vertica