Vertica

Author Archive

Boom Times for Boston’s Biggest Data

It’s boom times at HP Vertica – as the Boston area’s first and biggest Big Data technology provider, we continue to grow and expand our employee base and ecosystem.

We had a full house last week for our spring Open House, which gave us a chance to have friends of the company including current (of course) and future employees, strategic partners and even members of our alumni network.

As with any growth business in a highly dynamic market, occasionally people decide to move on but it was gratifying to have a number of former colleagues join us – some of whom have become customers, and a few of which have even re-joined the company recently!

Over the next few weeks, HP Vertica will be taking things on the road to HP Discover Las Vegas and will be involved in some major strategic announcements regarding HP’s Big Data strategy, so watch this space for more.

We’re also going to be on the road for our Discover Performance series and a number of industry events – check our website for details.

And last but not at all least, we’ve got our HP Vertica Big Data Conference in early August – click here or on the image below to learn more and to register. We’re expecting another full house including our base of current and future customers, strategic partners and of course many of our colleagues from worldwide HP – sign up now, since space is limited and Early Bird Pricing expires June 28!

Register here

Join us on Monday for an Open House

We’ve been sprucing up our gorgeous office space with help from our friends at Engage Marketing Design and are looking forward to showing it off on Monday.

Please join us by clicking this link and RSVP’ing. All friends of HP Vertica are welcome though space is limited and filling fast!

It’s coming together nicely – see below for a few previews – we’ll be taking more photos on Monday and want you in them. Join us and learn more about Boston’s Biggest Data!

Taking a Moonshot at Big Data Analytics for Everyone

HP Vertica is very excited about Monday’s announcement of the HP Moonshot system.

Why? Because we believe that the combination of the HP Vertica Analytics Platform running on the HP Moonshot Servers offers a truly game-changing value proposition for a variety of customers, and new segments of the market.

Moonshot is, simply put, a groundbreaking system which offers customers the ability to rapidly deploy, scale and manage with dramatically lower space and energy constraints. While traditional IT services that support business functions will continue to be served by general purpose server infrastructure, a new computing platform is required for specialized workloads that can deliver innovative solutions to market at unprecedented speed and scale.

 

We’ve already successfully tested the HP Vertica Analytics Platform on HP Moonshot Servers, and achieved very comparable performance to traditional Big Data Analytics hardware across certain performance ranges, which for a large segment of the market is more than sufficient to handle their Big Data Analytics loads – while offering very significant potential cost, space and energy savings.

Running Vertica on Moonshot offers yet another proof point of the unmatched value provided by HP’s combination of Information Optimization solutions, and a great example of the opportunity created by innovation that makes us so excited to be a part of the greater OneHP.

To learn more about HP Project Moonshot, visit http://www.hp.com/go/moonshot

The Disruptive Power of Big Data

Aside from the sheer quantity of digital data created every day—about 2.5 exabytes1 —there’s more to Big Data than volume. Big Data offers enterprise leaders the opportunity to dramatically change the way their organizations operate to gain competitive advantage and find new revenue opportunities. But realizing the value Big Data promises requires a new approach. Traditional data warehouses and business intelligence tools weren’t built for the scale of Big Data, and can’t provide insight quickly enough to be useful or even keep up.

But this isn’t just a case of data growth outstripping technology growth. Big Data embodies fundamental differences that necessitate new approaches and new technologies. Big Data takes many forms, three in particular we’ll discuss here:

  • Transactional data
  • Sentiment and perceptual data based on conversations taking place in social media
  • Data from networked sensors—the so-called “Internet of Things”

Transactional Data

As businesses have expanded—and expanded onto the Internet—the volume of business transactions has grown. The Economist reported in 2010 that Wal-Mart processes more than 1 million customer transactions every hour and maintains databases exceeding 2.5 petabytes (million gigabytes)2. Imagine how those numbers have grown since then.

What’s even more critical is that companies can now capture not just sales transactions, but the detailed histories and clickstreams that lead to the sale. From web-based clickstream analysis to call data records, pre- and post-transaction histories are more robust than ever—and our ability to collect, analyze and act on that data must adjust accordingly.

The social media explosion

Today’s online customer has progressed well beyond accessing information. Today’s consumers are not only interacting and collaborating with each other, but they’re talking about and interacting with your brand. Facebook has more than 1 billion active subscribers3, and it’s estimated they share almost 700,000 individual pieces on content every minute. On Twitter, more than a billion tweets go out every two to three days4. (You can watch them mapped geographically in real-time at tweetping.net.)

Product reviews, user communities, forums and blogs allow consumers to generate content that contains critical insight for the business. The proliferation of user-generated content in these social channels has lead to new techniques and tools for “sentiment analysis”—the ability to measure emotion to determine how your company and brand are perceived.

The Internet of Things

The amount of information generated by devices rather than people is also growing explosively.
Mobile devices—and the apps people use on them—regularly broadcast individuals’ location, performance and other factors to the network. Retailers and distributors are using radio frequency identification (RFID), bar and QR codes to track inventory and enhance their supply chain and inventory performance. The healthcare industry seeks to improve care and reduce costs through remote patient monitoring. The automotive industry is embedding sensors in vehicles. And utilities are beginning to rely on smart meters to track usage. McKinsey Global Institute reports that more than 30 million networked sensors are in use in the transportation, automotive, industrial, utilities and retail sectors—and the number is growing by 30 percent every year.5

We recently presented a webinar on the Internet of Things and the Power of Sensor Data, which delves into this exciting area in much more detail.

Disrupting conventional analytics – developing a ‘conversational relationship with data’

Using Big Data to make operations more efficient, improve competitiveness and increase revenue is not about generating traditional statistics or producing standard reports.

Just as important as systems to collect and store data are systems to analyze and extract insight from that data. Without insight, you can’t gain new knowledge into your markets, your products and your operations.

When you have this insight at your disposal, you can act faster and with greater probability of success.

Extracting business value from Big Data requires a new approach. We believe that Big Data analytics is an iterative process. We describe it as developing a conversational relationship with your data. Analytics becomes a continuous improvement loop, which uses the results of analyses to frame better, more meaningful analyses, which, in turn, produce more definitive results. When results are available in minutes, analysts can ask, “What if?”

When properly applied, Big Data analytics enables business leaders to:

  • Understand market reaction and brand perception
  • Identify key buying factors
  • Segment populations to customize actions
  • Enable experimentation
  • Accurately predict outcomes
  • Reinvent and enhance inventory and supply chain systems and processes
  • Disrupt their industries, gain an edge over competitors and enable new business models

Big Data already proved its game-changing power during the 2012 U.S. presidential election. Obama campaign chairman Jim Messina said: “We were going to demand data on everything, we were going to measure everything…We were going to put an analytics team inside of us to study us the entire time to make sure we were being smart about things.”
And, in fact, Big Data analytics helped the Obama campaign ratchet up the three key levers in any election: voter registration, persuasion and turnout. Rolling Stone magazine singled out Messina and the campaign’s CTO, Harper Reed, as two among a handful of unsung heroes in Obama’s victory.

You can hear more about how HP Vertica contributed to the high-tech strategy behind Obama’s reelection in a recent webinar featuring Chris Wegrzyn, director of data architecture for the Democratic National Committee.

The traditional data warehouse won’t get it done

The concept of the data warehouse evolved in the 1980s. Then, data warehouses were simply databases into which data from multiple sources was consolidated for the purpose of query and reporting. But today, these systems fall short when confronted with the volume, velocity and variety of Big Data. Why? They fail to enable the conversational approach to data required by Big Data analytics.

Traditional databases and data warehouses don’t easily scale to the hundreds of terabytes or even petabytes needed for many Big Data applications. Data is often not compressed, so huge amounts of storage and I/O bandwidth are needed to load, store and retrieve data. Data is still stored in tables by row, so access to a single data element through many rows—a common operation in business analytics—requires retrieving practically all of the data in a dataset to extract the specific element(s) needed. That strains I/O bandwidth and extends processing time. We have seen cases where the velocity of incoming data exceeds the capacity of the system to load it into the database, and queries produce answers in hours rather than the seconds or minutes needed for iterative business analytics. As a result, systems cost too much to maintain, and they fail to deliver the insight business leaders seek.

Take sentiment analysis, for example. The goal is to extract meaningful information from unstructured data so results can be stored in databases and analyzed. But the formats of resulting data are less predictable, more varied and subject to change during iterative analytics. This requires frequent changes to relational database structure and to processes that load data into them. For IT, it means the iterative approach to extracting business insight from Big Data requires new approaches, new tools and new skills.

Challenges for business leaders

Big Data is not just a technical challenge. Gaining and applying business insight compels business leaders to adopt new and disruptive ways of thinking and working.
Successful leaders we have known in data-driven organizations become more familiar with the sources of data available to them. Rather than asking IT what information is available in the database, they view information as a key competitive asset and explore how insights might be extracted from it to offer immediate and sustainable competitive advantage.

A solution for Big Data analytics

HP Vertica Analytics Platform is a new kind of database designed from the ground up for business analytics at the scale of Big Data. Compared to traditional databases and data warehouses, it drives down the cost of capturing, storing and analyzing data. And it produces answers 50 to 1,000 times faster to enable the iterative, conversational analytics approach needed.

  • HP Vertica Analytics Platform compresses data to reduce storage costs and speed access by up to 90 percent.
  • It stores data by columns rather than rows and caches data in memory to make analytic queries 50 to 1,000 times faster.
  • It uses massively parallel processing (MPP) to spread huge data volumes over any hardware, including low-cost commodity servers.
  • It uses data replication, failover and recovery to achieve automatic high availability.
  • It includes a pre-packaged, in-database analytics library to handle complex analytics and development framework.
  • It supports the R statistical programming language so analysts can create user-defined analytics inside the database.
  • It dynamically integrates with Hadoop to analyze large sets of structured, semi-structured and unstructured data.

HP Vertica Analytics Platform means better, faster business insight at less cost.


Test drive the HP Vertica Analytics Platform at www.vertica.com/evaluate.


[1] “Big Data: The Management Revolution,” Andrew McAfee and Erik Brynjolfsson, Harvard Business Review, October, 1012.

[2]“Data, data everywhere,” The Economist, Feb 25, 2010.

[3]Facebook key facts.

[4] http://www.mediabistro.com/alltwitter/tweetping_b35247

[5] “Big data: The next frontier for innovation, competition, and productivity,” The McKinsey Global Institute, June 2011.

BDOC – Big Data on Campus

I had a great time speaking at the MIT Sloan Sports Analytics Conference yesterday, and perhaps the most gratifying part of doing a panel in front of a packed house was how many students were in the audience. Having been a bit of a ‘stats geek’ during my college years, I can assure you that such an event, even with a sports theme, would never have drawn such an audience back then.

It was even more gratifying to read this weekend’s Wall Street Journal, with the title Data Crunchers Now The Cool Kids on Campus. Clearly this a terrific time to be studying – and teaching – statistics and Big Data. To quote the article:

The explosive growth in data available to businesses and researchers has brought a surge in demand for people able to interpret and apply the vast new swaths of information, from the analysis of high-resolution medical images to improving the results of Internet search engines.

Schools have rushed to keep pace, offering college-level courses to high-school students, while colleges are teaching intro stats in packed lecture halls and expanding statistics departments when the budget allows.

 

Of course, Big Data training is not just for college students, and at HP Vertica we are working on programs to train both professionals as well as students in conjunction with our colleagues in the HP ExpertOne program. We invite those interested in learning more to contact us – including educational institutions who are interested in adding Big Data training to their curriculum.

The growth of Big Data, the demand for Data Scientists, and the power of Community

These was an interesting article in CIO last week, IT Departments Battle for Data Analytics Talent, which argues (along with a related McKinsey report) that by 2018 the US will be facing a massive shortage of analytics talent:

By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.

On a more personal note, I attended a holiday party this weekend where a parent was relating to me how their pre-college-age son was being advised to pursue ‘Data Scientist’ as a course of study because it is ‘hot’ (while asking me what exactly a ‘Data Scientist’ does).

But is the solution really just to throw more people at the problem? More importantly, is harnessing and leveraging Big Data really a labor problem, a technology problem, or a community problem?

At HP Vertica, we believe that the Big Data challenge will be met – and while we agree that Data Scientist will indeed be one of the hottest (if not sexiest) jobs of the 21st century, we are also confident that the power of community will allow companies to leverage technology to compensate for the demand for labor. Consequently, we have been making significant investments in the MyVertica community and have big plans in store for 2013.

Our friends at the Community Roundtable have put together a terrific set of materials around what it takes to build an active, engaged community – which aligns very well with our efforts to ‘socialize’ our organization.

Watch for much more in the year ahead, and if you’re not already a member of MyVertica, sign up today!

Optimizing Value – Creating a Conversational Relationship with Your Big Data

I spent most of the past week on the road, attending Gartner Symposium in Orlando and then later in the week at Strata Hadoop World in NYC. (For more, see my colleague Jeff Healey’s excellent recap of Hadoop World here.)

In the course of delivering the session ‘Big Data, Turning the information Overload into an Information Advantage,’ which I delivered with my colleague Jerome Levadoux of our sister company Autonomy, and in just walking the events in general, I spoke to many people, and unsurprisingly found the interest level in Big Data continuing to skyrocket.

Some of the most notable comments came from those who had already begun to tackle the Big Data challenge, since so many are trying to uncover the fourth ‘V’ of Big DataValue.

What I continue to hear is that the Value of effectively leveraging Big Data (or as we across HP like to call it ‘Information Optimization’) lies in fundamentally changing the relationship between the organization and the data. In particular, moving from static queries which take minutes, hours or sometimes days to run, to providing nearly-instantaneous answers that lead to more interactive ‘conversations’ with the data, completely changes how business executives perceive their data, and allows them to gain significantly more meaning and value.

Suddenly, it is no longer “specify the reports, set up the queries, run the reports, deliver to the business users” daily or weekly (rinse and repeat), but “I have a question, I need an answer”, which delivered in near-real-time via a platform such as Vertica then leads quickly to follow-on questions, what-if scenarios, and a virtuous cycle that puts the data – and the Analysts/Data Scientists who provide access to it – in a much more strategic and business-critical role.

My colleague Jim Campbell discussed this during his visit to Cloudera’s booth at Hadoop World.

Live from Strata + Hadoop World 2012: Jim Campbell, Vertica from Cloudera.

If you want to take a live look at how Vertica can add game-changing Velocity to your organizations’ conversations with your Big Data, sign up for an Evaluation today.