Vertica

Author Archive

MySQL Ate My Homework: Five Reasons You Should Always Use a Subpar Data Platform

shutterstock_145410211 [Converted]

subpar

adjective

falling short of a standard <the service at the restaurant was subpar, to say the least>

Synonyms: bush, bush-league, crummy (also crumby), deficient, dissatisfactory,
ill, inferior, lame, lousy,off, paltry, poor, punk, sour, suboptimal, subpar, substandard, unacceptable, unsatisfactory, wack [slang], wanting, wretched, wrong

Related Words: abysmal, atrocious, awful, [slang], brutal, damnable, deplorable, detestable, disastrous, dreadful,execrable, gnarly [slang], horrendous, horrible, pathetic,stinky, sucky [slang], terrible, unspeakable; defective, faulty,flawed; egregious, flagrant, gross; bum, cheesy, coarse,common, crappy [slang], cut-rate, junky, lesser, low-grade,low-rent, mediocre, miserable, reprehensible, rotten,rubbishy, second-rate, shoddy, sleazy, trashy; abominable,odious, vile; useless, valueless, worthless; inadequate,insufficient, lacking, meager (or meagre), mean, miserly, , scanty, shabby, short, skimp, skimpy, spare,stingy; miscreant, scurrilous, villainous; counterfeit, fake,phony (also phoney), sham

In the big data technology industry, we spend most of our time writing blogs and whitepapers about our technology.  I’m sure you’ve heard this before…”Our technology is great…it’s the best…most functional…top-notch” and so forth.  But we never really discuss when someone might want to use less effective technology – systems that may be more raw, or less suited to the task, or that have no vendor behind them.  Sure, these systems can break easily or might not do everything you want, but some of these technologies have tens of thousands of users around the world. So, they must be valid choices, right?

So, when should less effective technology be used?  Based on many years in the IT trenches, here is my countdown of the top five reasons you should use a subpar big data platform.

Caution: sarcasm ahead with a mostly serious ending which actually makes a point

Reason Number 5: Not invented here, dude
Science_and_Invention_Jan_1922_pg822 (1)

Who wants to be boring and pick existing technology that is solid… and works?  By rolling your own, you get serious technical chops.  What’s that knocking sound? That’s O’Reilly Media at your door…they want you to write a book!  Seriously, reinvention is under-rated.  Sure, relational databases have been around for forty-plus years, but reinventing transaction semantics or indexing would be seriously cool!  Give it a funny name and pick a cute animal for the logo and…voila!  Tech cred!

Furthermore, using off-the-shelf technology tends to create a situation some IT shops dread: transparency. What? The executives understand the technology we’re using well enough to monitor progress with it?  Time to throw it out and build something arcane from scratch to control what the execs see!

Reason Number 4: It’s free


FreeTags

When I was seven, one of my dad’s friends came by to visit around the holidays.  He gave me a kitten.  My dad got seriously steamed, and my mom looked like somebody had just sneezed in her soup.  But the kitten was free, right?  Three illnesses, a few injuries, and one or two thousand dollars later, and coupled with a year or so cleaning a litterbox, I realized that the kitten was not – in fact – free.

But we’re talking about software here.  Isn’t that different?  Free means you don’t need to deal with a sales guy and some engineer who’ll help you set things up in an hour.  You just go through a few websites and download four RPMS, the Java SDK, a Java JRE, five or six utilities, upgrade your OS, downgrade your OS, grab some runtime libraries for Linux, the Eclipse IDE, a downgraded version of the Eclipse IDE that’s required by the plug-in you’re about to download, and an Eclipse plug-in which kinda does most of what you need and…voila!  You can run the “hello world” example.  So free must be good, right?  Now, fire up “Getting Better” by the Beatles on your iPod and get to work!

Reason Number 3: You’ve got all the time in the world

Czech-2013-Prague-Astronomical_clock_face

Yeah, the business folks are in a panic about losing market share, and the CIO is a little bent out of shape about the fact that the IT budget has been going up at 15% every year, but what’s the big rush?  After all, the prospectus for that O’Reilly book needs to be seriously heavy stuff to have a chance of getting anywhere.  So dig into the technology!  Science projects can be fun when you’re doing science.  Hey, do those hardware guys really think that putting data on the disk tracks closer to the spindle will improve read times by 0.01%??  That sounds fun to test!  We can write a hack in HDFS for that!  Of course, the only way we can tell is on a cluster that has at least a thousand nodes.  The good news is that with modern cloud technologies, it’ll take only six months and ten people to test it!  The business can wait a little longer.

Reason Number 2: It’s cool

IMAG1539

Does anything really need to be said here?  Cool + Not Invented Here = Happy Technologists = Productivity, right?

And (drum roll please)…

Reason Number 1: You like risk

800px-John_Carta_motorbike_Base_jumping_Salto_Moto

Do you fly by those ancient thirty-year olds on your kitesurfing rig wondering why they still use something as yesterday as a windsurfer?  Is base jumping from old-school spots like the KL Tower yesterday’s news for you?  Well, risk on!  In the stock market, risk=volatility=upside, right?  And the worst that can happen is the dollar value of your investment hits zero.  Why should it be any different with technology?  If you’re not base jumping from that erupting volcano, you’re not alive.   So bring together the adrenaline rush and the upside potential of adopting something which looks like it isn’t ready so that, in the event it ever gets to be what you need, you’re ahead of the curve!


Summing up, Seriously

While this piece – so far – has been very sarcastic, there’s a nugget of truth hidden within.  Businesses globally choose subpar technology every day believing that it will solve their problems.  And while they rarely select such technologies based on my sarcastic “top five” list above, they often select these technologies with the mistaken belief that they’re cheaper/better/faster, etc.

Today businesses don’t have to select subpar technologies for big data analytics.  Two years ago, Vertica opted to release the Vertica Community Edition.  This release of Vertica offers the full functionality of the product on up to one terabyte of raw data and a three node cluster.  Furthermore, it now includes the full functionality of Vertica’s sentiment scoring engine (Pulse), Vertica’s geospatial add-in (Place), and Vertica’s auto-schematizer for things like JSON data (FlexZone). I tried to talk the Vertica team out of offering so much for free!  But the team wants to share this with the world so organizations no longer have to settle for a subpar data platform.  It’s hard to argue with that!

So, if you want to try Vertica CE today, click here.

In my twenty-plus years of working with databases, I’ve installed and worked with just about every commercially available database under the sun, including Vertica.  And out of all of them, Vertica has been the easiest to stand up, the most powerful, and the highest quality.  Try it.  Seriously.

Don’t go for the subpar stuff, because you don’t have to.

That Giant Sucking Sound is Your Big Data Science Project

shutterstock_144005719

Vertica recently hosted its second annual Big Data Conference in Boston, Massachusetts. It was very well attended with over eight hundred folks, and about two hundred companies represented. We at Vertica love these events for a few reasons – first because our customers tend to be our best spokespeople because it’s such a sound product, but also because it’s a chance for us to learn from them.

In one of the sessions, the presenter asked the audience how many of them had Hadoop installed today. Almost all the hands went up. This wasn’t too surprising given that the session was on Hadoop and Vertica integration. Then the presenter asked how many of those folks had actually paid for Hadoop. Most of the hands went down. Then the presenter asked how many of those folks felt that they were getting business value out of their investment. Only two or three hands stayed up. This was eye-opening for us at HP, and it was surprising to the audience as well. Everyone seemed to think they were doing something wrong with Hadoop that was causing them to miss out on the value.

Over the next few days, I made a point to track down folks in the audience I knew and get their thoughts on what the issues were. Since most of them were Vertica customers I knew many of them already. I thought it would be helpful to identify the signs indicative of a big data science project – a project where a team has installed something like Hadoop and is experimenting with it in the hope of achieving some new analytic insights, but isn’t on a clear path to deriving value out of it. And some clear themes emerged. And these align with what I and my colleagues in the industry have been observing over the last few years. So, without further ado, here are the top five signs that you may have a big data science project in your enterprise:

    1. The project isn’t tied to business value, but has lots of urgency. Somebody on the leadership team went to a big data presentation and has hit the panic button. As a result, the team rushes ahead and does…something. And maybe splinters into different teams doing different things. We all know how well this will turn out.
    2. The technologies were chosen primarily because they beef up resumes. There’s so much hype around big data and the shortage of people with relevant skills that salaries are inflated. And in the face of a project with high urgency, nobody wants to stand still. So download some software! That open source stuff is great, right? While it’s generally true that multiple technologies can solve the same big data problems, some will fit with the business more readily than others. Maybe they’re easier to deploy. Maybe they don’t require extensive skill retooling for the staff. Maybe the TCO is better. Those are all good things to keep in mind during technology selection. But selecting technology for “resume polishing”? Not so much.
    3. The project is burdened with too much process. Most organizations already have well-defined governance processes in place for technology projects. And, so the reasoning goes, big data is basically just a bunch more of the same data and same old reporting & analysis. So when it’s time to undertake a highly experimental big data analytics project which requires agility and adaptability, rigid process usually results in a risk-averse mindset where failure at any level is seen as a bad thing. For projects like these, failure during the experimentation isn’t just expected, it’s a critical part of innovation.
    4. The “can’t-do” attitude. It’s been a well understood fact of life for decades that IT departments often feel under siege – the business always asks for too much, never knows what it wants, and wants it yesterday. As a result, the prevailing attitude in many IT teams today is to start by saying “no”, and then line up a set of justifications for why radical change is bad.
    5. The decision-making impedance mismatch. Sometimes, organizations need to move fast to develop their insights. Maybe it’s driven by the competition, or maybe it’s driven by a change in leadership. And…then they move slooooowly, and miss the opportunity. Other times, the change represents a big one with impact across the company, and requires extensive buy-in and consensus. And…then it moves at a breakneck pace and causes the organization to develop antibodies and reject the project.

     

    So if your organization has one or more big data projects underway, ask whether it suffers from any of these issues. If so, you may have a big data science project on your hands.

Is Big Data Giving You Grief? Part 5: Acceptance

“We can do this”

Over the last month or so, this series has discussed how organizations often deal with a missed big data opportunity in ways that closely resemble the grieving process, and how that process maps to the commonly understood five stages of grief: denial, anger, bargaining, depression, and acceptance. This is the last entry in the series; it focuses on how an organization can move forward effectively with a big data project.

While big data is big, complicated, fast, and so forth, it is also very vague to most businesses. I was at an event recently where a poll question was asked of a room full of technology professionals – “How important is big data to your business?” A surprisingly high number of respondents felt that big data wasn’t relevant to them. Afterwards, I spoke with one of the attendees over lunch. I asked him what the primary challenges were to his business. It turns out that their business costs rely primarily on commodity costs – if the price of an input such as oil goes up or the supply is disrupted, the entire business is affected. I asked him whether he thought social media was relevant to his business, and he didn’t believe so. I then talked about how hedge funds have found that Tweets can be a very effective way of predicting commodity prices and availability disruptions. Until that moment, he was unaware that this was possible. This was what I call a “light bulb” moment. Suddenly, the appeal of big data became clear.

This experience highlighted for me a fundamental issue I see daily in the big data space – that it’s just too big (and vague) for many organizations to grasp its tangible value – an important pre-requisite to moving forward. So even while they go through all the stages of grief and struggle with the fact that their competitors may be outperforming them due to big data, companies also struggle with how to turn that into a plan of action.

Once they’ve worked their way through the realization that something’s wrong, organizations are often ready to take action. Here are some of the most helpful techniques I’ve seen businesses take over the years to begin an effective big data program – to accept the reality of the situation, and move forward.

Execute tactically, think strategically
For the organization first tackling big data, this is probably the most important thing to keep in mind. Big data projects rarely start with a crystal clear vision of what the strategic outcome should be. Uncertainty and hype around the opportunity, unfamiliarity with how to handle big data, lack of a data science competence, and so forth all create challenges that make it tough to articulate an up-front strategic vision.

But don’t interpret that as a pass to ignore the potential impact of a big data project. Thus the advice. Execute the project tactically – be prepared to move fast with the aim to demonstrate value quickly. And when the project is complete, a debrief with the business leadership is essential. In this debrief, answer two questions: How did applying big data matter to the business? And given what we’ve learned, how can our next project impact the business in a bigger way?

The answers are inputs to the next project, and over time can serve as a powerful guide to articulating a big data strategy for the business.

Don’t boil the ocean
Very often, when a group of people from an organization attend a big data event, they all come back very enthused about big data projects. Vendors love to talk about big-picture, blue sky notions of transforming businesses or industries with big data. It’s exciting stuff, but doesn’t lend itself to immediate action – especially for a business new to big data.

So don’t start there.

A much better approach is to identify measurable goals that can be tied to actions that can be completed in the right timeframe. What’s “the right timeframe”? Good question! In part, it depends on how open the business is to a big data initiative – if the leadership team is bearish on the idea and needs powerful convincing, it’ll be important to demonstrate value quickly. Also, immediacy is a powerful guide to enthusiasm – so don’t tell the IT team to disappear for a year and come back with a big data architecture. There’s no immediacy, and as a result there likely won’t be much focus. So don’t boil the ocean and try to do everything at once, in a big hurry. Start with focus, and retain it as you progress.

One foot in front of the other (and sometimes…baby steps!)
When an organization wakes up and realizes that it’s at risk of being left behind or otherwise outperformed by others due to big data, the first response can be panic. The CEO or CMO may set a goal for the team – catch up. This can kick everyone into overdrive quickly, which is great. But it can also set everyone running in different directions with a vague charter to do something to change the business…now!

The tendency is to start chasing the Big Goal – maybe something dramatic like “reinvent the business”. For the organization new to big data, this is a recipe for trouble. Developing any new core competence takes time, and nobody starts as an expert. Learning to incorporate big data into your business is the same thing. It’s probably not realistic to expect a team accustomed to managing enterprise applications (which might all be running on a twenty-year-old technology stack) to learn massively parallel technologies, large scale data management and data science in a week. Or a month. Or a year.

So put one foot in front of the other. Don’t expect to master big data overnight, and instead take measured steps. Pick a project with a strong return on investment to get stakeholders on board and get the technology team’s feet wet in new technology. Then make the next project somewhat more ambitious. As the team learns more about delivering these projects, it’ll be much more natural to assess larger questions such as revising technology architecture.

It’s not too late
Marketing is marketing and reality is reality. Just because one of your competitors released a success story about their big data program last week doesn’t mean that there’s no benefit for your company. And when an article shows up online or in the printed media that declares that the big data war is over, and you lost if you’re not one of a handful of companies – take it with a huge grain of salt. There’s nothing wrong with a big data project that makes your business more profitable, or drives more top line revenue. And while it’s fun to contemplate reinventing your company, there are plenty of practical (and do-able) opportunities for improving revenue, customer experience, efficiency, etc. So don’t think for a moment that it’s too late.

Furthermore, by waiting a bit, organizations can take advantage of the learnings of others – things to do, things to avoid, and so forth. And the tools will usually improve. And successful use cases will become easier to spot. All these factors will reduce the risk to your big data project, and increase the likelihood of success. So it’s not too late.

To Accept or Not
Sadly, not all organizations make it to this stage. I’ve seen companies get stuck in finger pointing exercises, or trapped in endless cycles of ill-defined big data “science projects” that never seem to produce anything tangible and never end, or even put on blinders and avoid big data completely. But for companies who get to a place where they’re ready to accept the challenge, there are opportunities to meaningfully impact the business. And there are frequently increasing returns on well-crafted big data projects – which is to say that for every additional dollar spent over time, the value to the business actually increases. I’ve seen this cycle unfold time and time again, and in every single case of which I’m aware, the organization has reached the stage I’m referring to as “acceptance”, and is moving forward in a well-planned fashion with an effective big data program.

In fact, as I write this I’m listening to the HP Vertica Customer Advisory Board talk about their experiences to date with Vertica. And every one of them has approached their big data program in the ways described above. And every one of them has discovered increasing returns to their big data investment over time.

So put big data grief aside, accept that big data can help your business, and get started!

Is Big Data Giving You Grief? Part 4: Depression

“The problem is too big.  How can we possibly address it?”

Continuing the five part series which explores how organizations coping with big data often go through a process that closely resembles grief, this segment addresses the point at which the organization finally grasps the reality of big data and realizes the magnitude of the opportunity and challenge…and gets depressed about the reality of it.

Having seen this more than once, I’ve observed a few ways this shows in an organization.  Here are the most common reactions.

It’s too big

This reaction makes sense.  After all, as much as we in the industry say that “big data” is more than big and describe it with a laundry list of varying attributes, we all agree that it’s big.  It represents addressing data at a scale never before attempted by most organizations.  It represents analytic abilities perhaps never done before – and a capability pivot towards being an analytics-driven company.  And it may represent opportunities that are so big they appear to be nebulous: “If I capture ten thousand times as much data about my product, how does that translate into value?  Does that mean I’ll sell ten thousand times as many widgets?  How do I quantify the payoff?”

It may be challenging just to get a handle on the costs of a big data program for reasons mentioned in earlier parts of this series, much less the potential payoff.  This can make for a very challenging return-on-investment calculation.

We’re not ready

I believe I may have heard this particular form of worry more than anything else.  The infrastructure isn’t ready, the people aren’t ready to build big data applications, the business isn’t ready to consume the new data, and so on.  And, in fact, the company may not be prepared to size the big data effort because the team may not have the know-how for the ROI calculation (see above).  Also, the executive leadership may be unprepared to make a strategic wager on the program because of the uncertainty around the risks and benefits.

This can seem like a true show-stopper.  It’s not easy to change an organization.  Skills and technologies may not appear to be aligned with big data needs.  The various lines of business may not realize the ways they can improve or revolutionize their business.  The leadership team may be unaccustomed to making big bets on unproven technologies, or may believe that big data is a fad and will pass.

We’re too late

I hear this a lot too.  Everywhere a business turns today there’s a story about how someone has transformed their business, created new markets, broken old barriers, etc.  It’s easy to believe that all the opportunity is gone – that there’s no more benefit to tackling big data because it’s already been done.  It’s also easy to believe that it would be impossible to “catch up” with others because of all the time and effort required.

While this can be an intimidating belief, it can also be hard to characterize accurately.  After all, do you think your competitors will announce that the big data project they recently publicized in the media is a year late and $10M USD over budget?  Instead, they’ll play it up as if it’s a runaway success.  Vendors help this along too – who wouldn’t want to tout that their product helped a company?

So the saying goes – “The darkest hour is just before the dawn.”  Sage words written long before computers that apply to this situation.   But this is actually a positive place to be, because once a team has moved through anger, denial, bargaining , and into depression, it’s ready to come to terms with the situation and make an action plan to move forward.  I’ll discuss that next week in the final part of this series: acceptance.

Next week the series concludes with…acceptance.  “We can do this.”

Is Big Data Giving You Grief? Part 3: Bargaining

Is Big Data Giving You Grief?  Part 3: Bargaining

“Can’t we work with our current technologies (and vendors)? But they cost too much!”
Continuing the five part series about the stages of big-data grief that organizations experience, this segment focuses on the first time organizations explore the reality of the challenges and opportunities presented by big data and start to work their way forward…with bargaining.
Coping with a missed opportunity often brings some introspection. And with that comes the need to explore what-ifs that may provide a way forward. Here are some of the more common what-ifs that organizations explore during this phase.

What if I go to my current vendor? They’re sure to have some great technology. That’ll fix the problem.
This is a perfectly fine path of inquiry to explore. The only issue with this, as mentioned in my previous post (Part 2: Anger) is that vendors have a tendency to re-label their technology to suit a desirable market. So their technology offerings may not actually be suited to big data needs. And spending time and effort exploring these technologies to verify this can distract and prevent you from moving forward.

Also, vendors may have a business that relies on high-margin technology or services that were priced for a time before the big data explosion. So, the economics of their technology may suit them, but not the organization in need – your company. For example, if I need to store a petabyte of data in a data warehouse, I might require a several hundred node data warehouse cluster. If my current vendor charges a price of a hundred thousand US dollars per node, this isn’t economically feasible since I can now find alternatives that are purpose-built for large scale database processing and are priced at 1/5th or 1/10th that (or less!).

What if I hire some smart people? They’ll bring skills and insight. They’ll fix the problem.
Like the question above, this is a perfectly reasonable question to ask. But hiring bright people with the perfect skills can be very difficult today – the talent pool for big data is slim, and the hiring for these folks is highly competitive. Furthermore, hiring from outside doesn’t bring in the context of the business. In almost every business, there are nuances to the products, culture, market, and so forth that have a meaningful impact on the business. Hired guns, no matter how skilled, often lack this context.

Also, just bringing in new people doesn’t necessarily mean that your organization’s technology will suit them. Most analytic professionals develop their way of operating—their “game plan”—early in their career, and often prefer a particular set of technologies. It’s likely your new hires will want to introduce technologies they’re familiar with to your organization. And that can introduce additional complexity. A classic example of this is hiring a data science team who have spent the last decade analyzing data with the SAS system. If the organization doesn’t use SAS to begin with, the new team will likely press to introduce it.. And that may conflict with how the how the organization approaches analytics.

What if I download this cool open source software? I hear that stuff is magic, so that’ll fix the problem.
Unlike the first two what-ifs, this one should be approached with great caution! As mentioned in my previous post, open source software has something of a unique tendency to be associated with vague, broad, exaggerated, and often contradictory claims of functionality. This brings to mind a classic bit of satire by the Saturday Night Live crew, first aired in 1976: “New Shimmer is both a floor wax and a dessert topping!” The easy mistake to make here is for the technology team to rush forward, install the new stuff and start to experiment with it to the exclusion of all else. Six months (and several million dollars of staff time) later, the sunk cost in the open source option is so huge that it becomes a fait accompli. Careers would be damaged if the team admitted that it just wasted six months proving that the technology does not do what it claims, so it becomes the default choice.

What if I do what everybody else is doing? Crowds have wisdom, so that’ll fix the problem.
The risk with this thinking is similar to that posed by open source. This often goes hand-in-hand with hiring big data smarts – companies often bring in people from the outside and pay them to do what they’ve done elsewhere. It can definitely accelerate a big data program. But it can also guarantee that the efforts are more of a me-too duplication of something the rest of the industry has already done rather than true innovation. And while this may be suited for some businesses, the big money in big data is in being the first to derive new insights.

These are all perfectly acceptable questions that come up as organizations begin to acknowledge, for the first time, the reality of big data. But this isn’t the end of the discussion by any means. It’s important to avoid getting so enamored with exploring one or two of the above options that you don’t follow through on the “grief” process. But the natural next step is to be intimidated by the challenge, which will serve as an important reality check. I’ll cover this in the next segment: depression. So stay tuned!

Next up: Depression “The problem is too big. How can we possibly tackle it?”

Is Big Data Giving You Grief? Part 2: Anger

Is Big Data Giving You Grief? Part Two: Anger

“We missed our numbers last quarter because we’re not leveraging Big Data! How did we miss this?!”

Continuing this five part series focused on how organizations frequently go through the five stages of grief when confronting big data challenges, this post will focus on the second stage: anger.

It’s important to note that while an organization may begin confronting big data with something very like denial, anger usually isn’t far behind. As mentioned previously, very often the denial is rooted in the fact that the company doesn’t see the benefit in big data, or the benefits appear too expensive. And sometimes the denial can be rooted in a company’s own organizational inertia.

Moving past denial often entails learning – that big data is worth pursuing. Ideally, this learning comes from self-discovery and research – looking at the various opportunities it represents, casting a broad net as to technologies for addressing it, etc. Unfortunately, sometimes the learning can be much less pleasant as the competition learns big data first…and suddenly is performing much better. This can show up in a variety of ways – your competitors suddenly have products that seem much more aligned with what people want to buy; their customer service improves dramatically while their overhead actually goes down; and so on.

For better or worse, this learning often results in something that looks an awful lot like organizational “anger”. As I look back at my own career to my days before HP, I can recall more than a few all-hands meetings hosted by somber executives highlighting deteriorating financials, as well as meetings featuring a fist pounding leader or two talking about the need to change, dammit! It’s a natural part of the process wherein eyes are suddenly opened to the fact that change needs to occur. This anger often is focused at the parties involved in the situation. So, who’re the targets, and why?

The Leadership Team

At any company worth its salt, the buck stops with the leadership team. A shortcoming of the company is a shortcoming of the leadership. So self-reflection would be a natural focus of anger. How did a team of experienced business leaders miss this? Companies task leaders with both the strategic and operational guidance of the business – so if they missed a big opportunity in big data, or shot it down because it looked to costly or risky, this is often seen as a problem.

Not to let anybody off the hook, but company leadership is also tasked with a responsibility to the investors. And this varies with the type of company, stage in the market, etc. In an organization tasked with steady growth, taking chances on something which appears risky – like a big data project where the benefits are less understood than the costs – is often discouraged. Also, leaders often develop their own “playbook” – their way of viewing and running a business that works. And not that many retool their skills and thinking over time. So their playbook might’ve worked great when brand value was determined by commercial airtime, and social media was word of mouth from a tradeshow. But the types and volume of information available are changing rapidly in the big data world, so that playbook may be obsolete.

Also, innovation is as much art as science. This is something near & dear to me both in my educational background as well as career interests. If innovation was a competence that could just be taught or bought, we wouldn’t see a constant flow of companies appearing (and disappearing) across markets. We also wouldn’t see new ideas (the web! social networking!) appear overnight to upend entire segments of the economy. For most firms, recognizing the possibilities inherent in big data and acting on those possibilities represents innovation, so it’s not surprising to see that some leadership teams struggle.

The Staff

There are times when the upset over a missed big data opportunity is aimed at the staff. It’s not unusual to see a situation where the CEO of a firm asked IT to research big data opportunities, only to have the team come back and state that they weren’t worthwhile. And six months later, after discovering that the competition is eating their lunch, the CEO is a bit upset at the IT team.

While this is sometimes due to teams being “in the bunker” (see my previous post here), in my experience it occurs far more often due to the IT comfort zone. Early in my career, I worked in IT for a human resources department. The leader of the department asked a group of us to research new opportunities for the delivery of information to the HR team across a large geographic area (yeah, I’m dating myself a bit here…this was in the very early days of the web). We were all very excited about it, so we ran back to our desks and proceeded to install a bunch of software to see what it could do. In retrospect I have to laugh at myself about this – it never occurred to me to have a conversation with the stakeholders first! My first thought was to install the technology and experiment with it, then build something.

This is probably the most common issue I see in IT today. The technologies are different but the practice is the same. Ask a room full of techies to research big data with no business context and…they’ll go set up a bunch of technology and see what it can do! Will the solution meet the needs of the business? Hmm. Given the historical failure rate of large IT projects, probably not.

The Vendors

It’s a given that the vendors might get the initial blame for missing a big data opportunity. After all, they’re supposed to sell us stuff that solves our problems, aren’t they? As it turns out, that’s not exactly right. What they’re really selling us is stuff that solves problems for which their technology was built. Why? Well, that’s a longer discussion that Clayton Christensen has addressed far better than I ever could in “The Innovator’s Dilemma”. Suffice it to say that the world of computing technology continues to change rapidly today, and products built twenty years ago to handle data often are hobbled by their legacy – both in the technology and the organization that sells it.

But if a company is writing a large check every year to a vendor – it’s not at all unusual to see firms spend $1 million or more per year with technology vendors – they often expect a measure of thought leadership from that vendor. So if a company is blindsided by bad results because they’re behind on big data, it’s natural to expect that the vendor should have offered some guidance, even if it was just to steer the IT folks away from an unproductive big data science project (for more on that, see my blog post coming soon titled “That Giant Sucking Sound is Your Big Data Lab Experiment”).

Moving past anger

Organizational anger can be a real time-waster. Sometimes, assigning blame can gain enough momentum that it distracts from the original issue. Here are some thoughts on moving past this.

You can’t change the past, only the future. Learning from mistakes is a positive thing, but there’s a difference between looking at the causes and looking for folks to blame. And it’s critical to identify the real reasons the opportunity was missed instead of playing the “blame game”, as it would suck up precious time and in fact may prevent the identification of the real issue. I’ve seen more than one organization with what I call a “Teflon team” – a team which is never held responsible for any of the impacts their work has on the business, regardless of their track record. Once or twice, I’ve seen these teams do very poor work, but the responsibility has been placed elsewhere. So the team never improves and the poor work continues. So watch out for the Teflon team!

Big data is bigger than you think. It’s big in every sense of the word because it represents not just the things we usually talk about – volume of data, variety of data, and velocity of data – but it also represents the ability to bring computing to bear on problems where this was previously impossible. This is not an incremental or evolutionary opportunity, but a revolutionary one. Can a business improve its bottom line by ten percent with big data? Very likely. Can it drive more revenue? Almost certainly. But it can also develop entirely new products and capabilities, and even create new markets.

So it’s not surprising that businesses may have a hard time recognizing this and coping with it. Business leaders accustomed to thinking of incremental boosts to revenue, productivity, margins, etc. may not be ready to see the possibilities. And the IT team is likely to be even less prepared. So while it may take some convincing to get the VP of Marketing to accept that Twitter is a powerful tool for evaluating their brand, asking IT to evaluate it in a vacuum is a recipe for confusion.

So understanding the true scope of big data and what it means for an organization is critical to moving forward.

A vendor is a vendor. Most organizations have one or more data warehouses today, along with a variety of tools for the manipulation, transformation, delivery, analysis, and consumption of data. So they will almost always have some existing vendor relationships around technologies which manage data. And most of them will want to leverage the excitement around big data, so will have some message along those lines. But it’s important to separate the technology from the message. And to distinguish between aging technology which has simply been rebranded and technology which can actually do the job.

Also, particularly in big data, there are “vendorless” or “vendor-lite” technologies which have become quite popular. By this I mean technologies such as Apache Hadoop, Mongodb, Cassandra, etc. These are often driven less by a vendor with a product goal and more by a community of developers who cut their teeth on the concept of open-source software which comes with very different business economics. Generally without a single marketing department to control the message, these technologies can be associated with all manner of claims regarding capabilities – some of which are accurate, and some which aren’t. This is a tough issue to confront because the messages can be conflicting, diffused, etc. The best advice I’ve got here is – if an open source technology sounds too good to be true, it very likely is.

Fortunately, this phase is a transitional one. Having come to terms with anger over the missed big data opportunity or risk, businesses then start to move forward…only to find their way blocked. This is when the bargaining starts. So stay tuned!

Next up: Bargaining “Can’t we work with our current technologies (and vendors)? …but they cost too much!”

Is Big Data Giving You Grief? Part 1: Denial

My father passed away recently, and so I’ve found myself in the midst of a cycle of grief. And, in thinking about good blog topics, I realized that many of the organizations I’ve worked with over the years have gone through something very much like grief as they’ve come to confront big data challenges…and the stages they go through even map pretty cleanly to the five stages of grief! So this series was born.

So it’ll focus on the five stages of grief: denial, anger, bargaining, depression, and acceptance. I’ll explore the ways in which organizations experience each of these phases when confronting the challenges of big data, and also present strategies for coping with these challenges and coming to terms with big data grief.

Part One: Denial

“We don’t have a big data problem. Our Oracle DBA says so.”

Big data is a stealth tsunami – it’s snuck up on many businesses and markets worldwide. As a result, they often believe initially that that they don’t need to change. In other words, they are in denial. In this post, I’ll discuss various forms of denial, and recommend strategies for moving forward.

Here are the three types of organizational “denial” that we’ve seen most frequently:

They don’t know what they’re missing

Typically, these organizations are aware that there’s now much more data available to them, but don’t see that how it represents opportunity to their business. Organizations may have listened to vendors, who often focus their message on use cases they want to sell into – which may not be the problem a business needs to solve. But it’s also common for an organization settle into its comfort zone; the business is doing just fine and the competition doesn’t seem to be gaining any serious ground. So, the reasoning goes, why change?

The truth is that, as much as those of us who work with it every day feel that there’s always a huge opportunity in big data, for many organizations it’s just not that important to them yet. They might know that every day, tens of thousands of people tweet about their brand, but they haven’t yet recognized the influence these tweets can have on their business. And they may not have any inkling that those tweets can be signals of intent – intent to purchase, intent to churn, etc.

They don’t think it’s worth doing

Organizations in denial may also question whether dealing with big data is worth doing. An organization might already be paying a technology vendor $1 million or more per year for technology…and this to handle just a few terabytes of data. When the team looks at a request to suddenly deal with multiple petabytes of data, it automatically assumes that the costs would be prohibitive and shuts down that line of thinking. This attitude often goes hand-in-hand with the first item…after all, if it’s outrageously expensive to even consider a big data initiative, it seems there’s no point in researching it further since it can’t possibly provide a strong return on investment.

Somebody is in the bunker

While the prior two items pertained largely to management decisions based on return on investment for a big data project, this one is different. Early in my career I learned to program in the SAS analysis platform. As I pursued this for several different firms, I observed that organizations would tend to build a team of SAS gurus who held the keys to this somewhat exotic kingdom. Key business data existed only in SAS datasets which were difficult to access from other systems. Also, programming in SAS required a specialized skillset that only a few possessed. Key business logic such as predictive models, data transformations, business metric calculations, etc. were all locked away in a large library of SAS programs. I’ve spoken with more than one organization who tells me that they’ve got a hundred thousand (or more!) SAS datasets, and several times that many SAS programs floating around their business…many of which contain key business logic and information. As a result, the SAS team often held a good position in the organizational food chain, and its members were well paid.

One day, folks began to discover that they could download other tools that did very similar things, didn’t care where the data resided, cost a fraction of SAS, and required less exotic programming skills.

Can you see where this is going?

I also spent some years as an Oracle DBA and database architect, and witnessed very similar situations. It’s not uncommon – especially given how disruptive big data technologies can be – to see teams go “into the bunker” and be very reluctant to change. Why would they volunteer to give up their position, influence and perks? And so we now are at the intersection of information technology and a classic change management challenge.

Moving forward past denial

For an organization, working through the denial stage can seem daunting, but it’s very do-able. Here are some recommendations to get started:

Be prepared to throw out old assumptions. The world is rapidly becoming a much more instrumented place, so there are possibilities today that literally didn’t exist ten years ago. The same will be true in another ten years (or less). This represents both opportunity and competitive threat. Not only might your current competitors leverage data in new ways, but entirely new classes of products may appear quickly that will change everything. For example, consider the sudden emergence in recent years of smartphones, tablets, Facebook, and Uber. In their respective domains, they’ve caused entire industries to churn. So it’s important to cast a broad net in terms of looking for big data projects to deliver value for your business.

Big data means not having to say “no.”  I’ve worked with numerous organizations who have had to maintain a high cost infrastructure for so long that they’re used to saying “no” when they ‘re approached for a new project. And they add an exclamation point (“no!”) when they’re approached with a big data project. Newer technologies and delivery models offer the chance to put much more in the hands of users. So, while saying no may sometimes be inevitable, it no longer needs to be an automatic response. When it comes to an organization’s IT culture, be ready to challenge the common wisdom about team organization, project evaluation and service delivery. The old models – the IT service desk, the dedicated analyst/BI team, organizing a technology team into technology centric silos such as the DBA team, etc. may no longer be a fit.

Big data is in the eye of the beholder. Just because vendors love to talk about Twitter (and I’m guilty of that too), doesn’t mean that Twitter is relevant to your business. Maybe you manufacture a hundred pieces of very complex equipment every year and sell them to a handful of very large companies. In this case, it’s probably best not worry overmuch about tweets. You might have a very different big data problem. For instance, you may need to evaluate data from your last generation of devices which had ten sensors that generate ten rows of data per second each. And, you know that the next generation will have ten thousand sensors generating a hundred rows per second each – so very soon it’ll be necessary to cope with around ten thousand times as much data (or more – the new sensors may provide a lot more information than the older ones). And if the device goes awry, your customer might lose a $100 million manufacturing run. So don’t dismiss the possibilities in big data just because your vendor doesn’t talk about your business. Push them to help you solve your problems, and the vendors worth partnering with will work with you to do this.

Data expertise is a good thing. Just because you might not need ten Oracle DBA’s in the new world doesn’t mean that you should lay eight of them off. The folks who have been working intimately with the data in the bunker often have very deep knowledge of the data. They frequently can retool and, in fact, find themselves having a lot more fun delivering insights and helping the business. It may be important to re-think the role of the “data gurus” in the new world.  In fact, I’d contend that this is where you may find some of your best data scientists.

While organizational denial is a tough place to be when it comes to big data, it happens often. And many are able to move past it. Sometimes voluntarily, and sometimes not – as I’ll describe in the next installment.  So stay tuned!

Next up:

Anger: “We missed our numbers last quarter because we have a big data problem! What the heck are we going to do about it?”

Get Started With Vertica Today

Subscribe to Vertica