Vertica

Archive for the ‘use cases’ Category

Is Big Data Giving You Grief? Part 3: Bargaining

Is Big Data Giving You Grief?  Part 3: Bargaining

“Can’t we work with our current technologies (and vendors)? But they cost too much!”
Continuing the five part series about the stages of big-data grief that organizations experience, this segment focuses on the first time organizations explore the reality of the challenges and opportunities presented by big data and start to work their way forward…with bargaining.
Coping with a missed opportunity often brings some introspection. And with that comes the need to explore what-ifs that may provide a way forward. Here are some of the more common what-ifs that organizations explore during this phase.

What if I go to my current vendor? They’re sure to have some great technology. That’ll fix the problem.
This is a perfectly fine path of inquiry to explore. The only issue with this, as mentioned in my previous post (Part 2: Anger) is that vendors have a tendency to re-label their technology to suit a desirable market. So their technology offerings may not actually be suited to big data needs. And spending time and effort exploring these technologies to verify this can distract and prevent you from moving forward.

Also, vendors may have a business that relies on high-margin technology or services that were priced for a time before the big data explosion. So, the economics of their technology may suit them, but not the organization in need – your company. For example, if I need to store a petabyte of data in a data warehouse, I might require a several hundred node data warehouse cluster. If my current vendor charges a price of a hundred thousand US dollars per node, this isn’t economically feasible since I can now find alternatives that are purpose-built for large scale database processing and are priced at 1/5th or 1/10th that (or less!).

What if I hire some smart people? They’ll bring skills and insight. They’ll fix the problem.
Like the question above, this is a perfectly reasonable question to ask. But hiring bright people with the perfect skills can be very difficult today – the talent pool for big data is slim, and the hiring for these folks is highly competitive. Furthermore, hiring from outside doesn’t bring in the context of the business. In almost every business, there are nuances to the products, culture, market, and so forth that have a meaningful impact on the business. Hired guns, no matter how skilled, often lack this context.

Also, just bringing in new people doesn’t necessarily mean that your organization’s technology will suit them. Most analytic professionals develop their way of operating—their “game plan”—early in their career, and often prefer a particular set of technologies. It’s likely your new hires will want to introduce technologies they’re familiar with to your organization. And that can introduce additional complexity. A classic example of this is hiring a data science team who have spent the last decade analyzing data with the SAS system. If the organization doesn’t use SAS to begin with, the new team will likely press to introduce it.. And that may conflict with how the how the organization approaches analytics.

What if I download this cool open source software? I hear that stuff is magic, so that’ll fix the problem.
Unlike the first two what-ifs, this one should be approached with great caution! As mentioned in my previous post, open source software has something of a unique tendency to be associated with vague, broad, exaggerated, and often contradictory claims of functionality. This brings to mind a classic bit of satire by the Saturday Night Live crew, first aired in 1976: “New Shimmer is both a floor wax and a dessert topping!” The easy mistake to make here is for the technology team to rush forward, install the new stuff and start to experiment with it to the exclusion of all else. Six months (and several million dollars of staff time) later, the sunk cost in the open source option is so huge that it becomes a fait accompli. Careers would be damaged if the team admitted that it just wasted six months proving that the technology does not do what it claims, so it becomes the default choice.

What if I do what everybody else is doing? Crowds have wisdom, so that’ll fix the problem.
The risk with this thinking is similar to that posed by open source. This often goes hand-in-hand with hiring big data smarts – companies often bring in people from the outside and pay them to do what they’ve done elsewhere. It can definitely accelerate a big data program. But it can also guarantee that the efforts are more of a me-too duplication of something the rest of the industry has already done rather than true innovation. And while this may be suited for some businesses, the big money in big data is in being the first to derive new insights.

These are all perfectly acceptable questions that come up as organizations begin to acknowledge, for the first time, the reality of big data. But this isn’t the end of the discussion by any means. It’s important to avoid getting so enamored with exploring one or two of the above options that you don’t follow through on the “grief” process. But the natural next step is to be intimidated by the challenge, which will serve as an important reality check. I’ll cover this in the next segment: depression. So stay tuned!

Next up: Depression “The problem is too big. How can we possibly tackle it?”

Is Big Data Giving You Grief? Part 2: Anger

Is Big Data Giving You Grief? Part Two: Anger

“We missed our numbers last quarter because we’re not leveraging Big Data! How did we miss this?!”

Continuing this five part series focused on how organizations frequently go through the five stages of grief when confronting big data challenges, this post will focus on the second stage: anger.

It’s important to note that while an organization may begin confronting big data with something very like denial, anger usually isn’t far behind. As mentioned previously, very often the denial is rooted in the fact that the company doesn’t see the benefit in big data, or the benefits appear too expensive. And sometimes the denial can be rooted in a company’s own organizational inertia.

Moving past denial often entails learning – that big data is worth pursuing. Ideally, this learning comes from self-discovery and research – looking at the various opportunities it represents, casting a broad net as to technologies for addressing it, etc. Unfortunately, sometimes the learning can be much less pleasant as the competition learns big data first…and suddenly is performing much better. This can show up in a variety of ways – your competitors suddenly have products that seem much more aligned with what people want to buy; their customer service improves dramatically while their overhead actually goes down; and so on.

For better or worse, this learning often results in something that looks an awful lot like organizational “anger”. As I look back at my own career to my days before HP, I can recall more than a few all-hands meetings hosted by somber executives highlighting deteriorating financials, as well as meetings featuring a fist pounding leader or two talking about the need to change, dammit! It’s a natural part of the process wherein eyes are suddenly opened to the fact that change needs to occur. This anger often is focused at the parties involved in the situation. So, who’re the targets, and why?

The Leadership Team

At any company worth its salt, the buck stops with the leadership team. A shortcoming of the company is a shortcoming of the leadership. So self-reflection would be a natural focus of anger. How did a team of experienced business leaders miss this? Companies task leaders with both the strategic and operational guidance of the business – so if they missed a big opportunity in big data, or shot it down because it looked to costly or risky, this is often seen as a problem.

Not to let anybody off the hook, but company leadership is also tasked with a responsibility to the investors. And this varies with the type of company, stage in the market, etc. In an organization tasked with steady growth, taking chances on something which appears risky – like a big data project where the benefits are less understood than the costs – is often discouraged. Also, leaders often develop their own “playbook” – their way of viewing and running a business that works. And not that many retool their skills and thinking over time. So their playbook might’ve worked great when brand value was determined by commercial airtime, and social media was word of mouth from a tradeshow. But the types and volume of information available are changing rapidly in the big data world, so that playbook may be obsolete.

Also, innovation is as much art as science. This is something near & dear to me both in my educational background as well as career interests. If innovation was a competence that could just be taught or bought, we wouldn’t see a constant flow of companies appearing (and disappearing) across markets. We also wouldn’t see new ideas (the web! social networking!) appear overnight to upend entire segments of the economy. For most firms, recognizing the possibilities inherent in big data and acting on those possibilities represents innovation, so it’s not surprising to see that some leadership teams struggle.

The Staff

There are times when the upset over a missed big data opportunity is aimed at the staff. It’s not unusual to see a situation where the CEO of a firm asked IT to research big data opportunities, only to have the team come back and state that they weren’t worthwhile. And six months later, after discovering that the competition is eating their lunch, the CEO is a bit upset at the IT team.

While this is sometimes due to teams being “in the bunker” (see my previous post here), in my experience it occurs far more often due to the IT comfort zone. Early in my career, I worked in IT for a human resources department. The leader of the department asked a group of us to research new opportunities for the delivery of information to the HR team across a large geographic area (yeah, I’m dating myself a bit here…this was in the very early days of the web). We were all very excited about it, so we ran back to our desks and proceeded to install a bunch of software to see what it could do. In retrospect I have to laugh at myself about this – it never occurred to me to have a conversation with the stakeholders first! My first thought was to install the technology and experiment with it, then build something.

This is probably the most common issue I see in IT today. The technologies are different but the practice is the same. Ask a room full of techies to research big data with no business context and…they’ll go set up a bunch of technology and see what it can do! Will the solution meet the needs of the business? Hmm. Given the historical failure rate of large IT projects, probably not.

The Vendors

It’s a given that the vendors might get the initial blame for missing a big data opportunity. After all, they’re supposed to sell us stuff that solves our problems, aren’t they? As it turns out, that’s not exactly right. What they’re really selling us is stuff that solves problems for which their technology was built. Why? Well, that’s a longer discussion that Clayton Christensen has addressed far better than I ever could in “The Innovator’s Dilemma”. Suffice it to say that the world of computing technology continues to change rapidly today, and products built twenty years ago to handle data often are hobbled by their legacy – both in the technology and the organization that sells it.

But if a company is writing a large check every year to a vendor – it’s not at all unusual to see firms spend $1 million or more per year with technology vendors – they often expect a measure of thought leadership from that vendor. So if a company is blindsided by bad results because they’re behind on big data, it’s natural to expect that the vendor should have offered some guidance, even if it was just to steer the IT folks away from an unproductive big data science project (for more on that, see my blog post coming soon titled “That Giant Sucking Sound is Your Big Data Lab Experiment”).

Moving past anger

Organizational anger can be a real time-waster. Sometimes, assigning blame can gain enough momentum that it distracts from the original issue. Here are some thoughts on moving past this.

You can’t change the past, only the future. Learning from mistakes is a positive thing, but there’s a difference between looking at the causes and looking for folks to blame. And it’s critical to identify the real reasons the opportunity was missed instead of playing the “blame game”, as it would suck up precious time and in fact may prevent the identification of the real issue. I’ve seen more than one organization with what I call a “Teflon team” – a team which is never held responsible for any of the impacts their work has on the business, regardless of their track record. Once or twice, I’ve seen these teams do very poor work, but the responsibility has been placed elsewhere. So the team never improves and the poor work continues. So watch out for the Teflon team!

Big data is bigger than you think. It’s big in every sense of the word because it represents not just the things we usually talk about – volume of data, variety of data, and velocity of data – but it also represents the ability to bring computing to bear on problems where this was previously impossible. This is not an incremental or evolutionary opportunity, but a revolutionary one. Can a business improve its bottom line by ten percent with big data? Very likely. Can it drive more revenue? Almost certainly. But it can also develop entirely new products and capabilities, and even create new markets.

So it’s not surprising that businesses may have a hard time recognizing this and coping with it. Business leaders accustomed to thinking of incremental boosts to revenue, productivity, margins, etc. may not be ready to see the possibilities. And the IT team is likely to be even less prepared. So while it may take some convincing to get the VP of Marketing to accept that Twitter is a powerful tool for evaluating their brand, asking IT to evaluate it in a vacuum is a recipe for confusion.

So understanding the true scope of big data and what it means for an organization is critical to moving forward.

A vendor is a vendor. Most organizations have one or more data warehouses today, along with a variety of tools for the manipulation, transformation, delivery, analysis, and consumption of data. So they will almost always have some existing vendor relationships around technologies which manage data. And most of them will want to leverage the excitement around big data, so will have some message along those lines. But it’s important to separate the technology from the message. And to distinguish between aging technology which has simply been rebranded and technology which can actually do the job.

Also, particularly in big data, there are “vendorless” or “vendor-lite” technologies which have become quite popular. By this I mean technologies such as Apache Hadoop, Mongodb, Cassandra, etc. These are often driven less by a vendor with a product goal and more by a community of developers who cut their teeth on the concept of open-source software which comes with very different business economics. Generally without a single marketing department to control the message, these technologies can be associated with all manner of claims regarding capabilities – some of which are accurate, and some which aren’t. This is a tough issue to confront because the messages can be conflicting, diffused, etc. The best advice I’ve got here is – if an open source technology sounds too good to be true, it very likely is.

Fortunately, this phase is a transitional one. Having come to terms with anger over the missed big data opportunity or risk, businesses then start to move forward…only to find their way blocked. This is when the bargaining starts. So stay tuned!

Next up: Bargaining “Can’t we work with our current technologies (and vendors)? …but they cost too much!”

Can Big Data Analytics Save Our World?

CI image

If you ask Conservation International this question, they may just say yes. After all, Conservation International has teamed up with HP Earth Insights to provide organizations around the world — from environmentalists to policy makers – with a real-time look at what is happening within our planets most valuable natural resource: the rain forest.

But how does their work relate to you as a start-up organization or a Fortune 500 company?
First, they have surprisingly similar analytical needs to many other start-ups and corporations, collecting data regularly from 16 sites around the globe, performing more than 4 million climate measurements as of this February, and managing more than 3 TB of biodiversity information. As the name implies, this information is incredibly, well… diverse, including everything from photos to hand-recorded measurements to weather station and camera trap imagery. While your company may not be recording/analyzing the metadata of candid photos of elephants and/or chimpanzees, chances are, many of you out there are working with at least more than one type of data.

Collecting and Analyzing Multiple Data Types
All of these different data types have to be funneled into a database, analyzed, and then acted on. Running queries based on millions of climate readings begins to look a lot like doing the same on a diverse customer base like many other companies deal with every day. Many agricultural companies collect sensor data from across their farm lands to get a forecast of how the climate has affected their crops for the upcoming year. These days, utilities companies are launching Advanced Metering Infrastructures (AMI) to deal with the staggering amounts of sensor data collected from the energy usage of millions of homes. HP Vertica coincidentally works as an effective Meter Data Management (MDM) system (read more here).

Visualizing the Data and Reaching More People
Working with HP, Conservation International has built from the ground up their own analytics system and dashboard for visualizing their data from all 16 rainforests around the globe. CI DBA’s discover trends based on over 140 million simulations, and analyze the metadata from over 1.7 million photos. Not only is their custom interface intuitive, it also enables them to generate PDFs instantly and share to social media directly from the dashboard. For CI, this means more people now see more of their impact in more places to proactively address environment threats. For you, it might mean anything from less time spent prepping your data to present to management, or just simply fewer emails to send.

The Power of Prediction for the Greater Good
Like many companies, CI uses standard methodology in processing their data, and uses R for their analysis, as is very common in scientific studies. Using R, CI can proactively assess where the future trouble spots will be, and what parts of their monitored ecosystems are most threatened. Many other HP Vertica customers use R in surprisingly similar ways, such as seeing what neighborhoods a future power outage might affect most, or how serious the next year’s dry season will be to a farmer’s crops

See Conservation International at the HP Vertica Big Data Conference
These are just a few examples of how an incredibly unique organization uses HP Vertica to analyze unique data, yet does it in ways that many other groups might find surprisingly familiar. Sometimes after a closer look, we can see that many organizations have a lot more in common with their data needs than they may think, and HP Vertica is the right tool for the job.

Be sure to attend out upcoming Big Data Conference in Boston MA, where Conservation International is leading the hackathon!

The Real-Time Unicorn

The “De-mythification” Series

Part 1: The Real-Time Unicorn

This is part one of a series I call the “de-mythification” series, wherein I’ll aim to clear up some of the more widespread myths in the big data marketplace.

In the first of this multi-part series, I’ll address one of the most common myths my colleagues and I have to confront in the Big Data marketplace today: the notion of “real-time” data visibility. Whether it’s real-time analytics or real-time data, the same misconception always seems to come up. So I figured I’d address this, define what “real-time” really means, and provide readers some advice on how to approach this topic in a productive way.

First of all, let’s establish the theoretical definition of “real-time” data visibility. In the purest interpretation, it means that as some data is generated – say, a row of log data in an Apache web server – the data would immediately be queryable. What does that imply? Well, we’d have to parse the row into something readable by a query engine – so some program would have to ingest the row, parse the row, characterize it in terms of metadata, and understand enough about the data in that row to determine a decent machine-level plan for querying it. Now since all our systems are limited by that pesky “speed of light” thing, we can’t move data any faster than that – considerably slower in fact. So even if we only need to move the data through the internal wires of the same computer where the data is generated, it would take measurable time to get the row ready for query. And let’s not forget the time required for the CPU to actually perform the operations on the data. It may be nanoseconds, milliseconds, or longer, but in any event it’s a non-zero amount of time.

So “real-time” never, ever means real-time, despite marketing myths to the contrary.

There are two exceptions to this – slowing down time inside the machine, or technology which queries a stream of data as it flows by (typically called complex event processing, or CEP). With regard to the first option: let’s say we wanted to make data queryable as soon as the row is generated.  We could make the flow from the logger to the query engine part of one synchronous process. So the weblog row wouldn’t actually be written until it were also processed and ready for query. Those of you who administer web and application infrastructures are probably getting gray hair just reading this as you can imagine the performance impact to a web application. So, in the real world, this is a non-starter.  The other option – CEP – is exotic and typically very expensive, and while it will tell you what’s happening at the current moment, it’s not designed to build analytics models.  It’s largely used to put those models to work in a real-time application such as currency arbitrage.

So, given all this, what’s a good working definition of “real-time” in the world of big data analytics?

Most organizations define it this way: “As fast as it can be done providing a correct answer and not torpedoing the rest of the infrastructure or the technology budget”.

Once everyone gets comfortable with that definition, then we can discuss the real goal: reducing the time to useful visibility of the data to an optimal minimum. This might mean a few seconds, it might mean a few minutes, or it might mean hours or longer. In fact, for years now I’ve found that once we get the IT department comfortable with the practical definition of real-time, it invariably turns out that the CEO/CMO/CFO/etc. really meant exactly that when they said they needed real-time visibility to the data. So, in other words, when the CEO said “real-time”, she meant “within fifteen minutes” or something along those lines.

This then becomes a realistic goal we can work towards in terms of engineering product, field deployment, customer production work, etc. Ironically, chasing the real-time unicorn can actually impede efforts to develop high speed data flows by forcing the team to chase unrealistic targets for which, at the end of the day, there is no quantifiable business value.

So when organizations say they need “real-time” visibility to the data, I recommend not walking away from that conversation until fully understanding just what that phrase means, and using that as the guiding principle in technology selection and design.

I hope readers found this helpful! In the remaining segments of this series, I’ll address other areas of confusion in the Big Data marketplace. So stay tuned!

Next up: The Unstructured Leprechaun

 

Can Vertica Climb a Tree?

big_basin_0939_mg_1143

The answer is YES if it is the right kind of tree. Here “tree” refers to a common data structure that consists of parent-child hierarchical relationship such as an org chart. Traditionally this kind of hierarchical data structure can be modeled and stored in tables but is usually not simple to navigate and use in a relational database (RDBMS). Some other RDBMS (e.g. Oracle) has a built-in CONNECT_BY function that can be used to find the level of a given node and navigate the tree. However if you take a close look at its syntax, you will realize that it is quite complicated and not at all easy to understand or use.

For a complex hierarchical tree with 10+ levels and large number of nodes, any meaningful business questions that require joins to the fact tables, aggregate and filter on multiple levels will result in SQL statements that look extremely unwieldy and can perform poorly. The reason is that such kind of procedural logic may internally scan the same tree multiple times, wasting precious machine resources. Also this kind of approach flies in the face of some basic SQL principles, simple, intuitive and declarative. Another major issue is the integration with third-party BI reporting tools which may often not recognize vendor-specific variants such as CONNECT_BY.

Other implementations include ANSI SQL’s recursive SQL syntax using WITH and UNION ALL, special graph based algorithms and enumerated path technique. These solutions tend to follow an algorithmic approach and as such, they can be long on theory but short on practical applications.
Since SQL derives its tremendous power and popularity from its declarative nature, specifying clearly WHAT you want to get out of a RDBMS but not HOW you can get it, a fair question to ask is: Is there a simple and intuitive approach to the modeling and navigating of such kind of hierarchical (recursive) data structures in a RDBMS? Thankfully the answer is yes.

In the following example, I will discuss a design that focuses on “flattening” out such kind of hierarchical parent-child relationship in a special way. The output is a wide sparsely populated table that has extra columns that will hold the node-ids at various levels on a tree and the number of these extra columns is dependent upon the depth of a tree. For simplicity, I will use one table with one hierarchy as an example. The same design principles can be applied to tables with multiple hierarchies embedded in them. The following is a detailed outline of how this can be done in a program/script:

  1. Capture the (parent, child) pairs in a table (table_source).
  2. Identify the root node by following specific business rules and store this info in a new temp_table_1.
    Example: parent_id=id.
  3. Next find the 1st level of nodes and store them in a temp_table_2. Join condition:
    temp_table_1.id=table_source.parent_id.
  4. Continue to go down the tree and at the end of each step (N), store data in temp_table_N.
    Join condition: temp_table_M.parent_id=temp_table_N.id, where M=N+1.
  5. Stop at a MAX level (Mevel) when there is no child for any node at this level (leaf nodes).
  6. Create a flattened table: table_flat by adding in total (Mlevel+1) columns named as LEVEL,
    LEVEL_1_ID,….LEVEL_Mlevel_ID.
  7. A SQL insert statement can be generated to join all these temp tables together to load
    into the final flat table: table_flat.

  8. When there are multiple hierarchies in one table, the above procedures can be repeated for each
    hierarchy to arrive at a flattened table in the end.

 

This design is general and is not specific to any particular RDBMS architecture, row or column or hybrid. However the physical implementation of this design naturally favors columnar databases such as Vertica. Why? The flattened table is usually wide with many extra columns and these extra columns tend to be sparsely populated and they can be very efficiently stored in compressed format in Vertica. Another advantage is that when a small set of these columns are included in the select clause of an SQL, because of Vertica’s columnar nature, the other columns (no matter how many there are) will not introduce any performance overhead. This is as close to “free lunch” as you can get in a RDBMS.

Let’s consider the following simple hierarchical tree structure:

Vertica Tree diagram

There are four levels and the root node has an ID of 1. Each node is assumed to have one and only one parent (except for the root node) and each parent node may have zero to many child nodes. The above structure can be loaded into a table (hier_tab) having two columns: Parent_ID and Node_ID, which represent all the (parent, child) pairs in the above hierarchical tree:

CHart 1

It is possible to develop a script to “flatten” out this table by starting from the root node, going down the tree recursively one level at a time and stopping when there is no data left (i.e. reaching the max level or depth of the tree). The final output is a new table (hier_tab_flat):

Chart 2

What’s so special above this “flattened” table? First, this table has the same key (Node_ID) as the original table; Second, this table has several extra columns named as LEVEL_N_ID and the number of these columns is equal to the max number of levels (4 in this case) plus one extra LEVEL column; Third, for each node in this table, there is a row that includes the ID’s of all of its parents up to the root (LEVEL=1) and itself. This represents a path starting from a node and going all the way up to the root level.The power of this new “flattened” table is that it has encoded all the hierarchical tree info in the original table. Questions such as finding a level of a node and all the nodes that are below a give node, etc. can be translated into relatively simple SQL statements by applying predicates to the proper columns.

Example 1: Find all the nodes that are at LEVEL=3.Select Node_ID From hier_tab_flat Where LEVEL=3;Example 2: Find all the nodes that are below node= 88063633.

This requires two logical steps (which can be handled in a front-end application to generate the proper SQL).

Step 2.1. Find the LEVEL of node= 88063633 (which is 3).

Select LEVEL From hier_tab_flat Where Node_ID=88063633;

Step 2.2. Apply predicates to the column LEVE_3_ID:

Select Node_ID From hier_tab_flat Where LEVE_3_ID =88063633;

Complex business conditions such as finding all the nodes belonging to node=214231509 but excluding the nodes that are headed by node=88063633 can now be translated into the following SQL:

Select Node_ID
From hier_tab_flat
Where LEVE_2_ID=214231509
And LEVE_3_ID <> 88063633 ;

By invoking the script that flattens one hierarchy repeatedly, you can also flatten a table with multiple hierarchies using the same design. With this flattened table in your Vertica tool box, you can climb up and down any hierarchical tree using nothing but SQL.

Po Hong is a senior pre-sales engineer in HP Vertica’s Corporate Systems Engineering (CSE) group with a broad range of experience in various relational databases such as Vertica, Neoview, Teradata and Oracle

Our Users Validate the Value of Vertica

We recently allowed TechValidate, a trusted authority for creating customer evidence content, to survey the HP Vertica customer base. The firm reached out to nearly 200 customers across a variety of industries and came back with some extremely powerful results.

From the financial benefits to the performance advantages, the benefits of the HP Vertica Analytics platform were repeatedly and clearly detailed by our customers.

A sampling of some of the comments and results can be found below, but to see the full results set click here.

HP Vertica Software Rocks HP Vertica Software - the best in the market
Query performance increased by 100-500% or more

HP Vertica customers have achieved a wide range of benefitsMajority of Vertica users saved $100-500K or more

 

 

 

 

How MZI HealthCare identifies big data patient productivity gems using HP Vertica

As part of our continuing podcast series, Dana Gardner, president and principal analyst for Interarbor Solutions, recently conducted an interview with Greg Gootee, product manager at MZI HealthCare.   MZI HealthCare develops and provides sophisticated software solutions that are flexible, reliable, cost effective, and help reduce the complexities of the healthcare industry.

In a post on ZDNet, Dana shares some of the highlights from his podcast with Greg Gootee:

Doctors make informed decisions from their experience and the data that they have. So it’s critical that they can actually see all the information that’s available to them.

The other critical thing was speed, being able to deliver high-end analytics at the point of care, instead of two or three months later, and Vertica really produced. In fact, we did a proof of concept with them. It was almost unbelievable some of the queries that ran and the speed at which that data came back to us.

The ability to expand and scale the Vertica system along with the scalability that we get with the Amazon allows us to deliver that information. No matter what type of queries we’re getting, we can expand that automatically. We can grow that need, and it really makes a large difference in how we could be competitive in the marketplace.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

Get Started With Vertica Today

Subscribe to Vertica