Vertica was born in the age of the enterprise data warehouse (EDW) appliance, in 2005. At the time, the EDW appliance was a great idea – a relatively simple bundle of hardware and software. You plugged it in, and it worked. This single box came with a lot of services, too.
But it also came with some issues.
One was the difficulty in integrating with all the innovations that were rapidly becoming available from software vendors, open source projects, and tools to help with new compliance mandates in those days. Plus, if you wanted to upgrade your appliance, the only choice you had was to accept what the vendor provided in the next release, including the often steep price and seemingly random features. If you’d made the investment, you could not say “Sorry, your CPU is too expensive,” or “I don’t need all these features.” You were stuck.
When Vertica was created, we determined to solve much of that problem by making it an independent, software-only offering. We believed our customers should be able to take advantage of whatever underlying infrastructure works for them. We described Vertica as running on “commodity hardware,” and when we were acquired by HP in 2011, we called it “industry standard hardware,” with a nod to our new parent company’s PC business.
The point was, you should be able to run Vertica on the hardware you preferred. You like Lenovo? Dell? HP? It’s up to you. And when the public cloud came along, we decided to stay true to those roots – to remain infrastructure independent, and ensure that Vertica could run on AWS, Google, and Azure.
Soon, we realized that cloud architecture was fundamentally changing the game. Here’s a little history.
The evolving architecture behind analytics
First-generation data centers had CPUs with servers and disks. So, the earliest versions of Vertica were deployed to servers, where database design anticipated computing power on the servers to be tightly coupled to the storage.
But the public cloud offered a more flexible solution, with compute and communal storage offered separately. This had enormous advantages for you, as a cloud customer. If you needed to increase compute capacity, but your data storage requirements hadn’t changed, you didn’t need to the buy excess storage. Perhaps more commonly you needed greater storage capacity, but no additional computing power – i.e., most people weren’t running complex queries requiring millisecond response time back then.
This meant we had an interesting opportunity in designing the next major version of Vertica: What if we separated compute from storage, and charged differently for the new version?
But we said no to that idea. We wanted Vertica to remain a single code base, but allow customers to choose what mode they want in deploying their software. For instance, if they already had multiple HP DL380s in their data center, they could choose the deployment mode that assumes tightly coupled storage/compute and servers. That’s what we called Enterprise Mode.
But that same piece of code could be deployed in Eon Mode, which separated compute and storage, and it could run in the cloud or in a hybrid environment with both cloud and resources on-premises. We launched Eon Mode in May of 2018, on AWS. Our customers could still choose Enterprise Mode if they preferred, but in either mode, Vertica offered a single code base, with two options for how the software interacts with the underlying infrastructure.
By the way, this two-mode decision didn’t make life easy for our engineers. All our advanced analytical functions, all our machine learning functions, database design, all of it had to be tested to ensure that it worked equally well in both modes. We wanted to be true to our original goal of providing software that is infrastructure independent, with one code base our customers can rely on. I’m happy to say our engineers survived.
What does Eon Mode improve?
Let’s say you’re in the retail industry, with variable, seasonal workloads. You need to run reports each minute to know what products are performing well, and which are not. You’re focused on Black Friday, followed by Cyber Monday, and so forth. During that long and busy weekend, you need lots of compute.
But after the holidays, that activity drops significantly, and you just don’t need that power.
Eon Mode is perfect for that scenario. You can add CPUs to the cluster, turn them off or on, increase the capacity of the CPUs if you want, and you don’t have to invest in infrastructure to be constantly available 365 days a year. The ability to add and remove nodes to a Vertica cluster in Eon Mode reduces overhead for the IT team.
Now, in Enterprise Mode, if you need 24×7, sub-millisecond response time continually, what Eon Mode offers doesn’t matter. But if you have variable workloads, it does.
Solving a concurrency and performance issue by making load balancing a non-issue
Before Vertica was available in Eon Mode, some of our biggest customers, like Twitter, Uber, and AT&T – who require load balancing and distribution across their nodes in a cluster – needed to grow with ever-larger cluster sizes. In order to support the surging demand for access to data from thousands of business analysts, we enabled our customers to run up to three duplicate clusters with no impact to their license entitlement. That’s because our commitment to concurrency without impact on performance is very important to us. So for example, if they had 3,000 analysts, they could continue to deliver the individual performance for each analyst and satisfy – even exceed – their business requirements.
With Eon Mode, that problem of load balancing across nodes goes away. All the data is in communal storage, so the system doesn’t have to continually balance copies of the data across all the nodes in the cluster.
Eon Mode use cases from our customers
The pressure to make this change came from the constantly increasing demands of real analytics workloads. The Trade Desk, in the Ad Tech space, is currently running 256 nodes in a single Vertica cluster on AWS. They’re addressing 10 petabytes of data in communal storage. And 256 nodes is just the beginning for them. They’re planning to hit over 300 next year.
Here’s another example. One of the largest global gaming companies is required by the General Data Protection Regulation (GDPR) in Europe to comply with the “right to be forgotten” stipulation in the law. But, as part of their business model, they need to keep a large volume of data on every user – it’s what allows them to deliver the customized experiences for several of their most popular games. But when a given user requests their data to be deleted, the company has a limited window of time to comply. Yes, they need to maintain many, many petabytes of data. But in order to handle those “right to be forgotten” requests, they really don’t need a huge amount of compute.
Eon Mode is ideal for this scenario. You can ramp up compute in response to some peak in user demand, then spin it down when that demand diminishes. It means this gaming company saves a lot both in terms of money and reduced operational complexity.
The ROI of Eon Mode
Another big driver was cloud economics. The cloud requires you to pay for what you’ve agreed to – CPU capacity, storage – but if you don’t use it, you’re still paying for it. Eon Mode helps with that. It allows much more operational efficiency, including:
- The ability to easily add nodes
- Remove nodes
- Resize the cluster
- Add more data
- Allocate certain nodes via subclustering
- Achieve workload isolation for the marketing team vs. the finance team vs. the data scientists whose compute requirements can heavily impact everyone else’s work.
Consider workload isolation and what it solves. Often, companies have to set up separate databases for their different divisions – a marketing database here, a finance database there, another one just for the data scientists. In fact, much of what they’re using is duplicate data across separate clusters. That means you have to set up data pipelines and loading from multiple clusters. That’s not only a pain, it’s actually hard to keep all this accurate.
But with communal storage, you’re able to customize your compute to the specific needs of your use cases. You have one source, one cluster, and you’re able to allocate compute nodes – i.e., workload isolation using subclustering – to each of those divisions. They don’t compete with each other, yet they all have access to the same data. This is powerful in terms of ROI, and it makes you more efficient in managing your analytics platform.