Unless you have been hiding out in Jurassic Park, you’ve probably heard that most of the IBM PureData System models, aka Netezza, are going the way of the dinosaur this summer. With the announcement of the end of support for one of the first data warehouse appliances, it’s time to look at where to go next. Data warehouse appliances were once the standard in the industry for performance and ease of use, but times have changed. Software-only analytics databases have surpassed their hardware-bundled cousins in a variety of ways such as price per performance and avoiding vendor lock-in. No one wants to get stuck with the technology equivalent of a dinosaur. So, what should you get to replace Netezza? Should you buy another appliance, or buy a software-only analytics database like Vertica?
Well, you’re on the Vertica blog. My opinion is pretty easy to guess, but I’ve got some really good reasons for that opinion.
Appliances vs. Software-Only Databases
Let’s look at some of the realities of both solutions. Data warehouse appliances started with some good ideas. Like: What if we made a database specifically tuned for analytics, and made it easy to implement? We could do that by installing all the software, doing the database design and tuning up front, and putting it all on some really nice proprietary hardware that the software could take advantage of. Just bring in this box, plug it in, load your data, and voila – fast analytics.
Sounds great. Brilliant idea. But, there’s always a certain amount of disconnect between idea and implementation.
The first problem is that ideal database design varies according to how you use it. One size does not fit all. Speaking of size, you have to start out with a great big expensive box, even if you don’t need one that big, because they only come in certain size increments, and you can’t use one too small. If you need to move up to the next size because business is doing well, then you have to make another big investment and jump up another big size increment. Each of these jumps requires downtime while you forklift data from the old box to the new box. And the folks making these magic analytics boxes really are proud of those things. They require a big chunk of money up front, and a bigger chunk of money each time you go up in size. And, since that box was only made by that company, they pretty much have you locked in, both to their box, and to the things they make that integrate with that box.
A lot of organizations thought those disadvantages were worth the pain for the advantages of high performance and ease of use. Appliances popped up all over the place, battling it out for market share. After all, you can’t beat the performance of software built specifically for its hardware. Right?
Not true anymore. Over time, standard, commodity hardware has evolved, offering greater and greater capabilities at lower and lower cost. Modern commodity hardware has a lot of the same CPU, bus, I/O, memory, and drive speed that specialized hardware of the past had going for it. And, without depending on special hardware to boost its speed, software solutions like Vertica have found ways to take optimum advantage of ANY hardware
. Strategies like compressing data, and doing data manipulation operations on that compressed, encrypted data not only give you a huge performance boost, but also provide an extra level of security no matter what hardware or cloud you choose. These smart software efficiencies mean modern software-only analytical databases like Vertica beat the appliance dinosaurs every time when you put them head to head on inexpensive commodity hardware with an equivalent number of CPUs, cores, and memory.
In Comes the Cloud
After years of appliances and on-premise data centers roaming the world, along came the Cloud, like a big comet, to disrupt pretty much everything. You can’t fly in the clouds, if you’re still dragging around a chunk of iron from the Stone Age. You can only get all the advantages that pushed folks to the Cloud – variable workload, instant compute, pay as you use, no hardware procurement or maintenance – if you have an analytics solution that doesn’t depend on proprietary hardware.
If you have a software-only solution, you have a huge degree of flexibility. You can deploy on-premises with inexpensive commodity hardware, and scale up as your business grows in nice, small affordable increments, without having to shut down the whole system and move everything over at once. You can deploy on the cloud
and adjust compute capacity up and down as needed for variable workloads. Or, you can do any combination you want, since all you have to move is bits and bytes, not a box.
Letting your data warehouse
vendor dictate which cloud you use, or whether you stay on-prem or move to the cloud, or do some combination – that is definitely something that needs to go extinct. Your business requirements are unique to your business, and business requirements should determine where you put your analytics workloads. Don’t let an analytics vendor make these decisions on your behalf.
Easy Does It
But, let’s get back to the advantages of appliances. Ease–of-use really is a big deal. Designing the architecture of a high-performance analytics database is non-trivial. It takes expert knowledge of data query demands and the advantages of various data architectures, and time, which is its own kind of cost. The ease-of-use advantage alone might make buying a new appliance worth the downsides.
Well, you know I’ve just learned some cool things about Vertica that surprised me
, and one of those cool things was the built-in Database Designer. You don’t have to use it, of course. If you want to go ahead and design the database yourself, you can. But the whole ease-of-use appliance advantage of not having to mess with manual database design or tuning is out the window now. Database Designer looks at a few sample queries to see what kind of workload you normally have and designs the database architecture to optimize query speed for that workload. It’s built-in AI monitors queries as you use the database, and automatically tweaks and tunes it to make queries even faster as you go along. How’s that for ease of use? You just do your job. The software takes care of design and tuning, and you get better and better performance the longer you use it. Highly customized, automated tuning is a software usability advantage that goes way beyond any pre-built, one-size-fits-all static appliance capability.
Cost per Performance
Despite their reputation for outstanding performance, there are actually some aspects of performance where appliances have never really been all that great, and where software has always had a clear advantage. Analytical database performance isn’t just about speed of response on a single query in isolation.
Cost per performance is an aspect that folks don’t often think about until it bites them in the budget. One thing appliance vendors have never been known for is cost-efficiencies. You can have a massive monster of a computer that crunches through data like a t-rex, but if it can only do that with a maximum of ten users, and it costs half your yearly revenue, that’s not a good solution for your company’s fast-response analytics. Using modern software like Vertica to crunch through the same data at the same or even better speed with hundreds of users on inexpensive hardware, or on someone else’s hardware on the cloud, makes a lot more fiscal sense.
Modern software solutions like Vertica
give you all the advantages you used to expect from the appliance dinosaurs – performance, concurrency, scalability, ease of use – without the disadvantages – vendor lock-in, platform restrictions, static one-size-fits all tuning, high up-front capital investment, expensive incremental forklift upgrades with downtime, high overall TCO.
If you were an appliance user before, it might be time to consider letting that fossil go, and move into the modern age of analytical database software with Vertica.
More information for IBM PureData Systems/Netezza or other appliance users.