The winds of change have been blowing strong. This week, Vertica version 9.3 made its big splash. Vertica has a tendency to pack a lot of features even into minor releases, and this is not a minor release. Rather than dive deep, I’m going to skim the surface of the ocean of new features.
Eon Mode Improvements
Eon Mode is Vertica’s deployment profile for separate scaling of compute and storage. It’s all kinds of handy on the cloud for volatile workloads.
- Pure Storage support – As Pure just announced at their recent conference, Vertica in Eon Mode also works like gangbusters on-premises on Pure Storage Flashblade Check out the Under the Hood on-demand webinar A Deep Dive of Vertica and Pure Storage for Fast Object Storage and Analytics for more details.
- Improvements to sub-cluster management – Spinning up sub-clusters for isolating workloads just got easier. There’s no more “fault groups.” Sub-clusters are a first-class citizen.
- Secondary node types – To ensure Vertica runs with at least a minimum of capabilities, it has always checked to make sure at least half the nodes are up and running. Now, in Eon Mode, you may want to spin up a bunch of sub-clusters for particular jobs, then spin them right back down when you’re done. Far more than half the nodes might be ephemeral. Secondary nodes aren’t counted when we check to make sure half your nodes are up and running. They can go up and down all the time with no impact on the database.
- Increased control and tuning of the depot – The depot is a cache of data right next to a sub-cluster’s compute. It’s Vertica’s way of ensuring that, even on public clouds where the network might get busy and slow, you still get awesome query performance. Before, an internal intelligent algorithm chose what data to put in the depot based on the queries that sub-cluster handles. Now, if you know what data you’ll need ahead of time, you can indicate which data you want in the depot right in the management console.
Live Aggregate Projection Improvements
Live Aggregate Projections (LAPs) are essentially running aggregations pre-calculated for you, for aggregate queries you often run. So, when you query, you get an extremely fast, almost instantaneous, response for something that otherwise might have taken significant time. LAPs let you do individual, targeted or personalized analytics efficiently. They’re particularly useful for things like comparing the aggregated smart meter usage for one customer to their neighbors, the aggregated sales for one region to another, or the call drops in one cell tower range to another.
LAPs were append-only since their introduction. In order to update or delete, you had to drop the LAPs first, then re-add them after the source anchor tables were changed. In support of GDPR and similar regulations, and to help customers stay under their data storage allowances, tables with LAPs needed a more convenient method. Now in Version 9.3, tables with LAPs support UPDATE and DELETE operations without dropping the LAPs. This makes achieving incredible performance on your database even easier and more intuitive.
Python Support Improvements
We’ve added some cool things that will expand your ability to use Python in Vertica. For one thing, this version is fully upgraded to Python 3. In addition, we added multi-phase transform function support to our Python SDK. This expands the number of Python analytics and machine learning functions you can use in Vertica. Python functions that process distributed data first on each node, and then aggregate across the whole dataset are now supported. We already supported user-defined scalars, and user-defined transform functions. Now, Python multi-phase transform capability can be used to implement functions like Correlation Matrix, or a multi-phase average, for example. Check out the Under the Hood on-demand webinar: The Extensibility of Vertica – User-Defined Extensions in Action for more details.
Complex Data Type Support Improvements
Vertica is committed to adding full complex data type support in formats like Parquet over the next few months. For this release, we’ve added Struct support. Before, we read structs, but only as expanded columns. Now, you can read them as a single column, preserving their original structure.
We’ve got a blog post series coming to discuss complex data types in much greater detail. You can expect the first post next week!
As an added bonus on this one, you can also export data from Vertica to Parquet.
A Bunch of Infrastructure Updates and More …
We made lots of upgrades to the underlying Vertica infrastructure. Be sure to check them out before you upgrade your cluster. For more info on the rest of the version 9.3 updates and changes, be sure to see the Vertica 9.3.x New Features and Changes.
And, don’t miss the upcoming Under the Hood webinar: Data Preparation for Machine Learning at Scale Oct 29, 2019.
It’s smooth sailing from here!
Announcing Vertica Version 9.2.1 – Take Analytics Efficiency to the Next Level
Vertica 9.3 Supports UPDATE and DELETE Operations on Tables Having Live Aggregate Projections!
Extending Vertica with Python functions: Adding NumPy FFT as a UDx
Parallel Processing Using Partitions With Vertica UDx
Vertica in Eon Mode Depot – Where Did You Go?
In-Database Machine Learning 2: Calculate a Correlation Matrix – A Data Exploration Post
The Vertica Academy is Open for Learning
Evaluating Classifier Models in Vertica
What’s New in Vertica 9.2?
Strong Winds of Change at Chief Data & Analytics Officers Event