In Vertica 8.1.1, we introduce new functionality including:
• Supported platform updates
• Machine learning updates
• Management Console enhancements
• Apache Hadoop, Apache Kafka, and Apache Spark integration updates
• Database management improvements
• Workload management
• Table data management updates
• SQL functions and statements updates
• Backup and restore additions
Supported Platform UpdatesIn each Vertica release, we try to expand our list of supported platforms. In this release, we have added support for Apache Spark version 2.1, as well as added support for OEL 6.8, and support for Linux Volume Manager on all supported operating systems.
For all client drivers, we’ve added support for OEL 6.8 and SUSE 12 SP2. For a more comprehensive list of changes, see Vertica 8.1.x Supported Platforms.
Machine Learning UpdatesIn this release, we introduce two new algorithms: SVM regression and random forest for classification. Use SVM for regression to predict continuous ordered variables, in cases like pattern recognition. For a complete example, see Building an SVM for Regression Model in the Vertica documentation.
Use random forest to create decision trees for applications that include financial analysis and predicting genetic outcomes. For a complete example, see Classifying Data Using Random Forest in the Vertica documentation.
In addition to these algorithms, we’ve expanded the functionality of the predictive analytics package. You can now detect outliers by group, using the PARTITION BY clause, and we’ve added support for extracting model attributes. We also have enhanced the BALANCE function to include support for imbalanced data processing using a hybrid method of sampling that combines under-sampling and over-sampling.
For more information, see Machine Learning for Predictive Analytics in the Vertica documentation and stay tuned for our What’s New in 8.1.1: Machine Learning post!
Management Console EnhancementsNow, you can use MC to run SQL queries on your Vertica database using the new MC Query Runner. You can do this by inputting text, importing a SQL script or by running previous queries. For more information, see Running Queries in Management Console and stay tuned for our What’s New in 8.1.1: Management Console post!
Apache Hadoop, Apache Kafka, and Apache Spark Integration UpdatesWe’ve added multiple improvements to our Vertica and Hadoop integration! In this release, you can now export parquet data to share with Hadoop-based applications, and we’ve added support for Cloudera Manager Integration. In addition, the HCatalog Connector now uses HiveServer2 to read Hive metadata, rather than using the WebHCat web service.
If you’re integrating Vertica and Apache Kafka, you can now use the –offset parameter to begin an offset for a microbatch. This means you can skip older data in the queue! In addition, you can now set a specific offset in streaming, rather than waiting for KafkaSource to reach the end of a stream.
We’re also excited to announce the Vertica Connector Apache Spark supports Spark 2.1. You can also tell the connector to use Parquet format for your intermediate data files.
Database Management ImprovementsWe’ve added an easy way for you to map new IP addresses of hosts and databases in a cluster. For example, this is useful if you are using Vertica in the cloud where you may not have control over when IP addresses change.
Workload ManagementA session socket can occasionally be blocked indefinitely while awaiting client input or output for a given query. You can now set a grace period to handle session socket blocking, at the session, user, node, and database levels. If a socket is blocked for a continuous period that exceeds the grace period setting, the server shuts down the socket and throws a fatal error. The session is then terminated. For details, see Handling Session Socket Blocking in the Vertica documentation.
Table Data Management UpdatesVertica now requires that all projections of a given table must share the same schema. On upgrade to this version, Vertica checks whether all projections are in the same schema as their respective anchor tables. If not, Vertica moves the projections to the appropriate schema, resolving name conflicts as needed. When you move a table to a different schema, Vertica automatically moves all projections that are anchored to the source table to the destination schema.
Support will continue for specifying schemas in statements and functions, to ensure backward compatibility with existing scripts.
SQL Functions and Statements UpdatesMERGE functionality has been extended to support specifying views and subqueries as merge source data. The MERGE statement ‘s USING clause can now:
• Specify a view in the same way as a table. Vertica expands the view name to the query that it encapsulates, and uses the result set as the merge source data.
• Specify a subquery. Vertica executes the query and uses the result set as the merge source data. For details, see MERGE Source Options in the Vertica documentation.
You can qualify one or more tables in a query with the hint EARLY_MATERIALIZATION. This hint can be useful in cases where late materialization of join inputs precludes other optimizations—for example, pushing aggregation down the joins or using live aggregate projections.
Backup and Restore AdditionsIn this release, Vertica supports backups on Amazon S3 cloud storage. You can create the backups from your cluster or Amazon virtual servers. We’ve also expanded support to show all backups in a specific location, regardless of which configuration file put them there. You can see backups based on a specific configuration file, and view a JSON delimited list of all backups stored on the hosts in your cluster.
For a full list of new features, see the New Features Guide. And stay tuned for our What’s New series where we’ll give you an in-depth look into our newest features!