Machine Learning

One on One with Davin Potts: 3. Exciting News for Upcoming Python Release 3.8

At the recent Data Day Texas event, I sat down with Davin Potts and had a long conversation about a wide variety of subjects. I divided the conversation into multiple chunks by subject, and have been posting them one chunk at a time. In the first post, we discussed the wide variety of programming languages […]

Data Day Texas: Keep Your Architecture Open and Avoid Mindset Lock-in

Data Day Texas is an event in Austin that was started about nine years ago by an old acquaintance of mine, Lynn Bender, who founded Global DataGeeks. The one big theme that struck me as running through the whole conference was the highly cooperative landscape that has developed between proprietary and open source software, and […]

What You Never Knew About Vertica Could Surprise You

I just started working on the Vertica team. As the “new guy,” my first few weeks of work have been largely about cramming as much Vertica information into my brain as possible in the shortest time possible. I’ve been aware of the Vertica Analytics Platform for a while. I used to work for a competitor. […]

What’s New in Vertica 9.0.1: Machine Learning

This blog post was authored by Soniya Shah. Vertica 9.0.1 introduces new functionality that continues to match our goals for fast-paced development and enhancement of machine learning in Vertica. In this release, we introduce support for random forest for regression, a new statistical summary function, increased support for cross validation, and enhancements for data evaluation. […]

Estimate the Price of Diamonds Using Vertica Machine Learning

This blog post was authored by Vincent Xu. In this blog post, I’ll take you through the exercise I did to estimate the price of a diamond based on its characteristics, using the linear regression algorithm in Vertica. Besides Vertica 9.0, I used Tableau for charting and DbVisualizer as the SQL editor. From this exercise, […]

Machine Learning Mondays: Vertica 9.0 Cheat Sheet

This blog post was authored by Vincent Xu. Vertica 9.0 is out and here is the updated Vertica machine learning cheat sheet. Vertica 9.0 introduces a slew of new machine learning features including one-hot encoding, Lasso regression, cross validation, model import/export, and many more. See the cheat sheet for examples of how to use the […]

What’s New in Vertica 9.0: Machine Learning Enhancements

This blog post was authored by Soniya Shah. Vertica 9.0 introduces new functionality that continues to match our goals for fast-paced development of the existing machine learning functions. In this release, we introduce two new summary functions, support for cross validation, support for one hot encoding, and the ability to import and export your models […]

Compute Engine or Analytical Data Mart for Distributed Machine Learning? Vertica Explains How to Choose

This blog post was authored by Sarah Lemaire. On Tuesday, August 22, The Boston Vertica User Group hosted a late-summer Meetup to talk to attendees about compute engines and data mart applications, and the advantages and disadvantages of both solutions. In the cozy rustic-industrial atmosphere of Commonwealth Market and Restaurant, decorated with recycled wood pallets, […]

What’s New in Vertica 8.1.1: Machine Learning

This blog post was authored by Soniya Shah. Vertica 8.1.1 continues with the fast-paced development for machine learning. In this release, we introduce the highly-requested random forest algorithm. We added support for SVM to include SVM for regression, in addition to the existing SVM for classification algorithm. L2 regularization was added to both the linear […]

Machine Learning Mondays: How Vertica Implements Efficient and Scalable Machine Learning

This blog post was authored by Vincent Xu. As of Vertica 8.1, Vertica has introduced a set of popular machine learning algorithms, including Linear Regression, Logistic Regression, Kmeans, Naïve Bayes, and SVM. Based on our recent benchmarks, they run faster than MLlib on Apache Spark. The following chart shows the performance difference between Vertica 8.1.0 […]

Machine Learning Mondays: Data Preparation for Machine Learning in Vertica

This blog post was authored by Vincent Xu. This post is part of our Machine Learning Mondays series. Stay tuned for more! Introduction Machine learning (ML) is an iterative process. From understanding data, preparing data, building models, testing models to deploying models, every step of the way requires careful examination and manipulation of the data. […]

What’s New in Vertica 8.1: Machine Learning

This blog post was authored by Soniya Shah. Overall, you will notice that Machine Learning for Predictive Analytics, introduced in Vertica 7.2.2, is more accessible to use in Vertica 8.1, with the addition of several important functions. There are improvements to model management with access control ability to save and re-apply normalization parameters, missing value […]

Vertica Machine Learning Series: Logistic Regression

This blog post is based on a white paper authored by Maurizio Felici. What is Logistic Regression? Logistic regression is a popular machine learning algorithm used for binary classification. Logistic regression labels a sample with one of two possible classes, given a set of predictors in the sample. Optionally, the output can be the probability […]

Vertica Machine Learning Series: k-means

The content of this blog is based on a white paper that was authored by Maurizio Felici. What is k-means Clustering? K-means clustering is an unsupervised learning algorithm that clusters data into groups based on their similarity. Using k-means, you can find k clusters of data, represented by centroids. As the user, you select the […]

Machine Learning Series: Linear Regression

The content of this blog is based on a white paper that was authored by Maurizio Felici. This blog post is just one in a series of blog posts about the machine learning algorithms in Vertica. Stay tuned for more! What is Linear Regression? Let’s start with the basics. Linear regression is one of the […]

Watch Machine Learning for Predictive Analytics in Action

Watch this video to learn more about the Vertica Machine Learning for Predictive Analytics features new in 7.2

Learn More From Your Data with Machine Learning Algorithms

New in Vertica 7.2.2 is the Machine Learning for Predictive Analytics package. This analytics package allows you to use built-in machine learning algorithms on data in your Vertica database. Machine learning algorithms are extremely valuable in data analytics because, as their name suggests, they can learn from your data and provide information about deductive and […]