VerticaPy reaches a milestone at 100 stars

The Vertica team is happy to share a milestone in our “VerticaPy journey”: We just reached 100 stars in our GitHub repo, and it’s growing every day. (Repo: That’s “repository” for those of you unfamiliar with GitHub.) Repos accumulate stars as an indication of user interest – think of them as bookmarks in a user’s profile. The more stars, the more evidence of a repo’s popularity and value to the community.

VerticaPy started back in 2018 as an open-source project to support the Vertica community’s Python users. “The idea behind VerticaPy is quite simple: Combine the scalability of Vertica with the flexibility of Python,” explains Vertica chief data scientist Badr Ouali. “For a while, we took one star at a time. But in these past few months, we have seen an increase in interest in our amazing Python API for Vertica Data Science at Scale.”

The development work that has gone into the VerticaPy repository is all based on a simple principle: Make data science easier and performant.

But as most GitHub enthusiasts know, it is not easy to gain wide adoption at first. Most successful Github projects are plugins or add-ons of already used technologies. Building something new and creating a new community can take years. Still, Ouali feels that since the adoption has increased in the past months, it will probably continue to increase into the next year, “especially as the VerticaPy team works to make the software easier to install and to use,” he says.

“I am very excited to be part of this amazing effort to democratize data science,” says data science developer Umar Farooq Ghumman, who has contributed significantly to the VerticaPy project. “This is an amazing project which is simplifying many complex data science tasks. We will continue to explore opportunities to keep making things simpler and user-friendly.”

Ghumman believes that VerticaPy has potential even beyond Vertica, because it incorporates some of the other Python libraries. “As new users come into Python, VerticaPy will remove friction for them as they start their journey into data science. These users could be new or entry level employees in our customer organizations, and they could even be students and researchers working with data.”

VerticaPy offers all types of algorithms – classification algorithms like Random Forest or XGBoost, regression algorithms like Linear Regression or SVM, clustering algorithms like KMeans or Bisecting KMeans, anomaly detection with algorithms such as Isolation Forest and Global ZScore and time series with ARIMA).

“It is a complete statistical package with everything for ML,” says Badr Ouali. “That includes data preparation (time series / geospatial joins, pattern matching, missing values imputation, and much more) and even data exploration (integration with Matplotlib and High Charts).”

Find the VerticaPy repo here.

About the Author

Mike Perrow
Senior Product Marketing Writer/Editor

Mike Perrow has 25 years as a writer and editor in the software industry, having worked at Powersoft, Sybase, Rational Software, IBM, Hewlett Packard Enterprise, and most recently Micro Focus. He has an extensive background in developing and implementing web and print media, and was the founding editor of The Rational Edge ezine, which published from 2000-2008. He is the co-author with Kurt Bittner and Walker Royce of The Economics of Iterative Software Development, Addison-Wesley, 2008.

Product Overview

Vertica Announces Vertica 12 for Future-Proof Analytics

Harness the Internet of Things (IoT)

Support & Services

Partners

Vertica Inside – Embedded Analytics at Scale

Resources

About Vertica

Stay Informed

VerticaPy reaches a milestone at 100 stars

About the Author

Search The Blog

Explore Popular Topics

Subscribe For Email Updates

Product Overview

Vertica Announces Vertica 12 for Future-Proof Analytics

Harness the Internet of Things (IoT)

Support & Services

Partners

Vertica Inside – Embedded Analytics at Scale

Resources

About Vertica

Stay Informed

VerticaPy reaches a milestone at 100 stars

About the Author

Search The Blog

Explore Popular Topics

Subscribe For Email Updates

See More Industry Trends Posts