Python API for Vertica Data Science at Scale


VerticaPy is a Python library that exposes scikit-like functionality to conduct data science projects on data stored in Vertica, taking advantage Vertica’s speed and built-in analytics and machine learning capabilities. It supports the entire data science life cycle, uses a ‘pipeline’ mechanism to sequentialize data transformation operations (called Virtual Dataframe), and offers several options for graphical rendering.



VerticaPy is the perfect blend of the scalability of Vertica and the flexibility of Python, bringing a unique and indispensible set of data science tools.

Explore your data.

Prepare your data with advanced Features Engineering using Advanced Analytical Functions and Moving Windows.

Create a model with the highly scalable Vertica ML. Effortlessly build and evaluate models that optimize for efficiency and performance using many of the in-database, scalable ML algorithims.

This all takes place where it should: your database. By aggregating your data with Vertica, you can build, analyze, and model anything without modifying your data.