VerticaPy Unify 2022 Sessions

Posted June 21, 2022 by Badr Ouali, Head of Data Science

Vertica Unify 2022 is a great time to learn about Vertica, its new features, and best practices. To complement the many great presentations at Vertica Unify 2022 both in Boston and Paris, I’m very excited to present two sessions: one on VerticaPy best practices, and another general session on VerticaPy and its features. VerticaPy is a Python layer at the top of Vertica that leverages hundreds of abstractions to simplify and automate the process of analyzing your data.

Best Practices Session

For a moment, just imagine trying to write SQL to compute thousands of aggregations at once – I don’t know anyone who has time for that. To address this, VerticaPy can generate very deep queries to compute many aggregations at the same time. This would normally consume a fair amount of Vertica resources, but the presentation will discuss how we can offset that by, for example, instructing VerticaPy to send many small queries instead. We’ll go over this and more at the Best Practices session. Other highlights include:

  • Advantages of normalization when working with iterative models; many data scientists normalize their data almost out of habit, but without understanding exactly why.
  • Comparing the performance of seeded and seedless randomization.
  • Picking the right variables.

If you want to learn the best ways to maximize performance with VerticaPy, the Best Practices session is for you.

General Session

In the general session, you’ll see the power of VerticaPy SQL Magic, which lets you write SQL queries using Python variables and even the pandas.DataFrame. We’ll also go over time series features like gap filling and interpolation, geospatial features like spatial joins, and more.

If you’re not familiar with the many features of VerticaPy, its tight integration with Python, and support for machine learning libraries, the general session offers the perfect introduction.