
Chart Gallery User Guide¶
Introduction¶
The Chart Gallery is an invaluable resource that allows you to delve into the art of creating diverse charts using a variety of supported libraries. These libraries include Matplotlib, Highcharts, and Plotly, each offering its unique advantages in terms of visualization and interactivity. However, it’s not just about creating pretty pictures – it’s about understanding what happens under the hood as these charts are generated.
Here, you’ll gain insights into the inner workings of the chart generation process. You’ll learn how Vertica is harnessed to perform complex calculations and aggregations that drive these charts. This understanding empowers you to craft charts that not only look great but also accurately represent your data.
In addition to demystifying the magic behind the scenes, we’ll explore the art of parameter tuning. Each chart may have specific parameters that can be fine-tuned to meet your requirements. We’ll guide you through these settings, helping you make informed decisions about how to tailor your charts for maximum impact.
Our Chart Gallery is filled with meticulously detailed examples, showcasing the vast array of charts that you can create with VerticaPy. Whether you’re interested in creating insightful bar charts, interactive line plots, or visually stunning heatmaps, you’ll find examples to inspire and guide your data visualization journey.
Please note that while we’ll provide general principles and best practices in this guide, exploring the Chart Gallery is the best way to see these concepts in action. Dive in, experiment, and discover the limitless possibilities of data visualization with VerticaPy.
Switching Between Libraries¶
VerticaPy provides flexibility by allowing you to choose among different charting libraries: Matplotlib, Highcharts, and Plotly. Depending on your needs and preferences, you can switch between these libraries when creating charts.
Let’s begin by importing VerticaPy.
import verticapy as vp
Please click on the tabs to explore how you can seamlessly switch between different libraries.
We can switch to using the plotly module.
vp.set_option("plotting_lib", "plotly")
We can switch to using the highcharts module.
vp.set_option("plotting_lib", "highcharts")
We can switch to using the matplotlib module.
vp.set_option("plotting_lib", "matplotlib")
Data Sources and Chart Types¶
When it comes to creating charts with VerticaPy, you have two flexible options at your disposal:
vDataFrame - The Python Object: vDataFrame is a powerful Python object that simplifies the process of chart creation. It’s been meticulously optimized to streamline your workflow. By utilizing vDataFrame, you’ll benefit from the automatic generation of SQL queries that fetch the necessary data for your charts. This approach offers convenience and efficiency, as VerticaPy takes care of the complex SQL generation behind the scenes.
SQL Queries: Alternatively, you can opt to craft your own SQL queries directly within your Jupyter notebook magic cells. This gives you full control over the data retrieval process. Once you’ve executed the SQL query, VerticaPy will employ the returned results to generate your final chart. This approach provides ultimate flexibility, allowing you to fine-tune your queries to suit your specific charting requirements.
With these two distinct approaches, VerticaPy empowers you to seamlessly create charts that align with your data visualization needs. Whether you prefer the convenience of vDataFrame or the precision of handcrafted SQL queries, VerticaPy ensures that you can visualize your data effortlessly.
Let’s also import numpy to create a random dataset.
import numpy as np
Let’s generate a dataset using the following data.
N = 100 # Number of Records
data = vp.vDataFrame({
"score1": np.random.normal(5, 1, N),
"score2": np.random.normal(8, 1.5, N),
"score3": np.random.normal(10, 2, N),
"category1": [np.random.choice(['A','B','C']) for _ in range(N)],
"category2": [np.random.choice(['D','E']) for _ in range(N)],
})
In this dataset, we have two categorical columns and three numerical columns. We will use it in both Python and SQL statements.
Drawing a chart using vDataFrames¶
In Python, the process is straightforward. We can use various vDataFrame methods. For example, to draw a histogram of score1, you can simply call the hist method.
data["score1"].hist()
fig = data["score1"].hist(width = 570)
fig.write_html("figures/plotting_plotly_chart_gallery_hist_single.html")
Drawing a chart using SQL Chart Magic¶
For SQL users, the chart magic extension allows you to create graphics.
We load the VerticaPy chart extension.
%load_ext verticapy.chart
In Python, the histogram interval h is automatically computed, while in SQL, you need to manually specify the binning for the chart. Additionally, in magic cells, you can use the operator : to indicate that you want to use a Python variable, and then assign a value to h.
h = 2
We write the SQL query using Jupyter magic cells. You can change the type of plots using the k option.
%%chart -k hist
SELECT
FLOOR(score1 / :h) * :h AS score1,
COUNT(*) / :N AS density
FROM :data
GROUP BY 1
ORDER BY 1;
fig = data["score1"].hist(h = 2, width = 570)
fig.write_html("figures/plotting_plotly_chart_gallery_hist_single_h10.html")