A couple of weeks ago Vertica saw the interns teaming up and working on their own UDx projects. These projects were then presented under the eyes of judges who scored each group on various criteria. One hot project was our Heat Map UDT. After some additional work, the Heat Map is becoming a promising addition to the Vertica UDx family! Let’s get this show on the road!
Columns of data in need of some quick and dirty analysis? Enter Vertica and the Heat Map Transform, a highly extensible and parameterized analysis tool.
Imagine you had a client who came to you for advice on how to improve their popular first person shooter. After some talking you come to the conclusion that the maps may not be well balanced and players simply die too often. If only there was some way to keep them alive longer, and still have an exciting and fast-paced level! A good first step might be to figure out where the dangerous zones are in the maps so that the client may figure out where the maps may need balance changes. Perhaps by balancing out the map, players will start to utilize the level in its entirety and allow for more tactical, strategic play. Conveniently, the client has been logging all of the death and killing locations into his Vertica database. Now how to quickly process this information to get some useful visual results? Easy! Let’s try out the Heat Map UDT from the Vertica Extension Packages GitHub repository.
Thus the first step will be to get the HeatMap User Defined Transform functions from the repository. Go to an empty directory and run the following command to get the extension packages from the GitHub repository:
Now we need to build the HeatMap UDx and install it into Vertica. From the same directory we can now run:
$ cd heatmap_package
$ export VSQL=”vsql -U dbadmin”
$ make install
Which builds the HeatMap binaries and installs the included functions into Vertica server specified by the VSQL connection string.
If we’d like to test the package we can also run:
$ make test
Now that we have the library installed we can try creating a HeatMap from the users data:
>>> SELECT * FROM maps;
event_id | x | y | z | event_type | event_time
>>> SELECT heat_map(x,y USING PARAMETERS xbins=4, ybins=4, bounding_box=True) OVER () AS (x,y,x_t,y_t,c) FROM game_events WHERE event_type = “death”;
x | y | x_t | y_t | heat
-2160.0000 | -2026.0000 | -1044.5000 | -755.5000 | 4
-2160.0000 | -755.5000 | -1044.5000 | 515.0000 | 8
-2160.0000 | 515.0000 | -1044.5000 | 1785.5000 | 36
-2160.0000 | 1785.5000 | -1044.5000 | 3056.0000 | 28
-1044.5000 | -2026.0000 | 71.0000 | -755.5000 | 6
-1044.5000 | -755.5000 | 71.0000 | 515.0000 | 31
-1044.5000 | 515.0000 | 71.0000 | 1785.5000 | 23
This single SQL statement goes through all the death data with all of the speed of Vertica behind it. It takes in any number of rows and then outputs as rows the bins of a 2-D heat map histogram based on the death locations. But while the data in this form is more malleable and usable directly by Vertica, we’d prefer to see this data in a nice visual format. Fortunately the HeatMap UDT comes with a second function for directly graphing the data:
>>> SELECT heat_map_image(x,y,heat) OVER () FROM (SELECT heat_map(x,y) OVER() AS (x,y,heat) FROM game_events WHERE event_type = “death”) AS heat_map;
The result is a .png with opacity values that can be easily overlaid onto the game map to help our client find out what areas are safe and which are not. Both the heat_map function and heat_map_image function have a number of useful optional parameters such as data smoothing and automatic heat map image overlaying. Check out this example image created using data from a game called Counter Strike Source:
>>> SELECT heat_map_image(x,y,heat USING PARAMETERS inf=’/maps/de_dust2.png’’) OVER () FROM (SELECT heat_map(x,y) OVER() AS (x,y,heat) FROM game_events WHERE event_type = “death” AND map = “de_dust2”) AS heat_map;
Pretty Cool! But the image is a little blocky, how about we add some layer of smoothing on top? This is easy by adding in the “gaussian=True” parameter in the heat_map_image UDT. Here is the output now:
>>> SELECT heat_map_image(x,y,heat USING PARAMETERS inf=’/maps/de_dust2.png’’, gaussian=True) OVER () FROM (SELECT heat_map(x,y) OVER() AS (x,y,heat) FROM game_events WHERE event_type = “death” AND map = “de_dust2”) AS heat_map;
Finally we can spice things up by having a different color scheme than just the boring red – time to use the “contour” parameter! Now take a look:
>>> SELECT heat_map_image(x,y,heat USING PARAMETERS inf=’/maps/de_dust2.png’’, gaussian=True, contour=True) OVER () FROM (SELECT heat_map(x,y) OVER() AS (x,y,heat) FROM game_events WHERE event_type = “death” AND map = “de_dust2”) AS heat_map;
Now with a single SQL query we can generate images like the ones shown here. A quick glance at the map shows the most active and dangerous areas in the analyzed map. Our client will be able to quickly root out the danger zones and decide what areas need further investigation and testing.