

Data volume, velocity, and most importantly, differing data formats and incomplete data sets represent an enormous hurdle. The untapped value of sensor data float around in a Hadoop data lake or swim aimlessly in an Amazon S3 bucket. Data is being collected, but for many companies, that’s where the pipeline ends.
Vertica is uniquely designed to save this important data from drowning. Let’s look into this claim a little deeper. In an HPE Labs benchmark research report focused on a smart metering use case, data from 40 million meters was loaded into Vertica. The load step puts the raw data into a temporary “staging area” in Vertica to identify and repair “raw” data quality issues. Vertica can do this with the gap filling and interpolation (GFI) functionality in our time series analytics function. Just like with our in-database machine learning end-to-end functionality workflow, our advanced analytical functions like time series and geospatial and pattern matching (and many more!) all address data preparation and validation.
Now, we get to the analytics, and this is where we redefine the concept of “big data.” The HPE Labs benchmark data represents a “month in the life of a utility” in terms of data collection and analysis. The meter data is collected every 10 minutes for 40 million households. For a 31-day month, that means 178 billion readings. This is 4,464X as much data as a utility that only collects data once a month. But wait … any intelligent utility will also want to perform trend analytics on the monthly data over the course of months and years, and the benchmark study decided to focus on a decade of data that brought the data size to 22.8 trillion readings. Remember that utilities cannot bill their clients until they gather these readings on a daily basis from the thousands to millions of smart meters. Through these benchmarks and real-world use cases, Vertica proves that it can load and analyze smart meter faster and at scale than any of our competitors. The additional value for the utilities, once they account for smart metering tampering or issues with data communication, is to understand the daily, monthly, and annual energy usage and accurately predict demand to both optimize and protect the power grid.

The time is now for us to raise this topic with every customer in every industry. There isn’t a single industry or company that doesn’t have the potential for some form of IOT analytics. It’s our job to protect the data from drowning in the overflowing data lakes and ensure that our customers don’t find themselves in a history book rather than on the Fortune 1000 list because they didn’t derive value from all the sensor data that they had to bring their IOT analytical use cases to market with the peace of mind of performance at massive scale.