Recently, as I always do, I was reading about some emerging trends in our industry and a headline popped out at me — did you know that time series databases are one of the fastest growing specialty database platforms? That was a surprise to me, so I did a little more research.
Time After Time, Data Grows
Let’s first discuss this time series thing. After all, most data relevant to this topic has a “time stamp” already. Think about every time you log into your Amazon shopping account, or every time I hit my 12K daily step goal, or every time my mom asks Alexa what time it is in the middle of the night. Every one of those transactions is stamped with the time that the transaction occurred. But here comes the tricky part. When the next transaction is received, in many cases a transaction will be updated with a “last-login” timestamp. This is a logical approach because if you treated and stored every transaction as a net new column for every user, and you’re talking about a lot of transactions, your data size is going to grow quickly and things can get messy.
And Feed IOT Analytical Use Cases
But when a company treats time as a primary axis, not as an attribute of something else, a whole new set of insights is possible. Think about that example of my mom’s use of Alexa in the middle of the night. Amazon doesn’t just want to understand her behavior; it wants to see how her usage increases or decreases over time, bucket other users who have similar Alexa usage patterns, and determine the demographics and more. The use case gets even more compelling when you think of sensor data that is constantly sending information about smart meters, or tracking vehicles and physical containers, or monitoring shop floor manufacturing.
Do Companies Really Need Yet Another Speciality Database?
The most compelling justification for a dedicated speciality time series database according to my research is performance and scale. Time series data, especially sensor data, means a lot of data. But I have to confess, I actually laughed out loud when I read one article that gave an example of “a single connected car will collect 25GB per hour.” I immediately thought about Facebook in 2014 talking about how it was loading 35 TERABYTES an hour into Vertica. And, yes, that was 5 years ago! Admittedly, auto manufacturers like Tesla (and the insurance companies who crave this data) will be looking at more than one car but Vertica is built for performance at scale. Vertica may not fit every use case, but it will fit a lot of them without the need for yet another tool in the already overflowing big data junk trunk.
Vertica is Also a Time Series Database
But after I laughed … I sighed. My mind immediately flipped to a dinner conversation I had with a long-time Vertica customer who was complaining that if only Vertica had time series analytics, they could FINALLY get rid of their NoSQL database. I calmly (and that was hard!) said, “Vertica has time series, geospatial, pattern matching, and even in-database machine learning.” And the customer said, “What? No way! How did I not know?”
Data Analysts Become Rock Stars with Vertica
Here’s the answer, in three parts. First, sometimes we forget to tell people (repeat, repeat, repeat) about the full capabilities tucked inside the amazing Vertica Advanced Analytics Platform, especially if it is not directly relevant to an active PoC or the current use case. Second, most of the time, we work most directly with the DBA and not the business analyst community, who are the users of the advanced analytical functions like time series. Third, and this is the one I want us all to remember, most popular visualization and BI reporting tools like Tableau, Qlik, PowerBI, and more do NOT expose Vertica’s advanced analytics functions. If a business analyst community is using Tableau, then they may not know about the full value of Vertica.
To address this issue, we wrote a very compelling (and fun-to-read) whitepaper called “How Business Analysts become Rock Stars with Vertica”. Equally important, it’s important to ask every customer if they have time series use cases, geospatial use cases, pattern matching use cases and machine learning projects. This is especially important to our customers because some of the newbies on the block like Snowflake and Redshift don’t have advanced analytic functions like Vertica. But they can connect to specialty platforms that charge separately for this functionality.
Spread the Word – Vertica is Purpose Built for Time Series!
The vast majority of Vertica users are already using Vertica to store time-series data even if they are not applying specific time series functions from Vertica’s analytics library. Retail customers who collect sales data over time are collecting time series data. Network /telecom operators storing network traffic metadata are collecting time series data. Any log data is time series. So, essentially any data that has time as one of the axis can be categorized as time series. We have the opportunity to help with more use cases and expand the business value. The time for time series is now!
Learn more: Vertica Time Series Analytics