Of Hybrid Data and Ratchet Straps


If you’ve been following my recent blogs, you might recall that I’ve written about hybrid storage and hybrid IT as well as talking more than a little about cars.

Now it’s time for the final chapter in the series, on hybrid data. But first I need to talk about ratchet straps. Yep, ratchet straps!

My other passion, or affliction as my wife calls it, is building stuff. I build most anything, especially if it requires a tool I don’t already have. Said differently, I buy tools. Lots of them.

Recently I went to the big box hardware store to buy some ratchet straps, those things used to secure stuff to a truck so I could securely transport 70 tons of concrete pavers I had purchased for our new home build. Of course, with all these pavers I now need a paver saw, which I don’t have. Yet.

Like every good engineer, I analyzed ratchet straps – strength, cost, length, attachment type, etc. on various websites before finally finding what I wanted for the right price. Then off I went to the big box hardware store and bought four heavy-duty ratchet straps for $8.97 each. Not very exciting, or so you would think.

Well, something did get very excited, which I’m guessing was an analytics algorithm used by a marketing firm who happened to buy my clickstream data when I was doing my research, as well as information about my ratchet strap purchase.

I’m not sure I’m quite ready for a career change, but I now know that long-haul logistics companies are looking for truck drivers. It also turns out that another big box store thought they had a better deal, and the big online company where I also buy lots of stuff had an offer for tarps. If you are strapping things down, clearly, they should be covered as well.

Let’s explore what transpired when I used my credit card to buy those four ratchet straps.

I inserted my credit card to purchase them and at that moment living and growing data was born. The information about what I bought and some identifying information was auctioned off to the highest bidder. Real-time analytics were performed to figure out who else might buy my information. A correlation was made to my clickstream data which linked my searches to a purchase. Several companies purchased that information to market to me my next career in truck driving, present offers for more ratchet straps and tarps, as well as someone concluding that people who purchase ratchet straps are in dire need of little blue pills. Some offers took a few milliseconds, others took minutes or even hours.

It seems I was suddenly a hot commodity, along with ratchet straps, if only for a moment.

There was a time when a single application owned data forever – for its entire lifecycle, birth to death. One application stored it, reported on it, consolidated it and reconciled it. The application was perceived to have delivered the value.

Today data is hybrid, used by many applications for different reasons over different time horizons.

Data has a life independent of the application. It has left the proverbial nest – and for good reason. Data has the value and tying it to one application traps that value.

Some data is junk. Comprised of intermediate, duplicate or incomplete results. But a lot of data is pure gold, like researching and buying ratchet straps. Now a whole bunch of companies know something about me and, as a result, stand a better chance of selling me stuff – hopefully tools I don’t have because I can’t resist that. If they only knew about the paver saw I need…

Now, if you’re one of those companies the question is where and how to store these nuggets of gold.
 
You don’t want to trap them inside a server running an application or service that doles out access based on a single business strategy.
 
You want to store the data in a way that encourages more value to be extracted. In addition, you want to store it in ways that enable many applications to harvest it, even applications that don’t yet exist.

There are two key principles to ensure maximum value potential:

  • Store your data in formats that are widely recognized across applications
  • Store your data in systems that are broadly accessible across the data lifecycle

Data Formats: There are many data exchange formats for various types of data. The key is to utilize standardized formats designed for fidelity and metadata augmentation. You want to store metadata about the data with the data, not break it into pieces. If you break it into pieces you will diminish the value through lost context. This is one of the great promises of object storage, combining data and metadata that can be used and extended by any authorized user, thus extending the value of the data.

Data Accessibility: The next task is to ensure the data is widely accessible and cost effective in all phases of its lifecycle. Data cools down quickly but can have a very long tail of value. Real-time analytics, as the name suggests, places a lot of value on speed. Right when I purchased my ratchet straps, while I was still in the store, I got an ad trying to get me to buy more before I left – the equivalent of “would you like fries and a drink with that?” The cost and potential value of the real-time transaction was relatively high compared to the cost of other ads I received hours later. This transaction data undoubtedly landed in a data lake somewhere waiting to be correlated to other activities from me and others via batch analytics.

We could talk about Kafka and Spark for real-time analytics as well as Hadoop and data lakes for batch analytics, as it is likely that all of these were part of my ratchet strap experience, but let’s not fall back into talking about the application.

The last ten years has taught us two lessons about extracting value from data:

  • Big Data applications will change as new ways are invented to extract value from data.

  • Hardware used to store data at birth is seldom in use at data expiration, as much as a decade or more later.


What is needed is a storage strategy that helps you maximize data value while minimizing the long-tail investment, all while keeping current with best-in-class hardware and software technology. Get the data format right and data and metadata will thrive as it is used and re-used to generate value. But you also must get the storage strategy right.

You need a storage system, like Datera’s, that automatically manages the data lifecycle as well as automatically adapts to new technology.

It’s fast when the value is high, cost-effective when the value diminishes, but can be fast again if and when the data value increases. You need a storage system that provides access to any application you think needs it, even if the application is yet to be created. You need a storage system that evolves as your needs evolve:

  • Adapts to new applications and versions of applications
  • Accelerates new ways of extracting value from your data
  • Embraces new hardware and software technology

Datera was built explicitly for the dynamics of an ever-changing landscape.

It is impossible to the predict the dynamic value of data because the value is often based on future events.

I am no longer getting ads based on my ratchet strap purchase, but you can bet that when I start looking at paver saws, scaffolding or compactors (all on my Christmas list) that my ratchet strap purchase information will get hot again.

Someone who buys ratchet straps and is looking at paver saws must certainly need whatever products any number of companies are selling.

I wonder what kind of tools truck drivers need…